Created attachment 129895 [details] dump from /sys/class/drm/card0/error Hi: While transcoding HEVC 10bit video to H264, the GPU may hang there. It can recover from the status only if the device reboots. The Kernel message is as follow: [ 973.709462] [drm] GPU HANG: ecode 9:4:0xacdfbffd, in ffmpeg [6992], reason: Hang on video enhancement ring, action: reset [ 973.720436] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 973.729630] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 973.738477] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 973.748085] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 973.757000] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 973.764295] drm/i915: Resetting chip after gpu hang [ 973.769821] [drm] GuC firmware load skipped [ 983.766780] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete [ 993.780258] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete [ 1003.792770] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete [ 1013.805246] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete [ 1023.817750] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete [ 1024.719067] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete [ 1033.830242] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete [ 1034.731572] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete [ 1043.843737] [drm:i915_gem_wait_for_error.part.38] *ERROR* Timed out waiting for the gpu reset to complete dump from /sys/class/drm/card0/error: GPU HANG: ecode 9:4:0xacdfbffd, in ffmpeg [6992], reason: Hang on video enhancement ring, action: reset Time: 1487838143 s 4506 us Kernel: 4.2.8 Active process (on ring vebox): ffmpeg [6992] Reset count: 0 Suspend count: 0 PCI ID: 0x5a85 PCI Revision: 0x0b PCI Subsystem: 8086:2112 IOMMU enabled?: 0 DMC loaded: yes DMC fw version: 1.7 EIR: 0x00000000 IER: 0x08000000 GTIER gt 0: 0x01010101 GTIER gt 1: 0x01010101 GTIER gt 2: 0x00000070 GTIER gt 3: 0x00000101 PGTBL_ER: 0x00000000 FORCEWAKE: 0xffff0001 DERRMR: 0x2077efef CCID: 0x00000000 Missed interrupts: 0x00000000 CPU: Intel(R) Celeron(R) CPU J3455 @ 1.50GHz Git src tag: drm-intel-testing-2016-07-25 Please advise. Thank you! Cheers, Edward Tseng
There has been changes to firmware GuC in kernel 4.10, would you update your kernel and let us know if this continues. Also 01.org has an updated GuC firmware to download https://01.org/linuxgraphics/downloads/broxton-guc-8.7 I will place the bug into NeedInfo State, however as soon as you add the details please change the bug back to Reopen
I'm not sure this is guc related: "[drm] GuC firmware load skipped" Could you please boot with drm.debug=0xe, reproduce the issue and post the dmesg output here? Also could you please attach /sys/kernel/debug/dri/0/i915_guc_load_status Thanks, Rodrigo.
Hi Rodrigo: I turn on the debug, and the kernel message is as followed: 7>[ 486.419915] [drm:drm_ioctl] pid=30654, dev=0xe280, auth=1, I915_GEM_SW_FINISH <7>[ 486.419917] [drm:drm_ioctl] pid=30654, dev=0xe280, auth=1, I915_GEM_SW_FINISH <7>[ 486.419919] [drm:drm_ioctl] pid=30654, dev=0xe280, auth=1, I915_GEM_EXECBUFFER2 <6>[ 494.557219] [drm] GPU HANG: ecode 9:4:0xacdfbffd, in ffmpeg [30654], reason: Hang on video enhancement ring, action: reset <6>[ 494.568355] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. <6>[ 494.577563] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel 6>[ 494.586404] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. <6>[ 494.596034] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. <6>[ 494.604973] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 494.611401] [drm:i915_reset_and_wakeup] resetting chip 5>[ 494.611406] drm/i915: Resetting chip after gpu hang <7>[ 494.611425] [drm:drm_ioctl] pid=30654, dev=0xe280, auth=1, I915_GEM_MADVISE <7>[ 494.616426] [drm:gen8_init_common_ring] Execlists enabled for render ring <7>[ 494.616442] [drm:gen8_init_common_ring] Execlists enabled for blitter ring <7>[ 494.616454] [drm:gen8_init_common_ring] Execlists enabled for bsd ring <7>[ 494.616465] [drm:gen8_init_common_ring] Execlists enabled for video enhancement ring <7>[ 494.616482] [drm:intel_guc_setup] GuC fw status: path i915/kbl_guc_ver9_14.bin, fetch NONE, load NONE <6>[ 494.616484] [drm] GuC firmware load skipped 7>[ 494.620724] [drm:drm_ioctl] pid=30654, dev=0xe280, auth=1, DRM_IOCTL_GEM_CLOSE <7>[ 494.620728] [drm:drm_ioctl] pid=30654, dev=0xe280, auth=1, DRM_IOCTL_GEM_CLOSE <7>[ 494.620732] [drm:drm_ioctl] pid=30654, dev=0xe280, auth=1, DRM_IOCTL_GEM_CLOSE PS. I cannot find /sys/kernel/debug/dri/0/i915_guc_load_status node. Is there any debug option I need to setup? Thank you! Cheers, Edward Tseng
Rodrigo can you help with the question, once you reply you can reset the assignee to the mailing list
This is not a firmware related bug since GuC is not getting loaded. So changing category.
Looking to the error state it looks like it hangs on the very first attempt of using the VECS ring. Very first entry on VECS ring doesn't look like a valid command. This looks like an user space bug to me. I assume you are using open source libva with vaapi-intel-driver. If this is the case please go ahead and report this issue to https://github.com/01org/intel-vaapi-driver/issues. I'm closing this bug here for now. Feel free to reopen if necessary.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.