Created attachment 96758 [details] dmesg log Getting GPU Hang error in dmesg on Broadwell Y sku. "GPU HANG: ecode 0:0x4cb94c99, reason: Ring hung, action: reset" But the system still continue to operate. The crash dump file "/sys/class/drm/card0" is empty. OS: Ubuntu 13.10 (64 bit) Linux Kernel: 3.14.0-rc7 Processor: BDW Y RAM: 4 GB
Created attachment 96759 [details] xorg.0.log
Use "cat /sys/class/drm/card0/error > /tmp/error"
Created attachment 96823 [details] cat /sys/class/drm/card0/error > /tmp/error
Noor, please clear "NEEDINFO" status when you provide the answer.
Upgraded the kernel to 3.15.0-rc2 intel-drm-nightly. Now getting a different error code with GPU hang: [ 80.502986] [drm:gen8_irq_handler] *ERROR* Pipe A FIFO underrun [ 97.778228] [drm] stuck on render ring [ 97.780656] [drm] GPU HANG: ecode 0:0x43c6e6cf, reason: Ring hung, action: reset
Created attachment 97899 [details] dmesg for ecode 0:0x43c6e6cf
cat /sys/class/drm/card0/error > /tmp/error for this error is too big to attach here (above 3 MB). Kindly let me know if I need to share it in any other way.
(In reply to comment #7) > cat /sys/class/drm/card0/error > /tmp/error for this error is too big to > attach here (above 3 MB). Kindly let me know if I need to share it in any > other way. You might need to compress it with gzip or something like that.
We upgraded the kernel to latest intel-drm-nightly: commit f79ba79cf037eea9ee757ad37730b00f43d5ef80 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri May 16 21:54:54 2014 +0200 drm-intel-nightly: 2014y-05m-16d-21h-54m-36s integration manifest Unable to see the GPU hang error in dmesg anymore. 'cat /sys/class/drm/card0/error > /tmp/error' shows "no error state collected" in the output file.
Looks like the issue is still there, on long run. I am attaching the dmesg (dmesg2.txt) as well as the 'cat /sys/class/drm/card0/error > /tmp/error2' output (error2.bz2).
Created attachment 99615 [details] cat /sys/class/drm/card0/error > /tmp/error2
Created attachment 99616 [details] dmesg2
From the dmesg2, it looks like the system is hanging really quickly. Please try to update your kernel to the rc7 based one (it has the null context), update your BIOS, and update your silicon to the latest. Then report back if the bug still persists with the error state. Thanks.
Created attachment 100863 [details] dmesg3.log is for 3.15-rc7 (222ccbc) drm-intel-nightly
Created attachment 100864 [details] error dump cat /sys/class/drm/card0/error > error3.log
I am able to reproduce the GPU HANG on 3.15-rc7 (222ccbc - drm-intel-nightly) as well. I have attached dmesg3.log and error3.log.bz2 Looks like this crash happened while I played a video in VLC player.
I am able to re-produce this issue on mainline kernel (3.16.0-rc6-mainline, master-82e13c7) while opening vlc player. I am attaching the logs dmesg4.log and error4.log.bz2
Created attachment 103435 [details] dmesg dmesg4
Created attachment 103436 [details] cat /sys/class/drm/card0/error > error4.log
Ben are you still looking at this issue? Has anyone in OTC been able to reproduce this?
Noor - please also update the issue with steps to reproduce the issue.
Per your request .. Steps: --------- 1. Execute commands: sudo -s echo disk > /sys/power/state 2. Wait 60 seconds 3. Resume the DUT using keyboard 4. Wait a moment Actual result: ----------------- 4. I tried several times, there are several results: DUT resumes but is frozen with screen on (Terminal is restored), mouse and keyboard are not responding DUT reboots (image is not restored) Boot/Resume is stopped with some logs on the screen Expected result: ------------------- 4. DUT successfully suspends to disk and resumes
Gavin, The hang looks to happen quite randomly. Especially when I am trying to run graphics operation like playing videos using VLC player. Wrt Glenn's comment, I can confirm the GPU hang happens while trying to resume from S4. But also would like to mention that I am getting the GPU hang even when not trying any S4 cycle.
Noor, it's been a while with no update on this sorry. But could you please check the state of latest -nightly? Forgetting the suspend-resume for now. Or reporting another bug please. I'm afraid the one Glenn was getting was related to PSR suspend-resume.
I am not hitting the GPU Hang issue with the mentioned configuration for more than 20 hours. I used mplayer to run the High resolution Video. But issue is reproducible(10/10) while running the Video using VLC player. VLC player - VLC media player 2.1.4 Rincewind (revision 2.1.4-0-g2a072be) """ [148638.947640] drm/i915: Resetting chip after gpu hang [148644.978512] [drm] GPU HANG: ecode 0:0x85dffffb, in Xorg [1066], reason: Ring hung, action: reset [148644.979107] [drm:i915_context_is_banned [i915]] *ERROR* gpu hanging too fast, banning! [148644.985697] drm/i915: Resetting chip after gpu hang """ Though the used vlc player is upto date. So the issue is specific with VLC player. Do let me know if needs more information. Used configuraiton: Board : BDW Y LPDDR3 (PV) CRB FAB2 WITH MEMORY H11021-201 CPU : BROADWELL 2+2 ULX E3 MOBILE QDF: QNJ5 Libdrm : (master)libdrm-2.4.58-4-g00847fa48b83a85b0cb882594a12ed1511f780dbq Mesa : (master) 600066af93fe60abbfff5be82527b529e1e44916 Xserver : (master) e9db7682028bb0464c211c1f7bb6983fcfb6f37b Xf86-video-intel : 61436c2fabe117b85404eecb06158ba0a63a7741 Cairo : (master) b4e218c3e8402e149115a59406796b751118237f Libva : (master) ccd93de5a707e92a629cccd595757c8d436fa3cc Libva_intel_driver : (master) 24cba20a119c96556ae4dc9a90043896ea70e567 Kernel : (drm-intel-nightly) 32cefad9992e67b4ee7487adf465bd7e189c9c1c Thanks, Vijay
Thanks for the test. Let's consider it fixed. Feel free to reopen in case you still face the issue with latest -nightly.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.