Created attachment 114360 [details] dmesg log ==System Environment== -------------------------- Only eDP without external monitor. BIOS: V59 Regression: Not sure. Hard to test previous version because of bug 89005 Non-working platforms: BSW ==kernel== -------------------------- -testing: drm-intel-testing-2015-03-13 (fails) ==Bug detailed description== ----------------------------- I tried S3 and then S4, and dmesg showed GPU hang ==Reproduce steps== ---------------------------- 1. Boot 2. echo mem > /sys/power/state and resume 3. echo disk> /sys/power/state and resume 4. dmesg shows: [ 289.878592] [drm] stuck on render ring [ 289.899951] [drm] GPU HANG: ecode 8:0:0xfffffffe, reason: Ring hung, action: reset [ 289.899959] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 289.899962] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 289.899965] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 289.899968] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 289.899972] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Created attachment 114361 [details] /sys/class/drm/card0/error
exists on drm-intel-testing-2015-03-27
exists on drm-intel-testing-2015-04-10
glxgear looks fine after this issue appears.
I can reproduce this. If I test again after the suspend/hibernate has failed it works though (with a similar work/fail/work/fail pattern). Can you confirm a similar behaviour?
Yes. I saw the similar behavior (5 GPU hang after 10 suspend/resume).
Could you try passing the enable_execlists=0 parameter to i915? This seems to "fix" (work around) the issue for me.
You're right. I don't see GPU hang with i915.enable_execlists=0
Please could you check if https://patchwork.freedesktop.org/patch/47251/ fixes the problem. Obviously make sure that you test with i915.enable_execlists=1
Created attachment 115204 [details] dmesg after apply the patch Still see GPU HANG after applying patch to drm-intel-testing-2015-04-10.
Can you test if you can reproduce the bug with: * enable_execlists=0 + only S3 * enable_execlists=0 + only S4 In my testing I can only reproduce the issue when s4 is involved; when execlists are disabled S3 will not trigger any bugs (if execlists are used I can trivially trigger a similar -- not necessarily identical -- issue by using just S3).
With i915.enable_execlists=0, I tried boot+10 S3, and boot+10 S4, and I didn't see GPU hang in either cases.
Has any seen this on any platform other than BSW? Currently on a BDW and 30+ cycles and no hangs, execlists are on.
So far my experience is that if I boot the system in single user mode I'm unable to reproduce this issue. Booting in multi user mode and using X (in my case GNOME 3) during the test will however eventually trigger the GPU hangs. Running a test set with only S3 involved seems to work fine. S4 only seems to work fine too. I've only triggered this with the combination of S3 + S4 + X. So my theory is that some state (that only matters when using X -- presumably the accelerated parts that GNOME 3 requires, but that's just a hunch) that both S3 and S4 messes with isn't handled properly by one of those two code paths. I've tested both 61.4 and 66 and see similar results on both. I've yet to confirm whether this is specific to Braswell only. If it is it could be a BIOS issue; I don't think we have any Braswell-specific S3/S4 kernel code (it could still be a kernel issue if Braswell introduces some new properties that we don't restore properly on resume/thaw).
Hi David, With default enable_execlists (I believe to be i915.enable_execlists=1), I can reproduce this issue without X on my test machine after several S3. My test machine are re-worked though. You can remote ssh into x-bsw14 if you want. rtcwake can also reproduce this issue.
(In reply to Jeff Zheng from comment #15) > Hi David, > > With default enable_execlists (I believe to be i915.enable_execlists=1), I > can reproduce this issue without X on my test machine after several S3. My > test machine are re-worked though. > > You can remote ssh into x-bsw14 if you want. rtcwake can also reproduce this > issue. My tests were with enable_execlists=0; I'm perfectly aware that the enable_execlists=1 case is broken.
Seems to be a BSW issue. Exactly same Kernel that was run on the BDW, install on the BSW first P3 get a GPU hang. Will see if I can find the cause now.
Tested it on BDW-Y with the testing kernel drm-intel-testing-2015-04-23, This problem also exists, or run twice S4.
(In reply to ye.tian from comment #18) > Tested it on BDW-Y with the testing kernel drm-intel-testing-2015-04-23, > This problem also exists, or run twice S4. OK, so this isn't a Braswell-specific issue, after all?
Hi Tian Ye, could you please also upload dmesg?
I was on the BDW-U (rvp).
Created attachment 115366 [details] dmesg info on BDW-Y after S3+S4 dmesg info on BDW-Y after S3+S4
(In reply to peter.antoine from comment #21) > I was on the BDW-U (rvp). I am unable to reproduce this issue on the BDW-U.
When you reproduced it in BDW-Y, was that with execlists enabled or disabled?
This is a problem that has been seen before on other systems. It is a timing issue with the way the registers are restored when the system comes out of power saving. I have a patch that has survived 10 S3 and 10 S4 without a GPU hang. The patch simply changes the order that bits of the system are being re-initialised after a resume. Patch will be released to linux-gfx within the hour.
The possible fix from Peter is at: https://patchwork.freedesktop.org/patch/48028/ Please could you try this patch and see if it fixes the problem?
(In reply to Chris Harris from comment #26) > The possible fix from Peter is at: > > https://patchwork.freedesktop.org/patch/48028/ > > Please could you try this patch and see if it fixes the problem? On BSW, I apply the patch to drm-intel-testing-2015-04-23. I tried 10 S3 and then S3+S4 3 times and could not reproduce this issue. Without the patch, I can easily reproduce this issue.
(In reply to David Weinehall from comment #24) > When you reproduced it in BDW-Y, was that with execlists enabled or disabled? With execlists enabled. Tested it on the latest nightly kernel with this patch, this problem does not exists.
I can confirm that the patch does the trick, even with execlists enabled. @Peter: Nice work!
Still exists on drm-intel-testing-2015-05-08... Is the patch checked in?
No, the patch has not been merged yet. The patch, while indeed fixing the issue, was deemed to be too "ugly". Discussions are currently taking place on the intel-gfx mailing list as to how a better solution should look like.
The patch was too heavy handed. I have currently only added the part that is directly causing the issue. There are some other issue that will need to be fixed, but these are being scheduled for later. New patches are on the mailing list.
Observing this with the below stack on BSW OS: Ubuntu 14.04.01. 64-bit kernel: Eywa-4.0.0-rc7 Bios: BSW_SPI_Quad_R10_Config3_PreProduction_BRASWEL_X64_R_X068_01_ME-2.0.0.2060 KSC: 1.08 Software Stack: =============== mesa - 10.6.0-devel ef5d4bcc3a21f1aa3e6a919c8888f26ec754707f libdrm - 2.4.60 812e8fe6ce46d733c30207ee26c788c61f546294 libva - 0.37.1 9bfde38f19d81b7f33db8c4c8e80420c9e60429e Xf86_video_intel - 5054e2271210a52bf88b0f12c35d687ce9e8210d xserver - 1.15.1 b1029716e41e252f149b82124a149da180607c96 intel-driver - 37d1ee43a223766164ccc1de9079cac27c44e8f0
Fixed by commit 364aece01a2dd748fc36a1e8bf52ef639b0857bd Author: Peter Antoine <peter.antoine@intel.com> Date: Mon May 11 08:50:45 2015 +0100 drm/i915: Avoid GPU hang when coming out of s3 or s4 in drm-intel-fixes.
Tested on latest nightly 2015_05_15 and this issue is fixed.
gpu hung issue exists with the below stack Kernel: Eywa-4.1.0-rc3 commit id: 21cb3a48ab8b421aba19939151d7ad4cd8c6e531 bios ver: 68.1 ksc: 1.08
(In reply to vivekanandhan J from comment #37) > gpu hung issue exists with the below stack > > Kernel: Eywa-4.1.0-rc3 > commit id: 21cb3a48ab8b421aba19939151d7ad4cd8c6e531 > bios ver: 68.1 > ksc: 1.08 Well, do you have the commit referenced in comment #35 above?
<changing title> GPU hang observed with s4 exit only
(In reply to appala from comment #39) > <changing title> > GPU hang observed with s4 exit only If the observed behaviour doesn't match the reported behaviour of a pre-existing bug (which yours does not seem to do), then please file a new bug rather than re-using an existing bug.
Hi David, We have filed new bug related to GPU hang issue, and the bug id is:92435
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.