Bug 85054 - [BDW] GPU HANG: ecode 0:0x00200003 on resume from suspend (fixed on drm-intel-next and up)
Summary: [BDW] GPU HANG: ecode 0:0x00200003 on resume from suspend (fixed on drm-intel...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-15 10:32 UTC by Timo Aaltonen
Modified: 2017-07-24 22:51 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Timo Aaltonen 2014-10-15 10:32:12 UTC
Promised to file this for reference in case you want to fix this in v3.17.x

I get a GPU hang on resume when the machine is suspended from the login screen (lightdm). This happens on kernels earlier than drm-intel-next from Oct 4th, so 3.17 and earlier. Tested on Wilson Beach.

error state attached
Comment 1 Timo Aaltonen 2014-10-15 10:34:29 UTC
hm, can't attach error-state files since they are >3MB, so put it here:

http://koti.kapsi.fi/~tjaalton/bdw/3.17.0-error-state-resume
Comment 2 Rodrigo Vivi 2014-10-16 16:04:04 UTC
Thank you Timo.

It would be good if you could bisect the development branch to identify what patches fix the issue. So we would be able to mark it for Stable. Can you please help us on that?
Comment 3 Timo Aaltonen 2014-10-16 18:35:25 UTC
sure thing, easiest to test drm-intel-next-* builds first..
Comment 4 Rodrigo Vivi 2014-10-17 23:43:03 UTC
Apparently the patch that fixes it is:

http://cgit.freedesktop.org/drm-intel/commit/drivers/gpu/drm/i915?h=drm-intel-next&id=6689c167ae14c312972e89be1121e933e4de0001

drm/i915: Rework GPU reset sequence to match driver load & thaw
This patch is to address Daniels concerns over different code during reset:

http://lists.freedesktop.org/archives/intel-gfx/2014-June/047758.html

"The reason for aiming as hard as possible to use the exact same code for
driver load, gpu reset and runtime pm/system resume is that we've simply
seen too many bugs due to slight variations and unintended omissions."

Tested using igt drv_hangman.

V2: Cleaner way of preventing check_wedge returning -EAGAIN
V3: Clean the last_context during reset, to ensure do_switch() does the MI_SET_CONTEXT. As per review.
Signed-off-by: McAulay, Alistair <alistair.mcaulay@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: Rebase over ctx->ppgtt rework and extend the comment in
check_wedge a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Comment 5 Timo Aaltonen 2014-10-19 11:20:01 UTC
Thanks! Verified with v3.17.0 + the original v3 patch from the list (that one applied).

Tested-by: Timo Aaltonen <timo.aaltonen@canonical.com>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.