Bug 54677 - [GM965 regression] Kernel WARNING + stack trace and display lock-up on lid open
Summary: [GM965 regression] Kernel WARNING + stack trace and display lock-up on lid open
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Daniel Vetter
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-08 23:46 UTC by Ari Entlich
Modified: 2017-07-24 23:00 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Kernel log (856.69 KB, text/plain)
2012-09-08 23:46 UTC, Ari Entlich
no flags Details
intel_error_decode output (1.29 MB, text/plain)
2012-09-08 23:49 UTC, Ari Entlich
no flags Details
intel_reg_dumper output (13.51 KB, text/plain)
2012-09-08 23:49 UTC, Ari Entlich
no flags Details
Un-decoded i915_error_state (674.01 KB, text/plain)
2012-09-09 16:58 UTC, Ari Entlich
no flags Details
Kernel log for Oct 14 kernel (550.62 KB, text/plain)
2012-10-21 18:38 UTC, Ari Entlich
no flags Details
intel_reg_dumper output for Oct 14 kernel (13.57 KB, text/plain)
2012-10-21 18:41 UTC, Ari Entlich
no flags Details
i915_error_state for Oct 14 kernel (759.87 KB, text/plain)
2012-10-21 18:44 UTC, Ari Entlich
no flags Details
force restore on lid open (3.50 KB, patch)
2012-10-21 19:18 UTC, Daniel Vetter
no flags Details | Splinter Review

Description Ari Entlich 2012-09-08 23:46:44 UTC
Created attachment 66855 [details]
Kernel log

Sometimes when I open my laptop lid, the screen goes black and refuses to show anything else. The laptop is not completely frozen, since I can ssh in and I also appear to be able to switch VTs. Once, I was only able to get the driver to fill in the i915_error_state file after I switched away from and back to my X VT (though that time I did it with chvt).

I have experienced this with 3.4.4 and 3.5.3, but not with 2.6.31 (so it is probably, hopefully not a hardware issue). It also happens with both libdrm 2.4.33 and 2.4.39.

Attached are:

1. A log of kernel messages that shows an assertion failure in the drm code, and also includes a lot of debug logging (I used drm.debug=0xf even though I was told to use 0xe.... I hope that's alright).

2. The output of the intel_error_decode utility from intel-gpu-tools.

3. The output of the intel_reg_dumper utility from intel-gpu-tools.

Note that the intel-gpu-tools output and the kernel log are actually from different instances of the problem, because I didn't have the drm.debug option set when I was running those tools. I can get it all from one instance if that is necessary.

I also have some more information if you need it, and can do whatever else is necessary to figure this out.

Thanks!
Comment 1 Ari Entlich 2012-09-08 23:49:02 UTC
Created attachment 66856 [details]
intel_error_decode output
Comment 2 Ari Entlich 2012-09-08 23:49:38 UTC
Created attachment 66857 [details]
intel_reg_dumper output
Comment 3 Ben Widawsky 2012-09-09 01:32:24 UTC
Try this patch: https://patchwork.kernel.org/patch/1426961/
Comment 4 Chris Wilson 2012-09-09 08:54:07 UTC
Ari, can you add your reproduction method to the older bug?

*** This bug has been marked as a duplicate of bug 53379 ***
Comment 5 Daniel Vetter 2012-09-09 08:55:55 UTC
(In reply to comment #3)
> Try this patch: https://patchwork.kernel.org/patch/1426961/

This patch is for gen3 and earlier only, your machine is gen4 so this doesn't apply. The other thing is that the gpu hang and the WARN are two different bugs, e.g. in the attached dmesg there's no sign of a gpu hang afaict. We should concentrate this bug report here on the WARN, if the gpu hang is still reproducible with the latest userspace components, please file a new bug for that (and attaching the undecoded error_state is usually better, in case we take the opportunity to improve the error state decoder).
Comment 6 Daniel Vetter 2012-09-09 08:57:34 UTC
Chris marked the gpu hang in this bug as duplicate, but since I've decided to track the WARN in this one here we can reopen ;-)
Comment 7 Chris Wilson 2012-09-09 09:02:38 UTC
The WARN looks pretty sweet, but can be avoided with https://bugs.freedesktop.org/attachment.cgi?id=62701
Comment 8 Ari Entlich 2012-09-09 16:58:39 UTC
Created attachment 66883 [details]
Un-decoded i915_error_state
Comment 9 Chris Wilson 2012-10-21 18:12:37 UTC
Daniel, do you want the patch to avoid the WARN or respin an equivalent for modeset-rework?
Comment 10 Ari Entlich 2012-10-21 18:38:58 UTC
Created attachment 68879 [details]
Kernel log for Oct 14 kernel

Hey guys, here's some more logs. This is with a kernel built from drm-intel-nightly on October 14th. For the kernel log, I filtered out the "drm:drm_ioctl" lines, because they were making the file very big and they don't seem to have a whole lot of information in them.
Comment 11 Ari Entlich 2012-10-21 18:41:33 UTC
Created attachment 68880 [details]
intel_reg_dumper output for Oct 14 kernel
Comment 12 Ari Entlich 2012-10-21 18:44:16 UTC
Created attachment 68881 [details]
i915_error_state for Oct 14 kernel
Comment 13 Ari Entlich 2012-10-21 18:48:52 UTC
Oh, and this kernel did not have any other patches applied. It looked to me like the patch in comment #7 was no longer relevant, because of this: http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-nightly&id=3b7a89fce3e3dc96b549d6d829387b4439044d0d patch.
Comment 14 Daniel Vetter 2012-10-21 19:18:08 UTC
Created attachment 68883 [details] [review]
force restore on lid open

The attached patch (only compile tested) should shut up the warnings and restore the display on lid open.
Comment 15 Chris Wilson 2012-11-15 13:07:59 UTC
Ari, do you mind confirming that Daniel's patch does exactly what it says on the tin?
Comment 16 Chris Wilson 2012-11-28 23:57:42 UTC
commit a3443c5e7be8831ae634ed0738232452a11f3739
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Nov 23 18:16:34 2012 +0100

    drm/i915: force restore on lid open
    
    There seem to be indeed some awkwards machines around, mostly those
    without OpRegion support, where the firmware changes the display hw
    state behind our backs when closing the lid.
Comment 17 Florian Mickler 2012-12-22 09:18:27 UTC
A patch referencing this bug report has been merged in Linux v3.8-rc1:

commit 45e2b5f640b3766da3eda48f6c35f088155c06f3
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Nov 23 18:16:34 2012 +0100

    drm/i915: force restore on lid open


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.