Bug 60820 - [snb gt2] Hard hang after updating kernel, libdrm2 and intel_drv to 3.8-rc7, 2.4.42+git20130211.20c5607b, 2.21.2+git20130211.75406775 respectively.
Summary: [snb gt2] Hard hang after updating kernel, libdrm2 and intel_drv to 3.8-rc7, ...
Status: CLOSED INVALID
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Damien Lespiau
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-13 22:31 UTC by Gökçen Eraslan
Modified: 2017-07-24 22:58 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Xorg.0.log (33.07 KB, text/plain)
2013-02-13 22:31 UTC, Gökçen Eraslan
no flags Details
dmesg (62.54 KB, text/plain)
2013-02-13 22:31 UTC, Gökçen Eraslan
no flags Details
i915_error_state (2.07 MB, text/plain)
2013-02-13 22:32 UTC, Gökçen Eraslan
no flags Details
Exact update history (3.90 KB, text/plain)
2013-02-13 22:32 UTC, Gökçen Eraslan
no flags Details
i915_gem_pageflip contents continuously appended (7.11 KB, text/plain)
2013-02-14 14:37 UTC, Gökçen Eraslan
no flags Details

Description Gökçen Eraslan 2013-02-13 22:31:21 UTC
Created attachment 74788 [details]
Xorg.0.log

After kernel/libdrm/mesa/intel_drv updates, X started freezing after a few minutes of usage. But it doesn't happen suddenly, first mouse and keyboard are getting slower and slower and then X freezes completely.

I am attaching, lspci, dmesg, Xorg.0.log and i915_error_state.
Comment 1 Gökçen Eraslan 2013-02-13 22:31:45 UTC
Created attachment 74789 [details]
dmesg
Comment 2 Gökçen Eraslan 2013-02-13 22:32:32 UTC
Created attachment 74790 [details]
i915_error_state
Comment 3 Gökçen Eraslan 2013-02-13 22:32:57 UTC
Created attachment 74791 [details]
Exact update history
Comment 4 Chris Wilson 2013-02-13 22:48:30 UTC
That is puzzling. There is no reason for it to become trapped inside the select loop there and cause the mieq overflow, as it could wake up and process events at any time. Is this 100% reproducible with similar logs?

Can you try the baseline -intel and xserver-xorg-core from raring (individually downgrading them) to see which upgrade triggered the hang?
Comment 5 Gökçen Eraslan 2013-02-13 23:53:44 UTC
(In reply to comment #4)
> That is puzzling. There is no reason for it to become trapped inside the
> select loop there and cause the mieq overflow, as it could wake up and
> process events at any time. Is this 100% reproducible with similar logs?
> 

Yes.

> Can you try the baseline -intel and xserver-xorg-core from raring
> (individually downgrading them) to see which upgrade triggered the hang?

I use Quantal+xorg edgers PPA. Raring has xserver-xorg-video-intel-2:2.21.2-0ubuntu1 and xserver-xorg-core-2:1.13.2-0ubuntu2.

Which version should I use for downgrading? Back to xserver-xorg-video-intel-2.21.0+git20130204.9640640a-0ubuntu0ricotz~quantal? Or 2.20.19?
Comment 6 Chris Wilson 2013-02-14 00:10:22 UTC
Hmm, my fault, I expected xorg-edgers to have more recent packages. Should have paid closer attention to the version numbers.

Probably best to try the raring packages first as they are more recent.
Comment 7 Chris Wilson 2013-02-14 00:39:00 UTC
Can you also please check /sys/kernel/debug/dri/0/i915_gem_pageflip when it hangs?
Comment 8 Gökçen Eraslan 2013-02-14 00:44:02 UTC
(In reply to comment #7)
> Can you also please check /sys/kernel/debug/dri/0/i915_gem_pageflip when it
> hangs?

Of course. Right now, I'm trying a more recent version of xserver-xorg-intel-video, namely 2.21.2+git20130213.9861423a-0ubuntu0sarvatt~quantal.
Comment 9 Gökçen Eraslan 2013-02-14 14:29:56 UTC
(In reply to comment #7)
> Can you also please check /sys/kernel/debug/dri/0/i915_gem_pageflip when it
> hangs?

It is almost impossible to run any commands after freeze, even ssh connections stop working. But I was able to check i915_gem_pageflip contents using a watch command and continuously appending file contents to a temporary file.

Here is some part of the file I got:

No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
Flip queued on pipe A (plane A)
Stall check enabled, 1 prepares
Old framebuffer gtt_offset 0x0087c000
New framebuffer gtt_offset 0x05c69000
No flip due on pipe B (plane B)
Flip queued on pipe A (plane A)
Stall check enabled, 1 prepares
Old framebuffer gtt_offset 0x0bc6d000
New framebuffer gtt_offset 0x0087c000
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
Flip queued on pipe A (plane A)
Stall check enabled, 1 prepares
Old framebuffer gtt_offset 0x0087c000
New framebuffer gtt_offset 0x0bc6d000
No flip due on pipe B (plane B)
No flip due on pipe A (plane A)
No flip due on pipe B (plane B)
Comment 10 Gökçen Eraslan 2013-02-14 14:37:00 UTC
Created attachment 74819 [details]
i915_gem_pageflip contents continuously appended
Comment 11 Chris Wilson 2013-02-14 14:59:33 UTC
As it is a hard hang, it is probably not related to the pageflip->compiz/X freeze. Can you please try booting with i915.i915_enable_rc6=0 and see if that stops the hangs?
Comment 12 Gökçen Eraslan 2013-02-16 13:22:57 UTC
(In reply to comment #11)
> As it is a hard hang, it is probably not related to the pageflip->compiz/X
> freeze. Can you please try booting with i915.i915_enable_rc6=0 and see if
> that stops the hangs?

No it didn't stop the hang. I was able to reproduce hang with rc6 disabled. However, then I had to switch to Raring since I was not able to use my desktop more than 5 minutes.

I cannot reproduce the hang in Raring now (with Xorg server 1.13.2, mesa 9.0.2 and intel 2.21.2), I haven't installed edgers packages yet. I will report after installing xorg-edgers packages.
Comment 13 Chris Wilson 2013-02-16 22:18:21 UTC
If not rc6, then that suggests one of the other w/a, probably HiZ related as they could cause hangs iirc.
Comment 14 Charles Samuels 2013-03-24 19:32:04 UTC
I have this on Debian testing (and unstable): I added a few comments to bug #62507, but they seem to belong here.
Comment 15 Ben Widawsky 2013-06-04 20:12:26 UTC
(In reply to comment #14)
> I have this on Debian testing (and unstable): I added a few comments to bug
> #62507, but they seem to belong here.

Please check with latest version of mesa to see if it is fixed with recent HiZ fixes.
Comment 16 Chris Wilson 2013-06-12 09:31:28 UTC
Can you please try with this patch: https://patchwork.kernel.org/patch/2707341/ as it claims to fix some instability with rc6 on SandyBridge?
Comment 17 Charles Samuels 2013-06-15 21:31:15 UTC
I'm still seeing this on Debian 7.1. I have i915.i915_enable_rc6=0 i915.semaphores=1
Comment 18 Daniel Vetter 2013-10-28 18:23:33 UTC
Please test Ken's snb blorp fixes from

http://cgit.freedesktop.org/~kwg/mesa/log/?h=snbfixes

Note that this is a mesa series, not kernel patches. But it could be that a gpu hang caused by mesa results in your gpu hang.
Comment 19 Jani Nikula 2014-08-14 13:29:53 UTC
Timeout. Please reopen if the problem persists with recent kernels/mesa.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.