Bug 62443

Summary: [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Product: DRI Reporter: Franz Fellner <alpine.art.de>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium    
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg
none
i915_error_state
none
Xorg.0.log
none
perf.data (xz compressed)
none
perf report none

Description Franz Fellner 2013-03-17 17:06:40 UTC
I get the above message from time to time - once every 1-2 weeks. After that the desktop is extremely laggy, no vsync (moving windows moves several blocks), scrolling is slow as hell. I only get normal behaviour by restarting X.
Comment 1 Franz Fellner 2013-03-17 17:07:14 UTC
Created attachment 76656 [details]
dmesg
Comment 2 Franz Fellner 2013-03-17 17:08:28 UTC
Created attachment 76657 [details]
i915_error_state
Comment 3 Franz Fellner 2013-03-17 17:08:47 UTC
Created attachment 76658 [details]
Xorg.0.log
Comment 4 Chris Wilson 2013-03-17 18:49:52 UTC

*** This bug has been marked as a duplicate of bug 54226 ***
Comment 5 Chris Wilson 2013-03-17 18:54:39 UTC
Hmm, it really shouldn't be that noticeable after a gpu hang... unless you are using a compositor? Certain operations will be unavailable (accelerated GL, vsync, etc), but for everything else it should fallback to a shadow buffer and for typical rendering although it may be an order of magnitude slower it shouldn't actually impact upon latency. Moving windows and scrolling should still be crisp. So if you can, please grab a 'sudo perf record -f -g -a' after such an event.
Comment 6 Franz Fellner 2013-03-30 08:25:01 UTC
Created attachment 77229 [details]
perf.data (xz compressed)

happened again, so here the requested perf record.
And yes, I am using a compositor (kwin).
Comment 7 Chris Wilson 2013-03-30 21:23:02 UTC
You have to parse the perf.data locally so that it can resolve the symbols etc. Can you please do 'perf report -i /path/to/perf.data | head -1500'? Sorry for skipping that detail before.
Comment 8 Franz Fellner 2013-03-31 08:23:13 UTC
Created attachment 77242 [details]
perf report

85.26%                X  libc-2.15.so                           [.] __memcpy_ssse3_back

weird...

I also recorded just now where everything is fine (Should I post that, too?). That is the top line:

5.09%          swapper  [kernel.kallsyms]              [k] mwait_idle_with_hints
Comment 9 Chris Wilson 2013-03-31 11:03:59 UTC
I was hoping for just a little more information from the stacktraces. The most obvious cause for the GTT reads would be the DRI2 copies, but having that confirmed would have been useful. However, those cannot be eliminated due to the API constraints. So other than working around the broken hw, we also need to prevent the false positive EIO - which should be fixed in v3.9.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.