Bug 62443 - [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Summary: [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Status: CLOSED DUPLICATE of bug 54226
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-17 17:06 UTC by Franz Fellner
Modified: 2017-07-24 22:58 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg (35.84 KB, text/plain)
2013-03-17 17:07 UTC, Franz Fellner
no flags Details
i915_error_state (2.32 MB, text/plain)
2013-03-17 17:08 UTC, Franz Fellner
no flags Details
Xorg.0.log (51.97 KB, text/plain)
2013-03-17 17:08 UTC, Franz Fellner
no flags Details
perf.data (xz compressed) (524.70 KB, application/x-xz)
2013-03-30 08:25 UTC, Franz Fellner
no flags Details
perf report (107.86 KB, text/plain)
2013-03-31 08:23 UTC, Franz Fellner
no flags Details

Description Franz Fellner 2013-03-17 17:06:40 UTC
I get the above message from time to time - once every 1-2 weeks. After that the desktop is extremely laggy, no vsync (moving windows moves several blocks), scrolling is slow as hell. I only get normal behaviour by restarting X.
Comment 1 Franz Fellner 2013-03-17 17:07:14 UTC
Created attachment 76656 [details]
dmesg
Comment 2 Franz Fellner 2013-03-17 17:08:28 UTC
Created attachment 76657 [details]
i915_error_state
Comment 3 Franz Fellner 2013-03-17 17:08:47 UTC
Created attachment 76658 [details]
Xorg.0.log
Comment 4 Chris Wilson 2013-03-17 18:49:52 UTC

*** This bug has been marked as a duplicate of bug 54226 ***
Comment 5 Chris Wilson 2013-03-17 18:54:39 UTC
Hmm, it really shouldn't be that noticeable after a gpu hang... unless you are using a compositor? Certain operations will be unavailable (accelerated GL, vsync, etc), but for everything else it should fallback to a shadow buffer and for typical rendering although it may be an order of magnitude slower it shouldn't actually impact upon latency. Moving windows and scrolling should still be crisp. So if you can, please grab a 'sudo perf record -f -g -a' after such an event.
Comment 6 Franz Fellner 2013-03-30 08:25:01 UTC
Created attachment 77229 [details]
perf.data (xz compressed)

happened again, so here the requested perf record.
And yes, I am using a compositor (kwin).
Comment 7 Chris Wilson 2013-03-30 21:23:02 UTC
You have to parse the perf.data locally so that it can resolve the symbols etc. Can you please do 'perf report -i /path/to/perf.data | head -1500'? Sorry for skipping that detail before.
Comment 8 Franz Fellner 2013-03-31 08:23:13 UTC
Created attachment 77242 [details]
perf report

85.26%                X  libc-2.15.so                           [.] __memcpy_ssse3_back

weird...

I also recorded just now where everything is fine (Should I post that, too?). That is the top line:

5.09%          swapper  [kernel.kallsyms]              [k] mwait_idle_with_hints
Comment 9 Chris Wilson 2013-03-31 11:03:59 UTC
I was hoping for just a little more information from the stacktraces. The most obvious cause for the GTT reads would be the DRI2 copies, but having that confirmed would have been useful. However, those cannot be eliminated due to the API constraints. So other than working around the broken hw, we also need to prevent the false positive EIO - which should be fixed in v3.9.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.