Bug 111384 - [BXT/Iris] (recoverable) GPU hang in SynMark compute CSCloth
Summary: [BXT/Iris] (recoverable) GPU hang in SynMark compute CSCloth
Status: VERIFIED WORKSFORME
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/Iris (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: mesa-19.2
  Show dependency treegraph
 
Reported: 2019-08-12 14:24 UTC by Eero Tamminen
Modified: 2019-09-05 16:12 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
i915 error state for GPU hang (4.49 KB, text/plain)
2019-08-12 14:24 UTC, Eero Tamminen
Details

Description Eero Tamminen 2019-08-12 14:24:59 UTC
Created attachment 145036 [details]
i915 error state for GPU hang

Setup:
- BXT J4205
- ClearLinux (30730)
- drm-tip git kernel (0330b51e91)
- Mesa git (5ed4e31c08d)
- Weston git build

Test-case:
- 3x fullscreen FullHD SynMark CSCloth compute test-case (Wayland version):
  synmark2 OglCSCloth

Actual outcome:
- Recoverable GPU hang:
-------------------------------
[ 8477.103209] i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
[ 8477.103216] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 8477.103217] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 8477.103219] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 8477.103220] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 8477.103222] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 8477.104241] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[ 8477.105009] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
-------------------------------

This is faster of the 2 SynMark compute tests.

I wasn't able to reproduce the hang after reboot when re-running just CSCloth 10 times, so it may depend on previous tests, or is just very hard to reproduce.

I haven't seen such hangs with the i965 driver, and that test didn't hang on SKL GT2 (another one did).  I didn't see such hang when running similar test-set month ago, so it can be a regression.
Comment 1 Eero Tamminen 2019-08-29 12:37:37 UTC
I haven't seen these hangs since, but that setup doesn't have automated runs, so I don't have enough data points to conclude anything (currently Weston bug #273 is also preventing semi-automated testing with latest gfx stack).

I'm fine for this being closed as WORKSFORME after bug 111385 is fixed or after few weeks (whichever comes first), if I haven't seen it again.
Comment 2 Eero Tamminen 2019-09-05 16:12:00 UTC
In last ~2 weeks (3 test runs on most days), only (recoverable) GPU Iris hang on BXT was one hang in Manhattan 3.1, nothing in CSCloth -> WORKSFORME.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.