Summary: | [GEN9] random/rare GPU hangs in tessellation tests | ||
---|---|---|---|
Product: | Mesa | Reporter: | Eero Tamminen <eero.t.tamminen> |
Component: | Drivers/DRI/i965 | Assignee: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Status: | VERIFIED WORKSFORME | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | normal | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | SKL GT4e SynMark OglTerrainFlyTess hang error state |
Description
Eero Tamminen
2019-03-15 12:02:28 UTC
(In reply to Eero Tamminen from comment #0) > No idea whether these are related to compute hangs bug 108820, or Heaven > hang bug 103556. There were Heaven hangs during weekend, but no tessellation test hangs. If I don't happen to notice more of these by end of the month, I'll close this as WORKSFORME. (In reply to Eero Tamminen from comment #0) > Created attachment 143679 [details] > SKL GT4e SynMark OglTerrainFlyTess hang error state > > Setup: > * Ubuntu 18.04 > * v5.0+ drm-tip kernel & git version of Xserver > * Mesa git version > > In last few days I've seen couple of random GPU hangs in tessellation > related tests: > > - on SKL GT4e, recoverable one once in SynMark2 v7 OglTerrainFlyTess, and > once in GfxBench v5 GL Aztec Ruins normal (does also lot of other things > besides tessellation) > - one system hang in GfxBench tessellation test on KBL GT2 day before > > It's possible that first item is related to starting to use Weston/XWayland > instead of normal X: > ---------------------------------------------------- > [ 8231.866172] i915 0000:00:02.0: GPU HANG: ecode 9:1:0xfffffffe, in [0], > hang on rcs0 > [ 8231.866174] [drm] GPU hangs can indicate a bug anywhere in the entire gfx > stack, including userspace. > [ 8231.866174] [drm] Please file a _new_ bug report on bugs.freedesktop.org > against DRI -> DRM/Intel > [ 8231.866175] [drm] drm/i915 developers can then reassign to the right > component if it's not a kernel issue. > [ 8231.866175] [drm] The gpu crash dump is required to analyze gpu hangs, so > please always attach it. > [ 8231.866175] [drm] GPU crash dump saved to /sys/class/drm/card0/error > [ 8231.867183] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 > [ 8239.858844] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 > [ 8243.313841] Asynchronous wait on fence i915:weston[643]/1:5eb9e timed out > (hint:intel_atomic_commit_ready+0x0/0x54 [i915]) > [ 8247.858844] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 > ---------------------------------------------------- > > See attachement for error state. > > (Another possibility could be that those started (to be more visible?) after > the "intel/nir: Vectorize all IO" fix to bug 107510, as that improved > tessellation tests.) > > Note: I don't actively read dmesg output, so I may have missed most of the > recoverable GPU hangs unless they've been serious enough to hang the system, > fail the test, or at least slow it down enough to significantly impact > performance. I'll add some better tracking for that. > > No idea whether these are related to compute hangs bug 108820, or Heaven > hang bug 103556. This error state has consistent HS/TE/DS stage programming so it would seem to be a different issue from the unigine bug. (In reply to Lionel Landwerlin from comment #2) > This error state has consistent HS/TE/DS stage programming so it would seem > to be a different issue from the unigine bug. Thanks! I have now tracking for GPU resets, and I haven't seen any tessellation test hangs since I filed this bug (only bug 108820 & bug 103556 hangs). If they don't appear by next week, I'll close this as WORKSFORME. (In reply to Eero Tamminen from comment #3) > I have now tracking for GPU resets, and I haven't seen any tessellation test > hangs since I filed this bug (only bug 108820 & bug 103556 hangs). If they > don't appear by next week, I'll close this as WORKSFORME. -> WORKSFORME. I'm seeing (recoverable) GEN9+ GPU hangs only in Manhattan 3.1, CarChase and AztecRuins (and all of those could be compute issues, bug 108820), not in the tests listed in this bug. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.