Bug 112004

Summary: [ICL] Recoverable GPU hangs on ICL when running vulkan CTS on 5.3 mainline kernel
Product: DRI Reporter: Clayton Craft <clayton.a.craft>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED MOVED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: high CC: caio.oliveira, intel-gfx-bugs
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard: Triaged, ReadyForDev
i915 platform: ICL i915 features: GPU hang
Attachments:
Description Flags
card error state none

Description Clayton Craft 2019-10-14 20:20:16 UTC
Created attachment 145738 [details]
card error state

In the Mesa CI there are regular GPU hangs on ICL when running the vulkan CTS, e.g. https://mesa-ci.01.org/vulkancts_daily/builds/1590/group/63a9f0ea7bb98050796b649e85481845#fails

I've not been able to isolate which vts test(s) cause the hangs, and I cannot reproduce the hangs when running tests in isolation/one at a time. They only seem to happen when tests are running concurrently on the system.

This is on the 5.3 mainline kernel, not drm-tip.
Comment 1 Chris Wilson 2019-10-14 20:39:32 UTC
IPEHR: 0xfffff080

it ate some garbage that is I suspect the scratch address for a pipecontrol.

HEAD:  0x0000243c

is well advanced into the ring, and has not yet wrapped, so unlikely a bug in ringbuffer management.

Yet it ended up seeing a corrupt command stream. Mysteries.
Comment 2 Francesco Balestrieri 2019-11-11 07:24:09 UTC
Clayton, sorry for the lack of follow-up. Do you still observe the same behaviour or is there any new information?
Comment 3 Francesco Balestrieri 2019-11-11 09:57:34 UTC
Clayton, any chance you could try this with drm-tip? Or if that's not possible, share instructions on how we could try to reproduce?
Comment 4 Martin Peres 2019-11-29 19:40:05 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/506.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.