Created attachment 145738 [details]
card error state
In the Mesa CI there are regular GPU hangs on ICL when running the vulkan CTS, e.g. https://mesa-ci.01.org/vulkancts_daily/builds/1590/group/63a9f0ea7bb98050796b649e85481845#fails
I've not been able to isolate which vts test(s) cause the hangs, and I cannot reproduce the hangs when running tests in isolation/one at a time. They only seem to happen when tests are running concurrently on the system.
This is on the 5.3 mainline kernel, not drm-tip.
it ate some garbage that is I suspect the scratch address for a pipecontrol.
is well advanced into the ring, and has not yet wrapped, so unlikely a bug in ringbuffer management.
Yet it ended up seeing a corrupt command stream. Mysteries.
Clayton, sorry for the lack of follow-up. Do you still observe the same behaviour or is there any new information?
Clayton, any chance you could try this with drm-tip? Or if that's not possible, share instructions on how we could try to reproduce?