Summary: | [KBL] reset request times out (GPU reset failure) | ||
---|---|---|---|
Product: | DRI | Reporter: | Eero Tamminen <eero.t.tamminen> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | bjorn, intel-gfx-bugs |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | KBL | i915 features: | GPU hang |
Attachments: |
Description
Eero Tamminen
2016-11-16 16:23:47 UTC
They are all secondary effects to the GPU not resetting. That device has succeeded in running the full test set until end only twice before this, in early September and 27th of October (latter had same X, Intel DDX and Mesa as the version which has this extra symptoms). On both of these cases, there's been hang with CSDof and GPU reset fail. However, there were no repeated hang resets and GPU was completely idle after the tests had finished. Eero, can you also attached the error dump? Eero, can you have a try with Chris' patch: https://patchwork.freedesktop.org/series/15471/ ? Created attachment 128054 [details]
Last error state from build where there were no repeated resets
Created attachment 128055 [details]
Last error state from build with the repeated resets
Created attachment 128056 [details]
Last error state from build with the repeated resets, using newer mesa git
This one uses:
- kernel: 04145fe15cf8c81c221e62fc9d65d93053f9bd1a
- mesa: 341fc0073a3c05fd43e9c7a33613bcb881f25f33
(In reply to yann from comment #4) > Eero, can you have a try with Chris' patch: > https://patchwork.freedesktop.org/series/15471/ ? Didn't help. Still does recurring hangs after test-case stops. Valtteri came up with test-case that triggers the issue within few minutes: ------ hang.sh -------- #!/bin/sh for i in $(seq $1); do ./synmark2 OglBatch0 & sleep 2 killall synmark2 done ----------------------- $ ./hang.sh 100 ----------------------- (Mika's now looking into issue.) This does not appear on other gt3 boxes? I suggest we close this and reopen if it does. Eero? (In reply to Mika Kuoppala from comment #9) > This does not appear on other gt3 boxes? I suggest we close this and reopen > if it does. Eero? If you refer to reset request timeouts or higher power usage due to GPU reset failing completely, I haven't seen those on any HW in last couple of weeks. But we don't anymore have the KBL-U QL9J machine in regular testing. (There have been system hangs on the same CarChase offscreen tests on SKL GT2 & BXT, but I guess that's a different issue.) Something similar may now be happening on SKL GT2, since yesterday. After GFXBench CarChase tests (which often GPU hangs) all tests fail. However, I don't have logs as Jenkins timeouts the test-run, and reboots to another test-run. I haven't seen reset request timeout errors this year, so I think this can be closed. BXT J4205 had higher power consumption after all tests had been run (and CarChase offscreen had hanged as earlier) on 3 days around May 7th, but no reset timeouts, so it's different issue. Didn't see anything similar on other devices on last 2 months (or when using newer Mesa that doesn't anymore trigger the hangs so frequently). Haven't seem this in a long time, so marking it as fixed. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.