Created attachment 63205 [details] running hangman debug info System Environment: -------------------------- Platform: Ivybridge Kernel: (drm-intel-next-queued)33ee6d190ce8e4c33a7caf7d75618feb97936517 Bug detailed description: ------------------------- Running i-g-t tool ZZ_hangman, casuses " *ERROR* render ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000"
I can reproduce this on my ivb, but it takes a few loops of ZZ_hangman. How reliably can you hit this?
(In reply to comment #1) > I can reproduce this on my ivb, but it takes a few loops of ZZ_hangman. How > reliably can you hit this? Every time I run hangman will cause this error.
I've noticed that disabling rc6 with i915.i915_enable_rc6=0 makes ZZ_hangman completely stable for me on both ivb&snb, even when I run it in a loop. Note that you need to have a sleep 10 in that loop, otherwise the kernel complains about the gpu hanging too fast and stops accepting batchbuffer commands. I usually do while tests/ZZ_hangman; do sleep 10; done that will stop as soon as the gpu died. Can you confirm that disabling rc6 makes gpu reset stable for you, too?
(In reply to comment #3) > I've noticed that disabling rc6 with i915.i915_enable_rc6=0 makes ZZ_hangman > completely stable for me on both ivb&snb, even when I run it in a loop. Note > that you need to have a sleep 10 in that loop, otherwise the kernel complains > about the gpu hanging too fast and stops accepting batchbuffer commands. I > usually do > > while tests/ZZ_hangman; do sleep 10; done > > that will stop as soon as the gpu died. Can you confirm that disabling rc6 > makes gpu reset stable for you, too? Yes, I disable rc6 , and the gpu reset turn to be stable.the dmesg shows: [ 302.188495] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [ 302.188551] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [ 302.191248] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
I've created some patches to make the gpu reset more stable. Can you please test my reset-fail git branch from my personal git repo? http://cgit.freedesktop.org/~danvet/drm/log/?h=reset-fail
Created attachment 63426 [details] running hangman with reset-fail's debug info (In reply to comment #5) > I've created some patches to make the gpu reset more stable. Can you please > test my reset-fail git branch from my personal git repo? > > http://cgit.freedesktop.org/~danvet/drm/log/?h=reset-fail I try reset-fail with its latest commit: Kernel: (reset-fail)aefaf55d4cbb279d5029fdaf428edd22a83f575f and attach the dmesg
Looks like it works now with -fixes merged in. Can you confirm that the gpu works after running the hangman test (i.e. running gl apps doesn't crash it)?
(In reply to comment #7) > Looks like it works now with -fixes merged in. Can you confirm that the gpu > works after running the hangman test (i.e. running gl apps doesn't crash it)? I run glxgears after hangman test, it can work well.
Ok, patches are now all merged to -queued.
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.