| Summary: | [CI][SHARDS]igt@kms_frontbuffer_tracking@psr-modesetfrombusy - dmesg-warn - GEM_BUG_ON(((tail) & ~((__typeof__(tail))((64)-1))) == ((ring->head) & ~((__typeof__(ring->head))((64)-1))) && tail < ring->head) | ||
|---|---|---|---|
| Product: | DRI | Reporter: | Lakshmi <lakshminarayana.vudum> |
| Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
| Status: | RESOLVED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
| Severity: | not set | ||
| Priority: | highest | CC: | intel-gfx-bugs |
| Version: | DRI git | ||
| Hardware: | Other | ||
| OS: | All | ||
| Whiteboard: | |||
| i915 platform: | SKL | i915 features: | GEM/Other |
|
Description
Lakshmi
2019-10-28 16:38:12 UTC
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * SKL: igt@kms_frontbuffer_tracking@psr-modesetfrombusy - dmesg-warn - GEM_BUG_ON(((tail) & ~((__typeof__(tail))((64)-1))) == ((ring->head) & ~((__typeof__(ring->head))((64)-1))) && tail < ring->head) - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7181/shard-skl5/igt@kms_frontbuffer_tracking@psr-modesetfrombusy.html At least this has happened more than once, so we may be in luck and be able to add some more debug. Same run: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7181/shard-apl6/pstore15-1572035145_Panic_2.log https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7181/shard-glk2/pstore22-1572035113_Panic_2.log https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7181/shard-kbl3/pstore20-1572041883_Panic_2.log <0>[ 2016.271448] <idle>-0 0..s1 2012175800us : process_csb: rcs0 cs-irq head=0, tail=1
<0>[ 2016.271485] <idle>-0 0..s1 2012175800us : process_csb: rcs0 csb[1]: status=0x00000882:0x00000060
<0>[ 2016.271524] <idle>-0 0..s1 2012175801us : trace_ports: rcs0: preempted { 3d30:32, 0:0 }
<0>[ 2016.271561] gem_exec-8517 1d..1 2012175801us : __execlists_submission_tasklet: vcs1: queue_priority_hint:-2147483648, submit:yes
<0>[ 2016.271601] <idle>-0 0..s1 2012175801us : process_csb: reset_active(rcs0): { rq=3d30:32 }
<0>[ 2016.271639] gem_exec-8517 1d..1 2012175802us : trace_ports: vcs1: submit { a:428, 0:0 }
<0>[ 2016.271683] gem_exec-8517 1.... 2012175811us : __i915_request_commit: vecs0 fence c:608
<0>[ 2016.271721] <idle>-0 0..s1 2012175811us : trace_ports: rcs0: promote { 4:9442!, 0:0 }
<0>[ 2016.271765] <idle>-0 0d.s2 2012175813us : __i915_request_submit: rcs0 fence 3d30:12, current 10
<0>[ 2016.271810] <idle>-0 0d.s2 2012175820us : __i915_request_submit: rcs0 fence 3d30:14, current 10
<0>[ 2016.271855] <idle>-0 0d.s2 2012175825us : __i915_request_submit: rcs0 fence 3d30:16, current 10
<0>[ 2016.271900] gem_exec-8517 1d..1 2012175825us : __i915_request_submit: vecs0 fence c:608, current 606
<0>[ 2016.271944] <idle>-0 0d.s2 2012175829us : __i915_request_submit: rcs0 fence 3d30:18, current 10
<0>[ 2016.271985] gem_exec-8517 1d..1 2012175831us : __execlists_submission_tasklet: vecs0: queue_priority_hint:-2147483648, submit:yes
<0>[ 2016.272027] gem_exec-8517 1d..1 2012175832us : trace_ports: vecs0: submit { c:608, 0:0 }
<0>[ 2016.272070] <idle>-0 0d.s2 2012175833us : __i915_request_submit: rcs0 fence 3d30:20, current 10
<0>[ 2016.272115] <idle>-0 0d.s2 2012175837us : __i915_request_submit: rcs0 fence 3d30:22, current 10
<0>[ 2016.272159] <idle>-0 0d.s2 2012175840us : __i915_request_submit: rcs0 fence 3d30:24, current 10
<0>[ 2016.272203] <idle>-0 0d.s2 2012175843us : __i915_request_submit: rcs0 fence 3d30:26, current 10
<0>[ 2016.272247] <idle>-0 0d.s2 2012175846us : __i915_request_submit: rcs0 fence 3d30:28, current 10
<0>[ 2016.272292] <idle>-0 0d.s2 2012175850us : __i915_request_submit: rcs0 fence 3d30:30, current 10
<0>[ 2016.272331] gem_exec-8517 1.... 2012176048us : __intel_context_do_pin: rcs0 context:3d2e pin ring:{head:0000, tail:0000}
<0>[ 2016.272370] gem_exec-8517 1.... 2012176053us : intel_context_unpin: rcs0 context:3d2e retire
<0>[ 2016.272409] <idle>-0 0d.s2 2012176132us : assert_ring_tail_valid.part.38: assert_ring_tail_valid:101 GEM_BUG_ON(((tail) & ~((__typeof__(tail))((64)-1))) == ((ring->head) & ~((__typeof__(ring->head))((64)-1))) && tail < ring->head)
So it's the rogue i915.hangcheck=0 leading to context cancellation.
I believe fixed by
commit a7f328fc789817a6a0e5c46411956810d5ee00ca
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Oct 28 12:41:25 2019 +0000
drm/i915/execlists: Simply walk back along request timeline on reset
The request's timeline will only contain requests from this context, in
order of execution. Therefore, we can simply look back along this
timeline to find the currently executing request.
If we do find that the current context has completed its last request,
that does not imply that all requests are completed in the context, so
only advance the ring->head up to the end of the known completions!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028124125.25176-1-chris@chris-wilson.co.uk
The dangerous part is that this is precipitated by hangcheck=0 and so will not appear again in CI... That suggests I need a better smoketest for persistence opt-out.
A CI Bug Log filter associated to this bug has been updated:
{- SKL: igt@kms_frontbuffer_tracking@psr-modesetfrombusy - dmesg-warn - GEM_BUG_ON(((tail) & ~((__typeof__(tail))((64)-1))) == ((ring->head) & ~((__typeof__(ring->head))((64)-1))) && tail < ring->head) -}
{+ SKL: igt@kms_frontbuffer_tracking@psr-modesetfrombusy - dmesg-warn - GEM_BUG_ON(((tail) & ~((__typeof__(tail))((64)-1))) == ((ring->head) & ~((__typeof__(ring->head))((64)-1))) && tail < ring->head) +}
No new failures caught with the new filter
|
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.