https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5595/fi-icl-u3/igt@i915_selftest@live_workarounds.html <0>[ 425.800821] ksoftirq-21 2d.s1 443464553us : process_csb: rcs0 cs-irq head=5, tail=1 <0>[ 425.800821] ksoftirq-21 2d.s1 443464554us : process_csb: rcs0 csb[0]: status=0x10000001:0x00000000, active=0x1 <0>[ 425.800821] ksoftirq-21 2d.s1 443464555us : process_csb: rcs0 csb[1]: status=0x10000018:0x00000060, active=0x5 <0>[ 425.800821] ksoftirq-21 2d.s1 443464557us : process_csb: rcs0 out[0]: ctx=96.1, global=8 (fence 401c:2) (current 0:7), prio=2 <0>[ 425.800821] i915_sel-4429 0.... 443464742us : i915_request_add: rcs0 fence 401b:4 <0>[ 425.800821] i915_sel-4429 0.... 443464743us : i915_request_add: marking (null) as active <0>[ 425.800821] ksoftirq-21 2d.s1 443464744us : process_csb: process_csb:1103 GEM_BUG_ON(!i915_request_completed(rq)) <0>[ 425.800821] --------------------------------- <4>[ 425.800821] ---[ end trace 56491dea06ff360f ]--- <4>[ 426.373035] RIP: 0010:process_csb+0x640/0x9a0 [i915] <4>[ 426.373035] Code: ef 56 4a e0 48 8b 35 8f 23 1b 00 49 c7 c0 b6 54 d7 a0 b9 4f 04 00 00 48 c7 c2 e0 8e d5 a0 48 c7 c7 9b ee c7 a0 e8 a0 eb 50 e0 <0f> 0b 48 c7 c1 e8 42 d9 a0 ba 29 04 00 00 48 c7 c6 e0 8e d5 a0 48 <4>[ 426.373035] RSP: 0018:ffffc90000153d28 EFLAGS: 00010082 <4>[ 426.373035] RAX: 000000000000000b RBX: 0000000000000002 RCX: 0000000000000000 <4>[ 426.373035] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffff8884ae2574e8 <4>[ 426.373035] RBP: ffffc90000153d98 R08: 0000000000127878 R09: ffff8884ae398000 <4>[ 426.373035] R10: ffffc90000153cb8 R11: ffff8884ae2574e8 R12: 0000000000000000 <4>[ 426.373035] R13: ffff88845b4bc2a8 R14: 0000000000000001 R15: ffff888492f21040 <4>[ 426.373035] FS: 0000000000000000(0000) GS:ffff8884aff00000(0000) knlGS:0000000000000000 <4>[ 426.373035] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 426.373035] CR2: 00007f5b2ad345a0 CR3: 00000004a8540001 CR4: 0000000000760ee0 <4>[ 426.373035] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[ 426.373035] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 <4>[ 426.373035] PKRU: 55555554 <0>[ 426.373035] Kernel panic - not syncing: Fatal exception in interrupt <0>[ 426.373035] Shutting down cpus with NMI <0>[ 426.373035] Dumping ftrace buffer: <0>[ 426.373035] (ftrace buffer empty) <0>[ 426.373035] Kernel Offset: disabled
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * CFL ICL: igt@i915_selftest@live_workarounds - incomplete - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3590/fi-cfl-8109u/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3592/fi-cfl-8109u/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3592/fi-cfl-8700k/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3596/fi-cfl-8109u/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3596/fi-cfl-guc/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3596/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3596/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3617/fi-cfl-8109u/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3617/fi-cfl-8700k/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3617/fi-cfl-guc/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3617/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3617/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12200/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2380/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2380/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5594/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5595/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12209/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2391/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4824/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12213/fi-cfl-8109u/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12213/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3845/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2387/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3848/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_3848/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5598/fi-cfl-8109u/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5598/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2394/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5599/fi-icl-u2/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5599/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4825/fi-icl-u3/igt@i915_selftest@live_workarounds.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2396/fi-icl-u3/igt@i915_selftest@live_workarounds.html
(In reply to CI Bug Log from comment #1) > The CI Bug Log issue associated to this bug has been updated. > > ### New filters associated > > * CFL ICL: igt@i915_selftest@live_workarounds - incomplete ... Only a few of those are this very specific bug.
commit c836eb79c033c2be13aa8b41729b28d2ab1f72ab (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 13 22:48:05 2019 +0000 drm/i915/selftests: Always use an active engine while resetting Currently, we only try to reset a live engine for checking the whitelist retention across a per-engine reset. For safety, it appears we need to prime the system with a hanging spinner before performing a full-device reset. (Figuring out the root cause behind the instability with handling a reset during a no-op request is a challenge for another test, the whitelist test has its own purpose.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109626 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190213224805.32021-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> should prevent it from occurring, and CI hints that commit 9a3b19a16dc28ab717cf1663d09ffee0715b735a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 13 23:20:47 2019 +0000 drm/i915: Only try to park engines after a failed reset Currently we try to stop the engine by programming the ring registers to be disabled before we perform the reset. Sometimes, we see the context image also have invalid ring registers, which one presumes may be actually caused by us doing so. Lets risk not doing programming the ring to zero on the first attempt to avoid preserving that corruption into the context image, leaving the w/a in place for subsequent reset attempts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190213232047.8486-1-chris@chris-wilson.co.uk might be the real deal.
(In reply to Chris Wilson from comment #3) > commit c836eb79c033c2be13aa8b41729b28d2ab1f72ab (HEAD -> > drm-intel-next-queued, drm-intel/drm-intel-next-queued) > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Wed Feb 13 22:48:05 2019 +0000 > > drm/i915/selftests: Always use an active engine while resetting > > Currently, we only try to reset a live engine for checking the whitelist > retention across a per-engine reset. For safety, it appears we need to > prime the system with a hanging spinner before performing a full-device > reset. (Figuring out the root cause behind the instability with handling > a reset during a no-op request is a challenge for another test, the > whitelist test has its own purpose.) > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109626 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Link: > https://patchwork.freedesktop.org/patch/msgid/20190213224805.32021-1- > chris@chris-wilson.co.uk > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > should prevent it from occurring, and CI hints that > > commit 9a3b19a16dc28ab717cf1663d09ffee0715b735a > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Wed Feb 13 23:20:47 2019 +0000 > > drm/i915: Only try to park engines after a failed reset > > Currently we try to stop the engine by programming the ring registers to > be disabled before we perform the reset. Sometimes, we see the context > image also have invalid ring registers, which one presumes may be > actually caused by us doing so. Lets risk not doing programming the > ring to zero on the first attempt to avoid preserving that corruption > into the context image, leaving the w/a in place for subsequent > reset attempts. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Link: > https://patchwork.freedesktop.org/patch/msgid/20190213232047.8486-1- > chris@chris-wilson.co.uk > > might be the real deal. No idea if this was what fixed it, but it most definitely is fixed! Thanks!
The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.