Summary: | [CI][BAT] igt@drv_selftest@live_hangcheck - dmesg-fail - failed to idle after reset | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Peres <martin.peres> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | high | CC: | intel-gfx-bugs |
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | ICL | i915 features: | GEM/Other |
Description
Martin Peres
2018-09-13 07:48:49 UTC
What happened here is the tasklet never runs. It's scheduled, enabled and ready... <3>[ 526.201555] i915_reset_engine(vcs0:others-active): failed to idle after reset <6>[ 526.201561] i915 0000:00:02.0: [drm] vcs0 <6>[ 526.201563] i915 0000:00:02.0: [drm] current seqno 15874, last 15874, hangcheck 0 [10481 ms] <6>[ 526.201564] i915 0000:00:02.0: [drm] Reset count: 16464 (global 1) <6>[ 526.201566] i915 0000:00:02.0: [drm] Requests: <6>[ 526.201574] i915 0000:00:02.0: [drm] first 15874! [71d7:6fa] prio=0 @ 52ms: (null) <6>[ 526.201576] i915 0000:00:02.0: [drm] last 15874! [71d7:6fa] prio=0 @ 52ms: (null) <6>[ 526.201612] i915 0000:00:02.0: [drm] RING_START: 0x002f0000 <6>[ 526.201623] i915 0000:00:02.0: [drm] RING_HEAD: 0x00000680 <6>[ 526.201625] i915 0000:00:02.0: [drm] RING_TAIL: 0x00000680 <6>[ 526.201627] i915 0000:00:02.0: [drm] RING_CTL: 0x00000000 <6>[ 526.201629] i915 0000:00:02.0: [drm] RING_MODE: 0x00000200 [idle] <6>[ 526.201631] i915 0000:00:02.0: [drm] RING_IMR: fffffeff <6>[ 526.201633] i915 0000:00:02.0: [drm] ACTHD: 0x00000000_00000680 <6>[ 526.201636] i915 0000:00:02.0: [drm] BBADDR: 0x00000000_00000000 <6>[ 526.201638] i915 0000:00:02.0: [drm] DMA_FADDR: 0x00000000_00000000 <6>[ 526.201640] i915 0000:00:02.0: [drm] IPEIR: 0x00000000 <6>[ 526.201641] i915 0000:00:02.0: [drm] IPEHR: 0x00000000 <6>[ 526.201644] i915 0000:00:02.0: [drm] Execlist status: 0x00000301 00000000 <6>[ 526.201646] i915 0000:00:02.0: [drm] Execlist CSB read 5, write 1 [mmio:1], tasklet queued? yes (enabled) <6>[ 526.201648] i915 0000:00:02.0: [drm] Execlist CSB[0]: 0x00000001 [mmio:0x00000001], context: 0 [mmio:0] <6>[ 526.201650] i915 0000:00:02.0: [drm] Execlist CSB[1]: 0x00000018 [mmio:0x00000018], context: 2 [mmio:2] <6>[ 526.201653] i915 0000:00:02.0: [drm] ELSP[0] count=1, ring->start=002f0000, rq: 15874! [71d7:6fa] prio=0 @ 52ms: (null) <6>[ 526.201654] i915 0000:00:02.0: [drm] ELSP[1] idle <6>[ 526.201655] i915 0000:00:02.0: [drm] HW active? 0x1 <6>[ 526.201682] i915 0000:00:02.0: [drm] E 15874! [71d7:6fa] prio=0 @ 52ms: (null) <6>[ 526.201700] i915 0000:00:02.0: [drm] Queue priority: -2147483648 <6>[ 526.201702] i915 0000:00:02.0: [drm] IRQ? 0x0 (breadcrumbs? no) <6>[ 526.201703] i915 0000:00:02.0: [drm] HWSP: <6>[ 526.201706] i915 0000:00:02.0: [drm] [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <6>[ 526.201707] i915 0000:00:02.0: [drm] * <6>[ 526.201709] i915 0000:00:02.0: [drm] [0040] 00000001 00000000 00000018 00000002 00000001 00000000 00000001 00000000 <6>[ 526.201711] i915 0000:00:02.0: [drm] [0060] 00000018 00000008 00000001 00000000 00000000 00000000 00000000 00000001 <6>[ 526.201713] i915 0000:00:02.0: [drm] [0080] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <6>[ 526.201714] i915 0000:00:02.0: [drm] * <6>[ 526.201717] i915 0000:00:02.0: [drm] [00c0] 00015874 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <6>[ 526.201718] i915 0000:00:02.0: [drm] [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <6>[ 526.201720] i915 0000:00:02.0: [drm] * <6>[ 526.201721] i915 0000:00:02.0: [drm] Idle? no Loss for words. To cap it all we even try and kick the tasklet when checking idle. To skip that kick means the tasklet was locked? commit 22495b68f954c9018fd52bc620292cc80df2e418 (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Sep 14 09:00:16 2018 +0100 drm/i915: Flush the tasklet when checking for idle In order to reduce latency when checking for idle we kick the tasklet directly. Sometimes this is not enough as it is queued on another cpu and so to improve the accuracy of this idle-check (and so to reduce latency overall by avoiding another pass, or worse declaring a timeout!) wait for the tasklet to complete. References: https://bugs.freedesktop.org/show_bug.cgi?id=107916 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180914080017.30308-2-chris@chris-wilson.co.uk Hopefully. This issue occurred only once with CI_DRM_4811 (1 month / 415 runs ago). (In reply to Lakshmi from comment #3) > This issue occurred only once with CI_DRM_4811 (1 month / 415 runs ago). Closing and archiving. Thanks! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.