https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4829/shard-apl5/igt@gem_eio@in-flight-suspend.html [179.251251] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-5) Relevant part information leading to this line: <7> [179.246841] [drm:i915_reset_device [i915]] resetting chip <5> [179.247282] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff <7> [179.247880] [drm:drm_dp_i2c_do_msg] native defer <7> [179.249532] i915_gem_set_wedged rcs0 <7> [179.249540] i915_gem_set_wedged \x09current seqno 46, last 46, hangcheck 46 [138 ms] <7> [179.249545] i915_gem_set_wedged \x09Reset count: 1 (global 5) <7> [179.249554] [drm:drm_dp_i2c_do_msg] native defer <7> [179.249561] i915_gem_set_wedged \x09Requests: <7> [179.249606] i915_gem_set_wedged \x09RING_START: 0x00000000 <7> [179.249613] i915_gem_set_wedged \x09RING_HEAD: 0x00000000 <7> [179.249620] i915_gem_set_wedged \x09RING_TAIL: 0x00000000 <7> [179.249629] i915_gem_set_wedged \x09RING_CTL: 0x00000000 <7> [179.249638] i915_gem_set_wedged \x09RING_MODE: 0x00000200 [idle] <7> [179.249645] i915_gem_set_wedged \x09RING_IMR: ffffffff <7> [179.249657] i915_gem_set_wedged \x09ACTHD: 0x00000000_00000000 <7> [179.249668] i915_gem_set_wedged \x09BBADDR: 0x00000000_00000000 <7> [179.249726] i915_gem_set_wedged \x09DMA_FADDR: 0x00000000_00000000 <7> [179.249733] i915_gem_set_wedged \x09IPEIR: 0x00000000 <7> [179.249739] i915_gem_set_wedged \x09IPEHR: 0x00000000 <7> [179.249748] i915_gem_set_wedged \x09Execlist status: 0x00000001 00000000 <7> [179.249756] i915_gem_set_wedged \x09Execlist CSB read 5, write 5 [mmio:7], tasklet queued? no (disabled) <7> [179.249761] i915_gem_set_wedged \x09\x09ELSP[0] idle <7> [179.249765] i915_gem_set_wedged \x09\x09ELSP[1] idle <7> [179.249769] i915_gem_set_wedged \x09\x09HW active? 0x0 <7> [179.249848] i915_gem_set_wedged \x09\x09Queue priority: -1024 <7> [179.249928] i915_gem_set_wedged \x09\x09Q 0 [5:16] prio=-1024 @ 3ms: (null) <7> [179.249941] i915_gem_set_wedged IRQ? 0x0 (breadcrumbs? no) <7> [179.249945] i915_gem_set_wedged HWSP: <7> [179.249952] i915_gem_set_wedged [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.249956] i915_gem_set_wedged * <7> [179.249965] i915_gem_set_wedged [0040] 00000001 00000000 00000018 00000000 00000001 00000000 00000018 00000003 <7> [179.249973] i915_gem_set_wedged [0060] 00008002 00000002 00008002 00000002 00000000 00000000 00000000 00000005 <7> [179.249980] i915_gem_set_wedged [0080] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.249984] i915_gem_set_wedged * <7> [179.249990] i915_gem_set_wedged [00c0] 00000046 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.249995] i915_gem_set_wedged [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.249999] i915_gem_set_wedged * <7> [179.250007] i915_gem_set_wedged Idle? no <7> [179.250014] i915_gem_set_wedged bcs0 <7> [179.250022] i915_gem_set_wedged \x09current seqno 4, last 4, hangcheck 4 [139 ms] <7> [179.250027] i915_gem_set_wedged \x09Reset count: 0 (global 5) <7> [179.250032] i915_gem_set_wedged \x09Requests: <7> [179.250040] i915_gem_set_wedged \x09RING_START: 0x00000000 <7> [179.250047] i915_gem_set_wedged \x09RING_HEAD: 0x00000000 <7> [179.250059] i915_gem_set_wedged \x09RING_TAIL: 0x00000000 <7> [179.250069] i915_gem_set_wedged \x09RING_CTL: 0x00000000 <7> [179.250078] i915_gem_set_wedged \x09RING_MODE: 0x00000200 [idle] <7> [179.250085] i915_gem_set_wedged \x09RING_IMR: ffffffff <7> [179.250100] i915_gem_set_wedged \x09ACTHD: 0x00000000_00000000 <7> [179.250111] i915_gem_set_wedged \x09BBADDR: 0x00000000_00000000 <7> [179.250124] i915_gem_set_wedged \x09DMA_FADDR: 0x00000000_00000000 <7> [179.250131] i915_gem_set_wedged \x09IPEIR: 0x00000000 <7> [179.250139] i915_gem_set_wedged \x09IPEHR: 0x00000000 <7> [179.250149] i915_gem_set_wedged \x09Execlist status: 0x00000001 00000000 <7> [179.250157] i915_gem_set_wedged \x09Execlist CSB read 5, write 5 [mmio:7], tasklet queued? no (disabled) <7> [179.250161] i915_gem_set_wedged \x09\x09ELSP[0] idle <7> [179.250166] i915_gem_set_wedged \x09\x09ELSP[1] idle <7> [179.250170] i915_gem_set_wedged \x09\x09HW active? 0x0 <7> [179.250178] i915_gem_set_wedged \x09\x09Queue priority: -1024 <7> [179.250186] i915_gem_set_wedged \x09\x09Q 0 [8:a] prio=-1024 @ 3ms: (null) <7> [179.250193] i915_gem_set_wedged IRQ? 0x0 (breadcrumbs? no) <7> [179.250198] i915_gem_set_wedged HWSP: <7> [179.250204] i915_gem_set_wedged [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250208] i915_gem_set_wedged * <7> [179.250215] i915_gem_set_wedged [0040] 00000001 00000000 00000018 00000000 00000001 00000000 00000018 00000003 <7> [179.250221] i915_gem_set_wedged [0060] 00000018 00000000 00000001 00000000 00000000 00000000 00000000 00000005 <7> [179.250229] i915_gem_set_wedged [0080] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250234] i915_gem_set_wedged * <7> [179.250242] i915_gem_set_wedged [00c0] 00000004 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250249] i915_gem_set_wedged [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250253] i915_gem_set_wedged * <7> [179.250258] i915_gem_set_wedged Idle? no <7> [179.250263] i915_gem_set_wedged vcs0 <7> [179.250267] i915_gem_set_wedged \x09current seqno 4, last 4, hangcheck 4 [139 ms] <7> [179.250275] i915_gem_set_wedged \x09Reset count: 0 (global 5) <7> [179.250282] i915_gem_set_wedged \x09Requests: <7> [179.250292] i915_gem_set_wedged \x09RING_START: 0x00000000 <7> [179.250299] i915_gem_set_wedged \x09RING_HEAD: 0x00000000 <7> [179.250305] i915_gem_set_wedged \x09RING_TAIL: 0x00000000 <7> [179.250315] i915_gem_set_wedged \x09RING_CTL: 0x00000000 <7> [179.250326] i915_gem_set_wedged \x09RING_MODE: 0x00000200 [idle] <7> [179.250334] i915_gem_set_wedged \x09RING_IMR: ffffffff <7> [179.250346] i915_gem_set_wedged \x09ACTHD: 0x00000000_00000000 <7> [179.250358] i915_gem_set_wedged \x09BBADDR: 0x00000000_00000000 <7> [179.250371] i915_gem_set_wedged \x09DMA_FADDR: 0x00000000_00000000 <7> [179.250378] i915_gem_set_wedged \x09IPEIR: 0x00000000 <7> [179.250385] i915_gem_set_wedged \x09IPEHR: 0x00000000 <7> [179.250395] i915_gem_set_wedged \x09Execlist status: 0x00000001 00000000 <7> [179.250404] i915_gem_set_wedged \x09Execlist CSB read 5, write 5 [mmio:7], tasklet queued? no (disabled) <7> [179.250410] i915_gem_set_wedged \x09\x09ELSP[0] idle <7> [179.250414] i915_gem_set_wedged \x09\x09ELSP[1] idle <7> [179.250418] i915_gem_set_wedged \x09\x09HW active? 0x0 <7> [179.250427] i915_gem_set_wedged \x09\x09Queue priority: -1024 <7> [179.250434] i915_gem_set_wedged \x09\x09Q 0 [b:7] prio=-1024 @ 3ms: (null) <7> [179.250439] i915_gem_set_wedged IRQ? 0x0 (breadcrumbs? no) <7> [179.250443] i915_gem_set_wedged HWSP: <7> [179.250450] i915_gem_set_wedged [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250456] i915_gem_set_wedged * <7> [179.250463] i915_gem_set_wedged [0040] 00000001 00000000 00000018 00000000 00000001 00000000 00000018 00000003 <7> [179.250471] i915_gem_set_wedged [0060] 00008002 00000002 00000018 00000002 00000000 00000000 00000000 00000005 <7> [179.250477] i915_gem_set_wedged [0080] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250480] i915_gem_set_wedged * <7> [179.250486] i915_gem_set_wedged [00c0] 00000004 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250493] i915_gem_set_wedged [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250499] i915_gem_set_wedged * <7> [179.250506] i915_gem_set_wedged Idle? no <7> [179.250511] i915_gem_set_wedged vecs0 <7> [179.250515] i915_gem_set_wedged \x09current seqno 4, last 4, hangcheck 4 [139 ms] <7> [179.250519] i915_gem_set_wedged \x09Reset count: 0 (global 5) <7> [179.250524] i915_gem_set_wedged \x09Requests: <7> [179.250534] i915_gem_set_wedged \x09RING_START: 0x00000000 <7> [179.250542] i915_gem_set_wedged \x09RING_HEAD: 0x00000000 <7> [179.250550] i915_gem_set_wedged \x09RING_TAIL: 0x00000000 <7> [179.250558] i915_gem_set_wedged \x09RING_CTL: 0x00000000 <7> [179.250569] i915_gem_set_wedged \x09RING_MODE: 0x00000200 [idle] <7> [179.250579] i915_gem_set_wedged \x09RING_IMR: ffffffff <7> [179.250591] i915_gem_set_wedged \x09ACTHD: 0x00000000_00000000 <7> [179.250603] i915_gem_set_wedged \x09BBADDR: 0x00000000_00000000 <7> [179.250614] i915_gem_set_wedged \x09DMA_FADDR: 0x00000000_00000000 <7> [179.250622] i915_gem_set_wedged \x09IPEIR: 0x00000000 <7> [179.250629] i915_gem_set_wedged \x09IPEHR: 0x00000000 <7> [179.250639] i915_gem_set_wedged \x09Execlist status: 0x00000001 00000000 <7> [179.250646] i915_gem_set_wedged \x09Execlist CSB read 5, write 5 [mmio:7], tasklet queued? no (disabled) <7> [179.250656] i915_gem_set_wedged \x09\x09ELSP[0] idle <7> [179.250661] i915_gem_set_wedged \x09\x09ELSP[1] idle <7> [179.250665] i915_gem_set_wedged \x09\x09HW active? 0x0 <7> [179.250670] i915_gem_set_wedged \x09\x09Queue priority: -1024 <7> [179.250675] i915_gem_set_wedged \x09\x09Q 0 [e:9] prio=-1024 @ 3ms: (null) <7> [179.250704] i915_gem_set_wedged IRQ? 0x0 (breadcrumbs? no) <7> [179.250710] i915_gem_set_wedged HWSP: <7> [179.250717] i915_gem_set_wedged [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250721] i915_gem_set_wedged * <7> [179.250727] i915_gem_set_wedged [0040] 00000001 00000000 00000018 00000000 00000001 00000000 00000018 00000003 <7> [179.250732] i915_gem_set_wedged [0060] 00000018 00000002 00000001 00000000 00000000 00000000 00000000 00000005 <7> [179.250739] i915_gem_set_wedged [0080] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250745] i915_gem_set_wedged * <7> [179.250753] i915_gem_set_wedged [00c0] 00000004 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250759] i915_gem_set_wedged [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <7> [179.250763] i915_gem_set_wedged * <7> [179.250768] i915_gem_set_wedged Idle? no <3> [179.251251] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-5)
In the init_hw, we found the device was still wedged; so either we failed to clear the wedged status at the start of the reset (with reporting an error) or another thread set-wedged during the reset. As the first is impossible, the race seems more likely.
This issue occurred only once 3 weeks 6 days.
Also seen on GLK: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5051/shard-glk3/igt@gem_eio@in-flight-suspend.html <3> [909.326086] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-5)
Also seen on ICL: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5219/shard-iclb5/igt@gem_eio@in-flight-suspend.html <3> [1705.709370] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-5)
Last occurrence three weeks ago: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_173/fi-glk-j4005/igt@gem_eio@in-flight-suspend.html Was happening sporadically before that so we need to keep monitoring, but I'm lowering the priority for now.
This issue hasn't shown up in 2 months, but on the other hand the interval between previous occurrences was every 1-2 months so there is not guarantee it's gone, at least by just looking at the CI result. I'm lowering priority and unassigning it for now.
commit 9a3b19a16dc28ab717cf1663d09ffee0715b735a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 13 23:20:47 2019 +0000 drm/i915: Only try to park engines after a failed reset Currently we try to stop the engine by programming the ring registers to be disabled before we perform the reset. Sometimes, we see the context image also have invalid ring registers, which one presumes may be actually caused by us doing so. Lets risk not doing programming the ring to zero on the first attempt to avoid preserving that corruption into the context image, leaving the w/a in place for subsequent reset attempts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190213232047.8486-1-ch ris@chris-wilson.co.uk
Thanks, closing this bug as fixed. No occurrences from last months or so.
The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.