https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_38/fi-skl-6600u/igt@drv_suspend@sysfs-reader.html [ 378.040826] [drm:gen8_reset_engines [i915]] *ERROR* bcs0: reset request timeout [ 378.041039] ------------[ cut here ]------------ [ 378.041040] WARN_ON(intel_gpu_reset(i915, (~0))) [ 378.041079] WARNING: CPU: 1 PID: 126 at drivers/gpu/drm/i915/i915_gem.c:4978 i915_gem_sanitize+0x4d/0x80 [i915] [ 378.041080] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core i915 snd_pcm asix usbnet btusb btrtl btbcm btintel mii bluetooth ecdh_generic mei_me mei prime_numbers i2c_hid pinctrl_sunrisepoint pinctrl_intel [ 378.041121] CPU: 1 PID: 126 Comm: kworker/u8:2 Tainted: G U W 4.17.0-rc4-gfe5bde58dca5-drmtip_38+ #1 [ 378.041122] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016 [ 378.041126] Workqueue: events_unbound async_run_entry_fn [ 378.041152] RIP: 0010:i915_gem_sanitize+0x4d/0x80 [i915] [ 378.041154] RSP: 0018:ffff9ec7004c7cc8 EFLAGS: 00010286 [ 378.041156] RAX: 0000000000000000 RBX: ffff8b2865c90000 RCX: 0000000000000001 [ 378.041158] RDX: 0000000080000001 RSI: ffffffffb20fb2c9 RDI: 00000000ffffffff [ 378.041159] RBP: ffff8b2865c90068 R08: 00000000564bbeab R09: 0000000000000000 [ 378.041161] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b2865c989b0 [ 378.041162] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 378.041164] FS: 0000000000000000(0000) GS:ffff8b287dc80000(0000) knlGS:0000000000000000 [ 378.041165] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 378.041167] CR2: 000055eba5d3b1a8 CR3: 0000000026210006 CR4: 00000000003606e0 [ 378.041168] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 378.041170] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 378.041171] Call Trace: [ 378.041196] i915_gem_suspend+0xec/0x140 [i915] [ 378.041215] i915_drm_suspend+0x5f/0x160 [i915] [ 378.041220] pci_pm_suspend+0x7c/0x130 [ 378.041223] ? pci_pm_freeze+0xc0/0xc0 [ 378.041226] dpm_run_callback+0x5d/0x2f0 [ 378.041230] __device_suspend+0x11f/0x600 [ 378.041234] ? dpm_watchdog_set+0x60/0x60 [ 378.041240] async_suspend+0x15/0x90 [ 378.041243] async_run_entry_fn+0x34/0x160 [ 378.041247] process_one_work+0x229/0x6a0 [ 378.041252] worker_thread+0x35/0x380 [ 378.041256] ? process_one_work+0x6a0/0x6a0 [ 378.041258] kthread+0x119/0x130 [ 378.041261] ? _kthread_create_on_node+0x60/0x60 [ 378.041279] ret_from_fork+0x3a/0x50 [ 378.041286] Code: e0 03 00 84 c0 74 f1 be ff ff ff ff 48 89 df e8 5a de 03 00 85 c0 74 e0 48 c7 c6 c0 b3 6c c0 48 c7 c7 1d 30 6b c0 e8 93 28 ae f0 <0f> 0b eb c9 48 8d 6f 68 31 f6 48 89 ef e8 81 e8 39 f1 48 89 df [ 378.041359] irq event stamp: 3276 [ 378.041362] hardirqs last enabled at (3275): [<ffffffffb10fc757>] vprintk_emit+0x4b7/0x4d0 [ 378.041365] hardirqs last disabled at (3276): [<ffffffffb1a0111c>] error_entry+0x7c/0x100 [ 378.041367] softirqs last enabled at (3258): [<ffffffffb1c0032b>] __do_softirq+0x32b/0x4e1 [ 378.041370] softirqs last disabled at (3237): [<ffffffffb108f6f4>] irq_exit+0xa4/0xb0 [ 378.041393] WARNING: CPU: 1 PID: 126 at drivers/gpu/drm/i915/i915_gem.c:4978 i915_gem_sanitize+0x4d/0x80 [i915] [ 378.041395] ---[ end trace 84ea7be84ec5687c ]---
The behaviour should have substantially changed with commit f4e60c5cfbf217cc9faa3aeb63742860154fcfef (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued) Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Mon Aug 13 16:01:16 2018 +0300 drm/i915: Force reset on unready engine If engine reports that it is not ready for reset, we give up. Evidence shows that forcing a per engine reset on an engine which is not reporting to be ready for reset, can bring it back into a working order. There is risk that we corrupt the context image currently executing on that engine. But that is a risk worth taking as if we unblock the engine, we prevent a whole device wedging in a case of full gpu reset. Reset individual engine even if it reports that it is not prepared for reset, but only if we aim for full gpu reset and not on first reset attempt. v2: force reset only on later attempts, readability (Chris) v3: simplify with adequate caffeine levels (Chris) v4: comment about risks and migitations (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180813130116.7250-1-mika.kuoppala@linux.intel.com
Closing the bug as this seen last time 2 months ago.
This occurred only twice in the past with a gap of 20 rounds of drmtip execution. So, to make the bug is really closed we can wait for few more rounds of execution to see if this still occurs. So, reopening this issue again. But this doesn't mean that this issue needs a fix.
Last seen this issue with drmtip_59 (4 months, 1 week / 2229 runs ago). Closing this bug.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.