https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-cfl-8700k/igt@gem_eio@hibernate.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-cfl-8700k/igt@gem_eio@suspend.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-bsw-n3050/igt@gem_eio@hibernate.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-bsw-n3050/igt@gem_eio@suspend.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-bdw-5557u/igt@gem_eio@hibernate.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-bdw-5557u/igt@gem_eio@suspend.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-kbl-7500u/igt@gem_eio@hibernate.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-kbl-7560u/igt@gem_eio@suspend.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-skl-6260u/igt@gem_eio@hibernate.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-skl-6700hq/igt@gem_eio@hibernate.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-skl-6700k2/igt@gem_eio@hibernate.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_64/fi-skl-6770hq/igt@gem_eio@hibernate.html <0>[ 277.303716] --------------------------------- <4>[ 277.303717] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_pcm mei_me e1000e mei prime_numbers <4>[ 277.303727] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G U 4.17.0-rc7-g02d8db1a894b-drmtip_64+ #1 <4>[ 277.303727] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.00 10/31/2017 <4>[ 277.303743] RIP: 0010:process_csb+0x53c/0x8b0 [i915] <4>[ 277.303744] RSP: 0018:ffff940966203e18 EFLAGS: 00010286 <4>[ 277.303745] RAX: 000000000000000d RBX: 0000000000000018 RCX: 0000000000000000 <4>[ 277.303746] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffff94096542fa38 <4>[ 277.303746] RBP: ffff940966203e90 R08: 00000000001840b0 R09: ffff940965481000 <4>[ 277.303747] R10: 0000000000000000 R11: ffff94096542fa38 R12: 0000000000000003 <4>[ 277.303747] R13: ffff94095690c6f0 R14: ffff9409527b6040 R15: ffff94095690c2a8 <4>[ 277.303748] FS: 0000000000000000(0000) GS:ffff940966200000(0000) knlGS:0000000000000000 <4>[ 277.303749] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 277.303750] CR2: 000056541d36ab90 CR3: 0000000101210002 CR4: 00000000003606f0 <4>[ 277.303750] Call Trace: <4>[ 277.303751] <IRQ> <4>[ 277.303768] execlists_submission_tasklet+0xb1/0xe20 [i915] <4>[ 277.303770] ? lock_acquire+0xa6/0x210 <4>[ 277.303772] ? handle_irq_event+0x3a/0x50 <4>[ 277.303774] tasklet_action_common.isra.5+0x47/0xb0 <4>[ 277.303776] __do_softirq+0xc1/0x4e1 <4>[ 277.303778] ? _raw_spin_unlock+0x29/0x40 <4>[ 277.303780] irq_exit+0xa4/0xb0 <4>[ 277.303781] do_IRQ+0x9a/0x120 <4>[ 277.303782] common_interrupt+0xf/0xf <4>[ 277.303783] </IRQ> <4>[ 277.303785] RIP: 0010:cpuidle_enter_state+0xac/0x360 <4>[ 277.303785] RSP: 0018:ffffffff8f203e70 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffd8 <4>[ 277.303786] RAX: ffffffff8f2167c0 RBX: 0000000000014582 RCX: 0000000000000000 <4>[ 277.303787] RDX: 0000000000000046 RSI: ffffffff8f0fc071 RDI: ffffffff8f0a8eef <4>[ 277.303788] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 <4>[ 277.303788] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8f296138 <4>[ 277.303789] R13: ffffcb4dffa00a70 R14: 0000000000000000 R15: 000000408f13b924 <4>[ 277.303792] do_idle+0x1f3/0x250 <4>[ 277.303794] cpu_startup_entry+0x6a/0x70 <4>[ 277.303796] start_kernel+0x4a2/0x4c2 <4>[ 277.303799] secondary_startup_64+0xa5/0xb0 <4>[ 277.303801] Code: e8 f3 1f bb cd 48 8b 35 03 63 19 00 49 c7 c0 90 c5 64 c0 b9 27 04 00 00 48 c7 c2 b0 4f 61 c0 48 c7 c7 07 be 54 c0 e8 64 8a c1 cd <0f> 0b 48 8d 83 20 16 00 00 48 89 c7 48 89 45 c8 e8 3f d9 3f ce <1>[ 277.303840] RIP: process_csb+0x53c/0x8b0 [i915] RSP: ffff940966203e18 <4>[ 277.303866] ---[ end trace bb212b5641eb6667 ]---
commit 4fdd5b4e9aba5fbbc6d3072a5a87fa1d3f3fc030 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Jun 16 21:25:34 2018 +0100 drm/i915: Fix fallout of fake reset along resume commit b2209e62a450 ("drm/i915/execlists: Reset the CSB head tracking on reset/sanitization") and commit 1288786b18f7 ("drm/i915: Move GEM sanitize from resume_early to resume") show the conflicting requirements on the code. We must reset the GPU before trashing live state on a fast resume (hibernation debug, or error paths), but we must only reset our state tracking iff the GPU is reset (or power cycled). This is tricky if we are disabling GPU reset to simulate broken hardware; we reset our state tracking but the GPU is left intact and recovers from its stale state. v2: Again without the assertion for forcewake, no longer required since commit b3ee09a4de33 ("drm/i915/ringbuffer: Fix context restore upon reset") as the contexts are reset from the CS ensuring everything is powered up. Fixes: b2209e62a450 ("drm/i915/execlists: Reset the CSB head tracking on reset/sanitization") Fixes: 1288786b18f7 ("drm/i915: Move GEM sanitize from resume_early to resume") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180616202534.18767-1-chris@chris-wilson.co.uk
Closing, thanks.
Still happening with drm-tip: 2018y-06m-17d-12h-42m-13s UTC integration manifest https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_66/fi-whl-u/igt@gem_eio@suspend.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_66/fi-whl-u/igt@gem_eio@hibernate.html The output looks a little different though, so I guess it is progress? :)
Bug fix hasn't percolated as far as drmtip-66. Check again after drmtip-67/-68!
(In reply to Chris Wilson from comment #4) > Bug fix hasn't percolated as far as drmtip-66. Check again after > drmtip-67/-68! Are you sure? You pushed the patch on the 16th, and the drmtip run was with drmtip 2018y-06m-17d-12h-42m-13s.
Pretty confident, yes. The error (CSB head==5 but mmio reads 1) is the same as fixed by the patch, and the same as showing up in the shards for https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4332/ fixed in CI_DRM_4333. I do think we need a clearer indication of what base drmtip is using. drm-intel drm-intel-next-queued f677bd558de2e98b70b7f8c522024b26d2d1120d drm/i915/icl: update VBT's child_device_config flags2 field which is just (2 patches!) before commit 4fdd5b4e9aba5fbbc6d3072a5a87fa1d3f3fc030 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Jun 16 21:25:34 2018 +0100 drm/i915: Fix fallout of fake reset along resume was committed.
Indeed was not reproduced on drmtip_67. Closing! Thanks and sorry for the noise :s
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.