Intel-GFX-CI hosts are starting to run i915 selftests, and this bug is part of the series "Initial findings" drv_selftest@live_sanitycheck fails on SKL and KBL GUC-enabled hosts: [ 368.932807] [loop 0] Failed to busy the object [ 368.932825] i915/i915_gem_object_live_selftests: igt_mmap_offset_exhaustion failed with error -5 [ 369.026897] i915: probe of 0000:00:02.0 failed with error -5 Full traces: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4432/fi-skl-guc/igt@drv_selftest@live_sanitycheck.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4432/fi-kbl-guc/igt@drv_selftest@live_sanitycheck.html
Wrong cut/paste there, this has probably more relevant information: [ 369.531090] RIP: 0010:plist_check_prev_next+0x0/0x40 [ 369.531094] Code: ff 85 c0 89 c2 7e 17 be 01 00 00 00 48 89 df e8 76 c5 b6 ff ba 01 00 00 00 85 c0 0f 4e d0 89 d0 5b c3 90 90 90 90 90 90 90 90 <48> 8b 42 08 48 39 f0 74 2b 49 89 f0 48 8b 4f 08 50 ff 32 52 48 89 [ 369.531147] RSP: 0018:ffffc9000051fa88 EFLAGS: 00010006 [ 369.531151] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002 [ 369.531155] RDX: 0000000000000000 RSI: ffff880212c87820 RDI: ffffffff82243de0 [ 369.531159] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 [ 369.531164] R10: ffffc9000051fa70 R11: 0000000000000000 R12: ffffffff82243de0 [ 369.531168] R13: ffffffff82243de0 R14: ffff8802168b7820 R15: 0000000077359400 [ 369.531173] FS: 00007f17536e5980(0000) GS:ffff880236c80000(0000) knlGS:0000000000000000 [ 369.531178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 369.531182] CR2: 0000000000000008 CR3: 000000014df46004 CR4: 00000000003606e0 [ 369.531186] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 369.531190] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 369.531195] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:34 [ 369.531200] in_atomic(): 1, irqs_disabled(): 1, pid: 8902, name: drv_selftest [ 369.531205] INFO: lockdep is turned off. [ 369.531207] irq event stamp: 230888 [ 369.531211] hardirqs last enabled at (230887): [<ffffffff8193e0dc>] _raw_spin_unlock_irqrestore+0x4c/0x60 [ 369.531217] hardirqs last disabled at (230888): [<ffffffff8193df4d>] _raw_spin_lock_irqsave+0xd/0x50 [ 369.531224] softirqs last enabled at (230732): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505 [ 369.531230] softirqs last disabled at (230725): [<ffffffff8108c759>] irq_exit+0xa9/0xc0 [ 369.531235] Preemption disabled at: [ 369.531235] [<0000000000000000>] (null) [ 369.531242] CPU: 2 PID: 8902 Comm: drv_selftest Tainted: G UD W 4.18.0-rc3-CI-CI_DRM_4432+ #1 [ 369.531247] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018 [ 369.531252] Call Trace: [ 369.531256] dump_stack+0x67/0x9b [ 369.531260] ___might_sleep+0x167/0x250 [ 369.531264] exit_signals+0x2b/0x2d0 [ 369.531268] do_exit+0xa3/0xd40 [ 369.531273] rewind_stack_do_exit+0x17/0x20
Fwiw, there was a kasan run https://intel-gfx-ci.01.org/tree/drm-tip/kasan_48/ and that didn't seemingly suggest anything. (The closest, most novel splat, is a hit for live_objects.)
commit 7ab87ede5078af1daccf26951096e16ac16e19cb (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jul 10 15:38:21 2018 +0100 drm/i915: Unwind HW init after GVT setup failure Following intel_gvt_init() failure, we missed unwinding our setup leaving pointers dangling past the module unload. For our example, the pm_qos: [ 441.057615] top: 000000006b3baf1c, n: 0000000054d8ef33, p: 0000000097cdf1a2 prev: 0000000054d8ef33, n: 0000000097cdf1a2, p: 000000006b3baf1c next: 0000000097cdf1a2, n: 000000006de8fc8b, p: 0000000081087253 [ 441.057627] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40 [ 441.057628] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers [last unloaded: i915] [ 441.057652] CPU: 4 PID: 9277 Comm: drv_selftest Tainted: G U 4.18.0-rc4-CI-CI_DRM_4464+ #1 [ 441.057653] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017 [ 441.057656] RIP: 0010:plist_check_prev_next+0x2d/0x40 [ 441.057657] Code: 08 48 39 f0 74 2b 49 89 f0 48 8b 4f 08 50 ff 32 52 48 89 fe 41 ff 70 08 48 8b 17 48 c7 c7 d8 ae 14 82 4d 8b 08 e8 63 0e 76 ff <0f> 0b 48 83 c4 20 c3 48 39 10 75 d0 f3 c3 0f 1f 44 00 00 41 54 55 [ 441.057717] RSP: 0018:ffffc900003a3a68 EFLAGS: 00010082 [ 441.057720] RAX: 0000000000000000 RBX: ffff8802193978c0 RCX: 0000000000000002 [ 441.057721] RDX: 0000000080000002 RSI: ffffffff820c65a4 RDI: 00000000ffffffff [ 441.057722] RBP: ffff8802193978c0 R08: 0000000000000000 R09: 0000000000000001 [ 441.057724] R10: ffffc900003a3a70 R11: 0000000000000000 R12: ffffffff82243de0 [ 441.057725] R13: ffffffff82243de0 R14: ffff88021a6c78c0 R15: 0000000077359400 [ 441.057726] FS: 00007fc23a4a9980(0000) GS:ffff880236d00000(0000) knlGS:0000000000000000 [ 441.057728] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 441.057729] CR2: 0000563e4503d038 CR3: 0000000138f86005 CR4: 00000000003606e0 [ 441.057730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 441.057731] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 441.057732] Call Trace: [ 441.057736] plist_check_list+0x2e/0x40 [ 441.057738] plist_add+0x23/0x130 [ 441.057743] pm_qos_update_target+0x1bd/0x2f0 [ 441.057771] i915_driver_load+0xec4/0x1060 [i915] [ 441.057775] ? trace_hardirqs_on_caller+0xe0/0x1b0 [ 441.057800] i915_pci_probe+0x29/0x90 [i915] [ 441.057804] pci_device_probe+0xa1/0x130 [ 441.057807] driver_probe_device+0x306/0x480 [ 441.057810] __driver_attach+0xdb/0x100 [ 441.057812] ? driver_probe_device+0x480/0x480 [ 441.057813] ? driver_probe_device+0x480/0x480 [ 441.057816] bus_for_each_dev+0x74/0xc0 [ 441.057819] bus_add_driver+0x15f/0x250 [ 441.057821] ? 0xffffffffa0696000 [ 441.057823] driver_register+0x56/0xe0 [ 441.057825] ? 0xffffffffa0696000 [ 441.057827] do_one_initcall+0x58/0x370 [ 441.057830] ? do_init_module+0x1d/0x1ea [ 441.057832] ? rcu_read_lock_sched_held+0x6f/0x80 [ 441.057834] ? kmem_cache_alloc_trace+0x282/0x2e0 [ 441.057838] do_init_module+0x56/0x1ea [ 441.057841] load_module+0x2435/0x2b20 [ 441.057852] ? __se_sys_finit_module+0xd3/0xf0 [ 441.057854] __se_sys_finit_module+0xd3/0xf0 [ 441.057861] do_syscall_64+0x55/0x190 [ 441.057863] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 441.057865] RIP: 0033:0x7fc239d75839 [ 441.057866] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48 [ 441.057927] RSP: 002b:00007fffb7825d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 441.057930] RAX: ffffffffffffffda RBX: 0000563e45035dd0 RCX: 00007fc239d75839 [ 441.057931] RDX: 0000000000000000 RSI: 0000563e4502f8a0 RDI: 0000000000000004 [ 441.057932] RBP: 0000563e4502f8a0 R08: 0000000000000004 R09: 0000000000000000 [ 441.057933] R10: 00007fffb7825ea0 R11: 0000000000000246 R12: 0000000000000000 [ 441.057934] R13: 0000563e4502f690 R14: 0000000000000000 R15: 000000000000003f [ 441.057940] irq event stamp: 231338 [ 441.057943] hardirqs last enabled at (231337): [<ffffffff8193e3fc>] _raw_spin_unlock_irqrestore+0x4c/0x60 [ 441.057944] hardirqs last disabled at (231338): [<ffffffff8193e26d>] _raw_spin_lock_irqsave+0xd/0x50 [ 441.057947] softirqs last enabled at (231024): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505 [ 441.057949] softirqs last disabled at (231005): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0 [ 441.057951] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40 v2: Add a load failure point to intel_gvt_init() so that we always exercise this path in future. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107129 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180710143821.1889-1-chris@chris-wilson.co.uk
This test has been green since CI_DRM_4465 on fi-skl-guc and fi-kbl-guc, closing
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.