CI_DRM_3302 fi-cnl-y igt@kms_addfb_basic@invalid-set-prop kernel BUG at drivers/gpu/drm/i915/intel_lrc.c:886! Note, this is from: GEM_DEBUG_BUG_ON(buf[2 * head + 1] != port->context_id); bug 102035 is from: GEM_BUG_ON(status & GEN8_CTX_STATUS_PREEMPTED); Also, note a secondary display was added from this run to the CNL system. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3302/fi-cnl-y/igt@kms_addfb_basic@invalid-set-prop.html <14>[ 304.794369] [IGT] kms_addfb_basic: starting subtest invalid-set-prop <14>[ 304.794685] [IGT] kms_addfb_basic: exiting, ret=0 <4>[ 304.795912] ------------[ cut here ]------------ <2>[ 304.795920] kernel BUG at drivers/gpu/drm/i915/intel_lrc.c:886! <4>[ 304.795954] invalid opcode: 0000 [#1] PREEMPT SMP <4>[ 304.795964] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 snd_hda_intel snd_hda_codec x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul asix snd_hwdep usbnet snd_hda_core mii ghash_clmulni_intel e1000e snd_pcm ptp pps_core prime_numbers i2c_hid <4>[ 304.796034] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G U 4.14.0-rc7-CI-CI_DRM_3302+ #1 <4>[ 304.796048] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X100.A01.1708151220 08/15/2017 <4>[ 304.796067] task: ffff880267058040 task.stack: ffffc90000104000 <4>[ 304.796118] RIP: 0010:intel_lrc_irq_handler+0x2b2/0x8d0 [i915] <4>[ 304.796128] RSP: 0018:ffff880271183ea0 EFLAGS: 00010297 <4>[ 304.796138] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000003 <4>[ 304.796150] RDX: 0000000000000005 RSI: 0000000000000001 RDI: 00000000ffffffff <4>[ 304.796161] RBP: ffff880271183f00 R08: 0000000000000000 R09: 0000000000000001 <4>[ 304.796173] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 <4>[ 304.796184] R13: 0000000000000008 R14: ffff88025b4442a8 R15: ffff88025c4fa040 <4>[ 304.796196] FS: 0000000000000000(0000) GS:ffff880271180000(0000) knlGS:0000000000000000 <4>[ 304.796210] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 304.796220] CR2: 00007efc3a770000 CR3: 0000000003e0f005 CR4: 00000000007606e0 <4>[ 304.796231] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[ 304.796243] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 <4>[ 304.796254] PKRU: 55555554 <4>[ 304.796260] Call Trace: <4>[ 304.796266] <IRQ> <4>[ 304.796274] ? tasklet_hi_action+0x71/0x120 <4>[ 304.796285] tasklet_hi_action+0x98/0x120 <4>[ 304.796295] __do_softirq+0xc0/0x4ae <4>[ 304.796306] irq_exit+0xae/0xc0 <4>[ 304.796313] do_IRQ+0x71/0x130 <4>[ 304.796322] common_interrupt+0x9a/0x9a <4>[ 304.796330] </IRQ> <4>[ 304.796337] RIP: 0010:cpuidle_enter_state+0x136/0x370 <4>[ 304.796346] RSP: 0018:ffffc90000107e80 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff1d <4>[ 304.796360] RAX: ffff880267058040 RBX: 0000000000001ad9 RCX: 0000000000000001 <4>[ 304.796372] RDX: 0000000000000000 RSI: ffffffff81d0eb0c RDI: ffffffff81cc26f6 <4>[ 304.796383] RBP: ffffc90000107eb8 R08: 000000000000003a R09: 0000000000000006 <4>[ 304.796394] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 <4>[ 304.796405] R13: 0000000000000001 R14: ffff880264b4f1b8 R15: 00000046f73fec43 <4>[ 304.796436] cpuidle_enter+0x17/0x20 <4>[ 304.796446] call_cpuidle+0x23/0x40 <4>[ 304.796454] do_idle+0x192/0x1e0 <4>[ 304.796463] cpu_startup_entry+0x1d/0x20 <4>[ 304.796472] start_secondary+0x11c/0x140 <4>[ 304.796482] secondary_startup_64+0xa5/0xa5 <4>[ 304.796491] Code: 6d b0 4c 89 ef e8 af fc ff ff 4c 89 ef e8 67 fc ff ff 49 8b 86 80 03 00 00 a8 02 74 7e 41 0f ba b6 80 03 00 00 01 e9 ff fd ff ff <0f> 0b 0f 0b 0f 0b 4c 89 c7 e8 e0 dc 4a e1 e9 f0 fe ff ff 41 8b <1>[ 304.796592] RIP: intel_lrc_irq_handler+0x2b2/0x8d0 [i915] RSP: ffff880271183ea0 <4>[ 304.796619] ---[ end trace 7a8f0715b10dca37 ]---
We had a bug for the cnl failure... Where did that go?
*** Bug 103027 has been marked as a duplicate of this bug. ***
new subtest: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3326/fi-cnl-y/igt@gem_exec_flush@basic-wb-prw-default.html <2>[ 175.361223] kernel BUG at drivers/gpu/drm/i915/intel_lrc.c:886! <4>[ 175.361261] invalid opcode: 0000 [#1] PREEMPT SMP <4>[ 175.361271] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_codec asix snd_hwdep snd_hda_core ghash_clmulni_intel usbnet mii snd_pcm e1000e ptp pps_core prime_numbers i2c_hid <4>[ 175.361330] CPU: 3 PID: 1864 Comm: gem_exec_flush Tainted: G U 4.14.0-rc8-CI-CI_DRM_3326+ #1 <4>[ 175.361341] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X100.A01.1708151220 08/15/2017 <4>[ 175.361356] task: ffff8802601ec040 task.stack: ffffc90000598000 <4>[ 175.361398] RIP: 0010:intel_lrc_irq_handler+0x2b4/0x8d0 [i915] <4>[ 175.361405] RSP: 0018:ffff880271183ea0 EFLAGS: 00010293 <4>[ 175.361413] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000002 <4>[ 175.361422] RDX: 0000000000000005 RSI: 0000000000000001 RDI: 00000000ffffffff <4>[ 175.361430] RBP: ffff880271183f00 R08: ffff8801d15ae340 R09: 0000000000000000 <4>[ 175.361438] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 <4>[ 175.361446] R13: 0000000000000008 R14: ffff880255912158 R15: ffff88025baf2040 <4>[ 175.361455] FS: 00007f39cc90c8c0(0000) GS:ffff880271180000(0000) knlGS:0000000000000000 <4>[ 175.361465] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 175.361472] CR2: 00007ffd6c5c17e8 CR3: 0000000260e42004 CR4: 00000000007606e0 <4>[ 175.361481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[ 175.361489] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 <4>[ 175.361497] PKRU: 55555554 <4>[ 175.361502] Call Trace: <4>[ 175.361506] <IRQ> <4>[ 175.361514] ? tasklet_hi_action+0x71/0x120 <4>[ 175.361522] tasklet_hi_action+0x98/0x120 <4>[ 175.361530] __do_softirq+0xc0/0x4ae <4>[ 175.361537] irq_exit+0xae/0xc0 <4>[ 175.361543] do_IRQ+0x71/0x130 <4>[ 175.361549] common_interrupt+0x9a/0x9a <4>[ 175.361555] </IRQ> <4>[ 175.361560] RIP: 0010:osq_lock+0x7c/0x110 <4>[ 175.361566] RSP: 0018:ffffc9000059bbb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff1d <4>[ 175.361576] RAX: 0000000000000000 RBX: ffff88027119b840 RCX: 5a6fd2f7be4312d3 <4>[ 175.361584] RDX: ffff8802601ec040 RSI: ffffffff81cb63b3 RDI: ffffffff81cc3ced <4>[ 175.361593] RBP: ffffc9000059bbd0 R08: ffff8802601ec900 R09: 0000000000000000 <4>[ 175.361601] R10: 000000006be4761e R11: 000000009318ba4e R12: ffff88027111b840 <4>[ 175.361610] R13: ffff88025b6600b0 R14: ffffffff81c7c692 R15: ffff8802601ec040 <4>[ 175.361621] ? osq_lock+0x2f/0x110 <4>[ 175.361629] __mutex_lock+0x665/0x9b0 <4>[ 175.361635] ? __mutex_unlock_slowpath+0x43/0x2c0 <4>[ 175.361667] ? i915_gem_pread_ioctl+0x2cc/0x7f0 [i915] <4>[ 175.361676] ? lock_acquire+0xb0/0x200 <4>[ 175.361706] ? i915_gem_pread_ioctl+0x251/0x7f0 [i915] <4>[ 175.361736] ? i915_gem_object_get_page+0x60/0x60 [i915] <4>[ 175.361744] mutex_lock_interruptible_nested+0x1b/0x20 <4>[ 175.361752] ? mutex_lock_interruptible_nested+0x1b/0x20 <4>[ 175.361782] i915_gem_pread_ioctl+0x2cc/0x7f0 [i915] <4>[ 175.361789] ? lock_acquire+0xb0/0x200 <4>[ 175.361796] ? __might_fault+0x3e/0x90 <4>[ 175.361824] ? i915_gem_object_get_page+0x60/0x60 [i915] <4>[ 175.361832] drm_ioctl_kernel+0x69/0xb0 <4>[ 175.361838] drm_ioctl+0x2f9/0x3d0 <4>[ 175.361864] ? i915_gem_object_get_page+0x60/0x60 [i915] <4>[ 175.361874] ? lock_acquire+0xb0/0x200 <4>[ 175.361881] ? __might_fault+0x3e/0x90 <4>[ 175.361888] do_vfs_ioctl+0x94/0x670 <4>[ 175.361894] ? entry_SYSCALL_64_fastpath+0x5/0xb1 <4>[ 175.361901] ? __this_cpu_preempt_check+0x13/0x20 <4>[ 175.361908] ? trace_hardirqs_on_caller+0xe3/0x1b0 <4>[ 175.361916] SyS_ioctl+0x41/0x70 <4>[ 175.361922] entry_SYSCALL_64_fastpath+0x1c/0xb1 <4>[ 175.361929] RIP: 0033:0x7f39cae19587 <4>[ 175.361934] RSP: 002b:00007ffd6c5c1778 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 <4>[ 175.361944] RAX: ffffffffffffffda RBX: ffffffff81492003 RCX: 00007f39cae19587 <4>[ 175.361953] RDX: 00007ffd6c5c17b0 RSI: 000000004020645c RDI: 0000000000000003 <4>[ 175.361961] RBP: ffffc9000059bf88 R08: 0000000000000004 R09: 00007ffd6c5ec080 <4>[ 175.361969] R10: 0000000000000073 R11: 0000000000000246 R12: 00000000000000d6 <4>[ 175.361978] R13: 0000000000000003 R14: 000000004020645c R15: 0000000000000003 <4>[ 175.361987] ? __this_cpu_preempt_check+0x13/0x20 <4>[ 175.361995] Code: 4c 89 ef e8 af fc ff ff 4c 89 ef e8 67 fc ff ff 49 8b 86 80 03 00 00 a8 02 74 7e 41 0f ba b6 80 03 00 00 01 e9 ff fd ff ff 0f 0b <0f> 0b 0f 0b 4c 89 c7 e8 e0 26 4a e1 e9 f0 fe ff ff 41 8b 46 28 <1>[ 175.362091] RIP: intel_lrc_irq_handler+0x2b4/0x8d0 [i915] RSP: ffff880271183ea0 <4>[ 175.362117] ---[ end trace 72c868c4aac301ba ]---
Not seen for yonks, unfortunately not seen after we enabled the GEM_TRACE to capture the execlists leading to the bug. Presuming this is related to bug 103800 which is not reproducible since roughly commit ba74cb10c775c839f6e1d0fabd1e772eabd9c43f Author: Michel Thierry <michel.thierry@intel.com> Date: Mon Nov 20 12:34:58 2017 +0000 drm/i915/execlists: Delay writing to ELSP until HW has processed the previous write
OK
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.