Testhost is i7-6700K on Gigabyte Z170 MB. Serial is real COM1 header. Lockup happened while running IGT testlist in loop (started from serial console, if that matters). Kernel is built from DRM-Tip today. BIOS newest. Kconfig at https://intel-gfx-ci.01.org/CI/CI_DRM_2352/kernel.config.bz2 [22018.856538] [drm:intel_atomic_commit_tail [i915]] [CRTC:46:pipe C] [22018.856551] [drm:intel_power_well_disable [i915]] disabling DC off [22018.856566] [drm:skl_enable_dc6 [i915]] Enabling DC6 [22018.856577] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 02 [22018.856997] [drm:intel_power_well_disable [i915]] disabling always-on [22018.857038] [drm:intel_runtime_suspend [i915]] Suspending device [22018.857680] [IGT] kms_flip: exiting, ret=0 [22018.859602] [drm:intel_runtime_suspend [i915]] Device suspended [22018.884658] [drm:intel_runtime_resume [i915]] Resuming device [22018.888911] [drm:intel_runtime_resume [i915]] Device resumed [22018.888993] [drm:intel_runtime_suspend [i915]] Suspending device [22018.891615] [drm:intel_runtime_suspend [i915]] Device suspended [22055.545628] INFO: task kms_flip:25920 blocked for more than 30 seconds. [22055.545629] Tainted: G U W 4.11.0-rc2-CI-CI_DRM_2357+ #1 [22055.545629] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [22055.545630] kms_flip D 0 25920 20731 0x00000000 [22055.545631] Call Trace: [22055.545635] __schedule+0x3fd/0xbc0 [22055.545636] ? __mutex_unlock_slowpath+0x3e/0x2e0 [22055.545637] schedule+0x3b/0x90 [22055.545639] schedule_timeout+0x23b/0x490 [22055.545641] ? __save_stack_trace+0x7e/0xd0 [22055.545643] __down+0x80/0xc0 [22055.545644] ? _raw_spin_lock_irqsave+0x49/0x60 [22055.545645] down+0x3c/0x50 [22055.545647] console_lock+0x27/0x70 [22055.545648] ? capable+0x17/0x20 [22055.545650] vt_ioctl+0x51a/0x1440 [22055.545651] ? _raw_spin_unlock_irqrestore+0x3d/0x60 [22055.545652] ? __slab_free+0x39f/0x4f0 [22055.545654] tty_ioctl+0x384/0xe80 [22055.545655] ? _raw_spin_unlock_irqrestore+0x3d/0x60 [22055.545656] ? debug_check_no_obj_freed+0x117/0x220 [22055.545658] do_vfs_ioctl+0x90/0x6e0 [22055.545659] ? putname+0x4f/0x60 [22055.545661] SyS_ioctl+0x3c/0x70 [22055.545662] entry_SYSCALL_64_fastpath+0x1c/0xb1 [22055.545663] RIP: 0033:0x7f6c17851357 [22055.545663] RSP: 002b:00007fffa5d50198 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [22055.545664] RAX: ffffffffffffffda RBX: 0000000001566ea0 RCX: 00007f6c17851357 [22055.545664] RDX: 0000000000000000 RSI: 0000000000004b3a RDI: 0000000000000006 [22055.545665] RBP: 0000000000000000 R08: 00007fffa5d50160 R09: 0000000000000000 [22055.545665] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000 [22055.545665] R13: 00007fffa5d50ed0 R14: 0000000000000000 R15: 0000000000000000 [22055.545667] INFO: lockdep is turned off. [22055.545667] NMI backtrace for cpu 5 [22055.545669] CPU: 5 PID: 66 Comm: khungtaskd Tainted: G U W 4.11.0-rc2-CI-CI_DRM_2357+ #1 [22055.545669] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2 017 [22055.545669] Call Trace: [22055.545671] dump_stack+0x67/0x92 [22055.545672] ? irq_force_complete_move+0xf0/0xf0 [22055.545673] nmi_cpu_backtrace+0x92/0xa0 [22055.545674] ? irq_force_complete_move+0xf0/0xf0 [22055.545675] nmi_trigger_cpumask_backtrace+0xc0/0x100 [22055.545677] arch_trigger_cpumask_backtrace+0x14/0x20 [22055.545678] watchdog+0x433/0x600 [22055.545678] ? watchdog+0x9c/0x600 [22055.545680] kthread+0x107/0x140 [22055.545680] ? reset_hung_task_detector+0x10/0x10 [22055.545681] ? kthread_create_on_node+0x40/0x40 [22055.545682] ret_from_fork+0x2e/0x40 [22055.545684] Sending NMI from CPU 5 to CPUs 0-4,6-7: [22055.545690] NMI backtrace for cpu 0 skipped: idling at pc 0xffffffff818760f0 [22055.545691] NMI backtrace for cpu 1 skipped: idling at pc 0xffffffff818760f0 [22055.545721] NMI backtrace for cpu 7 skipped: idling at pc 0xffffffff818760f0 [22055.545722] NMI backtrace for cpu 3 skipped: idling at pc 0xffffffff818760f0 [22055.545723] NMI backtrace for cpu 2 skipped: idling at pc 0xffffffff818760f0 [22055.545723] NMI backtrace for cpu 6 skipped: idling at pc 0xffffffff818760f0 [22055.546688] NMI backtrace for cpu 4 [22055.546688] CPU: 4 PID: 25851 Comm: kworker/4:1 Tainted: G U W 4.11.0-rc2-CI-CI_DRM_23 57+ #1 [22055.546689] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2 017 [22055.546689] Workqueue: usb_hub_wq hub_event [22055.546689] task: ffff88041d3c0040 task.stack: ffffc900042b4000 [22055.546689] RIP: 0010:io_serial_in+0x13/0x20 [22055.546690] RSP: 0018:ffffc900042b79a0 EFLAGS: 00000002 [22055.546690] RAX: 0000000080000000 RBX: ffffffff83680100 RCX: 0000000000000000 [22055.546690] RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff83680100 [22055.546691] RBP: ffffc900042b79a0 R08: 0000000000000001 R09: 0000000000000001 [22055.546691] R10: 0000000000000000 R11: ffff88041d3c0040 R12: 0000000000002704 [22055.546691] R13: 0000000000000020 R14: ffffffff82d04fbb R15: ffffffff83680100 [22055.546691] FS: 0000000000000000(0000) GS:ffff88043ed00000(0000) knlGS:0000000000000000 [22055.546691] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [22055.546692] CR2: 00007f6c142323c0 CR3: 0000000003e0f000 CR4: 00000000003406e0 [22055.546692] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [22055.546692] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [22055.546692] Call Trace: [22055.546693] wait_for_xmitr+0x37/0xa0 [22055.546693] serial8250_console_putchar+0x17/0x30 [22055.546693] ? wait_for_xmitr+0xa0/0xa0 [22055.546693] uart_console_write+0x2b/0x60 [22055.546693] serial8250_console_write+0xbc/0x270 [22055.546693] ? vt_console_print+0x22d/0x3b0 [22055.546694] univ8250_console_write+0x1c/0x20 [22055.546694] console_unlock+0x40e/0x530 [22055.546694] ? vprintk_emit+0x27a/0x380 [22055.546694] vprintk_emit+0x283/0x380 [22055.546694] ? snprintf+0x39/0x40 [22055.546695] dev_vprintk_emit+0xab/0x220 [22055.546695] ? usb_cache_string+0x7e/0xa0 [22055.546695] ? trace_hardirqs_on+0xd/0x10 [22055.546695] dev_printk_emit+0x39/0x40 [22055.546695] __dev_printk+0x37/0x80 [22055.546696] _dev_info+0x57/0x60 [22055.546696] usb_new_device+0x122/0x4a0 [22055.546696] hub_event+0xb18/0x1500 [22055.546696] process_one_work+0x1f4/0x6d0 [22055.546696] ? process_one_work+0x16e/0x6d0 [22055.546696] worker_thread+0x49/0x4a0 [22055.546697] kthread+0x107/0x140 [22055.546697] ? process_one_work+0x6d0/0x6d0 [22055.546697] ? kthread_create_on_node+0x40/0x40 [22055.546697] ret_from_fork+0x2e/0x40 [22055.546697] Code: 48 89 e5 d3 e6 48 63 f6 48 03 77 40 8b 06 5d c3 0f 1f 84 00 00 00 00 00 0f b6 8 f d1 00 00 00 8b 57 38 55 48 89 e5 d3 e6 01 f2 ec <0f> b6 c0 5d c3 0f 1f 84 00 00 00 00 00 0f b6 8f d1 00 00 00 89 [22055.546706] Kernel panic - not syncing: hung_task: blocked tasks [22055.546707] CPU: 5 PID: 66 Comm: khungtaskd Tainted: G U W 4.11.0-rc2-CI-CI_DRM_2357+ #1 [22055.546707] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2 017 [22055.546707] Call Trace: [22055.546708] dump_stack+0x67/0x92 [22055.546709] panic+0xcf/0x205 [22055.546711] ? irq_force_complete_move+0xf0/0xf0 [22055.546712] watchdog+0x43f/0x600 [22055.546712] ? watchdog+0x9c/0x600 [22055.546713] kthread+0x107/0x140 [22055.546714] ? reset_hung_task_detector+0x10/0x10 [22055.546715] ? kthread_create_on_node+0x40/0x40 [22055.546716] ret_from_fork+0x2e/0x40 [22055.555570] Kernel Offset: disabled [22070.033123] ---[ end Kernel panic - not syncing: hung_task: blocked tasks
Each character has a 1s timeout in wait_for_xmitr. If the string was log enough and the serial console bust, it can easily appear to be hung. As wait_for_xmitr keeps poking the nmi watchdog. the slow loop will not be caught. If that is correct, this appears to be a hw issue. An alternative hypothesis is that the usb device struct sent garbage to be printed and it really is stuck printing out all of memory, 1 character at a time. Kasan might help. I don't think the vtcon switch of modes is significant. What would be useful is the dmesg before to see if we can identify what usb event triggered.
(In reply to Chris Wilson from comment #1) > Each character has a 1s timeout in wait_for_xmitr. If the string was log > enough and the serial console bust, it can easily appear to be hung. As > wait_for_xmitr keeps poking the nmi watchdog. the slow loop will not be > caught. If that is correct, this appears to be a hw issue. > > An alternative hypothesis is that the usb device struct sent garbage to be > printed and it really is stuck printing out all of memory, 1 character at a > time. Kasan might help. > > I don't think the vtcon switch of modes is significant. > > What would be useful is the dmesg before to see if we can identify what usb > event triggered. Hello, Is this bug still valid? If so, could you attach the dmesg mentioned before or any other relevant information? Thank you.
Hello, I asked Tomi about this case. The problem couldn't be reproduced again, if this is pointed up again it will be reopened. Thank you.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.