There's an regression; v4.19 is rock solid on thinkpad x220, while v4.20-rc1+ crashes after day or two of use. It also happened on linux-next, 20181210. It is usually dies after a day or two.
One of oopses: > > Nov 8 18:35:01 duo CRON[28511]: (root) CMD (command -v debian-sa1 > > > /dev/null && debian-sa > > 1 1 1) > > Nov 8 18:42:57 duo kernel: list_del corruption. prev->next should be > > ffff8801742b8178, but > > was ffffc9000192fec8 > > Nov 8 18:42:57 duo kernel: ------------[ cut here ]------------ > > Nov 8 18:42:57 duo kernel: kernel BUG at > > /data/fast/l/k/lib/list_debug.c:53! > > Nov 8 18:42:57 duo kernel: invalid opcode: 0000 [#1] SMP PTI > > Nov 8 18:42:57 duo kernel: CPU: 2 PID: 1082 Comm: i915/signal:1 Not > > tainted 4.20.0-rc1+ #3 > > Nov 8 18:42:57 duo kernel: Hardware name: LENOVO 42872WU/42872WU, > > BIOS 8DET74WW (1.44 ) 03 > > /13/2018 > > Nov 8 18:42:57 duo kernel: RIP: > > 0010:__list_del_entry_valid+0x8e/0x90 > > Nov 8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0 48 > > c7 c7 90 74 5e 85 e8 > > 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8 74 5e 85 e8 40 88 d1 ff > > <0f> 0b 55 48 89 d0 48 > > 8b 52 08 48 89 e5 48 39 f2 75 19 48 8b 32 48 > > Nov 8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS: > > 00210086 > > Nov 8 18:42:57 duo kernel: RAX: 0000000000000054 RBX: > > ffff8801742b8178 RCX: 00000000000000 > > 00 > > Nov 8 18:42:57 duo kernel: RDX: 0000000000000000 RSI: > > ffff88019e2a53d8 RDI: ffff88019e2a53 > > d8 > > Nov 8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08: > > ffff880196e2cd10 R09: 00000000000000 > > 00 > > Nov 8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11: > > 3863656632393101 R12: ffffc9000196be > > c8 > > Nov 8 18:42:57 duo kernel: R13: ffff88019707e000 R14: > > ffff8801742b8080 R15: ffffc9000192fd > > d0 > > Nov 8 18:42:57 duo kernel: FS: 0000000000000000(0000) > > GS:ffff88019e280000(0000) knlGS:000 > > 0000000000000 > > Nov 8 18:42:57 duo kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > > 0000000080050033 > > Nov 8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3: > > 000000000581e001 CR4: 00000000000606a0 > > Nov 8 18:42:57 duo kernel: Call Trace: > > Nov 8 18:42:57 duo kernel: intel_breadcrumbs_signaler+0x162/0x330 > > Nov 8 18:42:57 duo kernel: kthread+0x116/0x150 > > Nov 8 18:42:57 duo kernel: ? intel_engine_wakeup+0x40/0x40 > > Nov 8 18:42:57 duo kernel: ? kthread_park+0x90/0x90 > > Nov 8 18:42:57 duo kernel: ret_from_fork+0x35/0x40 > > Nov 8 18:42:57 duo kernel: Modules linked in: > > Nov 8 18:42:57 duo kernel: ---[ end trace 2f8da183a56f80f6 ]--- > > Nov 8 18:42:57 duo kernel: RIP: > > 0010:__list_del_entry_valid+0x8e/0x90 > > Nov 8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0 > > 48 c7 c7 90 74 5e 85 e8 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8 > > 74 5e 85 e8 40 88 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48 > > 39 f2 75 19 48 8b 32 48 > > Nov 8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS: > > 00210086 > > Nov 8 18:42:57 duo kernel: RAX: 0000000000000054 RBX: > > ffff8801742b8178 RCX: 0000000000000000 > > Nov 8 18:42:57 duo kernel: RDX: 0000000000000000 RSI: > > ffff88019e2a53d8 RDI: ffff88019e2a53d8 > > Nov 8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08: > > ffff880196e2cd10 R09: 0000000000000000 > > Nov 8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11: > > 3863656632393101 R12: ffffc9000196bec8 > > Nov 8 18:42:57 duo kernel: R13: ffff88019707e000 R14: > > ffff8801742b8080 R15: ffffc9000192fdd0 > > Nov 8 18:42:57 duo kernel: FS: 0000000000000000(0000) > > GS:ffff88019e280000(0000) knlGS:0000000000000000 > > Nov 8 18:42:57 duo kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > > 0000000080050033 > > Nov 8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3: > > 000000000581e001 CR4: 00000000000606a0 > > 0000000000000 > > Nov 8 18:42:57 duo kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > > 0000000080050033 > > Nov 8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3: > > 000000000581e001 CR4: 00000000000606a0 > > Nov 8 18:42:57 duo kernel: Call Trace: > > Nov 8 18:42:57 duo kernel: intel_breadcrumbs_signaler+0x162/0x330 > > Nov 8 18:42:57 duo kernel: kthread+0x116/0x150 > > Nov 8 18:42:57 duo kernel: ? intel_engine_wakeup+0x40/0x40 > > Nov 8 18:42:57 duo kernel: ? kthread_park+0x90/0x90 > > Nov 8 18:42:57 duo kernel: ret_from_fork+0x35/0x40 > > Nov 8 18:42:57 duo kernel: Modules linked in: > > Nov 8 18:42:57 duo kernel: ---[ end trace 2f8da183a56f80f6 ]--- > > Nov 8 18:42:57 duo kernel: RIP: > > 0010:__list_del_entry_valid+0x8e/0x90 > > Nov 8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0 > > 48 c7 c7 90 74 5e 85 e8 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8 > > 74 5e 85 e8 40 88 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48 > > 39 f2 75 19 48 8b 32 48 > > Nov 8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS: > > 00210086 > > Nov 8 18:42:57 duo kernel: RAX: 0000000000000054 RBX: > > ffff8801742b8178 RCX: 0000000000000000 > > Nov 8 18:42:57 duo kernel: RDX: 0000000000000000 RSI: > > ffff88019e2a53d8 RDI: ffff88019e2a53d8 > > Nov 8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08: > > ffff880196e2cd10 R09: 0000000000000000 > > Nov 8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11: > > 3863656632393101 R12: ffffc9000196bec8 > > Nov 8 18:42:57 duo kernel: R13: ffff88019707e000 R14: > > ffff8801742b8080 R15: ffffc9000192fdd0 > > Nov 8 18:42:57 duo kernel: FS: 0000000000000000(0000) > > GS:ffff88019e280000(0000) knlGS:0000000000000000 > > Nov 8 18:42:57 duo kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > > 0000000080050033 > > Nov 8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3: > > 000000000581e001 CR4: 00000000000606a0
Second: Dec 8 11:45:01 duo CRON[29325]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) Dec 8 11:46:42 duo org.mate.panel.applet.MateWeatherAppletFactory[3983]: (mateweather-applet-2:4242): GLib-CRITICAL **: Source ID 14603 was not found when attempting to remove it Dec 8 11:54:59 duo kernel: list_del corruption. prev->next should be ffff88019283ea28, but was ffff8801411a1c68 Dec 8 11:54:59 duo kernel: ------------[ cut here ]------------ Dec 8 11:54:59 duo kernel: kernel BUG at /data/fast/l/k/lib/list_debug.c:53! Dec 8 11:54:59 duo kernel: invalid opcode: 0000 [#1] SMP PTI Dec 8 11:54:59 duo kernel: CPU: 1 PID: 3428 Comm: Xorg Not tainted 4.20.0-rc1+ #4 Dec 8 11:54:59 duo kernel: Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018 Dec 8 11:54:59 duo kernel: RIP: 0010:__list_del_entry_valid+0x8e/0x90 Dec 8 11:54:59 duo kernel: Code: 16 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 08 75 5e 85 e8 03 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 40 75 5e 85 e8 f0 87 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48 39 f2 75 19 48 8b 32 48 Dec 8 11:54:59 duo kernel: RSP: 0000:ffffc90000223ac0 EFLAGS: 00213282 Dec 8 11:54:59 duo kernel: RAX: 0000000000000054 RBX: ffff880115a07c40 RCX: 0000000000000000 Dec 8 11:54:59 duo kernel: RDX: 0000000000000000 RSI: ffff88019e2653d8 RDI: ffff88019e2653d8 Dec 8 11:54:59 duo kernel: RBP: ffffc90000223ac0 R08: ffff880193a2ad10 R09: 0000000000000000 Dec 8 11:54:59 duo kernel: R10: 00000000008e9088 R11: 2e6e6f6974707501 R12: ffff8801960cb240 Dec 8 11:54:59 duo kernel: R13: ffff88019283e900 R14: ffff880115a07ec0 R15: ffff88019283ea28 Dec 8 11:54:59 duo kernel: FS: 0000000000000000(0000) GS:ffff88019e240000(0063) knlGS:00000000f79c4880 Dec 8 11:54:59 duo kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 Dec 8 11:54:59 duo kernel: CR2: 00000000086b0df8 CR3: 00000001939f6004 CR4: 00000000000606a0 Dec 8 11:54:59 duo kernel: Call Trace: Dec 8 11:54:59 duo kernel: i915_vma_move_to_active+0x1c3/0x510 Dec 8 11:54:59 duo kernel: ? i915_request_await_object+0xf4/0x280 Dec 8 11:54:59 duo kernel: i915_gem_do_execbuffer+0xe2f/0x10a0 Dec 8 11:54:59 duo kernel: ? find_held_lock+0x39/0xb0 Dec 8 11:54:59 duo kernel: ? kvmalloc_node+0x26/0x70 Dec 8 11:54:59 duo kernel: i915_gem_execbuffer2_ioctl+0x1b4/0x360 Dec 8 11:54:59 duo kernel: ? i915_gem_execbuffer_ioctl+0x290/0x290 Dec 8 11:54:59 duo kernel: drm_ioctl_kernel+0xaa/0xf0 Dec 8 11:54:59 duo kernel: drm_ioctl+0x323/0x3d0 Dec 8 11:54:59 duo kernel: ? i915_gem_execbuffer_ioctl+0x290/0x290 Dec 8 11:54:59 duo kernel: ? posix_ktime_get_ts+0xc/0x10 Dec 8 11:54:59 duo kernel: i915_compat_ioctl+0x37/0x40 Dec 8 11:54:59 duo kernel: __ia32_compat_sys_ioctl+0x429/0xe90 Dec 8 11:54:59 duo kernel: ? put_old_timespec32+0x9/0x10 Dec 8 11:54:59 duo kernel: ? __ia32_compat_sys_clock_gettime+0x67/0x90 Dec 8 11:54:59 duo kernel: do_int80_syscall_32+0x50/0x100 Dec 8 11:54:59 duo kernel: entry_INT80_compat+0x7d/0x82 Dec 8 11:54:59 duo kernel: RIP: 0023:0xf7fd5c42 Dec 8 11:54:59 duo kernel: Code: 65 8b 15 04 00 00 00 8b 0e 8b 0c ca 83 f9 ff 75 0c 89 04 24 89 f0 e8 b3 fe ff ff eb 05 8b 46 04 01 c8 83 c4 14 5b 5e c3 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 1c 24 c3 8d b6 00 00 Dec 8 11:54:59 duo kernel: RSP: 002b:00000000fff1a014 EFLAGS: 00203292 ORIG_RAX: 0000000000000036 Dec 8 11:54:59 duo kernel: RAX: ffffffffffffffda RBX: 000000000000000a RCX: 0000000040406469 Dec 8 11:54:59 duo kernel: RDX: 00000000fff1a0bc RSI: 0000000000000000 RDI: 0000000040406469 Dec 8 11:54:59 duo kernel: RBP: 000000000000000a R08: 0000000000000000 R09: 0000000000000000 Dec 8 11:54:59 duo kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Dec 8 11:54:59 duo kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Dec 8 11:54:59 duo kernel: Modules linked in: Dec 8 11:54:59 duo kernel: ---[ end trace 0c1e74ccc719c763 ]--- Dec 8 11:54:59 duo kernel: RIP: 0010:__list_del_entry_valid+0x8e/0x90 Dec 8 11:54:59 duo kernel: Code: 16 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 08 75 5e 85 e8 03 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 40 75 5e 85 e8 f0 87 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48 39 f2 75 19 48 8b 32 48 Dec 8 11:54:59 duo kernel: RSP: 0000:ffffc90000223ac0 EFLAGS: 00213282 Dec 8 11:54:59 duo kernel: RAX: 0000000000000054 RBX: ffff880115a07c40 RCX: 0000000000000000 Dec 8 11:54:59 duo kernel: RDX: 0000000000000000 RSI: ffff88019e2653d8 RDI: ffff88019e2653d8 Dec 8 11:54:59 duo kernel: RBP: ffffc90000223ac0 R08: ffff880193a2ad10 R09: 0000000000000000 Dec 8 11:54:59 duo kernel: R10: 00000000008e9088 R11: 2e6e6f6974707501 R12: ffff8801960cb240 Dec 8 11:54:59 duo kernel: R13: ffff88019283e900 R14: ffff880115a07ec0 R15: ffff88019283ea28 Dec 8 11:54:59 duo kernel: FS: 0000000000000000(0000) GS:ffff88019e240000(0063) knlGS:00000000f79c4880 Dec 8 11:54:59 duo kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 Dec 8 11:54:59 duo kernel: CR2: 00000000086b0df8 CR3: 00000001939f6004 CR4: 00000000000606a0 Dec 8 11:54:59 duo org.mate.panel.applet.WnckletFactory[3983]: wnck-applet: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. Dec 8 11:54:59 duo org.mate.panel.applet.MateWeatherAppletFactory[3983]: mateweather-applet-2: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. Dec 8 11:55:00 duo org.mate.panel.applet.CommandAppletFactory[3983]: command-applet: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. Dec 8 11:55:00 duo org.mate.panel.applet.NotificationAreaAppletFactory[3983]: notification-area-applet: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. Dec 8 11:55:00 duo org.mate.panel.applet.ClockAppletFactory[3983]: clock-applet: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. Dec 8 11:55:01 duo CRON[30056]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) Dec 8 11:55:02 duo org.mate.panel.applet.InhibitAppletFactory[3983]: mate-inhibit-applet: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. Dec 8 11:55:09 duo org.a11y.atspi.Registry[4114]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Another one. fgfs always segfaults, thats "normal". [79648.504664] fgfs[6740]: segfault at 8b58308 ip 0000000008b58308 sp 00000000ffd1327c error 15 [79648.504674] Code: 00 00 00 00 00 00 29 00 00 00 f0 83 b5 08 38 82 b5 08 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 a8 82 b5 08 00 00 00 00 <00> 00 00 00 11 00 00 00 08 8d b5 08 00 00 00 00 00 00 00 00 21 00 [79922.056325] list_del corruption. next->prev should be ffff88818761da28, but was ffff88817cb31e80 [79922.056348] ------------[ cut here ]------------ [79922.056355] kernel BUG at /data/fast/l/k/lib/list_debug.c:56! [79922.056368] invalid opcode: 0000 [#1] SMP PTI [79922.056375] CPU: 1 PID: 3478 Comm: Xorg Not tainted 4.20.0-rc6-next-20181210 #3 [79922.056379] Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018 [79922.056391] RIP: 0010:__list_del_entry_valid+0x52/0x90 [79922.056396] Code: 41 48 8b 12 48 39 d7 75 4c 48 8b 50 08 48 39 d7 75 07 b8 01 00 00 00 5d c3 48 89 fe 31 c0 48 c7 c7 20 9f 5e 85 e8 5c bb d1 ff <0f> 0b 48 89 c2 48 89 fe 31 c0 48 c7 c7 50 9e 5e 85 e8 46 bb d1 ff [79922.056401] RSP: 0000:ffffc90000287ac0 EFLAGS: 00213282 [79922.056406] RAX: 0000000000000054 RBX: ffff88817cb31c00 RCX: 0000000000000000 [79922.056410] RDX: 0000000000000000 RSI: ffff88819e2653d8 RDI: ffff88819e2653d8 [79922.056413] RBP: ffffc90000287ac0 R08: ffff8881944c1090 R09: 0000000000000000 [79922.056417] R10: 00000000008e9088 R11: 3038653133626301 R12: ffff888187769b40 [79922.056421] R13: ffff88818761d900 R14: ffff88817cb31e80 R15: ffff88818761da28 [79922.056425] FS: 0000000000000000(0000) GS:ffff88819e240000(0063) knlGS:00000000f793c880 [79922.056430] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [79922.056433] CR2: 0000000008064670 CR3: 000000018bdc6001 CR4: 00000000000606a0 [79922.056436] Call Trace: [79922.056447] i915_vma_move_to_active+0x1c3/0x530 [79922.056454] ? i915_request_await_object+0xf4/0x280 [79922.056461] i915_gem_do_execbuffer+0xe28/0x1090 [79922.056470] ? find_held_lock+0x39/0xb0 [79922.056478] ? kvmalloc_node+0x26/0x70 [79922.056484] i915_gem_execbuffer2_ioctl+0x1b4/0x360 [79922.056489] ? i915_gem_execbuffer_ioctl+0x290/0x290 [79922.056494] ? i915_gem_execbuffer_ioctl+0x290/0x290 [79922.056500] drm_ioctl_kernel+0xaa/0xf0 [79922.056505] drm_ioctl+0x31b/0x3c0 [79922.056510] ? i915_gem_execbuffer_ioctl+0x290/0x290 [79922.056517] i915_compat_ioctl+0x37/0x40 [79922.056523] __ia32_compat_sys_ioctl+0x429/0xe90 [79922.056528] ? ksys_read+0x53/0xc0 [79922.056536] do_int80_syscall_32+0x50/0x100 [79922.056543] entry_INT80_compat+0x7d/0x82 [79922.056548] RIP: 0023:0xf7f4dc42 [79922.056552] Code: 65 8b 15 04 00 00 00 8b 0e 8b 0c ca 83 f9 ff 75 0c 89 04 24 89 f0 e8 b3 fe ff ff eb 05 8b 46 04 01 c8 83 c4 14 5b 5e c3 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 1c 24 c3 8d b6 00 00 [79922.056556] RSP: 002b:00000000fff8c764 EFLAGS: 00203292 ORIG_RAX: 0000000000000036 [79922.056561] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 0000000040406469 [79922.056565] RDX: 00000000fff8c80c RSI: 0000000000000000 RDI: 0000000040406469 [79922.056569] RBP: 000000000000000a R08: 0000000000000000 R09: 0000000000000000 [79922.056572] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [79922.056576] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [79922.056580] Modules linked in: [79922.056587] ---[ end trace 5702775963af313b ]--- [79922.056593] RIP: 0010:__list_del_entry_valid+0x52/0x90 [79922.056597] Code: 41 48 8b 12 48 39 d7 75 4c 48 8b 50 08 48 39 d7 75 07 b8 01 00 00 00 5d c3 48 89 fe 31 c0 48 c7 c7 20 9f 5e 85 e8 5c bb d1 ff <0f> 0b 48 89 c2 48 89 fe 31 c0 48 c7 c7 50 9e 5e 85 e8 46 bb d1 ff [79922.056601] RSP: 0000:ffffc90000287ac0 EFLAGS: 00213282 [79922.056605] RAX: 0000000000000054 RBX: ffff88817cb31c00 RCX: 0000000000000000 [79922.056609] RDX: 0000000000000000 RSI: ffff88819e2653d8 RDI: ffff88819e2653d8 [79922.056612] RBP: ffffc90000287ac0 R08: ffff8881944c1090 R09: 0000000000000000 [79922.056616] R10: 00000000008e9088 R11: 3038653133626301 R12: ffff888187769b40 [79922.056619] R13: ffff88818761d900 R14: ffff88817cb31e80 R15: ffff88818761da28 [79922.056624] FS: 0000000000000000(0000) GS:ffff88819e240000(0063) knlGS:00000000f793c880 [79922.056628] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [79922.056631] CR2: 0000000008064670 CR3: 000000018bdc6001 CR4: 00000000000606a0 [79950.522519] wlan1: deauthenticating from 00:00:00:00:00:01 by local choice (Reason: 3=DEAUTH_LEAVING) [80077.424202] wlan1: authenticate with 00:00:00:00:00:01 [80077.428304] wlan1: send auth to 00:00:00:00:00:01 (try 1/3) [80077.430713] wlan1: authenticated [80077.433089] wlan1: associate with 00:00:00:00:00:01 (try 1/3) [80077.437077] wlan1: RX AssocResp from 00:00:00:00:00:01 (capab=0x401 status=0 aid=4) [80077.440273] wlan1: associated
#1 is not even the same list as #2/#3, nor are these out-of-the-way corners of the driver :| Please do grab https://cgit.freedesktop.org/drm/drm-tip and enable CONFIG_DRM_I915_DEBUG_GEM in case there that finds a common root cause.
Ok, let me try. It is strange that DEBUG_GEM depends on WERROR in Kconfig. Kernel is compiling now.
I have ran this kernel (drm-tip with debugging enabled) for few hours, and it panicked -- as seen on the capslock led. Unfortunately, nothing was in the syslog after reboot.
I just got another crash, while switching consoles. The previous one might also be while switching consoles. Anyway, blinking capslock, but nothing in the syslog after reboot :-(.
Another day, another crash, also while switching console. Unfortunately, blinking capslock... so I assume there will be nothing in the syslog.
Do you have a possibility of using netconsole or usb serial adapter to capture the dmesg live from an another machine? This way the extra debug information could be captured which should give a hint about what goes wrong. Can you provoke the error more quickly if you keep on switching consoles back and forth?
So based on search this is SNB/2520M?
(In reply to Jani Saarinen from comment #10) > So based on search this is SNB/2520M? And on CI we have similar system: https://intel-gfx-ci.01.org/tree/drm-tip/fi-snb-2520m.html = https://intel-gfx-ci.01.org/hardware.html#fi-snb-2520m And with latest drm-tip seems to work properly.
Ok, so am running youtube on one console, flightgear on other one.. and am switching between them: date; while true; do DISPLAY=:0.0 xdotool key F1; sleep .1; DISPLAY=:0.0 xdotool key F2; sleep .2; done ...5 minutes of test so far, and no crash. But this is not exactly representive of my "usual" load -- too much CPU use and not enough swapping....
Next crash, this time running flightgear and no console switching involved. Panic -- capslock blinks.
Ok, I have got netconsole running. It did not get everything, but it seems like it got the important parts: usb 2-1.1.4: New USB device strings: Mfr=0, Product=2, SerialNumber=0 usb 2-1.1.4: Product: CSR8510 A10 ------------[ cut here ]------------ ODEBUG: init destroyed (active state 0) object type: i915_sw_fence hint: submit_notify+0x0/0xa8 Modules linked in: netconsole [last unloaded: netconsole] CPU: 3 PID: 3568 Comm: Xorg Not tainted 4.20.0+ #3 Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018 RIP: 0010:debug_print_object+0x72/0x90 Code: 10 83 c2 01 4c 89 ee 48 c7 c7 a8 62 5f 85 89 15 1c 3d 03 02 8b 4b 14 4d 8b 04 24 48 8b 14 c5 c0 a0 24 85 31 c0 e8 8e 49 cb ff <0f> 0b 5b 83 05 e8 82 57 01 01 41 5c 41 5d 5d c3 66 66 66 66 66 2e RSP: 0000:ffffc900007a3a20 EFLAGS: 00213082 RAX: 0000000000000000 RBX: ffff88816c9ae308 RCX: 0000000000000006 RDX: 0000000000000007 RSI: ffff8881909e28e0 RDI: ffff88819e2e53d0 RBP: ffffc900007a3a38 R08: ffff8881909e28b8 R09: 0000000000000000 R10: 0000000060b7c18f R11: 00000001939c7401 R12: ffffffff8588aa40 R13: ffffffff856f0081 R14: ffffffff86468280 R15: 0000000000203202
(This time it was flightgear and mplayer, at the same time).
Created attachment 143018 [details] Complete dmesg showing the two bugs.
Attaching complete dmesg. There were actually two oopses, this has them both, along with complete messages from bootup to shutdown.
Fwiw, the signal handling code (which is presumably the chief suspect here with the exchange of semaphores for interrupts) has been substantially modified in drm-tip: https://cgit.freedesktop.org/drm-tip.
Treating as fixed as the likely suspect code has been replaced.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.