Bug 109175 - list_del corruption on thinkpad x220, in kernel v4.20-rc1+
Summary: list_del corruption on thinkpad x220, in kernel v4.20-rc1+
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-29 20:08 UTC by zblabunk
Modified: 2019-01-08 17:49 UTC (History)
1 user (show)

See Also:
i915 platform: SNB
i915 features:


Attachments
Complete dmesg showing the two bugs. (165.12 KB, text/plain)
2019-01-08 17:48 UTC, zblabunk
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description zblabunk 2018-12-29 20:08:31 UTC
There's an regression; v4.19 is rock solid on thinkpad x220, while v4.20-rc1+ crashes after day or two of use. It also happened on linux-next, 20181210. It is usually dies after a day or two.
Comment 1 zblabunk 2018-12-29 20:08:55 UTC
One of oopses:

> > Nov  8 18:35:01 duo CRON[28511]: (root) CMD (command -v debian-sa1 >
> > /dev/null && debian-sa
> > 1 1 1)
> > Nov  8 18:42:57 duo kernel: list_del corruption. prev->next should be
> > ffff8801742b8178, but
> >  was ffffc9000192fec8
> >  Nov  8 18:42:57 duo kernel: ------------[ cut here ]------------
> >  Nov  8 18:42:57 duo kernel: kernel BUG at
> >  /data/fast/l/k/lib/list_debug.c:53!   
> >  Nov  8 18:42:57 duo kernel: invalid opcode: 0000 [#1] SMP PTI
> >  Nov  8 18:42:57 duo kernel: CPU: 2 PID: 1082 Comm: i915/signal:1 Not 
> >  tainted 4.20.0-rc1+ #3
> >  Nov  8 18:42:57 duo kernel: Hardware name: LENOVO 42872WU/42872WU,
> >  BIOS 8DET74WW (1.44 ) 03
> >  /13/2018 
> >  Nov  8 18:42:57 duo kernel: RIP:
> >  0010:__list_del_entry_valid+0x8e/0x90
> >  Nov  8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0 48
> >  c7 c7 90 74 5e 85 e8
> >  53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8 74 5e 85 e8 40 88 d1 ff
> >  <0f> 0b 55 48 89 d0 48
> >   8b 52 08 48 89 e5 48 39 f2 75 19 48 8b 32 48
> >   Nov  8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS:
> >   00210086
> >   Nov  8 18:42:57 duo kernel: RAX: 0000000000000054 RBX:
> >   ffff8801742b8178 RCX: 00000000000000
> >   00
> >   Nov  8 18:42:57 duo kernel: RDX: 0000000000000000 RSI:
> >   ffff88019e2a53d8 RDI: ffff88019e2a53
> >   d8
> >   Nov  8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08:
> >   ffff880196e2cd10 R09: 00000000000000
> >   00
> >   Nov  8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11:
> >   3863656632393101 R12: ffffc9000196be
> >   c8
> >   Nov  8 18:42:57 duo kernel: R13: ffff88019707e000 R14:
> >   ffff8801742b8080 R15: ffffc9000192fd
> >   d0
> >   Nov  8 18:42:57 duo kernel: FS:  0000000000000000(0000)
> >   GS:ffff88019e280000(0000) knlGS:000
> >   0000000000000
> >   Nov  8 18:42:57 duo kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> >   0000000080050033
> >   Nov  8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> >   000000000581e001 CR4: 00000000000606a0
> >   Nov  8 18:42:57 duo kernel: Call Trace:
> >   Nov  8 18:42:57 duo kernel: intel_breadcrumbs_signaler+0x162/0x330
> >   Nov  8 18:42:57 duo kernel: kthread+0x116/0x150
> >   Nov  8 18:42:57 duo kernel: ? intel_engine_wakeup+0x40/0x40
> >   Nov  8 18:42:57 duo kernel: ? kthread_park+0x90/0x90
> >   Nov  8 18:42:57 duo kernel: ret_from_fork+0x35/0x40
> >   Nov  8 18:42:57 duo kernel: Modules linked in:
> >   Nov  8 18:42:57 duo kernel: ---[ end trace 2f8da183a56f80f6 ]---
> >   Nov  8 18:42:57 duo kernel: RIP:
> >   0010:__list_del_entry_valid+0x8e/0x90
> >   Nov  8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0
> >   48 c7 c7 90 74 5e 85 e8 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8
> >   74 5e 85 e8 40 88 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48
> >   39 f2 75 19 48 8b 32 48
> >   Nov  8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS:
> >   00210086
> >   Nov  8 18:42:57 duo kernel: RAX: 0000000000000054 RBX:
> >   ffff8801742b8178 RCX: 0000000000000000
> >   Nov  8 18:42:57 duo kernel: RDX: 0000000000000000 RSI:
> >   ffff88019e2a53d8 RDI: ffff88019e2a53d8
> >   Nov  8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08:
> >   ffff880196e2cd10 R09: 0000000000000000
> >   Nov  8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11:
> >   3863656632393101 R12: ffffc9000196bec8
> >   Nov  8 18:42:57 duo kernel: R13: ffff88019707e000 R14:
> >   ffff8801742b8080 R15: ffffc9000192fdd0
> >   Nov  8 18:42:57 duo kernel: FS:  0000000000000000(0000)
> >   GS:ffff88019e280000(0000) knlGS:0000000000000000
> >   Nov  8 18:42:57 duo kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> >   0000000080050033
> >   Nov  8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> >   000000000581e001 CR4: 00000000000606a0
> >   0000000000000
> >   Nov  8 18:42:57 duo kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> >   0000000080050033
> >   Nov  8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> >   000000000581e001 CR4: 00000000000606a0
> >   Nov  8 18:42:57 duo kernel: Call Trace:
> >   Nov  8 18:42:57 duo kernel: intel_breadcrumbs_signaler+0x162/0x330
> >   Nov  8 18:42:57 duo kernel: kthread+0x116/0x150
> >   Nov  8 18:42:57 duo kernel: ? intel_engine_wakeup+0x40/0x40
> >   Nov  8 18:42:57 duo kernel: ? kthread_park+0x90/0x90
> >   Nov  8 18:42:57 duo kernel: ret_from_fork+0x35/0x40
> >   Nov  8 18:42:57 duo kernel: Modules linked in:
> >   Nov  8 18:42:57 duo kernel: ---[ end trace 2f8da183a56f80f6 ]---
> >   Nov  8 18:42:57 duo kernel: RIP:
> >   0010:__list_del_entry_valid+0x8e/0x90
> >   Nov  8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0
> >   48 c7 c7 90 74 5e 85 e8 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8
> >   74 5e 85 e8 40 88 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48
> >   39 f2 75 19 48 8b 32 48
> >   Nov  8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS:
> >   00210086
> >   Nov  8 18:42:57 duo kernel: RAX: 0000000000000054 RBX:
> >   ffff8801742b8178 RCX: 0000000000000000
> >   Nov  8 18:42:57 duo kernel: RDX: 0000000000000000 RSI:
> >   ffff88019e2a53d8 RDI: ffff88019e2a53d8
> >   Nov  8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08:
> >   ffff880196e2cd10 R09: 0000000000000000
> >   Nov  8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11:
> >   3863656632393101 R12: ffffc9000196bec8
> >   Nov  8 18:42:57 duo kernel: R13: ffff88019707e000 R14:
> >   ffff8801742b8080 R15: ffffc9000192fdd0
> >   Nov  8 18:42:57 duo kernel: FS:  0000000000000000(0000)
> >   GS:ffff88019e280000(0000) knlGS:0000000000000000
> >   Nov  8 18:42:57 duo kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> >   0000000080050033
> >   Nov  8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> >   000000000581e001 CR4: 00000000000606a0
Comment 2 zblabunk 2018-12-29 20:09:46 UTC
Second:

Dec  8 11:45:01 duo CRON[29325]: (root) CMD (command -v debian-sa1 >
/dev/null && debian-sa1 1 1)
Dec  8 11:46:42 duo
org.mate.panel.applet.MateWeatherAppletFactory[3983]:
(mateweather-applet-2:4242): GLib-CRITICAL **: Source ID 14603 was not
found
 when attempting to remove it
 Dec  8 11:54:59 duo kernel: list_del corruption. prev->next should be
 ffff88019283ea28, but was ffff8801411a1c68
 Dec  8 11:54:59 duo kernel: ------------[ cut here ]------------
 Dec  8 11:54:59 duo kernel: kernel BUG at
 /data/fast/l/k/lib/list_debug.c:53!
 Dec  8 11:54:59 duo kernel: invalid opcode: 0000 [#1] SMP PTI
 Dec  8 11:54:59 duo kernel: CPU: 1 PID: 3428 Comm: Xorg Not tainted
 4.20.0-rc1+ #4
 Dec  8 11:54:59 duo kernel: Hardware name: LENOVO 42872WU/42872WU,
 BIOS 8DET74WW (1.44 ) 03/13/2018
 Dec  8 11:54:59 duo kernel: RIP:
 0010:__list_del_entry_valid+0x8e/0x90
 Dec  8 11:54:59 duo kernel: Code: 16 88 d1 ff 0f 0b 48 89 fe 31 c0 48
 c7 c7 08 75 5e 85 e8 03 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 40 75
 5e 85 e8 f0
  87 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48 39 f2 75 19 48  
  8b 32 48
  Dec  8 11:54:59 duo kernel: RSP: 0000:ffffc90000223ac0 EFLAGS:
  00213282
  Dec  8 11:54:59 duo kernel: RAX: 0000000000000054 RBX:
  ffff880115a07c40 RCX: 0000000000000000
  Dec  8 11:54:59 duo kernel: RDX: 0000000000000000 RSI:
  ffff88019e2653d8 RDI: ffff88019e2653d8
  Dec  8 11:54:59 duo kernel: RBP: ffffc90000223ac0 R08:
  ffff880193a2ad10 R09: 0000000000000000
  Dec  8 11:54:59 duo kernel: R10: 00000000008e9088 R11:
  2e6e6f6974707501 R12: ffff8801960cb240
  Dec  8 11:54:59 duo kernel: R13: ffff88019283e900 R14:
  ffff880115a07ec0 R15: ffff88019283ea28
  Dec  8 11:54:59 duo kernel: FS:  0000000000000000(0000)
  GS:ffff88019e240000(0063) knlGS:00000000f79c4880
  Dec  8 11:54:59 duo kernel: CS:  0010 DS: 002b ES: 002b CR0:
  0000000080050033  
  Dec  8 11:54:59 duo kernel: CR2: 00000000086b0df8 CR3:
  00000001939f6004 CR4: 00000000000606a0
  Dec  8 11:54:59 duo kernel: Call Trace:
  Dec  8 11:54:59 duo kernel: i915_vma_move_to_active+0x1c3/0x510
  Dec  8 11:54:59 duo kernel: ? i915_request_await_object+0xf4/0x280
  Dec  8 11:54:59 duo kernel: i915_gem_do_execbuffer+0xe2f/0x10a0
  Dec  8 11:54:59 duo kernel: ? find_held_lock+0x39/0xb0
  Dec  8 11:54:59 duo kernel: ? kvmalloc_node+0x26/0x70
  Dec  8 11:54:59 duo kernel: i915_gem_execbuffer2_ioctl+0x1b4/0x360
  Dec  8 11:54:59 duo kernel: ? i915_gem_execbuffer_ioctl+0x290/0x290
  Dec  8 11:54:59 duo kernel: drm_ioctl_kernel+0xaa/0xf0
  Dec  8 11:54:59 duo kernel: drm_ioctl+0x323/0x3d0
  Dec  8 11:54:59 duo kernel: ? i915_gem_execbuffer_ioctl+0x290/0x290
  Dec  8 11:54:59 duo kernel: ? posix_ktime_get_ts+0xc/0x10
  Dec  8 11:54:59 duo kernel: i915_compat_ioctl+0x37/0x40
  Dec  8 11:54:59 duo kernel: __ia32_compat_sys_ioctl+0x429/0xe90
  Dec  8 11:54:59 duo kernel: ? put_old_timespec32+0x9/0x10
  Dec  8 11:54:59 duo kernel: ?
  __ia32_compat_sys_clock_gettime+0x67/0x90
  Dec  8 11:54:59 duo kernel: do_int80_syscall_32+0x50/0x100
  Dec  8 11:54:59 duo kernel: entry_INT80_compat+0x7d/0x82
  Dec  8 11:54:59 duo kernel: RIP: 0023:0xf7fd5c42
  Dec  8 11:54:59 duo kernel: Code: 65 8b 15 04 00 00 00 8b 0e 8b 0c
  ca 83 f9 ff 75 0c 89 04 24 89 f0 e8 b3 fe ff ff eb 05 8b 46 04 01 c8
  83 c4 14 5b 5e c3 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00
  8b 1c 24 c3 8d b6 00 00
  Dec  8 11:54:59 duo kernel: RSP: 002b:00000000fff1a014 EFLAGS:
  00203292 ORIG_RAX: 0000000000000036
  Dec  8 11:54:59 duo kernel: RAX: ffffffffffffffda RBX:
  000000000000000a RCX: 0000000040406469
  Dec  8 11:54:59 duo kernel: RDX: 00000000fff1a0bc RSI:
  0000000000000000 RDI: 0000000040406469
  Dec  8 11:54:59 duo kernel: RBP: 000000000000000a R08:
  0000000000000000 R09: 0000000000000000
  Dec  8 11:54:59 duo kernel: R10: 0000000000000000 R11:
  0000000000000000 R12: 0000000000000000
  Dec  8 11:54:59 duo kernel: R13: 0000000000000000 R14:
  0000000000000000 R15: 0000000000000000
  Dec  8 11:54:59 duo kernel: Modules linked in:
  Dec  8 11:54:59 duo kernel: ---[ end trace 0c1e74ccc719c763 ]---
  Dec  8 11:54:59 duo kernel: RIP:
  0010:__list_del_entry_valid+0x8e/0x90
  Dec  8 11:54:59 duo kernel: Code: 16 88 d1 ff 0f 0b 48 89 fe 31 c0
  48 c7 c7 08 75 5e 85 e8 03 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 40
  75 5e 85 e8 f0 87 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48
  39 f2 75 19 48 8b 32 48
  Dec  8 11:54:59 duo kernel: RSP: 0000:ffffc90000223ac0 EFLAGS:
  00213282
  Dec  8 11:54:59 duo kernel: RAX: 0000000000000054 RBX:
  ffff880115a07c40 RCX: 0000000000000000
  Dec  8 11:54:59 duo kernel: RDX: 0000000000000000 RSI:
  ffff88019e2653d8 RDI: ffff88019e2653d8
  Dec  8 11:54:59 duo kernel: RBP: ffffc90000223ac0 R08:
  ffff880193a2ad10 R09: 0000000000000000
  Dec  8 11:54:59 duo kernel: R10: 00000000008e9088 R11:
  2e6e6f6974707501 R12: ffff8801960cb240
  Dec  8 11:54:59 duo kernel: R13: ffff88019283e900 R14:
  ffff880115a07ec0 R15: ffff88019283ea28
  Dec  8 11:54:59 duo kernel: FS:  0000000000000000(0000)
  GS:ffff88019e240000(0063) knlGS:00000000f79c4880
  Dec  8 11:54:59 duo kernel: CS:  0010 DS: 002b ES: 002b CR0:
  0000000080050033
  Dec  8 11:54:59 duo kernel: CR2: 00000000086b0df8 CR3:
  00000001939f6004 CR4: 00000000000606a0
  Dec  8 11:54:59 duo org.mate.panel.applet.WnckletFactory[3983]:
  wnck-applet: Fatal IO error 11 (Resource temporarily unavailable) on
  X server :0.
  Dec  8 11:54:59 duo
  org.mate.panel.applet.MateWeatherAppletFactory[3983]:
  mateweather-applet-2: Fatal IO error 11 (Resource temporarily
  unavailable) on X server :0.
  Dec  8 11:55:00 duo
  org.mate.panel.applet.CommandAppletFactory[3983]: command-applet:
  Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
  Dec  8 11:55:00 duo
  org.mate.panel.applet.NotificationAreaAppletFactory[3983]:
  notification-area-applet: Fatal IO error 11 (Resource temporarily
  unavailable) on X server :0.
  Dec  8 11:55:00 duo org.mate.panel.applet.ClockAppletFactory[3983]:
  clock-applet: Fatal IO error 11 (Resource temporarily unavailable)
  on X server :0.
  Dec  8 11:55:01 duo CRON[30056]: (root) CMD (command -v debian-sa1 >
  /dev/null && debian-sa1 1 1)
  Dec  8 11:55:02 duo
  org.mate.panel.applet.InhibitAppletFactory[3983]:
  mate-inhibit-applet: Fatal IO error 11 (Resource temporarily
  unavailable) on X server :0.
  Dec  8 11:55:09 duo org.a11y.atspi.Registry[4114]: XIO:  fatal IO
  error 11 (Resource temporarily unavailable) on X server ":0"
Comment 3 zblabunk 2018-12-29 20:11:15 UTC
Another one.

fgfs always segfaults, thats "normal".

[79648.504664] fgfs[6740]: segfault at 8b58308 ip 0000000008b58308 sp 00000000ffd1327c error 15
[79648.504674] Code: 00 00 00 00 00 00 29 00 00 00 f0 83 b5 08 38 82 b5 08 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 a8 82 b5 08 00 00 00 00 <00> 00 00 00 11 00 00 00 08 8d b5 08 00 00 00 00 00 00 00 00 21 00
[79922.056325] list_del corruption. next->prev should be ffff88818761da28, but was ffff88817cb31e80
[79922.056348] ------------[ cut here ]------------
[79922.056355] kernel BUG at /data/fast/l/k/lib/list_debug.c:56!
[79922.056368] invalid opcode: 0000 [#1] SMP PTI
[79922.056375] CPU: 1 PID: 3478 Comm: Xorg Not tainted 4.20.0-rc6-next-20181210 #3
[79922.056379] Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018
[79922.056391] RIP: 0010:__list_del_entry_valid+0x52/0x90
[79922.056396] Code: 41 48 8b 12 48 39 d7 75 4c 48 8b 50 08 48 39 d7 75 07 b8 01 00 00 00 5d c3 48 89 fe 31 c0 48 c7 c7 20 9f 5e 85 e8 5c bb d1 ff <0f> 0b 48 89 c2 48 89 fe 31 c0 48 c7 c7 50 9e 5e 85 e8 46 bb d1 ff
[79922.056401] RSP: 0000:ffffc90000287ac0 EFLAGS: 00213282
[79922.056406] RAX: 0000000000000054 RBX: ffff88817cb31c00 RCX: 0000000000000000
[79922.056410] RDX: 0000000000000000 RSI: ffff88819e2653d8 RDI: ffff88819e2653d8
[79922.056413] RBP: ffffc90000287ac0 R08: ffff8881944c1090 R09: 0000000000000000
[79922.056417] R10: 00000000008e9088 R11: 3038653133626301 R12: ffff888187769b40
[79922.056421] R13: ffff88818761d900 R14: ffff88817cb31e80 R15: ffff88818761da28
[79922.056425] FS:  0000000000000000(0000) GS:ffff88819e240000(0063) knlGS:00000000f793c880
[79922.056430] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[79922.056433] CR2: 0000000008064670 CR3: 000000018bdc6001 CR4: 00000000000606a0
[79922.056436] Call Trace:
[79922.056447]  i915_vma_move_to_active+0x1c3/0x530
[79922.056454]  ? i915_request_await_object+0xf4/0x280
[79922.056461]  i915_gem_do_execbuffer+0xe28/0x1090
[79922.056470]  ? find_held_lock+0x39/0xb0
[79922.056478]  ? kvmalloc_node+0x26/0x70
[79922.056484]  i915_gem_execbuffer2_ioctl+0x1b4/0x360
[79922.056489]  ? i915_gem_execbuffer_ioctl+0x290/0x290
[79922.056494]  ? i915_gem_execbuffer_ioctl+0x290/0x290
[79922.056500]  drm_ioctl_kernel+0xaa/0xf0
[79922.056505]  drm_ioctl+0x31b/0x3c0
[79922.056510]  ? i915_gem_execbuffer_ioctl+0x290/0x290
[79922.056517]  i915_compat_ioctl+0x37/0x40
[79922.056523]  __ia32_compat_sys_ioctl+0x429/0xe90
[79922.056528]  ? ksys_read+0x53/0xc0
[79922.056536]  do_int80_syscall_32+0x50/0x100
[79922.056543]  entry_INT80_compat+0x7d/0x82
[79922.056548] RIP: 0023:0xf7f4dc42
[79922.056552] Code: 65 8b 15 04 00 00 00 8b 0e 8b 0c ca 83 f9 ff 75 0c 89 04 24 89 f0 e8 b3 fe ff ff eb 05 8b 46 04 01 c8 83 c4 14 5b 5e c3 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 1c 24 c3 8d b6 00 00
[79922.056556] RSP: 002b:00000000fff8c764 EFLAGS: 00203292 ORIG_RAX: 0000000000000036
[79922.056561] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 0000000040406469
[79922.056565] RDX: 00000000fff8c80c RSI: 0000000000000000 RDI: 0000000040406469
[79922.056569] RBP: 000000000000000a R08: 0000000000000000 R09: 0000000000000000
[79922.056572] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[79922.056576] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[79922.056580] Modules linked in:
[79922.056587] ---[ end trace 5702775963af313b ]---
[79922.056593] RIP: 0010:__list_del_entry_valid+0x52/0x90
[79922.056597] Code: 41 48 8b 12 48 39 d7 75 4c 48 8b 50 08 48 39 d7 75 07 b8 01 00 00 00 5d c3 48 89 fe 31 c0 48 c7 c7 20 9f 5e 85 e8 5c bb d1 ff <0f> 0b 48 89 c2 48 89 fe 31 c0 48 c7 c7 50 9e 5e 85 e8 46 bb d1 ff
[79922.056601] RSP: 0000:ffffc90000287ac0 EFLAGS: 00213282
[79922.056605] RAX: 0000000000000054 RBX: ffff88817cb31c00 RCX: 0000000000000000
[79922.056609] RDX: 0000000000000000 RSI: ffff88819e2653d8 RDI: ffff88819e2653d8
[79922.056612] RBP: ffffc90000287ac0 R08: ffff8881944c1090 R09: 0000000000000000
[79922.056616] R10: 00000000008e9088 R11: 3038653133626301 R12: ffff888187769b40
[79922.056619] R13: ffff88818761d900 R14: ffff88817cb31e80 R15: ffff88818761da28
[79922.056624] FS:  0000000000000000(0000) GS:ffff88819e240000(0063) knlGS:00000000f793c880
[79922.056628] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[79922.056631] CR2: 0000000008064670 CR3: 000000018bdc6001 CR4: 00000000000606a0
[79950.522519] wlan1: deauthenticating from 00:00:00:00:00:01 by local choice (Reason: 3=DEAUTH_LEAVING)
[80077.424202] wlan1: authenticate with 00:00:00:00:00:01
[80077.428304] wlan1: send auth to 00:00:00:00:00:01 (try 1/3)
[80077.430713] wlan1: authenticated
[80077.433089] wlan1: associate with 00:00:00:00:00:01 (try 1/3)
[80077.437077] wlan1: RX AssocResp from 00:00:00:00:00:01 (capab=0x401 status=0 aid=4)
[80077.440273] wlan1: associated
Comment 4 Chris Wilson 2018-12-29 20:51:50 UTC
#1 is not even the same list as #2/#3, nor are these out-of-the-way corners of the driver :|

Please do grab https://cgit.freedesktop.org/drm/drm-tip and enable CONFIG_DRM_I915_DEBUG_GEM in case there that finds a common root cause.
Comment 5 zblabunk 2018-12-31 11:39:48 UTC
Ok, let me try. It is strange that DEBUG_GEM depends on WERROR in Kconfig. Kernel is compiling now.
Comment 6 zblabunk 2018-12-31 21:05:39 UTC
I have ran this kernel (drm-tip with debugging enabled) for few hours, and it panicked -- as seen on the capslock led. Unfortunately, nothing was in the syslog after reboot.
Comment 7 zblabunk 2019-01-01 17:22:52 UTC
I just got another crash, while switching consoles. The previous one might also be while switching consoles. Anyway, blinking capslock, but nothing in the syslog after reboot :-(.
Comment 8 zblabunk 2019-01-02 10:38:18 UTC
Another day, another crash, also while switching console. Unfortunately, blinking capslock... so I assume there will be nothing in the syslog.
Comment 9 Joonas Lahtinen 2019-01-03 07:35:25 UTC
Do you have a possibility of using netconsole or usb serial adapter to capture the dmesg live from an another machine? This way the extra debug information could be captured which should give a hint about what goes wrong.

Can you provoke the error more quickly if you keep on switching consoles back and forth?
Comment 10 Jani Saarinen 2019-01-03 07:42:40 UTC
So based on search this is SNB/2520M?
Comment 11 Jani Saarinen 2019-01-03 07:47:20 UTC
(In reply to Jani Saarinen from comment #10)
> So based on search this is SNB/2520M?

And on CI we have similar system: https://intel-gfx-ci.01.org/tree/drm-tip/fi-snb-2520m.html = https://intel-gfx-ci.01.org/hardware.html#fi-snb-2520m

And with latest drm-tip seems to work properly.
Comment 12 zblabunk 2019-01-05 21:56:48 UTC
Ok, so am running youtube on one console, flightgear on other one.. and am switching between them:

date; while true; do DISPLAY=:0.0 xdotool key F1; sleep .1; DISPLAY=:0.0 xdotool key F2; sleep .2; done

...5 minutes of test so far, and no crash. But this is not exactly representive of my "usual" load -- too much CPU use and not enough swapping....
Comment 13 zblabunk 2019-01-08 16:56:57 UTC
Next crash, this time running flightgear and no console switching involved. Panic -- capslock blinks.
Comment 14 zblabunk 2019-01-08 17:38:01 UTC
Ok, I have got netconsole running. It did not get everything, but it seems like it got the important parts:

usb 2-1.1.4: New USB device strings: Mfr=0, Product=2, SerialNumber=0
usb 2-1.1.4: Product: CSR8510 A10
------------[ cut here ]------------
ODEBUG: init destroyed (active state 0) object type: i915_sw_fence hint: submit_notify+0x0/0xa8
Modules linked in: netconsole [last unloaded: netconsole]
CPU: 3 PID: 3568 Comm: Xorg Not tainted 4.20.0+ #3
Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018
RIP: 0010:debug_print_object+0x72/0x90
Code: 10 83 c2 01 4c 89 ee 48 c7 c7 a8 62 5f 85 89 15 1c 3d 03 02 8b 4b 14 4d 8b 04 24 48 8b 14 c5 c0 a0 24 85 31 c0 e8 8e 49 cb ff <0f> 0b 5b 83 05 e8 82 57 01 01 41 5c 41 5d 5d c3 66 66 66 66 66 2e
RSP: 0000:ffffc900007a3a20 EFLAGS: 00213082
RAX: 0000000000000000 RBX: ffff88816c9ae308 RCX: 0000000000000006
RDX: 0000000000000007 RSI: ffff8881909e28e0 RDI: ffff88819e2e53d0
RBP: ffffc900007a3a38 R08: ffff8881909e28b8 R09: 0000000000000000
R10: 0000000060b7c18f R11: 00000001939c7401 R12: ffffffff8588aa40
R13: ffffffff856f0081 R14: ffffffff86468280 R15: 0000000000203202
Comment 15 zblabunk 2019-01-08 17:39:20 UTC
(This time it was flightgear and mplayer, at the same time).
Comment 16 zblabunk 2019-01-08 17:48:43 UTC
Created attachment 143018 [details]
Complete dmesg showing the two bugs.
Comment 17 zblabunk 2019-01-08 17:49:26 UTC
Attaching complete dmesg. There were actually two oopses, this has them both, along with complete messages from bootup to shutdown.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.