Bug 109175 - list_del corruption on thinkpad x220, in kernel v4.20-rc1+
Summary: list_del corruption on thinkpad x220, in kernel v4.20-rc1+
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-29 20:08 UTC by zblabunk
Modified: 2019-02-04 22:51 UTC (History)
1 user (show)

See Also:
i915 platform: SNB
i915 features:


Attachments
Complete dmesg showing the two bugs. (165.12 KB, text/plain)
2019-01-08 17:48 UTC, zblabunk
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description zblabunk 2018-12-29 20:08:31 UTC
There's an regression; v4.19 is rock solid on thinkpad x220, while v4.20-rc1+ crashes after day or two of use. It also happened on linux-next, 20181210. It is usually dies after a day or two.
Comment 1 zblabunk 2018-12-29 20:08:55 UTC
One of oopses:

> > Nov  8 18:35:01 duo CRON[28511]: (root) CMD (command -v debian-sa1 >
> > /dev/null && debian-sa
> > 1 1 1)
> > Nov  8 18:42:57 duo kernel: list_del corruption. prev->next should be
> > ffff8801742b8178, but
> >  was ffffc9000192fec8
> >  Nov  8 18:42:57 duo kernel: ------------[ cut here ]------------
> >  Nov  8 18:42:57 duo kernel: kernel BUG at
> >  /data/fast/l/k/lib/list_debug.c:53!   
> >  Nov  8 18:42:57 duo kernel: invalid opcode: 0000 [#1] SMP PTI
> >  Nov  8 18:42:57 duo kernel: CPU: 2 PID: 1082 Comm: i915/signal:1 Not 
> >  tainted 4.20.0-rc1+ #3
> >  Nov  8 18:42:57 duo kernel: Hardware name: LENOVO 42872WU/42872WU,
> >  BIOS 8DET74WW (1.44 ) 03
> >  /13/2018 
> >  Nov  8 18:42:57 duo kernel: RIP:
> >  0010:__list_del_entry_valid+0x8e/0x90
> >  Nov  8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0 48
> >  c7 c7 90 74 5e 85 e8
> >  53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8 74 5e 85 e8 40 88 d1 ff
> >  <0f> 0b 55 48 89 d0 48
> >   8b 52 08 48 89 e5 48 39 f2 75 19 48 8b 32 48
> >   Nov  8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS:
> >   00210086
> >   Nov  8 18:42:57 duo kernel: RAX: 0000000000000054 RBX:
> >   ffff8801742b8178 RCX: 00000000000000
> >   00
> >   Nov  8 18:42:57 duo kernel: RDX: 0000000000000000 RSI:
> >   ffff88019e2a53d8 RDI: ffff88019e2a53
> >   d8
> >   Nov  8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08:
> >   ffff880196e2cd10 R09: 00000000000000
> >   00
> >   Nov  8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11:
> >   3863656632393101 R12: ffffc9000196be
> >   c8
> >   Nov  8 18:42:57 duo kernel: R13: ffff88019707e000 R14:
> >   ffff8801742b8080 R15: ffffc9000192fd
> >   d0
> >   Nov  8 18:42:57 duo kernel: FS:  0000000000000000(0000)
> >   GS:ffff88019e280000(0000) knlGS:000
> >   0000000000000
> >   Nov  8 18:42:57 duo kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> >   0000000080050033
> >   Nov  8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> >   000000000581e001 CR4: 00000000000606a0
> >   Nov  8 18:42:57 duo kernel: Call Trace:
> >   Nov  8 18:42:57 duo kernel: intel_breadcrumbs_signaler+0x162/0x330
> >   Nov  8 18:42:57 duo kernel: kthread+0x116/0x150
> >   Nov  8 18:42:57 duo kernel: ? intel_engine_wakeup+0x40/0x40
> >   Nov  8 18:42:57 duo kernel: ? kthread_park+0x90/0x90
> >   Nov  8 18:42:57 duo kernel: ret_from_fork+0x35/0x40
> >   Nov  8 18:42:57 duo kernel: Modules linked in:
> >   Nov  8 18:42:57 duo kernel: ---[ end trace 2f8da183a56f80f6 ]---
> >   Nov  8 18:42:57 duo kernel: RIP:
> >   0010:__list_del_entry_valid+0x8e/0x90
> >   Nov  8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0
> >   48 c7 c7 90 74 5e 85 e8 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8
> >   74 5e 85 e8 40 88 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48
> >   39 f2 75 19 48 8b 32 48
> >   Nov  8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS:
> >   00210086
> >   Nov  8 18:42:57 duo kernel: RAX: 0000000000000054 RBX:
> >   ffff8801742b8178 RCX: 0000000000000000
> >   Nov  8 18:42:57 duo kernel: RDX: 0000000000000000 RSI:
> >   ffff88019e2a53d8 RDI: ffff88019e2a53d8
> >   Nov  8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08:
> >   ffff880196e2cd10 R09: 0000000000000000
> >   Nov  8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11:
> >   3863656632393101 R12: ffffc9000196bec8
> >   Nov  8 18:42:57 duo kernel: R13: ffff88019707e000 R14:
> >   ffff8801742b8080 R15: ffffc9000192fdd0
> >   Nov  8 18:42:57 duo kernel: FS:  0000000000000000(0000)
> >   GS:ffff88019e280000(0000) knlGS:0000000000000000
> >   Nov  8 18:42:57 duo kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> >   0000000080050033
> >   Nov  8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> >   000000000581e001 CR4: 00000000000606a0
> >   0000000000000
> >   Nov  8 18:42:57 duo kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> >   0000000080050033
> >   Nov  8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> >   000000000581e001 CR4: 00000000000606a0
> >   Nov  8 18:42:57 duo kernel: Call Trace:
> >   Nov  8 18:42:57 duo kernel: intel_breadcrumbs_signaler+0x162/0x330
> >   Nov  8 18:42:57 duo kernel: kthread+0x116/0x150
> >   Nov  8 18:42:57 duo kernel: ? intel_engine_wakeup+0x40/0x40
> >   Nov  8 18:42:57 duo kernel: ? kthread_park+0x90/0x90
> >   Nov  8 18:42:57 duo kernel: ret_from_fork+0x35/0x40
> >   Nov  8 18:42:57 duo kernel: Modules linked in:
> >   Nov  8 18:42:57 duo kernel: ---[ end trace 2f8da183a56f80f6 ]---
> >   Nov  8 18:42:57 duo kernel: RIP:
> >   0010:__list_del_entry_valid+0x8e/0x90
> >   Nov  8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0
> >   48 c7 c7 90 74 5e 85 e8 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8
> >   74 5e 85 e8 40 88 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48
> >   39 f2 75 19 48 8b 32 48
> >   Nov  8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS:
> >   00210086
> >   Nov  8 18:42:57 duo kernel: RAX: 0000000000000054 RBX:
> >   ffff8801742b8178 RCX: 0000000000000000
> >   Nov  8 18:42:57 duo kernel: RDX: 0000000000000000 RSI:
> >   ffff88019e2a53d8 RDI: ffff88019e2a53d8
> >   Nov  8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08:
> >   ffff880196e2cd10 R09: 0000000000000000
> >   Nov  8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11:
> >   3863656632393101 R12: ffffc9000196bec8
> >   Nov  8 18:42:57 duo kernel: R13: ffff88019707e000 R14:
> >   ffff8801742b8080 R15: ffffc9000192fdd0
> >   Nov  8 18:42:57 duo kernel: FS:  0000000000000000(0000)
> >   GS:ffff88019e280000(0000) knlGS:0000000000000000
> >   Nov  8 18:42:57 duo kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> >   0000000080050033
> >   Nov  8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> >   000000000581e001 CR4: 00000000000606a0
Comment 2 zblabunk 2018-12-29 20:09:46 UTC
Second:

Dec  8 11:45:01 duo CRON[29325]: (root) CMD (command -v debian-sa1 >
/dev/null && debian-sa1 1 1)
Dec  8 11:46:42 duo
org.mate.panel.applet.MateWeatherAppletFactory[3983]:
(mateweather-applet-2:4242): GLib-CRITICAL **: Source ID 14603 was not
found
 when attempting to remove it
 Dec  8 11:54:59 duo kernel: list_del corruption. prev->next should be
 ffff88019283ea28, but was ffff8801411a1c68
 Dec  8 11:54:59 duo kernel: ------------[ cut here ]------------
 Dec  8 11:54:59 duo kernel: kernel BUG at
 /data/fast/l/k/lib/list_debug.c:53!
 Dec  8 11:54:59 duo kernel: invalid opcode: 0000 [#1] SMP PTI
 Dec  8 11:54:59 duo kernel: CPU: 1 PID: 3428 Comm: Xorg Not tainted
 4.20.0-rc1+ #4
 Dec  8 11:54:59 duo kernel: Hardware name: LENOVO 42872WU/42872WU,
 BIOS 8DET74WW (1.44 ) 03/13/2018
 Dec  8 11:54:59 duo kernel: RIP:
 0010:__list_del_entry_valid+0x8e/0x90
 Dec  8 11:54:59 duo kernel: Code: 16 88 d1 ff 0f 0b 48 89 fe 31 c0 48
 c7 c7 08 75 5e 85 e8 03 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 40 75
 5e 85 e8 f0
  87 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48 39 f2 75 19 48  
  8b 32 48
  Dec  8 11:54:59 duo kernel: RSP: 0000:ffffc90000223ac0 EFLAGS:
  00213282
  Dec  8 11:54:59 duo kernel: RAX: 0000000000000054 RBX:
  ffff880115a07c40 RCX: 0000000000000000
  Dec  8 11:54:59 duo kernel: RDX: 0000000000000000 RSI:
  ffff88019e2653d8 RDI: ffff88019e2653d8
  Dec  8 11:54:59 duo kernel: RBP: ffffc90000223ac0 R08:
  ffff880193a2ad10 R09: 0000000000000000
  Dec  8 11:54:59 duo kernel: R10: 00000000008e9088 R11:
  2e6e6f6974707501 R12: ffff8801960cb240
  Dec  8 11:54:59 duo kernel: R13: ffff88019283e900 R14:
  ffff880115a07ec0 R15: ffff88019283ea28
  Dec  8 11:54:59 duo kernel: FS:  0000000000000000(0000)
  GS:ffff88019e240000(0063) knlGS:00000000f79c4880
  Dec  8 11:54:59 duo kernel: CS:  0010 DS: 002b ES: 002b CR0:
  0000000080050033  
  Dec  8 11:54:59 duo kernel: CR2: 00000000086b0df8 CR3:
  00000001939f6004 CR4: 00000000000606a0
  Dec  8 11:54:59 duo kernel: Call Trace:
  Dec  8 11:54:59 duo kernel: i915_vma_move_to_active+0x1c3/0x510
  Dec  8 11:54:59 duo kernel: ? i915_request_await_object+0xf4/0x280
  Dec  8 11:54:59 duo kernel: i915_gem_do_execbuffer+0xe2f/0x10a0
  Dec  8 11:54:59 duo kernel: ? find_held_lock+0x39/0xb0
  Dec  8 11:54:59 duo kernel: ? kvmalloc_node+0x26/0x70
  Dec  8 11:54:59 duo kernel: i915_gem_execbuffer2_ioctl+0x1b4/0x360
  Dec  8 11:54:59 duo kernel: ? i915_gem_execbuffer_ioctl+0x290/0x290
  Dec  8 11:54:59 duo kernel: drm_ioctl_kernel+0xaa/0xf0
  Dec  8 11:54:59 duo kernel: drm_ioctl+0x323/0x3d0
  Dec  8 11:54:59 duo kernel: ? i915_gem_execbuffer_ioctl+0x290/0x290
  Dec  8 11:54:59 duo kernel: ? posix_ktime_get_ts+0xc/0x10
  Dec  8 11:54:59 duo kernel: i915_compat_ioctl+0x37/0x40
  Dec  8 11:54:59 duo kernel: __ia32_compat_sys_ioctl+0x429/0xe90
  Dec  8 11:54:59 duo kernel: ? put_old_timespec32+0x9/0x10
  Dec  8 11:54:59 duo kernel: ?
  __ia32_compat_sys_clock_gettime+0x67/0x90
  Dec  8 11:54:59 duo kernel: do_int80_syscall_32+0x50/0x100
  Dec  8 11:54:59 duo kernel: entry_INT80_compat+0x7d/0x82
  Dec  8 11:54:59 duo kernel: RIP: 0023:0xf7fd5c42
  Dec  8 11:54:59 duo kernel: Code: 65 8b 15 04 00 00 00 8b 0e 8b 0c
  ca 83 f9 ff 75 0c 89 04 24 89 f0 e8 b3 fe ff ff eb 05 8b 46 04 01 c8
  83 c4 14 5b 5e c3 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00
  8b 1c 24 c3 8d b6 00 00
  Dec  8 11:54:59 duo kernel: RSP: 002b:00000000fff1a014 EFLAGS:
  00203292 ORIG_RAX: 0000000000000036
  Dec  8 11:54:59 duo kernel: RAX: ffffffffffffffda RBX:
  000000000000000a RCX: 0000000040406469
  Dec  8 11:54:59 duo kernel: RDX: 00000000fff1a0bc RSI:
  0000000000000000 RDI: 0000000040406469
  Dec  8 11:54:59 duo kernel: RBP: 000000000000000a R08:
  0000000000000000 R09: 0000000000000000
  Dec  8 11:54:59 duo kernel: R10: 0000000000000000 R11:
  0000000000000000 R12: 0000000000000000
  Dec  8 11:54:59 duo kernel: R13: 0000000000000000 R14:
  0000000000000000 R15: 0000000000000000
  Dec  8 11:54:59 duo kernel: Modules linked in:
  Dec  8 11:54:59 duo kernel: ---[ end trace 0c1e74ccc719c763 ]---
  Dec  8 11:54:59 duo kernel: RIP:
  0010:__list_del_entry_valid+0x8e/0x90
  Dec  8 11:54:59 duo kernel: Code: 16 88 d1 ff 0f 0b 48 89 fe 31 c0
  48 c7 c7 08 75 5e 85 e8 03 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 40
  75 5e 85 e8 f0 87 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48
  39 f2 75 19 48 8b 32 48
  Dec  8 11:54:59 duo kernel: RSP: 0000:ffffc90000223ac0 EFLAGS:
  00213282
  Dec  8 11:54:59 duo kernel: RAX: 0000000000000054 RBX:
  ffff880115a07c40 RCX: 0000000000000000
  Dec  8 11:54:59 duo kernel: RDX: 0000000000000000 RSI:
  ffff88019e2653d8 RDI: ffff88019e2653d8
  Dec  8 11:54:59 duo kernel: RBP: ffffc90000223ac0 R08:
  ffff880193a2ad10 R09: 0000000000000000
  Dec  8 11:54:59 duo kernel: R10: 00000000008e9088 R11:
  2e6e6f6974707501 R12: ffff8801960cb240
  Dec  8 11:54:59 duo kernel: R13: ffff88019283e900 R14:
  ffff880115a07ec0 R15: ffff88019283ea28
  Dec  8 11:54:59 duo kernel: FS:  0000000000000000(0000)
  GS:ffff88019e240000(0063) knlGS:00000000f79c4880
  Dec  8 11:54:59 duo kernel: CS:  0010 DS: 002b ES: 002b CR0:
  0000000080050033
  Dec  8 11:54:59 duo kernel: CR2: 00000000086b0df8 CR3:
  00000001939f6004 CR4: 00000000000606a0
  Dec  8 11:54:59 duo org.mate.panel.applet.WnckletFactory[3983]:
  wnck-applet: Fatal IO error 11 (Resource temporarily unavailable) on
  X server :0.
  Dec  8 11:54:59 duo
  org.mate.panel.applet.MateWeatherAppletFactory[3983]:
  mateweather-applet-2: Fatal IO error 11 (Resource temporarily
  unavailable) on X server :0.
  Dec  8 11:55:00 duo
  org.mate.panel.applet.CommandAppletFactory[3983]: command-applet:
  Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
  Dec  8 11:55:00 duo
  org.mate.panel.applet.NotificationAreaAppletFactory[3983]:
  notification-area-applet: Fatal IO error 11 (Resource temporarily
  unavailable) on X server :0.
  Dec  8 11:55:00 duo org.mate.panel.applet.ClockAppletFactory[3983]:
  clock-applet: Fatal IO error 11 (Resource temporarily unavailable)
  on X server :0.
  Dec  8 11:55:01 duo CRON[30056]: (root) CMD (command -v debian-sa1 >
  /dev/null && debian-sa1 1 1)
  Dec  8 11:55:02 duo
  org.mate.panel.applet.InhibitAppletFactory[3983]:
  mate-inhibit-applet: Fatal IO error 11 (Resource temporarily
  unavailable) on X server :0.
  Dec  8 11:55:09 duo org.a11y.atspi.Registry[4114]: XIO:  fatal IO
  error 11 (Resource temporarily unavailable) on X server ":0"
Comment 3 zblabunk 2018-12-29 20:11:15 UTC
Another one.

fgfs always segfaults, thats "normal".

[79648.504664] fgfs[6740]: segfault at 8b58308 ip 0000000008b58308 sp 00000000ffd1327c error 15
[79648.504674] Code: 00 00 00 00 00 00 29 00 00 00 f0 83 b5 08 38 82 b5 08 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 a8 82 b5 08 00 00 00 00 <00> 00 00 00 11 00 00 00 08 8d b5 08 00 00 00 00 00 00 00 00 21 00
[79922.056325] list_del corruption. next->prev should be ffff88818761da28, but was ffff88817cb31e80
[79922.056348] ------------[ cut here ]------------
[79922.056355] kernel BUG at /data/fast/l/k/lib/list_debug.c:56!
[79922.056368] invalid opcode: 0000 [#1] SMP PTI
[79922.056375] CPU: 1 PID: 3478 Comm: Xorg Not tainted 4.20.0-rc6-next-20181210 #3
[79922.056379] Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018
[79922.056391] RIP: 0010:__list_del_entry_valid+0x52/0x90
[79922.056396] Code: 41 48 8b 12 48 39 d7 75 4c 48 8b 50 08 48 39 d7 75 07 b8 01 00 00 00 5d c3 48 89 fe 31 c0 48 c7 c7 20 9f 5e 85 e8 5c bb d1 ff <0f> 0b 48 89 c2 48 89 fe 31 c0 48 c7 c7 50 9e 5e 85 e8 46 bb d1 ff
[79922.056401] RSP: 0000:ffffc90000287ac0 EFLAGS: 00213282
[79922.056406] RAX: 0000000000000054 RBX: ffff88817cb31c00 RCX: 0000000000000000
[79922.056410] RDX: 0000000000000000 RSI: ffff88819e2653d8 RDI: ffff88819e2653d8
[79922.056413] RBP: ffffc90000287ac0 R08: ffff8881944c1090 R09: 0000000000000000
[79922.056417] R10: 00000000008e9088 R11: 3038653133626301 R12: ffff888187769b40
[79922.056421] R13: ffff88818761d900 R14: ffff88817cb31e80 R15: ffff88818761da28
[79922.056425] FS:  0000000000000000(0000) GS:ffff88819e240000(0063) knlGS:00000000f793c880
[79922.056430] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[79922.056433] CR2: 0000000008064670 CR3: 000000018bdc6001 CR4: 00000000000606a0
[79922.056436] Call Trace:
[79922.056447]  i915_vma_move_to_active+0x1c3/0x530
[79922.056454]  ? i915_request_await_object+0xf4/0x280
[79922.056461]  i915_gem_do_execbuffer+0xe28/0x1090
[79922.056470]  ? find_held_lock+0x39/0xb0
[79922.056478]  ? kvmalloc_node+0x26/0x70
[79922.056484]  i915_gem_execbuffer2_ioctl+0x1b4/0x360
[79922.056489]  ? i915_gem_execbuffer_ioctl+0x290/0x290
[79922.056494]  ? i915_gem_execbuffer_ioctl+0x290/0x290
[79922.056500]  drm_ioctl_kernel+0xaa/0xf0
[79922.056505]  drm_ioctl+0x31b/0x3c0
[79922.056510]  ? i915_gem_execbuffer_ioctl+0x290/0x290
[79922.056517]  i915_compat_ioctl+0x37/0x40
[79922.056523]  __ia32_compat_sys_ioctl+0x429/0xe90
[79922.056528]  ? ksys_read+0x53/0xc0
[79922.056536]  do_int80_syscall_32+0x50/0x100
[79922.056543]  entry_INT80_compat+0x7d/0x82
[79922.056548] RIP: 0023:0xf7f4dc42
[79922.056552] Code: 65 8b 15 04 00 00 00 8b 0e 8b 0c ca 83 f9 ff 75 0c 89 04 24 89 f0 e8 b3 fe ff ff eb 05 8b 46 04 01 c8 83 c4 14 5b 5e c3 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 1c 24 c3 8d b6 00 00
[79922.056556] RSP: 002b:00000000fff8c764 EFLAGS: 00203292 ORIG_RAX: 0000000000000036
[79922.056561] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 0000000040406469
[79922.056565] RDX: 00000000fff8c80c RSI: 0000000000000000 RDI: 0000000040406469
[79922.056569] RBP: 000000000000000a R08: 0000000000000000 R09: 0000000000000000
[79922.056572] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[79922.056576] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[79922.056580] Modules linked in:
[79922.056587] ---[ end trace 5702775963af313b ]---
[79922.056593] RIP: 0010:__list_del_entry_valid+0x52/0x90
[79922.056597] Code: 41 48 8b 12 48 39 d7 75 4c 48 8b 50 08 48 39 d7 75 07 b8 01 00 00 00 5d c3 48 89 fe 31 c0 48 c7 c7 20 9f 5e 85 e8 5c bb d1 ff <0f> 0b 48 89 c2 48 89 fe 31 c0 48 c7 c7 50 9e 5e 85 e8 46 bb d1 ff
[79922.056601] RSP: 0000:ffffc90000287ac0 EFLAGS: 00213282
[79922.056605] RAX: 0000000000000054 RBX: ffff88817cb31c00 RCX: 0000000000000000
[79922.056609] RDX: 0000000000000000 RSI: ffff88819e2653d8 RDI: ffff88819e2653d8
[79922.056612] RBP: ffffc90000287ac0 R08: ffff8881944c1090 R09: 0000000000000000
[79922.056616] R10: 00000000008e9088 R11: 3038653133626301 R12: ffff888187769b40
[79922.056619] R13: ffff88818761d900 R14: ffff88817cb31e80 R15: ffff88818761da28
[79922.056624] FS:  0000000000000000(0000) GS:ffff88819e240000(0063) knlGS:00000000f793c880
[79922.056628] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[79922.056631] CR2: 0000000008064670 CR3: 000000018bdc6001 CR4: 00000000000606a0
[79950.522519] wlan1: deauthenticating from 00:00:00:00:00:01 by local choice (Reason: 3=DEAUTH_LEAVING)
[80077.424202] wlan1: authenticate with 00:00:00:00:00:01
[80077.428304] wlan1: send auth to 00:00:00:00:00:01 (try 1/3)
[80077.430713] wlan1: authenticated
[80077.433089] wlan1: associate with 00:00:00:00:00:01 (try 1/3)
[80077.437077] wlan1: RX AssocResp from 00:00:00:00:00:01 (capab=0x401 status=0 aid=4)
[80077.440273] wlan1: associated
Comment 4 Chris Wilson 2018-12-29 20:51:50 UTC
#1 is not even the same list as #2/#3, nor are these out-of-the-way corners of the driver :|

Please do grab https://cgit.freedesktop.org/drm/drm-tip and enable CONFIG_DRM_I915_DEBUG_GEM in case there that finds a common root cause.
Comment 5 zblabunk 2018-12-31 11:39:48 UTC
Ok, let me try. It is strange that DEBUG_GEM depends on WERROR in Kconfig. Kernel is compiling now.
Comment 6 zblabunk 2018-12-31 21:05:39 UTC
I have ran this kernel (drm-tip with debugging enabled) for few hours, and it panicked -- as seen on the capslock led. Unfortunately, nothing was in the syslog after reboot.
Comment 7 zblabunk 2019-01-01 17:22:52 UTC
I just got another crash, while switching consoles. The previous one might also be while switching consoles. Anyway, blinking capslock, but nothing in the syslog after reboot :-(.
Comment 8 zblabunk 2019-01-02 10:38:18 UTC
Another day, another crash, also while switching console. Unfortunately, blinking capslock... so I assume there will be nothing in the syslog.
Comment 9 Joonas Lahtinen 2019-01-03 07:35:25 UTC
Do you have a possibility of using netconsole or usb serial adapter to capture the dmesg live from an another machine? This way the extra debug information could be captured which should give a hint about what goes wrong.

Can you provoke the error more quickly if you keep on switching consoles back and forth?
Comment 10 Jani Saarinen 2019-01-03 07:42:40 UTC
So based on search this is SNB/2520M?
Comment 11 Jani Saarinen 2019-01-03 07:47:20 UTC
(In reply to Jani Saarinen from comment #10)
> So based on search this is SNB/2520M?

And on CI we have similar system: https://intel-gfx-ci.01.org/tree/drm-tip/fi-snb-2520m.html = https://intel-gfx-ci.01.org/hardware.html#fi-snb-2520m

And with latest drm-tip seems to work properly.
Comment 12 zblabunk 2019-01-05 21:56:48 UTC
Ok, so am running youtube on one console, flightgear on other one.. and am switching between them:

date; while true; do DISPLAY=:0.0 xdotool key F1; sleep .1; DISPLAY=:0.0 xdotool key F2; sleep .2; done

...5 minutes of test so far, and no crash. But this is not exactly representive of my "usual" load -- too much CPU use and not enough swapping....
Comment 13 zblabunk 2019-01-08 16:56:57 UTC
Next crash, this time running flightgear and no console switching involved. Panic -- capslock blinks.
Comment 14 zblabunk 2019-01-08 17:38:01 UTC
Ok, I have got netconsole running. It did not get everything, but it seems like it got the important parts:

usb 2-1.1.4: New USB device strings: Mfr=0, Product=2, SerialNumber=0
usb 2-1.1.4: Product: CSR8510 A10
------------[ cut here ]------------
ODEBUG: init destroyed (active state 0) object type: i915_sw_fence hint: submit_notify+0x0/0xa8
Modules linked in: netconsole [last unloaded: netconsole]
CPU: 3 PID: 3568 Comm: Xorg Not tainted 4.20.0+ #3
Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018
RIP: 0010:debug_print_object+0x72/0x90
Code: 10 83 c2 01 4c 89 ee 48 c7 c7 a8 62 5f 85 89 15 1c 3d 03 02 8b 4b 14 4d 8b 04 24 48 8b 14 c5 c0 a0 24 85 31 c0 e8 8e 49 cb ff <0f> 0b 5b 83 05 e8 82 57 01 01 41 5c 41 5d 5d c3 66 66 66 66 66 2e
RSP: 0000:ffffc900007a3a20 EFLAGS: 00213082
RAX: 0000000000000000 RBX: ffff88816c9ae308 RCX: 0000000000000006
RDX: 0000000000000007 RSI: ffff8881909e28e0 RDI: ffff88819e2e53d0
RBP: ffffc900007a3a38 R08: ffff8881909e28b8 R09: 0000000000000000
R10: 0000000060b7c18f R11: 00000001939c7401 R12: ffffffff8588aa40
R13: ffffffff856f0081 R14: ffffffff86468280 R15: 0000000000203202
Comment 15 zblabunk 2019-01-08 17:39:20 UTC
(This time it was flightgear and mplayer, at the same time).
Comment 16 zblabunk 2019-01-08 17:48:43 UTC
Created attachment 143018 [details]
Complete dmesg showing the two bugs.
Comment 17 zblabunk 2019-01-08 17:49:26 UTC
Attaching complete dmesg. There were actually two oopses, this has them both, along with complete messages from bootup to shutdown.
Comment 18 Chris Wilson 2019-02-04 22:51:27 UTC
Fwiw, the signal handling code (which is presumably the chief suspect here with the exchange of semaphores for interrupts) has been substantially modified in drm-tip: https://cgit.freedesktop.org/drm-tip.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.