Bug 106084 - [CI] igt@.* - BUG kmalloc-2048 (Tainted: G U W ): Poison overwritten
Summary: [CI] igt@.* - BUG kmalloc-2048 (Tainted: G U W ): Poison overwritten
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-16 17:31 UTC by Martin Peres
Modified: 2018-05-22 06:12 UTC (History)
2 users (show)

See Also:
i915 platform: IVB
i915 features: display/Other


Attachments

Description Martin Peres 2018-04-16 17:31:44 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4055/fi-ivb-3520m/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4057/fi-ivb-3520m/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4057/fi-ivb-3520m/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

[  409.021307] =============================================================================
[  409.021400] BUG kmalloc-2048 (Tainted: G     U  W        ): Poison overwritten
[  409.021405] -----------------------------------------------------------------------------

[  409.021411] Disabling lock debugging due to kernel taint
[  409.021413] INFO: 0x00000000db70b4f3-0x00000000419e839a. First byte 0x0 instead of 0x6b
[  409.021419] INFO: Allocated in usb_alloc_dev+0x29/0x300 age=513 cpu=1 pid=390
[  409.021422] 	kmem_cache_alloc_trace+0x234/0x2d0
[  409.021425] 	usb_alloc_dev+0x29/0x300
[  409.021428] 	hub_event+0x27d/0x1580
[  409.021432] 	process_one_work+0x21a/0x640
[  409.021435] 	worker_thread+0x48/0x3a0
[  409.021438] 	kthread+0xfb/0x130
[  409.021442] 	ret_from_fork+0x3a/0x50
[  409.021447] INFO: Freed in device_release+0x28/0x80 age=253 cpu=1 pid=390
[  409.021451] 	kobject_put+0xb7/0x190
[  409.021454] 	hub_event+0x1da/0x1580
[  409.021457] 	process_one_work+0x21a/0x640
[  409.021460] 	worker_thread+0x1ff/0x3a0
[  409.021463] 	kthread+0xfb/0x130
[  409.021466] 	ret_from_fork+0x3a/0x50
[  409.021469] INFO: Slab 0x00000000ab3cebe1 objects=13 used=13 fp=0x          (null) flags=0x8000000000008100
[  409.021472] INFO: Object 0x0000000096211629 @offset=4776 fp=0x00000000faaf5302
Comment 3 Chris Wilson 2018-04-18 11:13:49 UTC
The question here is which usb device so we can go and throw stones at the sinner. kasan should be able to tell us where the use-after-free occurs, which hopefully will have a less generic stacktrace.

Alternatively, bisect?
Comment 4 Chris Wilson 2018-04-20 09:13:36 UTC
<3>[   56.014815] ==================================================================
<3>[   56.014947] BUG: KASAN: use-after-free in xhci_free_virt_device.part.18+0x5e4/0x650
<3>[   56.014959] Read of size 4 at addr ffff8800aaffd178 by task systemd-udevd/1516

<4>[   56.014981] CPU: 0 PID: 1516 Comm: systemd-udevd Tainted: G     U  W         4.17.0-rc1-g47f407780a2b-kasan_27+ #1
<4>[   56.014985] Hardware name: LENOVO 2356GCG/2356GCG, BIOS G7ET31WW (1.13 ) 07/02/2012
<4>[   56.014990] Call Trace:
<4>[   56.014995]  <IRQ>
<4>[   56.015004]  dump_stack+0x7c/0xbb
<4>[   56.015012]  ? xhci_free_virt_device.part.18+0x5e4/0x650
<4>[   56.015019]  print_address_description+0x65/0x270
<4>[   56.015027]  ? xhci_free_virt_device.part.18+0x5e4/0x650
<4>[   56.015035]  kasan_report+0x23e/0x360
<4>[   56.015047]  xhci_free_virt_device.part.18+0x5e4/0x650
<4>[   56.015065]  handle_cmd_completion+0x1791/0x41a0
<4>[   56.015092]  ? lock_acquire+0x138/0x3c0
<4>[   56.015098]  ? lock_acquire+0x138/0x3c0
<4>[   56.015106]  ? xhci_queue_new_dequeue_state+0x860/0x860
<4>[   56.015125]  xhci_irq+0x1c89/0x64e0
<4>[   56.015160]  ? debug_check_no_locks_freed+0x2a0/0x2a0
<4>[   56.015168]  ? finish_td+0x350/0x350
<4>[   56.015186]  ? xhci_irq+0x64e0/0x64e0
<4>[   56.015195]  __handle_irq_event_percpu+0xe5/0x6e0
<4>[   56.015212]  handle_irq_event_percpu+0x65/0x120
<4>[   56.015221]  ? __handle_irq_event_percpu+0x6e0/0x6e0
<4>[   56.015227]  ? lock_acquire+0x138/0x3c0
<4>[   56.015233]  ? handle_edge_irq+0x24/0x750
<4>[   56.015243]  ? do_raw_spin_unlock+0x4f/0x240
<4>[   56.015254]  handle_irq_event+0x9c/0x130
<4>[   56.015263]  handle_edge_irq+0x2ba/0x750
<4>[   56.015278]  handle_irq+0x39/0x50
<4>[   56.015285]  do_IRQ+0x7d/0x1a0
<4>[   56.015296]  common_interrupt+0xf/0xf
<4>[   56.015301]  </IRQ>
<4>[   56.015308] RIP: 0010:unwind_get_return_address+0x72/0x90
<4>[   56.015313] RSP: 0018:ffff8800b17ef330 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd6
<4>[   56.015322] RAX: ffffffffa65a68a2 RBX: ffff8800b17ef3c8 RCX: 0000000000000000
<4>[   56.015328] RDX: 1ffff100162fde70 RSI: ffff8800b17ef200 RDI: ffffffffa65a68a2
<4>[   56.015332] RBP: ffff8800b17ef3b0 R08: 0000000000000001 R09: 0000000000000001
<4>[   56.015337] R10: ffff8800b17efc90 R11: 000000000001e033 R12: 0000000000000000
<4>[   56.015342] R13: 0000000000000000 R14: ffff88010d7e4ec0 R15: ffff88011a18de80
<4>[   56.015358]  ? filename_lookup+0x172/0x2e0
<4>[   56.015366]  ? filename_lookup+0x172/0x2e0
<4>[   56.015378]  __save_stack_trace+0x7e/0xd0
<4>[   56.015392]  ? filename_lookup+0x172/0x2e0
<4>[   56.015404]  kasan_kmalloc+0xe4/0x170
<4>[   56.015414]  ? kmem_cache_alloc+0xdf/0x2e0
<4>[   56.015420]  ? __d_alloc+0x25/0x900
<4>[   56.015425]  ? d_alloc+0x3f/0x240
<4>[   56.015430]  ? d_alloc_parallel+0xdf/0x13e0
<4>[   56.015436]  ? __lookup_slow+0x167/0x390
<4>[   56.015442]  ? lookup_slow+0x4b/0x70
<4>[   56.015447]  ? walk_component+0x67e/0xcc0
<4>[   56.015453]  ? path_lookupat+0x1a1/0x880
<4>[   56.015466]  ? __d_alloc+0x25/0x900
<4>[   56.015472]  ? __d_alloc+0x25/0x900
<4>[   56.015479]  ? set_track+0x86/0x100
<4>[   56.015485]  ? init_object+0x66/0x80
<4>[   56.015498]  ? ___slab_alloc.constprop.35+0x232/0x3e0
<4>[   56.015505]  ? ___slab_alloc.constprop.35+0x232/0x3e0
<4>[   56.015510]  ? __d_alloc+0x25/0x900
<4>[   56.015532]  ? mark_held_locks+0xa8/0xf0
<4>[   56.015542]  ? __d_alloc+0x25/0x900
<4>[   56.015548]  ? trace_hardirqs_on_caller+0x33f/0x590
<4>[   56.015560]  ? __d_alloc+0x25/0x900
<4>[   56.015565]  kmem_cache_alloc+0xdf/0x2e0
<4>[   56.015576]  __d_alloc+0x25/0x900
<4>[   56.015590]  d_alloc+0x3f/0x240
<4>[   56.015603]  d_alloc_parallel+0xdf/0x13e0
<4>[   56.015613]  ? debug_check_no_locks_freed+0x2a0/0x2a0
<4>[   56.015629]  ? __lock_acquire+0x8a4/0x4f30
<4>[   56.015638]  ? __mutex_unlock_slowpath+0xd3/0x670
<4>[   56.015645]  ? __d_lookup_rcu+0x720/0x720
<4>[   56.015657]  ? mark_held_locks+0xa8/0xf0
<4>[   56.015670]  ? trace_hardirqs_on_caller+0x33f/0x590
<4>[   56.015680]  ? __lockdep_init_map+0xdf/0x580
<4>[   56.015688]  ? __lockdep_init_map+0xdf/0x580
<4>[   56.015704]  __lookup_slow+0x167/0x390
<4>[   56.015724]  ? follow_dotdot+0x1f0/0x1f0
<4>[   56.015752]  lookup_slow+0x4b/0x70
<4>[   56.015761]  walk_component+0x67e/0xcc0
<4>[   56.015769]  ? inode_permission+0x2c7/0x380
<4>[   56.015777]  ? lookup_fast+0x10b0/0x10b0
<4>[   56.015785]  ? link_path_walk+0x6cc/0x1240
<4>[   56.015801]  ? walk_component+0xcc0/0xcc0
<4>[   56.015821]  path_lookupat+0x1a1/0x880
<4>[   56.015826]  ? getname_flags+0x4a/0x3e0
<4>[   56.015832]  ? user_path_at_empty+0x18/0x30
<4>[   56.015841]  ? path_mountpoint+0x900/0x900
<4>[   56.015855]  ? getname_flags+0x4a/0x3e0
<4>[   56.015862]  ? getname_flags+0x4a/0x3e0
<4>[   56.015869]  ? set_track+0x86/0x100
<4>[   56.015875]  ? init_object+0x66/0x80
<4>[   56.015888]  ? ___slab_alloc.constprop.35+0x232/0x3e0
<4>[   56.015900]  filename_lookup+0x172/0x2e0
<4>[   56.015912]  ? filename_parentat+0x380/0x380
<4>[   56.015934]  ? strncpy_from_user+0x75/0x280
<4>[   56.015941]  ? getname_flags+0x4a/0x3e0
<4>[   56.015947]  ? rcu_read_lock_sched_held+0x10f/0x130
<4>[   56.015954]  ? kmem_cache_alloc+0x278/0x2e0
<4>[   56.015965]  ? getname_flags+0x88/0x3e0
<4>[   56.015981]  ? do_readlinkat+0xad/0x240
<4>[   56.015986]  do_readlinkat+0xad/0x240
<4>[   56.015997]  ? __x32_compat_sys_newfstat+0x70/0x70
<4>[   56.016007]  ? syscall_trace_enter+0x27e/0x880
<4>[   56.016013]  ? do_faccessat+0x36d/0x570
<4>[   56.016021]  ? syscall_slow_exit_work+0x400/0x400
<4>[   56.016040]  __x64_sys_readlinkat+0x8e/0xf0
<4>[   56.016049]  do_syscall_64+0x97/0x400
<4>[   56.016060]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[   56.016065] RIP: 0033:0x7f869cfbcd1a
<4>[   56.016070] RSP: 002b:00007ffe3fe72f08 EFLAGS: 00000202 ORIG_RAX: 000000000000010b
<4>[   56.016079] RAX: ffffffffffffffda RBX: 000056128e58ec10 RCX: 00007f869cfbcd1a
<4>[   56.016083] RDX: 000056128e58ec10 RSI: 00007ffe3fe72f90 RDI: 00000000ffffff9c
<4>[   56.016088] RBP: 0000000000000064 R08: 000000000000fefe R09: 0000000000000018
<4>[   56.016092] R10: 0000000000000063 R11: 0000000000000202 R12: 00007ffe3fe72f90
<4>[   56.016097] R13: 00000000ffffff9c R14: 00007ffe3fe72f60 R15: 0000000000000063

<3>[   56.016130] Allocated by task 153:
<4>[   56.016140]  kmem_cache_alloc_trace+0x125/0x300
<4>[   56.016147]  usb_alloc_dev+0x50/0xc70
<4>[   56.016153]  hub_event+0x10b9/0x3370
<4>[   56.016159]  process_one_work+0x6f8/0x1600
<4>[   56.016164]  worker_thread+0xc9/0xc20
<4>[   56.016170]  kthread+0x30c/0x3d0
<4>[   56.016175]  ret_from_fork+0x3a/0x50

<3>[   56.016187] Freed by task 153:
<4>[   56.016196]  kfree+0xe9/0x310
<4>[   56.016202]  device_release+0x6e/0x1d0
<4>[   56.016208]  kobject_put+0x14b/0x400
<4>[   56.016213]  hub_event+0xfc9/0x3370
<4>[   56.016218]  process_one_work+0x6f8/0x1600
<4>[   56.016223]  worker_thread+0x5dd/0xc20
<4>[   56.016229]  kthread+0x30c/0x3d0
<4>[   56.016234]  ret_from_fork+0x3a/0x50

<3>[   56.016247] The buggy address belongs to the object at ffff8800aaffcb08
                   which belongs to the cache kmalloc-2048 of size 2048
<3>[   56.016257] The buggy address is located 1648 bytes inside of
                   2048-byte region [ffff8800aaffcb08, ffff8800aaffd308)
<3>[   56.016266] The buggy address belongs to the page:
<0>[   56.016276] page:ffffea0002abfe00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
<0>[   56.016293] flags: 0x4000000000008100(slab|head)
<1>[   56.016303] raw: 4000000000008100 0000000000000000 0000000000000000 00000001000d000d
<1>[   56.016314] raw: ffffea0002ab6a20 ffffea0002a45e20 ffff88011a0113c0 0000000000000000
<1>[   56.016322] page dumped because: kasan: bad access detected

<3>[   56.016338] Memory state around the buggy address:
<3>[   56.016347]  ffff8800aaffd000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[   56.016356]  ffff8800aaffd080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[   56.016365] >ffff8800aaffd100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[   56.016373]                                                                 ^
<3>[   56.016382]  ffff8800aaffd180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[   56.016391]  ffff8800aaffd200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[   56.016399] ==================================================================
Comment 5 Guenter Roeck 2018-04-25 18:25:29 UTC
Same problem seen with v4.17-rc2 when inserting a USB Type-C dongle.

[ 1657.051472] usb 2-2: new SuperSpeed USB device number 5 using xhci_hcd
[ 1657.084589] usb 2-2: device descriptor read/8, error -71
[ 1657.199254] usb 2-2: new SuperSpeed USB device number 5 using xhci_hcd
[ 1657.319426] usb 2-2: device descriptor read/8, error -71
[ 1657.453806] ==================================================================
[ 1657.462113] BUG: KASAN: use-after-free in xhci_free_virt_device+0x33b/0x38e
[ 1657.469911] Read of size 4 at addr ffff88040e82b550 by task kworker/3:3/2085

[ 1657.479477] CPU: 3 PID: 2085 Comm: kworker/3:3 Not tainted 4.17.0-rc2-00001-g41e284e58369-dirty #10
[ 1657.489598] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.95.0 09/27/2017
[ 1657.497782] Workqueue: usb_hub_wq hub_event
[ 1657.502469] Call Trace:
[ 1657.505212]  <IRQ>
[ 1657.507469]  dump_stack+0x7d/0xbd
[ 1657.511184]  print_address_description+0x80/0x2d2
[ 1657.516443]  ? xhci_free_virt_device+0x33b/0x38e
[ 1657.521619]  kasan_report+0x26a/0x2aa
[ 1657.525721]  xhci_free_virt_device+0x33b/0x38e
[ 1657.530695]  handle_cmd_completion+0x5e6/0x1f19
[ 1657.535768]  ? lock_acquire+0x1f5/0x22b
[ 1657.540071]  ? match_held_lock+0x1d/0xff
[ 1657.544466]  xhci_irq+0x20c7/0x2284
[ 1657.548371]  ? match_held_lock+0x1d/0xff
[ 1657.552766]  ? xhci_irq+0x2284/0x2284
[ 1657.556874]  __handle_irq_event_percpu+0x1da/0x424
[ 1657.562238]  handle_irq_event_percpu+0x34/0x8f
[ 1657.567212]  handle_irq_event+0x59/0x89
[ 1657.571514]  handle_edge_irq+0x13e/0x188
[ 1657.575921]  handle_irq+0x19f/0x1b0
[ 1657.579823]  do_IRQ+0x8b/0xfa
[ 1657.583144]  common_interrupt+0xf/0xf
[ 1657.587244]  </IRQ>
[ 1657.589600] RIP: 0010:__asan_load4+0x63/0x84
[ 1657.594379] RSP: 0018:ffff8804149af7d8 EFLAGS: 00000a06 ORIG_RAX: ffffffffffffffdc
[ 1657.602853] RAX: 1ffff10082935f1d RBX: ffff8804149af8e8 RCX: ffffffff9f2e52a7
[ 1657.610841] RDX: 0000000000000008 RSI: 0000000000000003 RDI: ffff8804149af8e8
[ 1657.618838] RBP: ffff8804149af7d8 R08: dffffc0000000000 R09: ffffed0081d055dd
[ 1657.626828] R10: fffffbfff4198620 R11: ffffffffa0cc30fd R12: 0000000000000008
[ 1657.634811] R13: ffff8804149afc98 R14: ffff8804149b0000 R15: ffff8804149a8000
[ 1657.642802]  ? on_stack+0x38/0x71
[ 1657.646514]  ? stack_access_ok+0x17/0x41
[ 1657.650903]  on_stack+0x38/0x71
[ 1657.654424]  ? device_release+0x9b/0xda
[ 1657.658719]  stack_access_ok+0x17/0x41
[ 1657.662915]  deref_stack_reg+0x1d/0x44
[ 1657.667127]  ? unwind_next_frame+0x65f/0x7a0
[ 1657.671913]  unwind_next_frame+0x674/0x7a0
[ 1657.676502]  ? kobject_put+0x9f/0xb9
[ 1657.680500]  ? kobject_put+0x9f/0xb9
[ 1657.684501]  __save_stack_trace+0xbf/0xe2
[ 1657.688992]  ? kobject_put+0x9f/0xb9
[ 1657.692998]  ? kfree+0x1d9/0x26f
[ 1657.696610]  save_stack+0x46/0xce
[ 1657.700319]  ? __kasan_slab_free+0x102/0x126
[ 1657.705105]  ? slab_free_freelist_hook+0x84/0xd1
[ 1657.710285]  ? kfree+0x1d9/0x26f
[ 1657.713898]  ? device_release+0x9b/0xda
[ 1657.718191]  ? look_up_lock_class+0x104/0x127
[ 1657.723073]  ? register_lock_class+0x4a2/0x507
[ 1657.728067]  ? hlock_class+0x67/0x85
[ 1657.732069]  ? mark_lock+0x3a/0x27a
[ 1657.735974]  ? lock_acquire+0x1f5/0x22b
[ 1657.740271]  ? lookup_chain_cache+0x4c/0x76
[ 1657.744956]  ? __lock_acquire+0x13d9/0x1522
[ 1657.749637]  ? match_held_lock+0x1d/0xff
[ 1657.754051]  ? hlock_class+0x67/0x85
[ 1657.758059]  ? mark_lock+0x3a/0x27a
[ 1657.761965]  ? mark_held_locks+0x30/0x87
[ 1657.766357]  __kasan_slab_free+0x102/0x126
[ 1657.770948]  slab_free_freelist_hook+0x84/0xd1
[ 1657.775926]  kfree+0x1d9/0x26f
[ 1657.779345]  ? device_release+0x9b/0xda
[ 1657.783637]  device_release+0x9b/0xda
[ 1657.787743]  kobject_put+0x9f/0xb9
[ 1657.791555]  hub_event+0x1058/0x1626
[ 1657.795558]  ? xhci_address_device+0x14/0x14
[ 1657.800336]  process_one_work+0x423/0x761
[ 1657.804830]  worker_thread+0x2ec/0x469
[ 1657.809046]  ? cancel_delayed_work+0xdd/0xdd
[ 1657.813827]  kthread+0x1d2/0x1e1
[ 1657.817439]  ? kthread_flush_work+0x118/0x118
[ 1657.822322]  ret_from_fork+0x3a/0x50

[ 1657.827994] Allocated by task 2085:
[ 1657.831897]  kasan_kmalloc+0x99/0xa8
[ 1657.835902]  kmem_cache_alloc_trace+0x10d/0x133
[ 1657.840978]  usb_alloc_dev+0x41/0x551
[ 1657.845070]  hub_event+0x9d2/0x1626
[ 1657.848995]  process_one_work+0x423/0x761
[ 1657.853487]  worker_thread+0x2ec/0x469
[ 1657.857683]  kthread+0x1d2/0x1e1
[ 1657.861295]  ret_from_fork+0x3a/0x50

[ 1657.866961] Freed by task 2085:
[ 1657.870482]  __kasan_slab_free+0x102/0x126
[ 1657.875071]  slab_free_freelist_hook+0x84/0xd1
[ 1657.880066]  kfree+0x1d9/0x26f
[ 1657.883484]  __kfree_skb+0x30/0x3a
[ 1657.887296]  unix_stream_read_generic+0xa61/0xb09
[ 1657.892563]  unix_stream_recvmsg+0x53/0x69
[ 1657.897146]  ___sys_recvmsg+0x167/0x289
[ 1657.901437]  __sys_recvmsg+0x63/0xa2
[ 1657.905444]  do_syscall_64+0x74/0x94
[ 1657.909449]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[ 1657.916786] The buggy address belongs to the object at ffff88040e82aee8
                which belongs to the cache kmalloc-2048 of size 2048
[ 1657.930989] The buggy address is located 1640 bytes inside of
                2048-byte region [ffff88040e82aee8, ffff88040e82b6e8)
[ 1657.944320] The buggy address belongs to the page:
[ 1657.949685] page:ffffea00103a0a00 count:1 mapcount:0 mapping:0000000000000000 index:0xffff88040e828008 compound_mapcount: 0
[ 1657.962146] flags: 0x8000000000008100(slab|head)
[ 1657.967317] raw: 8000000000008100 0000000000000000 ffff88040e828008 00000001000d000c
[ 1657.975986] raw: ffffea00108fe420 ffff88042d403200 ffff88042d40d0c0 0000000000000000
[ 1657.984650] page dumped because: kasan: bad access detected

[ 1657.992554] Memory state around the buggy address:
[ 1657.997915]  ffff88040e82b400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1658.005999]  ffff88040e82b480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1658.014096] >ffff88040e82b500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1658.022174]                                                  ^
[ 1658.028705]  ffff88040e82b580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1658.036792]  ffff88040e82b600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1658.044873] ==================================================================

(gdb) l *xhci_free_virt_device+0x33b
0xffffffff8191b855 is in xhci_free_virt_device (/mnt/host/source/src/third_party/kernel/v4.14/drivers/usb/host/xhci-mem.c:916).
911		if (dev->in_ctx)
912			xhci_free_container_ctx(xhci, dev->in_ctx);
913		if (dev->out_ctx)
914			xhci_free_container_ctx(xhci, dev->out_ctx);
915	
916		if (dev->udev && dev->udev->slot_id)
917			dev->udev->slot_id = 0;
918		kfree(xhci->devs[slot_id]);
919		xhci->devs[slot_id] = NULL;
920	}


It appears that dev->udev has been freed.

Wonder why this is filed against drm ?
Comment 6 Guenter Roeck 2018-04-25 18:29:13 UTC
The culprit may be a400efe455f7 ("xhci: zero usb device slot_id member when disabling and freeing a xhci slot") which introduces the code in question.
Comment 7 Martin Peres 2018-04-25 21:16:30 UTC
(In reply to Guenter Roeck from comment #5)
> Same problem seen with v4.17-rc2 when inserting a USB Type-C dongle.
> 
> [...]
> 
> (gdb) l *xhci_free_virt_device+0x33b
> 0xffffffff8191b855 is in xhci_free_virt_device
> (/mnt/host/source/src/third_party/kernel/v4.14/drivers/usb/host/xhci-mem.c:
> 916).
> 911		if (dev->in_ctx)
> 912			xhci_free_container_ctx(xhci, dev->in_ctx);
> 913		if (dev->out_ctx)
> 914			xhci_free_container_ctx(xhci, dev->out_ctx);
> 915	
> 916		if (dev->udev && dev->udev->slot_id)
> 917			dev->udev->slot_id = 0;
> 918		kfree(xhci->devs[slot_id]);
> 919		xhci->devs[slot_id] = NULL;
> 920	}
> 
> 
> It appears that dev->udev has been freed.

Thanks for your analysis!

> 
> Wonder why this is filed against drm ?

It is filed against DRM because it has been caught by Intel GFX CI, and I am crawling under the failures found in there, so I first file them on DRM before to give a chance to DRM devs to check out the bugs, and report them to the right people. If they don't, then I do it when the storm passes and I get to pick up some bugs and drive them to fixes.

FYI, here are all the bugs tracked by our CI system: https://intel-gfx-ci.01.org/cibuglog/
Comment 8 Chris Wilson 2018-05-11 14:31:21 UTC
commit 44a182b9d17765514fa2b1cc911e4e65134eef93
Author: Mathias Nyman <mathias.nyman@linux.intel.com>
Date:   Thu May 3 17:30:07 2018 +0300

    xhci: Fix use-after-free in xhci_free_virt_device
    
    KASAN found a use-after-free in xhci_free_virt_device+0x33b/0x38e
    where xhci_free_virt_device() sets slot id to 0 if udev exists:
    if (dev->udev && dev->udev->slot_id)
            dev->udev->slot_id = 0;
    
    dev->udev will be true even if udev is freed because dev->udev is
    not set to NULL.
    
    set dev->udev pointer to NULL in xhci_free_dev()
    
    The original patch went to stable so this fix needs to be applied
    there as well.
    
    Fixes: a400efe455f7 ("xhci: zero usb device slot_id member when disabling and freeing a xhci slot")
Comment 9 Martin Peres 2018-05-22 06:12:14 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_31/fi-ivb-3520m/igt@kms_vblank@pipe-b-ts-continuation-suspend.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_33/fi-ivb-3520m/igt@kms_atomic@plane_invalid_params_fence.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_33/fi-ivb-3520m/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html(In reply to Chris Wilson from comment #8)
> commit 44a182b9d17765514fa2b1cc911e4e65134eef93
> Author: Mathias Nyman <mathias.nyman@linux.intel.com>
> Date:   Thu May 3 17:30:07 2018 +0300
> 
>     xhci: Fix use-after-free in xhci_free_virt_device
>     
>     KASAN found a use-after-free in xhci_free_virt_device+0x33b/0x38e
>     where xhci_free_virt_device() sets slot id to 0 if udev exists:
>     if (dev->udev && dev->udev->slot_id)
>             dev->udev->slot_id = 0;
>     
>     dev->udev will be true even if udev is freed because dev->udev is
>     not set to NULL.
>     
>     set dev->udev pointer to NULL in xhci_free_dev()
>     
>     The original patch went to stable so this fix needs to be applied
>     there as well.
>     
>     Fixes: a400efe455f7 ("xhci: zero usb device slot_id member when
> disabling and freeing a xhci slot")

Yep, it seems like it did the trick! Thanks!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.