Bug 112313

Summary: [CI][SHARDS]igt@gem_exec_reuse@contexts - dmesg-warn - cache_from_obj: Wrong slab cache. active_node but object is from kmalloc-\d+
Product: DRI Reporter: Lakshmi <lakshminarayana.vudum>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED MOVED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: not set    
Priority: not set CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: ICL i915 features: GEM/Other

Description Lakshmi 2019-11-18 09:06:15 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5290/shard-iclb7/igt@gem_exec_reuse@contexts.html
<4> [512.506573] cache_from_obj: Wrong slab cache. active_node but object is from kmalloc-64
<4> [512.506582] WARNING: CPU: 1 PID: 2545 at mm/slab.h:523 kmem_cache_free+0x374/0x390
<4> [512.506583] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal coretemp mei_hdcp snd_hda_intel snd_intel_dspcfg crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core cdc_ether ghash_clmulni_intel usbnet snd_pcm mii e1000e ptp pps_core mei_me mei thunderbolt prime_numbers
<4> [512.506596] CPU: 1 PID: 2545 Comm: gem_exec_reuse Tainted: G     U            5.4.0-rc7-CI-CI_DRM_7357+ #1
<4> [512.506597] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
<4> [512.506600] RIP: 0010:kmem_cache_free+0x374/0x390
<4> [512.506601] Code: ff 0f 0b e9 fc fd ff ff 49 8b 4d 58 48 8b 55 58 48 c7 c6 90 8d e1 81 48 c7 c7 48 16 0b 82 c6 05 33 74 0d 01 01 e8 ec 15 e7 ff <0f> 0b 4c 89 ed e9 b1 fc ff ff ba 01 00 00 00 e9 b7 fe ff ff 0f 1f
<4> [512.506603] RSP: 0018:ffffc9000184fb68 EFLAGS: 00010082
<4> [512.506604] RAX: 0000000000000000 RBX: ffff88832a227e40 RCX: 0000000000000002
<4> [512.506605] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff
<4> [512.506607] RBP: ffff8884985d49c0 R08: 0000000000000000 R09: 0000000000000001
<4> [512.506608] R10: 0000000030a5050d R11: 00000000fea845a1 R12: ffff8883aa227e40
<4> [512.506609] R13: ffff88849e00f3c0 R14: 0000000000000007 R15: 7fffffffffffffff
<4> [512.506610] FS:  00007f2bfdc95300(0000) GS:ffff88849fc80000(0000) knlGS:0000000000000000
<4> [512.506612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [512.506613] CR2: 00007f7a60661538 CR3: 000000047c6d6003 CR4: 0000000000760ee0
<4> [512.506614] PKRU: 55555554
<4> [512.506615] Call Trace:
<4> [512.506659]  __active_retire+0x135/0x250 [i915]
<4> [512.506664]  dma_fence_signal_locked+0x9e/0x1b0
<4> [512.506666]  dma_fence_signal+0x1f/0x40
<4> [512.506703]  i915_request_wait+0x292/0x880 [i915]
<4> [512.506705]  ? dma_resv_get_fences_rcu+0xc4/0x550
<4> [512.506740]  i915_gem_object_wait+0xcc/0x430 [i915]
<4> [512.506774]  i915_gem_wait_ioctl+0xf3/0x270 [i915]
<4> [512.506805]  ? i915_gem_object_wait+0x430/0x430 [i915]
<4> [512.506809]  drm_ioctl_kernel+0xa7/0xf0
<4> [512.506813]  drm_ioctl+0x2e1/0x390
<4> [512.506844]  ? i915_gem_object_wait+0x430/0x430 [i915]
<4> [512.506848]  ? __handle_mm_fault+0x8b1/0xff0
<4> [512.506854]  do_vfs_ioctl+0xa0/0x6f0
<4> [512.506857]  ? __do_page_fault+0x2da/0x4f0
<4> [512.506862]  ksys_ioctl+0x35/0x60
<4> [512.506865]  __x64_sys_ioctl+0x11/0x20
<4> [512.506867]  do_syscall_64+0x4f/0x210
<4> [512.506870]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [512.506872] RIP: 0033:0x7f2bfd3435d7
<4> [512.506873] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4> [512.506875] RSP: 002b:00007ffe7e04e268 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4> [512.506876] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2bfd3435d7
<4> [512.506877] RDX: 00007ffe7e04e2a0 RSI: 00000000c010646c RDI: 0000000000000005
<4> [512.506879] RBP: 00007ffe7e04e2a0 R08: 00007ffe7e07c1b0 R09: 00007ffe7e04e2e0
<4> [512.506880] R10: 0000000000000056 R11: 0000000000000246 R12: 00000000c010646c
<4> [512.506881] R13: 0000000000000005 R14: 00007ffe7e04e2a0 R15: 0000000000000004
<4> [512.506887] irq event stamp: 72017198
<4> [512.506890] hardirqs last  enabled at (72017197): [<ffffffff8124020d>] __slab_alloc.isra.84.constprop.89+0x4d/0x70
<4> [512.506892] hardirqs last disabled at (72017198): [<ffffffff819e8fbd>] _raw_spin_lock_irqsave+0xd/0x50
<4> [512.506894] softirqs last  enabled at (72016450): [<ffffffff81c00385>] __do_softirq+0x385/0x47f
<4> [512.506896] softirqs last disabled at (72016443): [<ffffffff810b803a>] irq_exit+0xba/0xc0
<4> [512.506897] ---[ end trace 147f0731e0a8dd42 ]---
<3> [512.507012] =============================================================================
<3> [512.507187] BUG kmalloc-64 (Tainted: G     U  W        ): Invalid object pointer 0x000000008ef4961b
<3> [512.507206] -----------------------------------------------------------------------------

<4> [512.507226] Disabling lock debugging due to kernel taint
<3> [512.507228] INFO: Slab 0x00000000700d1aaf objects=32 used=32 fp=0x0000000061c35381 flags=0x8000000000010201
<4> [512.507235] CPU: 1 PID: 2545 Comm: gem_exec_reuse Tainted: G    BU  W         5.4.0-rc7-CI-CI_DRM_7357+ #1
<4> [512.507236] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
<4> [512.507237] Call Trace:
<4> [512.507239]  dump_stack+0x71/0x9b
<4> [512.507241]  slab_err+0xa8/0xd0
<4> [512.507244]  ? free_debug_processing+0x37/0x380
<4> [512.507246]  free_debug_processing+0x201/0x380
<4> [512.507276]  ? __active_retire+0x135/0x250 [i915]
<4> [512.507279]  __slab_free+0x35b/0x520
<4> [512.507281]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4> [512.507283]  ? debug_check_no_obj_freed+0x11d/0x210
<4> [512.507285]  ? kmem_cache_free+0x31f/0x390
<4> [512.507316]  ? __active_retire+0x135/0x250 [i915]
<4> [512.507317]  kmem_cache_free+0x31f/0x390
<4> [512.507347]  __active_retire+0x135/0x250 [i915]
<4> [512.507349]  dma_fence_signal_locked+0x9e/0x1b0
<4> [512.507351]  dma_fence_signal+0x1f/0x40
<4> [512.507383]  i915_request_wait+0x292/0x880 [i915]
<4> [512.507384]  ? dma_resv_get_fences_rcu+0xc4/0x550
<4> [512.507415]  i915_gem_object_wait+0xcc/0x430 [i915]
<4> [512.507445]  i915_gem_wait_ioctl+0xf3/0x270 [i915]
<4> [512.507474]  ? i915_gem_object_wait+0x430/0x430 [i915]
<4> [512.507476]  drm_ioctl_kernel+0xa7/0xf0
<4> [512.507478]  drm_ioctl+0x2e1/0x390
<4> [512.507506]  ? i915_gem_object_wait+0x430/0x430 [i915]
<4> [512.507508]  ? __handle_mm_fault+0x8b1/0xff0
<4> [512.507511]  do_vfs_ioctl+0xa0/0x6f0
<4> [512.507512]  ? __do_page_fault+0x2da/0x4f0
<4> [512.507515]  ksys_ioctl+0x35/0x60
<4> [512.507517]  __x64_sys_ioctl+0x11/0x20
<4> [512.507518]  do_syscall_64+0x4f/0x210
<4> [512.507519]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [512.507520] RIP: 0033:0x7f2bfd3435d7
<4> [512.507521] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4> [512.507522] RSP: 002b:00007ffe7e04e268 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4> [512.507523] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2bfd3435d7
<4> [512.507523] RDX: 00007ffe7e04e2a0 RSI: 00000000c010646c RDI: 0000000000000005
<4> [512.507524] RBP: 00007ffe7e04e2a0 R08: 00007ffe7e07c1b0 R09: 00007ffe7e04e2e0
<4> [512.507525] R10: 0000000000000056 R11: 0000000000000246 R12: 00000000c010646c
<4> [512.507525] R13: 0000000000000005 R14: 00007ffe7e04e2a0 R15: 0000000000000004
<3> [512.507528] FIX kmalloc-64: Object at 0x000000008ef4961b not freed
Comment 1 CI Bug Log 2019-11-18 09:07:10 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL: igt@gem_exec_reuse@contexts - dmesg-warn - cache_from_obj: Wrong slab cache. active_node but object is from kmalloc-\d+
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5290/shard-iclb7/igt@gem_exec_reuse@contexts.html
Comment 2 Chris Wilson 2019-11-18 09:33:14 UTC
Took a quick look at all allocations used for the tree, and they look valid. So I wonder if this is a use-after-free or some other form of memcorruption?
Comment 3 Chris Wilson 2019-11-18 23:31:34 UTC
Even worse, I've now seen a similar error myself:
       GEM_BUG_ON(i915_active_fence_isset(&it->base));

Still did not see where the fence might be set without a ref->count; yet there it is. Notably has not occurred when hunting for it with kasan etc.
Comment 4 Martin Peres 2019-11-29 19:48:17 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/613.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.