https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5096/shard-apl8/igt@gem_exec_reuse@baggage.html <6> [630.362731] [IGT] gem_exec_reuse: starting subtest baggage <6> [682.800570] [IGT] gem_exec_reuse: exiting, ret=0 <6> [685.973846] Console: switching to colour frame buffer device 240x67 <4> [697.383196] ODEBUG: Out of memory. ODEBUG disabled <4> [698.591841] stack segment: 0000 [#1] PREEMPT SMP NOPTI <4> [698.591905] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G U 4.20.0-rc1-CI-CI_DRM_5096+ #1 <4> [698.591974] Hardware name: /NUC6CAYB, BIOS AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018 <4> [698.592041] Workqueue: events free_obj_work <4> [698.592078] RIP: 0010:free_obj_work+0x143/0x210 <4> [698.592116] Code: dd 00 00 00 49 bd 00 01 00 00 00 00 ad de 49 bc 00 02 00 00 00 00 ad de 48 89 43 08 4c 89 2e 4c 89 66 08 e8 3f 0e d6 ff eb 16 <48> 89 45 08 48 89 de 4c 89 2b 4c 89 63 08 48 89 eb e8 27 0e d6 ff <4> [698.592247] RSP: 0018:ffffc9000007fe30 EFLAGS: 00010202 <4> [698.592288] RAX: ffffc9000007fe30 RBX: ffff8801529c5d68 RCX: 0000000000000001 <4> [698.592342] RDX: 0000000080000001 RSI: 00000000ffffffff RDI: ffff880276816a80 <4> [698.592395] RBP: 6b6b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000001 <4> [698.592448] R10: 0000000000000000 R11: 0000000000000000 R12: dead000000000200 <4> [698.592500] R13: dead000000000100 R14: 0000000000000000 R15: 0000000000000000 <4> [698.592554] FS: 0000000000000000(0000) GS:ffff880277a00000(0000) knlGS:0000000000000000 <4> [698.592614] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [698.592658] CR2: 0000564ee3e10f28 CR3: 000000026e5c4000 CR4: 00000000003406f0 <4> [698.592711] Call Trace: <4> [698.592740] process_one_work+0x262/0x630 <4> [698.592779] worker_thread+0x37/0x380 <4> [698.592812] ? process_one_work+0x630/0x630 <4> [698.592847] kthread+0x119/0x130 <4> [698.592876] ? kthread_park+0x80/0x80 <4> [698.592911] ret_from_fork+0x3a/0x50 <4> [698.592947] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core lpc_ich snd_pcm r8169 mei_me prime_numbers mei pinctrl_broxton pinctrl_intel
I get the feeling that rc1 brought in a slew of new debugobject users, too many for the debugobject preallocation to handle. The current preallocation code dates from Feb 2018 so unlikely to be behind the recent discovery.
I wonder if we've enabled rcu debug objects recently. That ties in with the recent reports for untracked rcu objects on what seems to be an old problem.
Something else to note is that the debugobject preallocation is done in debug_object_init, and so if we have not been calling them for our obj->rcu that's a lot of debugobjects being allocated without any preallocation!
commit 8811d616dfaa8c6e1905a20ce0543ec401275997 (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Nov 9 09:03:11 2018 +0000 drm/i915: Initialise the obj->rcu head Make the rcu_head known to the system, in particular for debugobjects. And having declared it for debugobjects, we need to tidy up afterwards. v2: mark the obj->rcu as being destroyed when we reuse its location for the freed list. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108691 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181109090311.15321-1-chris@chris-wilson.co.uk
This indeed fixed it, thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.