https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6977/shard-skl9/igt@gem_exec_reuse@contexts.html <6> [2075.844850] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),global_oom,task_memcg=/,task=gem_exec_reuse,pid=3690,uid=0 <3> [2075.846994] Out of memory: Killed process 3690 (gem_exec_reuse) total-vm:214048kB, anon-rss:9712kB, file-rss:4kB, shmem-rss:0kB, UID:0 pgtables:425984kB oom_score_adj:1000 <6> [2076.112341] oom_reaper: reaped process 3690 (gem_exec_reuse), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB <4> [2079.622101] java invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000 <4> [2079.622127] CPU: 0 PID: 852 Comm: java Tainted: G U 5.4.0-rc1-CI-CI_DRM_6977+ #1 <4> [2079.622141] Hardware name: Google Caroline/Caroline, BIOS MrChromebox 08/27/2018 <4> [2079.622152] Call Trace: <4> [2079.622184] dump_stack+0x67/0x9b <4> [2079.622206] dump_header+0x4a/0x3f0 <4> [2079.622231] oom_kill_process+0xe8/0x200 <4> [2079.622257] out_of_memory+0xfa/0x380 <4> [2079.622291] __alloc_pages_slowpath+0xc1d/0xdc0 <4> [2079.622397] __alloc_pages_nodemask+0x2ce/0x330 <4> [2079.622441] pagecache_get_page+0xb5/0x240 <4> [2079.622475] filemap_fault+0x6e5/0x9c0 <4> [2079.622498] ? filemap_map_pages+0x1cd/0x560 <4> [2079.622539] ? ext4_filemap_fault+0x22/0x39 <4> [2079.622586] ext4_filemap_fault+0x2a/0x39 <4> [2079.622605] __do_fault+0x4a/0xa0 <4> [2079.622632] __handle_mm_fault+0xa0f/0xf80 <4> [2079.622699] handle_mm_fault+0x159/0x350 <4> [2079.622733] __do_page_fault+0x2bb/0x4f0 <4> [2079.622774] page_fault+0x34/0x40 <4> [2079.622792] RIP: 0033:0x7f0080b13fd3 <4> [2079.622819] Code: Bad RIP value. <4> [2079.622832] RSP: 002b:00007f0061cbf6e0 EFLAGS: 00010206 <4> [2079.622849] RAX: 00007f007825e4f0 RBX: 00007f007825d800 RCX: 0000000000002744 <4> [2079.622860] RDX: 0000000000002745 RSI: 00000000000001a1 RDI: 00000000c3ec70d8 <4> [2079.622872] RBP: 00007f0061cbf740 R08: 00007ffd29dc0090 R09: 00007f0061cbf770 <4> [2079.622884] R10: 00007f006955d838 R11: 00000000c3ee2860 R12: 00000000000000c7 <4> [2079.622896] R13: 00007f0061cbf750 R14: 0000000000000000 R15: 00007f007825d800 <4> [2079.623119] Mem-Info: <4> [2079.623164] active_anon:43953 inactive_anon:2312 isolated_anon:0 active_file:61 inactive_file:129 isolated_file:0 unevictable:108514 dirty:0 writeback:0 unstable:0 slab_reclaimable:46619 slab_unreclaimable:752469 mapped:346 shmem:110826 pagetables:1516 bounce:0 free:22460 free_pcp:341 free_cma:0 <4> [2079.623213] Node 0 active_anon:175812kB inactive_anon:9248kB active_file:244kB inactive_file:516kB unevictable:434056kB isolated(anon):0kB isolated(file):0kB mapped:1384kB dirty:0kB writeback:0kB shmem:443304kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 108544kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes <4> [2079.623332] DMA free:15400kB min:276kB low:344kB high:412kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:176kB writepending:0kB present:15996kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB <4> [2079.623355] lowmem_reserve[]: 0 1823 3783 3783 <4> [2079.623444] DMA32 free:40100kB min:32444kB low:40552kB high:48660kB active_anon:0kB inactive_anon:0kB active_file:28kB inactive_file:64kB unevictable:337220kB writepending:0kB present:1992708kB managed:1936720kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:496kB local_pcp:248kB free_cma:0kB <4> [2079.623465] lowmem_reserve[]: 0 0 1959 1959 <4> [2079.623564] Normal free:34340kB min:34856kB low:43568kB high:52280kB active_anon:175812kB inactive_anon:9248kB active_file:0kB inactive_file:384kB unevictable:96660kB writepending:0kB present:2080768kB managed:2006912kB mlocked:0kB kernel_stack:2672kB pagetables:6064kB bounce:0kB free_pcp:868kB local_pcp:124kB free_cma:0kB <4> [2079.623586] lowmem_reserve[]: 0 0 0 0 <4> [2079.623653] DMA: 2*4kB (U) 0*8kB 2*16kB (UE) 0*32kB 2*64kB (UE) 1*128kB (U) 1*256kB (E) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ME) 2*4096kB (M) = 15400kB <4> [2079.623836] DMA32: 5*4kB (UME) 3*8kB (UE) 1*16kB (E) 13*32kB (ME) 14*64kB (UME) 8*128kB (ME) 5*256kB (UM) 3*512kB (UM) 6*1024kB (ME) 4*2048kB (ME) 5*4096kB (M) = 40028kB <4> [2079.623982] Normal: 1501*4kB (UMEH) 866*8kB (ME) 423*16kB (MEH) 197*32kB (UMEH) 88*64kB (UMEH) 21*128kB (UMH) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 34324kB <6> [2079.624110] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB <4> [2079.624130] 111038 total pagecache pages <4> [2079.624153] 0 pages in swap cache <4> [2079.624176] Swap cache stats: add 0, delete 0, find 0/0 <4> [2079.624197] Free swap = 0kB <4> [2079.624324] Total swap = 0kB <4> [2079.624344] 1022368 pages RAM <4> [2079.624364] 0 pages HighMem/MovableOnly <4> [2079.624384] 32484 pages reserved <6> [2079.624408] Unreclaimable slab info: <6> [2079.624431] Name Used Total <6> [2079.624463] i915_vma 115187KB 115189KB <6> [2079.624491] i915_priolist 0KB 8KB <6> [2079.624519] i915_dependency 0KB 8KB <6> [2079.624549] drm_i915_gem_object 409648KB 409656KB <6> [2079.624578] i915_lut_handle 38949KB 38950KB <6> [2079.624606] intel_context 3594KB 3606KB <6> [2079.624640] active_node 0KB 32KB <6> [2079.624710] bio-2 1KB 15KB <6> [2079.624764] fib6_nodes 3KB 8KB <6> [2079.624792] ip6_dst_cache 5KB 15KB <6> [2079.624824] RAWv6 16KB 31KB <6> [2079.624856] UDPv6 0KB 31KB <6> [2079.624893] TCPv6 3KB 31KB <6> [2079.624937] sd_ext_cdb 0KB 7KB <6> [2079.624965] sgpool-128 8KB 31KB <6> [2079.624993] sgpool-64 4KB 31KB <6> [2079.625020] sgpool-32 2KB 31KB <6> [2079.625048] sgpool-16 1KB 15KB <6> [2079.625076] sgpool-8 1KB 15KB <6> [2079.625105] mqueue_inode_cache 1KB 30KB <6> [2079.625144] jbd2_inode 6KB 31KB <6> [2079.625175] ext4_system_zone 9KB 15KB <6> [2079.625203] ext4_bio_post_read_ctx 52KB 54KB <6> [2079.625385] bio-1 2KB 15KB <6> [2079.625407] posix_timers_cache 0KB 31KB <6> [2079.625423] iommu_devinfo 11KB 16KB <6> [2079.625439] iommu_domain 45KB 63KB <6> [2079.625455] iommu_iova 50475KB 50478KB <6> [2079.625473] UNIX 187KB 223KB <6> [2079.625494] tcp_bind_bucket 1KB 8KB <6> [2079.625510] inet_peer_cache 1KB 15KB <6> [2079.625531] ip_fib_trie 3KB 7KB <6> [2079.625547] ip_fib_alias 3KB 7KB <6> [2079.625564] ip_dst_cache 7KB 47KB
The CI Bug Log issue associated to this bug has been updated. ### New filters associated * SKL: igt@gem_exec_reuse@contexts - incomplete - Out of memory - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6977/shard-skl9/igt@gem_exec_reuse@contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6978/shard-skl7/igt@gem_exec_reuse@contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6979/shard-skl10/igt@gem_exec_reuse@contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14594/shard-skl6/igt@gem_exec_reuse@contexts.html - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14598/shard-skl6/igt@gem_exec_reuse@contexts.html
We've disabled kmemleak from normal builds (i.e. should still be enabled for kasan; check with Tomi for precise details or check the kconfig of the relevant run). Bisect ongoing as to how/why it suddenly exploded memusage wise.
ickle@broadwell:~/linux$ git bisect bad c5665868183fec689dbab9fb8505188b2c4f0757 is the first bad commit commit c5665868183fec689dbab9fb8505188b2c4f0757 Author: Catalin Marinas <catalin.marinas@arm.com> Date: Mon Sep 23 15:34:05 2019 -0700 mm: kmemleak: use the memory pool for early allocations Currently kmemleak uses a static early_log buffer to trace all memory allocation/freeing before the slab allocator is initialised. Such early log is replayed during kmemleak_init() to properly initialise the kmemleak metadata for objects allocated up that point. With a memory pool that does not rely on the slab allocator, it is possible to skip this early log entirely. In order to remove the early logging, consider kmemleak_enabled == 1 by default while the kmem_cache availability is checked directly on the object_cache and scan_area_cache variables. The RCU callback is only invoked after object_cache has been initialised as we wouldn't have any concurrent list traversal before this. In order to reduce the number of callbacks before kmemleak is fully initialised, move the kmemleak_init() call to mm_init(). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: remove WARN_ON(), per Catalin] Link: http://lkml.kernel.org/r/20190812160642.52134-4-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reverts cleanly, but still ooms. Hmm, maybe taken a wrong turn in bisecting.
Fwiw, disabling kmemleak had the desired effected in preventing the incompletes during shard runs.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/471.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.