| Summary: | [EXT][NO SNB, IVB] gem_tiled_swapping@non-threaded failing or hanging testhosts | ||
|---|---|---|---|
| Product: | DRI | Reporter: | Tomi Sarvela <tomi.p.sarvela> |
| Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
| Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
| Severity: | normal | ||
| Priority: | medium | CC: | intel-gfx-bugs |
| Version: | DRI git | ||
| Hardware: | Other | ||
| OS: | All | ||
| URL: | https://patchwork.freedesktop.org/patch/160838/ | ||
| Whiteboard: | |||
| i915 platform: | BDW, HSW, ILK, KBL, SKL | i915 features: | GEM/Other |
The issue is that we are not adding the newly mmaped objects to the shrinker list and so instead of swapping out objects, we exhaust all physical memory -> ENOMEM on current tip, system dead due to ineffective oomkiller before. Hmm, analysis is probably wrong. We are calling gem_set_domain() which does i915_gem_object_bump_inactive_ggtt() and that should be placing the new objects on the shrinker list (i915->mm.unbound_list). The shrinker is being run before the failures, e.g.: [ 31.367525] i915_gem_shrinker_count: count = 256 pages (actual 354 pages) [ 31.367832] i915_gem_shrinker_scan: freed = 256 / 128 pages [ 31.367861] i915_gem_shrinker_count: count = 0 pages (actual 98 pages) [ 31.368090] i915_gem_shrinker_count: count = 0 pages (actual 98 pages) [ 31.368124] i915_gem_shrinker_count: count = 0 pages (actual 98 pages) [ 31.368138] gem_tiled_swapp: page allocation failure: order:0, mode:0x1401000(GFP_NOIO|__GFP_NORETRY), nodemask=(null) [ 31.368147] gem_tiled_swapp cpuset=/ mems_allowed=0 [ 31.368153] CPU: 0 PID: 747 Comm: gem_tiled_swapp Not tainted 4.11.0-rc3+ #7 [ 31.368155] Hardware name: /NUC6CAYB, BIOS AYAPLCEL.86A.0029.2016.1124.1625 11/24/2016 [ 31.368156] Call Trace: [ 31.368164] ? dump_stack+0x46/0x5d [ 31.368168] ? warn_alloc+0xe6/0x180 [ 31.368171] ? __alloc_pages_slowpath+0xab6/0xb40 [ 31.368174] ? set_next_entity+0x6b/0x9c0 [ 31.368177] ? __alloc_pages_nodemask+0x197/0x1b0 [ 31.368179] ? shmem_alloc_and_acct_page+0x140/0x230 [ 31.368181] ? find_get_entry+0x14/0xe0 [ 31.368183] ? shmem_getpage_gfp.isra.56+0x15a/0xae0 [ 31.368186] ? shmem_read_mapping_page_gfp+0x31/0x60 [ 31.368216] ? i915_gem_object_get_pages_gtt+0x20f/0x5f0 [i915] [ 31.368239] ? ____i915_gem_object_get_pages+0x38/0x70 [i915] [ 31.368262] ? __i915_gem_object_get_pages+0x4e/0x60 [i915] [ 31.368284] ? i915_gem_set_domain_ioctl+0x1bd/0x210 [i915] [ 31.368295] ? drm_ioctl+0x1d1/0x3b0 [drm] [ 31.368317] ? i915_gem_obj_prepare_shmem_write+0x140/0x140 [i915] [ 31.368320] ? do_mmap+0x356/0x440 [ 31.368323] ? do_vfs_ioctl+0x86/0x5b0 [ 31.368325] ? SyS_ioctl+0x36/0x70 [ 31.368328] ? entry_SYSCALL_64_fastpath+0x17/0x98 [ 31.368330] Mem-Info: [ 31.368336] active_anon:8601 inactive_anon:78273 isolated_anon:45 active_file:2772 inactive_file:634 isolated_file:0 unevictable:1872890 dirty:13 writeback:15104 unstable:0 slab_reclaimable:992 slab_unreclaimable:4998 mapped:3590 shmem:68205 pagetables:4174 bounce:0 free:26901 free_pcp:72 free_cma:0 [ 31.368341] Node 0 active_anon:34404kB inactive_anon:313092kB active_file:11088kB inactive_file:2536kB unevictable:7491560kB isolated(anon):180kB isolated(file):0kB mapped:14360kB dirty:52kB writeback:60416kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 272820kB writeback_tmp:0kB unstable:0kB pages_scanned:8 all_unreclaimable? no [ 31.368346] DMA free:15892kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15892kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 31.368346] lowmem_reserve[]: 0 1800 7803 7803 [ 31.368354] DMA32 free:39836kB min:15560kB low:19448kB high:23336kB active_anon:19528kB inactive_anon:287760kB active_file:4kB inactive_file:12kB unevictable:1498088kB writepending:58328kB present:1926492kB managed:1853664kB mlocked:1498088kB slab_reclaimable:620kB slab_unreclaimable:3852kB kernel_stack:0kB pagetables:2920kB bounce:0kB free_pcp:288kB local_pcp:288kB free_cma:0kB [ 31.368355] lowmem_reserve[]: 0 0 6003 6003 [ 31.368362] Normal free:51876kB min:51884kB low:64852kB high:77820kB active_anon:14716kB inactive_anon:24904kB active_file:11084kB inactive_file:2364kB unevictable:5993492kB writepending:2100kB present:6291456kB managed:6147088kB mlocked:5993492kB slab_reclaimable:3348kB slab_unreclaimable:16140kB kernel_stack:1648kB pagetables:13776kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 31.368363] lowmem_reserve[]: 0 0 0 0 [ 31.368367] DMA: 3*4kB (U) 3*8kB (U) 3*16kB (U) 2*32kB (U) 2*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB [ 31.368383] DMA32: 5*4kB (ME) 21*8kB (UE) 19*16kB (UE) 15*32kB (UME) 8*64kB (ME) 5*128kB (UME) 8*256kB (UME) 4*512kB (ME) 1*1024kB (M) 0*2048kB 8*4096kB (M) = 40012kB [ 31.368399] Normal: 286*4kB (UM) 111*8kB (UME) 58*16kB (M) 54*32kB (UM) 28*64kB (UME) 2*128kB (E) 3*256kB (ME) 3*512kB (UM) 2*1024kB (ME) 4*2048kB (UME) 8*4096kB (M) = 52048kB [ 31.368416] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 31.368417] 86889 total pagecache pages [ 31.368421] 15150 pages in swap cache [ 31.368422] Swap cache stats: add 36066, delete 20916, find 22/37 [ 31.368423] Free swap = 8089596kB [ 31.368424] Total swap = 8233980kB [ 31.368425] 2058483 pages RAM [ 31.368426] 0 pages HighMem/MovableOnly [ 31.368426] 54322 pages reserved and that also says we don't have stray objects left lying around. So why is everything "unevictable"? Ugh, the test uses igt_lock_mem(). That explains why everything appears to be unevictable. Ok, looks like this is the classic oom with swap + writeable pages, and not the untracked object bug as I first feared. (In reply to Chris Wilson from comment #6) > https://patchwork.freedesktop.org/patch/160838/ Changing the status to NEEDINFO. Thank you. tested again and the test is now passing.. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
IGT test gem_tiled_swapping@non-threaded has hanged many testhosts for a week. On a newest DRM-Tip From a fresh boot, dmesg below happens. 8229a8c712c22ff8e94e3244d4fd942a7dcd89af 2017-03-23_09-57-57 drm-tip: 2017y-03m-23d-09h-57m-34s UTC integration manifest 12:27 < ickle> it's genuine 12:27 < ickle> better give me a bug so I don't lalala foever 12:29 < ickle> on current tip, at least it should no longer hang :| [ 50.304609] Console: switching to colour dummy device 80x25 [ 50.305082] [IGT] gem_tiled_swapping: executing [ 50.318345] gem_tiled_swapp (6810): drop_caches: 4 [ 51.829865] [IGT] gem_tiled_swapping: starting subtest non-threaded [ 51.872167] gem_tiled_swapp: page allocation failure: order:0, mode:0x1000(__GFP_NORETRY), nodemask=(null) [ 51.872183] CPU: 2 PID: 6810 Comm: gem_tiled_swapp Not tainted 4.11.0-rc3-CI-CI_DRM_2388+ #1 [ 51.872184] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2017 [ 51.872185] Call Trace: [ 51.872188] dump_stack+0x67/0x92 [ 51.872191] warn_alloc+0xd9/0x170 [ 51.872194] __alloc_pages_nodemask+0xe97/0x1260 [ 51.872197] ? __lock_acquire+0x45b/0x1bb0 [ 51.872202] ? __this_cpu_preempt_check+0x13/0x20 [ 51.872203] ? __percpu_counter_add+0x85/0xb0 [ 51.872206] shmem_alloc_and_acct_page+0x1b8/0x380 [ 51.872208] ? page_cache_prev_hole+0x50/0x50 [ 51.872211] shmem_getpage_gfp.isra.9+0x171/0xc40 [ 51.872213] ? synchronize_rcu_expedited+0x35/0xb0 [ 51.872217] shmem_read_mapping_page_gfp+0x37/0x50 [ 51.872233] i915_gem_object_get_pages_gtt+0x3b0/0x600 [i915] [ 51.872247] ____i915_gem_object_get_pages+0x25/0x60 [i915] [ 51.872259] __i915_gem_object_get_pages+0x59/0x70 [i915] [ 51.872271] i915_gem_set_domain_ioctl+0x2f7/0x330 [i915] [ 51.872282] ? i915_gem_set_domain_ioctl+0x1e2/0x330 [i915] [ 51.872285] drm_ioctl+0x200/0x450 [ 51.872296] ? i915_gem_obj_prepare_shmem_write+0x200/0x200 [i915] [ 51.872300] ? up_write+0x1a/0x40 [ 51.872303] do_vfs_ioctl+0x90/0x6e0 [ 51.872305] ? entry_SYSCALL_64_fastpath+0x5/0xb1 [ 51.872307] ? __this_cpu_preempt_check+0x13/0x20 [ 51.872309] ? trace_hardirqs_on_caller+0xe7/0x200 [ 51.872311] SyS_ioctl+0x3c/0x70 [ 51.872313] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 51.872315] RIP: 0033:0x7f57f2a48357 [ 51.872316] RSP: 002b:00007ffc98788468 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 51.872319] RAX: ffffffffffffffda RBX: ffffffff8147ed53 RCX: 00007f57f2a48357 [ 51.872320] RDX: 00007ffc987884a0 RSI: 00000000400c645f RDI: 0000000000000003 [ 51.872321] RBP: ffffc90000dc3f88 R08: 0000000000000003 R09: 0000000104600000 [ 51.872322] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000003 [ 51.872323] R13: 0000000000000003 R14: 00000000400c645f R15: 00007f57f3f1e000 [ 51.872325] ? __this_cpu_preempt_check+0x13/0x20 [ 51.872339] Mem-Info: [ 51.872342] active_anon:1190 inactive_anon:68620 isolated_anon:0 active_file:1176 inactive_file:944 isolated_file:0 unevictable:3927540 dirty:6 writeback:833 unstable:0 slab_reclaimable:6942 slab_unreclaimable:14952 mapped:2138 shmem:18326 pagetables:8873 bounce:0 free:33830 free_pcp:1105 free_cma:0 [ 51.872345] Node 0 active_anon:4760kB inactive_anon:274480kB active_file:4704kB inactive_file:3776kB unevictable:15710160kB isolated(anon):0kB isolated(file):0kB mapped:8552kB dirty:24kB writeback:3332kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 15552512kB anon_thp: 73304kB writeback_tmp:0kB unstable:0kB pages_scanned:384 all_unreclaimable? no [ 51.872347] DMA free:15876kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15884kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 51.872348] lowmem_reserve[]: 0 2848 15860 15860 [ 51.872355] DMA32 free:64072kB min:12124kB low:15152kB high:18180kB active_anon:80kB inactive_anon:10240kB active_file:0kB inactive_file:0kB unevictable:2881644kB writepending:0kB present:3058324kB managed:2956336kB mlocked:2891884kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 51.872356] lowmem_reserve[]: 0 0 13012 13012 [ 51.872362] Normal free:55372kB min:55392kB low:69240kB high:83088kB active_anon:4776kB inactive_anon:264240kB active_file:4704kB inactive_file:3776kB unevictable:12828332kB writepending:3356kB present:13615104kB managed:13324684kB mlocked:12879532kB slab_reclaimable:27768kB slab_unreclaimable:59800kB kernel_stack:3712kB pagetables:35492kB bounce:0kB free_pcp:4420kB local_pcp:692kB free_cma:0kB [ 51.872363] lowmem_reserve[]: 0 0 0 0 [ 51.872368] DMA: 3*4kB (U) 1*8kB (U) 3*16kB (U) 0*32kB 3*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15876kB [ 51.872388] DMA32: 3*4kB (UM) 3*8kB (UM) 3*16kB (UM) 2*32kB (U) 0*64kB 7*128kB (UM) 7*256kB (UM) 10*512kB (UM) 11*1024kB (UM) 2*2048kB (UM) 10*4096kB (M) = 64276kB [ 51.872409] Normal: 77*4kB (UME) 499*8kB (UM) 377*16kB (UM) 179*32kB (UME) 78*64kB (ME) 66*128kB (UME) 68*256kB (UME) 17*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 55612kB [ 51.872430] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 51.872431] 21844 total pagecache pages [ 51.872434] 1209 pages in swap cache [ 51.872435] Swap cache stats: add 28852, delete 27657, find 294/534 [ 51.872436] Free swap = 8157180kB [ 51.872437] Total swap = 8272892kB [ 51.872438] 4172354 pages RAM [ 51.872439] 0 pages HighMem/MovableOnly [ 51.872440] 98128 pages reserved [ 51.902964] [IGT] gem_tiled_swapping: exiting, ret=99 [ 51.927222] Console: switching to colour frame buffer device 240x75