Summary: | [CI][SHARDS] igt@drv_suspend@shrink - incomplete - hard hang? | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Peres <martin.peres> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | RESOLVED WORKSFORME | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | intel-gfx-bugs |
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | BSW/CHT, CFL, G33, HSW, I945GM, KBL, SNB | i915 features: | power/suspend-resume |
Description
Martin Peres
2018-06-11 13:33:45 UTC
It looks like we get into a livelock loop where we make no progress. Some report continuing to hit our shrinking, but a few others just go quiet. Reproduced on CFL, this time with pstores! https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_61/fi-cfl-u2/igt@drv_suspend@shrink.html Hmm, on SNB, the test just get killed by the OOM killer: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4310/shard-snb5/igt@drv_suspend@shrink.html Stdout: child 0 died with signal 9, Killed Dmesg: <6>[ 66.295205] drv_suspend (1451) used greatest stack depth: 11288 bytes left <4>[ 69.647709] python3 invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 <4>[ 69.647741] CPU: 1 PID: 1362 Comm: python3 Tainted: G U 4.17.0-rc7-CI-CI_DRM_4310+ #1 <4>[ 69.647743] Hardware name: Dell Inc. XPS 8300 /0Y2MRG, BIOS A06 10/17/2011 <4>[ 69.647744] Call Trace: <4>[ 69.647751] dump_stack+0x67/0x9b <4>[ 69.647755] dump_header+0x60/0x42e <4>[ 69.647759] ? trace_hardirqs_on_caller+0xe0/0x1b0 <4>[ 69.647762] ? _raw_spin_unlock_irqrestore+0x39/0x60 <4>[ 69.647767] oom_kill_process+0x2be/0x6d0 <4>[ 69.647773] out_of_memory+0x103/0x390 <4>[ 69.647777] __alloc_pages_nodemask+0xe3f/0x1250 <4>[ 69.647791] filemap_fault+0x276/0x620 <4>[ 69.647798] ext4_filemap_fault+0x27/0x40 <4>[ 69.647802] __do_fault+0x1b/0x80 <4>[ 69.647805] __handle_mm_fault+0x888/0xe30 <4>[ 69.647815] handle_mm_fault+0x196/0x3a0 <4>[ 69.647820] __do_page_fault+0x295/0x590 <4>[ 69.647826] ? page_fault+0x8/0x30 <4>[ 69.647829] page_fault+0x1e/0x30 <4>[ 69.647831] RIP: 0033:0x54e734 <4>[ 69.647833] RSP: 002b:00007fc52ac6f260 EFLAGS: 00010246 <4>[ 69.647836] RAX: 0000000000000000 RBX: 0000000002037950 RCX: 00007fc5397f510d <4>[ 69.647838] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 0000000000a8b1c0 <4>[ 69.647839] RBP: 0000000002037950 R08: 0000000000a8b180 R09: 0000000000000000 <4>[ 69.647841] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc52b473126 <4>[ 69.647843] R13: 00007fc52400a8a0 R14: 00007fc52400a790 R15: 0000000000000057 <4>[ 69.647852] Mem-Info: <4>[ 69.647856] active_anon:0 inactive_anon:61 isolated_anon:0 active_file:113 inactive_file:0 isolated_file:0 unevictable:1926764 dirty:0 writeback:0 unstable:0 slab_reclaimable:13274 slab_unreclaimable:12073 mapped:1926824 shmem:1926766 pagetables:9135 bounce:0 free:25273 free_pcp:1 free_cma:0 <4>[ 69.647860] Node 0 active_anon:0kB inactive_anon:244kB active_file:452kB inactive_file:0kB unevictable:7707056kB isolated(anon):0kB isolated(file):0kB mapped:7707296kB dirty:0kB writeback:0kB shmem:7707064kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes <4>[ 69.647864] DMA free:15360kB min:128kB low:160kB high:192kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15360kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB <4>[ 69.647865] lowmem_reserve[]: 0 3050 7771 7771 <4>[ 69.647876] DMA32 free:44916kB min:26476kB low:33092kB high:39708kB active_anon:76kB inactive_anon:156kB active_file:140kB inactive_file:616kB unevictable:3056012kB writepending:132kB present:3238728kB managed:3127532kB mlocked:3056012kB kernel_stack:0kB pagetables:11976kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB <4>[ 69.647878] lowmem_reserve[]: 0 0 4721 4721 <4>[ 69.647888] Normal free:40816kB min:40976kB low:51220kB high:61464kB active_anon:44kB inactive_anon:148kB active_file:0kB inactive_file:300kB unevictable:4650828kB writepending:44kB present:4978688kB managed:4834472kB mlocked:4650828kB kernel_stack:3872kB pagetables:24564kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB <4>[ 69.647890] lowmem_reserve[]: 0 0 0 0 <4>[ 69.647898] DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15360kB <4>[ 69.647933] DMA32: 77*4kB (UM) 39*8kB (ME) 27*16kB (UME) 80*32kB (UME) 14*64kB (ME) 1*128kB (M) 3*256kB (UM) 1*512kB (U) 3*1024kB (UME) 0*2048kB 9*4096kB (UM) = 45852kB <4>[ 69.647967] Normal: 445*4kB (UME) 221*8kB (UME) 192*16kB (UME) 50*32kB (UME) 61*64kB (ME) 38*128kB (ME) 21*256kB (ME) 9*512kB (UME) 14*1024kB (UME) 0*2048kB 0*4096kB = 41308kB <6>[ 69.648050] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB <4>[ 69.648053] 1927023 total pagecache pages <4>[ 69.648059] 1 pages in swap cache <4>[ 69.648062] Swap cache stats: add 80511, delete 80577, find 6280/10999 <4>[ 69.648065] Free swap = 1809148kB <4>[ 69.648068] Total swap = 2097148kB <4>[ 69.648071] 2058350 pages RAM <4>[ 69.648073] 0 pages HighMem/MovableOnly <4>[ 69.648076] 64009 pages reserved <6>[ 69.648079] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name <6>[ 69.648133] [ 239] 0 239 35727 0 303104 174 0 systemd-journal <6>[ 69.648203] [ 276] 0 276 11024 1 114688 460 -1000 systemd-udevd <6>[ 69.648209] [ 471] 102 471 17652 0 172032 165 0 systemd-resolve <6>[ 69.648214] [ 506] 118 506 11814 0 135168 107 0 avahi-daemon <6>[ 69.648219] [ 519] 118 519 11769 0 126976 85 0 avahi-daemon <6>[ 69.648225] [ 523] 0 523 45273 0 204800 208 0 thermald <6>[ 69.648230] [ 524] 0 524 106814 0 335872 356 0 ModemManager <6>[ 69.648236] [ 526] 104 526 65759 0 163840 406 0 rsyslogd <6>[ 69.648241] [ 529] 0 529 1138 0 53248 41 0 acpid <6>[ 69.648247] [ 537] 106 537 12591 1 147456 250 -900 dbus-daemon <6>[ 69.648252] [ 548] 0 548 11188 0 131072 138 0 wpa_supplicant <6>[ 69.648258] [ 553] 0 553 17646 1 176128 178 0 systemd-logind <6>[ 69.648263] [ 554] 0 554 8135 1 114688 74 0 cron <6>[ 69.648269] [ 555] 0 555 27617 0 114688 95 0 irqbalance <6>[ 69.648275] [ 557] 0 557 42936 2 217088 1999 0 networkd-dispat <6>[ 69.648280] [ 592] 0 592 72217 0 204800 729 0 polkitd <6>[ 69.648286] [ 610] 0 610 120439 1 421888 678 0 NetworkManager <6>[ 69.648291] [ 736] 0 736 18074 0 184320 188 -1000 sshd <6>[ 69.648297] [ 737] 0 737 4350 0 77824 37 0 agetty <6>[ 69.648303] [ 741] 0 741 6414 1 86016 304 0 dhclient <6>[ 69.648308] [ 904] 0 904 26431 1 249856 247 0 sshd <6>[ 69.648314] [ 909] 1000 909 19150 1 188416 271 0 systemd <6>[ 69.648319] [ 912] 1000 912 28487 0 253952 609 0 (sd-pam) <6>[ 69.648325] [ 951] 1000 951 27046 0 249856 290 0 sshd <6>[ 69.648330] [ 985] 1000 985 1515142 0 827392 30069 0 java <6>[ 69.648336] [ 1022] 1000 1022 3526 1 69632 72 0 bash <6>[ 69.648341] [ 1091] 0 1091 16707 1 184320 119 0 sudo <6>[ 69.648347] [ 1096] 0 1096 2479 1 65536 33 0 rngd <6>[ 69.648352] [ 1119] 1000 1119 3807 0 73728 43 0 dmesg <6>[ 69.648358] [ 1121] 0 1121 16707 1 163840 120 0 sudo <6>[ 69.648363] [ 1125] 0 1125 1129 0 57344 22 0 owatch <6>[ 69.648369] [ 1126] 0 1126 338543 0 606208 32399 0 python3 <6>[ 69.648375] [ 1450] 0 1450 2042986 1926080 15818752 425 1000 drv_suspend <6>[ 69.648380] [ 1452] 0 1452 2042986 1926766 15814656 419 1000 drv_suspend <3>[ 69.648384] Out of memory: Kill process 1452 (drv_suspend) score 1766 or sacrifice child <3>[ 69.648845] Killed process 1452 (drv_suspend) total-vm:8171944kB, anon-rss:0kB, file-rss:0kB, shmem-rss:7707064kB <6>[ 69.651916] oom_reaper: reaped process 1452 (drv_suspend), now anon-rss:0kB, file-rss:0kB, shmem-rss:7707064kB <4>[ 69.656037] in:imklog invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 <4>[ 69.656079] CPU: 3 PID: 543 Comm: in:imklog Tainted: G U 4.17.0-rc7-CI-CI_DRM_4310+ #1 <4>[ 69.656082] Hardware name: Dell Inc. XPS 8300 /0Y2MRG, BIOS A06 10/17/2011 <4>[ 69.656085] Call Trace: <4>[ 69.656091] dump_stack+0x67/0x9b <4>[ 69.656097] dump_header+0x60/0x42e <4>[ 69.656102] ? trace_hardirqs_on_caller+0xe0/0x1b0 <4>[ 69.656107] ? _raw_spin_unlock_irqrestore+0x39/0x60 <4>[ 69.656115] oom_kill_process+0x2be/0x6d0 <4>[ 69.656125] out_of_memory+0x103/0x390 <4>[ 69.656132] __alloc_pages_nodemask+0xe3f/0x1250 <4>[ 69.656157] __read_swap_cache_async+0x148/0x260 <4>[ 69.656166] swapin_readahead+0x312/0x410 <4>[ 69.656177] ? pagecache_get_page+0x2b/0x210 <4>[ 69.656186] ? do_swap_page+0x2e2/0x910 <4>[ 69.656190] do_swap_page+0x2e2/0x910 <4>[ 69.656202] __handle_mm_fault+0x65e/0xe30 <4>[ 69.656218] handle_mm_fault+0x196/0x3a0 <4>[ 69.656226] __do_page_fault+0x295/0x590 <4>[ 69.656238] page_fault+0x1e/0x30 Should I file another bug? The OOMkiller also struck on HSW and APL: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4312/shard-apl3/igt@drv_suspend@shrink.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4312/shard-hsw6/igt@drv_suspend@shrink.html Also seen on HSW: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_65/fi-hsw-4770r/igt@drv_suspend@shrink.html Also seen on GDG and ILK: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_73/fi-gdg-551/igt@drv_suspend@shrink.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_74/fi-ilk-650/igt@drv_suspend@shrink.html Also seen on PNV: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_77/fi-pnv-d510/igt@drv_suspend@shrink.html Also seen on SNB: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4535/shard-snb4/igt@drv_suspend@shrink.html This time, there is nothing interesting in the logs, except a crash. The test is gone. Closing! The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.