Bug 105272 - [IGT] gem_shrink subtest get-pages-userptr has a failed with child 4 died with signal 9, Killed and dmesg-fail: Out of memory: Kill process 700 (gem_shrink)
Summary: [IGT] gem_shrink subtest get-pages-userptr has a failed with child 4 died wit...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-27 18:14 UTC by Hector Velazquez
Modified: 2018-07-02 10:59 UTC (History)
1 user (show)

See Also:
i915 platform: GLK
i915 features: GEM/Other


Attachments
output (1.88 KB, text/plain)
2018-02-27 18:14 UTC, Hector Velazquez
no flags Details
kernel log (445.04 KB, text/plain)
2018-02-27 18:14 UTC, Hector Velazquez
no flags Details
dmesg -w -H (134.96 KB, text/plain)
2018-02-27 18:14 UTC, Hector Velazquez
no flags Details
dmesg_shrink-test (127.20 KB, text/plain)
2018-04-10 16:45 UTC, Elizabeth
no flags Details

Description Hector Velazquez 2018-02-27 18:14:06 UTC
Created attachment 137653 [details]
output

This test has failed on GLK QA

Test List

igt@gem_shrink@get-pages-userptr

======================================
        output
======================================
. . .
IGT-Version: 1.21-ga2664f8 (x86_64) (Linux: 4.16.0-rc2-drm-tip-ww9-commit-3a86cab+ x86_64)
(gem_shrink:666) igt-core-DEBUG: Test requirement passed: !igt_run_in_simulation()
(gem_shrink:666) intel-chipset-DEBUG: Test requirement passed: pci_dev
Using 62 processes and 128MiB per process
(gem_shrink:666) intel-os-DEBUG: Checking 62 surfaces of size 134217728 bytes (total 8321531904) against RAM + swap
(gem_shrink:666) drmtest-DEBUG: Test requirement passed: !(fd<0)
(gem_shrink:666) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_shrink:666) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_shrink:666) intel-os-DEBUG: Test requirement passed: __intel_check_memory(count, size, mode, &required, &total)
(gem_shrink:666) igt-core-DEBUG: Test requirement passed: !igt_run_in_simulation()
(gem_shrink:666) drmtest-DEBUG: Test requirement passed: !(fd<0)
(gem_shrink:666) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd)
(gem_shrink:666) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_shrink:666) ioctl-wrappers-DEBUG: Test requirement passed: dir >= 0
(gem_shrink:666) ioctl-wrappers-DEBUG: Test requirement passed: err == 0
(gem_shrink:666) DEBUG: Test requirement passed: nengine
(gem_shrink:666) igt-core-DEBUG: Starting subtest: get-pages-userptr
(gem_shrink:666) drmtest-DEBUG: Test requirement passed: !(fd<0)
(gem_shrink:666) DEBUG: Test requirement passed: has_userptr()
Subtest get-pages-userptr failed.
**** DEBUG ****
(gem_shrink:666) drmtest-DEBUG: Test requirement passed: !(fd<0)
(gem_shrink:666) DEBUG: Test requirement passed: has_userptr()
****  END  ****
child 4 died with signal 9, Killed
Subtest get-pages-userptr: FAIL (254.924s)
(gem_shrink:666) igt-core-DEBUG: Exiting with status code 137
(gem_shrink:666) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
. . .

======================================
        dmesg-fail sample
======================================
. . .
[  +0.000001] Out of memory: Kill process 756 (gem_shrink) score 1008 or sacrifice child
[  +0.000012] Killed process 756 (gem_shrink) total-vm:4385476kB, anon-rss:131184kB, file-rss:0kB, shmem-rss:0kB
[  +0.025216] oom_reaper: reaped process 756 (gem_shrink), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ +16.570287] Purging GPU memory, 0 pages freed, 1083671 pages still pinned.
[  +0.000003] 1 and 360448 pages still available in the bound and unbound GPU page lists.
[  +0.000164] kworker/u9:4 invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[  +0.000001] kworker/u9:4 cpuset=/ mems_allowed=0
[  +0.000007] CPU: 0 PID: 726 Comm: kworker/u9:4 Tainted: G        W        4.16.0-rc2-drm-tip-ww9-commit-3a86cab+ #1
[  +0.000001] Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0077.B50.1712072148 12/07/2017
[  +0.000055] Workqueue: i915-userptr-acquire __i915_gem_userptr_get_pages_worker [i915]
[  +0.000002] Call Trace:
[  +0.000010]  dump_stack+0x5c/0x85
[  +0.000005]  dump_header+0x6b/0x27c
[  +0.000004]  ? apparmor_capable+0xcf/0xe0
[  +0.000002]  oom_kill_process+0x239/0x460
[  +0.000002]  ? oom_badness+0xeb/0x160
[  +0.000002]  out_of_memory+0x10f/0x480
[  +0.000003]  __alloc_pages_slowpath+0xcdd/0xdb0
[  +0.000003]  __alloc_pages_nodemask+0x246/0x280
[  +0.000004]  alloc_pages_vma+0x7c/0x1e0
[  +0.000003]  __handle_mm_fault+0xcfc/0x1130
[  +0.000002]  handle_mm_fault+0xdf/0x1e0
[  +0.000003]  __get_user_pages+0x11a/0x650
[  +0.000002]  get_user_pages_remote+0x137/0x1f0
[  +0.000035]  __i915_gem_userptr_get_pages_worker+0x185/0x230 [i915]
[  +0.000005]  process_one_work+0x147/0x3c0
[  +0.000003]  worker_thread+0x4a/0x440
[  +0.000002]  kthread+0xf8/0x130
[  +0.000002]  ? rescuer_thread+0x360/0x360
[  +0.000002]  ? kthread_associate_blkcg+0x90/0x90
[  +0.000002]  ret_from_fork+0x35/0x40
[  +0.000002] Mem-Info:
[  +0.000006] active_anon:945954 inactive_anon:922236 isolated_anon:704
               active_file:32 inactive_file:32 isolated_file:0
               unevictable:0 dirty:0 writeback:0 unstable:0
               slab_reclaimable:5982 slab_unreclaimable:7278
               mapped:439 shmem:923477 pagetables:64421 bounce:0
               free:26625 free_pcp:42 free_cma:0
[  +0.000003] Node 0 active_anon:3783816kB inactive_anon:3688944kB active_file:128kB inactive_file:128kB unevictable:0kB isolated(anon):2816kB isolated(file):0kB mapped:1756kB dirty:0kB writeback:0kB shmem:3693908kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2048kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[  +0.000001] Node 0 DMA free:15888kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15980kB managed:15888kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  +0.000005] lowmem_reserve[]: 0 1763 7748 7748 7748
[  +0.000003] Node 0 DMA32 free:38944kB min:15352kB low:19188kB high:23024kB active_anon:887416kB inactive_anon:845460kB active_file:0kB inactive_file:8kB unevictable:0kB writepending:0kB present:1913052kB managed:1847484kB mlocked:0kB kernel_stack:292kB pagetables:57880kB bounce:0kB free_pcp:76kB local_pcp:0kB free_cma:0kB
[  +0.000004] lowmem_reserve[]: 0 0 5984 5984 5984
[  +0.000003] Node 0 Normal free:51668kB min:52092kB low:65112kB high:78132kB active_anon:2896400kB inactive_anon:2843224kB active_file:128kB inactive_file:120kB unevictable:0kB writepending:0kB present:6291456kB managed:6132192kB mlocked:0kB kernel_stack:4252kB pagetables:199804kB bounce:0kB free_pcp:92kB local_pcp:0kB free_cma:0kB
[  +0.000004] lowmem_reserve[]: 0 0 0 0 0
[  +0.000003] Node 0 DMA: 0*4kB 2*8kB (U) 2*16kB (U) 3*32kB (U) 2*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15888kB
[  +0.000012] Node 0 DMA32: 7184*4kB (UM) 1186*8kB (UM) 1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB = 38944kB
[  +0.000011] Node 0 Normal: 12385*4kB (UME) 258*8kB (UME) 4*16kB (UM) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 51668kB
[  +0.000011] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  +0.000001] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  +0.000001] 923685 total pagecache pages
[  +0.000003] 144 pages in swap cache
[  +0.000001] Swap cache stats: add 10394, delete 10274, find 596/1369
[  +0.000001] Free swap  = 7965180kB
[  +0.000000] Total swap = 8000508kB
[  +0.000001] 2055122 pages RAM
[  +0.000001] 0 pages HighMem/MovableOnly
[  +0.000000] 56231 pages reserved
[  +0.000001] 0 pages cma reserved
[  +0.000000] 0 pages hwpoisoned
[  +0.000001] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[  +0.000004] [  252]     0   252    16310      503   163840       95             0 systemd-journal
. . .
[  +0.000002] Out of memory: Kill process 700 (gem_shrink) score 1008 or sacrifice child
[  +0.000008] Killed process 700 (gem_shrink) total-vm:4254404kB, anon-rss:131192kB, file-rss:0kB, shmem-rss:0kB
[ +13.886034] systemd[1]: systemd-resolved.service: Main process exited, code=dumped, status=6/ABRT
[  +0.006083] systemd[1]: systemd-resolved.service: Unit entered failed state.
[  +0.000583] systemd[1]: systemd-resolved.service: Failed with result 'watchdog'.
[  +0.015226] systemd[1]: systemd-udevd.service: Main process exited, code=dumped, status=6/ABRT
[  +0.001475] systemd[1]: systemd-udevd.service: Unit entered failed state.
[  +0.000105] systemd[1]: systemd-udevd.service: Failed with result 'watchdog'.
[  +0.003130] systemd[1]: systemd-udevd.service: Service has no hold-off time, scheduling restart.
[  +0.000153] systemd[1]: systemd-resolved.service: Service has no hold-off time, scheduling restart.
[  +0.000448] systemd[1]: Stopped Network Name Resolution.
[  +0.044719] systemd[1]: Starting Network Name Resolution...
[  +0.538217] systemd-journald[1252]: File /run/log/journal/f67bf632faf24157a4676ab8a79ac726/system.journal corrupted or uncleanly shut down, renaming and replacing.
[  +2.457190] [IGT] gem_shrink: exiting, ret=137
. . .
======================================
        Graphic stack
======================================
Component: drm
    tag: libdrm-2.4.89-79-g009634e
    commit: 009634e493097afae95d190fc26cb04a1664648a

Component: intel-gpu-tools
    tag: intel-gpu-tools-1.21-155-ga2664f8
    commit: a2664f86bb75e642c432531e2bf30e030fb3e828

Component: piglit
    tag: piglit-v1
    commit: 47b42158f3d8037ae968f5c3ceef930e4973e8af

Note: Using cairo component included in Ubuntu 17.10 (artful) distribution...
======================================
             Software
======================================
kernel version              : 4.16.0-rc2-drm-tip-ww9-commit-3a86cab+
architecture                : x86_64
os version                  : Ubuntu 17.10
os codename                 : artful
kernel driver               : i915
hardware acceleration       : disabled
swap partition              : enabled on (/dev/sda2)
======================================
        Graphic drivers
======================================
modesetting                 : enabled
modesetting compiled for    : 1.19.5 X.Org Video Driver
libdrm                      : 2.4.90
intel-gpu-tools (tag)       : intel-gpu-tools-1.21-155-ga2664f86
intel-gpu-tools (commit)    : a2664f86
======================================
             Hardware
======================================
platform                   : Geminilake
motherboard id             : GLKRVP1DDR4(05)
form factor                : Hand Held
cpu information            : Intel(R) Pentium(R) Silver N5000 CPU @ 1.10GHz
gpu card                   : Intel Corporation Device 3184 (rev 03) (prog-if 00 [VGA controller])
memory ram                 : 7.63 GB
current cd clock frequency : 316800 kHz
maximum cd clock frequency : 316800 kHz
displays connected         : eDP-1 DP-1 HDMI-A-2
======================================
             Firmware
======================================
dmc fw loaded             : yes
dmc version               : 1.4
guc fw loaded             : fetch NONE, load NONE
======================================
             kernel parameters
======================================
drm.debug=0x1e intel_iommu=igfx_off fsck.repair=yes i915.error_capture=yes log_buf_len=4M resume=/dev/sda2
Comment 1 Hector Velazquez 2018-02-27 18:14:41 UTC
Created attachment 137654 [details]
kernel log
Comment 2 Hector Velazquez 2018-02-27 18:14:54 UTC
Created attachment 137655 [details]
dmesg -w -H
Comment 3 Jani Saarinen 2018-03-29 07:11:20 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 4 Elizabeth 2018-04-10 16:45:40 UTC
Created attachment 138741 [details]
dmesg_shrink-test

I can't reproduce the failure anymore on GLK, though now the test even as success, presents warn in dmesg reported on bug 105873.

[gfx@GLK-1-GLKRVP1DDR405] [~]$ : time sudo -E ./intel-graphics/intel-gpu-tools/tests/gem_shrink --r get-pages-userptr
IGT-Version: 1.22-g7c474e0 (x86_64) (Linux: 4.16.0-rc7-drm-intel-qa-ww15-commit-617cdf0+ x86_64)
Using 30 processes and 128MiB per process
Subtest get-pages-userptr: SUCCESS (93.695s)

real    1m33.850s
user    0m6.524s
sys     1m12.708s
Comment 5 Francesco Balestrieri 2018-07-02 10:06:05 UTC
Marking resolved based on the above comment.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.