Summary: | [CI] igt@drv_selftest@live_gtt - fail | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Peres <martin.peres> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | intel-gfx-bugs |
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | BXT, GLK, KBL | i915 features: | GEM/Other |
Description
Martin Peres
2018-03-05 15:49:09 UTC
You're looking in the wrong place for test output. The test runs inside the kernel and randomly dies when java or systemd-journald allocate during the test. The test itself should back off gracefully under allocation failure, but we can't defend against oom generated by non-igt processes. Thanks Chris, I am assigning Tomi on this to either further-reduce our memory usage, or increase the amount of RAM on it. Another instance of this issue happened last week, so the issue is still here: https://bugs.freedesktop.org/show_bug.cgi?id=106609 To be fair, the test does try to allocate as much of ~64GiB as it can within one second, at one point. (That being it tries to exercise allocating the whole set of pagetables required for 48b.) Apollo Lake (BXT) hardware memory limit is 8GB, so adding memory isn't really the correct solution. https://ark.intel.com/products/95598/Intel-Celeron-Processor-N3350-2M-Cache-up-to-2_4-GHz I seee (In reply to Chris Wilson from comment #3) > To be fair, the test does try to allocate as much of ~64GiB as it can within > one second, at one point. (That being it tries to exercise allocating the > whole set of pagetables required for 48b.) I see... So what can we do to be more deterministic? Otherwise, the test will have to be suppressed forever... which does not make it serve its purpose... (In reply to Tomi Sarvela from comment #4) > Apollo Lake (BXT) hardware memory limit is 8GB, so adding memory isn't > really the correct solution. > > https://ark.intel.com/products/95598/Intel-Celeron-Processor-N3350-2M-Cache- > up-to-2_4-GHz Yeah, sorry for assigning you! I re-assigned it! Also seen on GLK and KBL: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4225/shard-glk6/igt@drv_selftest@live_gtt.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4247/shard-kbl1/igt@drv_selftest@live_gtt.html This one has a pstore: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4492/shard-kbl4/igt@drv_selftest@live_gtt.html <0>[ 174.348941] --------------------------------- <4>[ 174.348947] Modules linked in: i915(+) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb btrtl btbcm btintel bluetooth snd_hda_codec x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hwdep snd_hda_core ghash_clmulni_intel ecdh_generic snd_pcm e1000e mei_me mei prime_numbers [last unloaded: i915] <4>[ 174.349001] CPU: 2 PID: 5794 Comm: drv_selftest Tainted: G U 4.17.0-rc6-CI-CI_DRM_4221+ #1 <4>[ 174.349012] Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017 <4>[ 174.349062] RIP: 0010:i915_vma_destroy+0x1fd/0x410 [i915] <4>[ 174.349069] RSP: 0018:ffffc900005579a0 EFLAGS: 00010286 <4>[ 174.349078] RAX: 000000000000000c RBX: ffff8802677ef9c0 RCX: 0000000000000000 <4>[ 174.349086] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff880275a074e8 <4>[ 174.349094] RBP: ffff88026775f700 R08: 000000000000e750 R09: ffff880275b9c000 <4>[ 174.349103] R10: 0000000000000000 R11: ffff880275a074e8 R12: ffff88026b72ef98 <4>[ 174.349111] R13: 0000000000000000 R14: ffff8802677ef9c0 R15: 00000000fffffe00 <4>[ 174.349120] FS: 00007f5b5cf30980(0000) GS:ffff88027ed00000(0000) knlGS:0000000000000000 <4>[ 174.349129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 174.349137] CR2: 00007f765f402180 CR3: 000000025f8ec002 CR4: 00000000003606e0 <4>[ 174.349145] Call Trace: <4>[ 174.349191] __igt_write_huge+0xf7/0x2d0 [i915] <4>[ 174.349238] igt_write_huge+0x255/0x350 [i915] <4>[ 174.349285] igt_ppgtt_exhaust_huge+0x250/0x590 [i915] <4>[ 174.349340] __i915_subtests+0x44/0xd0 [i915] <4>[ 174.349389] i915_gem_huge_page_live_selftests+0x7d/0xc0 [i915] <4>[ 174.349444] __run_selftests+0x10b/0x190 [i915] <4>[ 174.349494] i915_live_selftests+0x2c/0x60 [i915] <4>[ 174.349538] i915_pci_probe+0x3b/0x90 [i915] <4>[ 174.349548] pci_device_probe+0xa1/0x130 <4>[ 174.349557] driver_probe_device+0x306/0x480 <4>[ 174.349564] __driver_attach+0xb7/0xe0 <4>[ 174.349571] ? driver_probe_device+0x480/0x480 <4>[ 174.349578] ? driver_probe_device+0x480/0x480 <4>[ 174.349586] bus_for_each_dev+0x74/0xc0 <4>[ 174.349593] bus_add_driver+0x15f/0x250 <4>[ 174.349599] ? 0xffffffffa0759000 <4>[ 174.349605] driver_register+0x52/0xc0 <4>[ 174.349611] ? 0xffffffffa0759000 <4>[ 174.349617] do_one_initcall+0x58/0x370 <4>[ 174.349625] ? do_init_module+0x1d/0x1ea <4>[ 174.349632] ? rcu_read_lock_sched_held+0x6f/0x80 <4>[ 174.349639] ? kmem_cache_alloc_trace+0x282/0x2e0 <4>[ 174.349648] do_init_module+0x56/0x1ea <4>[ 174.349655] load_module+0x2435/0x2b20 <4>[ 174.349667] ? __se_sys_finit_module+0xd3/0xf0 <4>[ 174.349674] __se_sys_finit_module+0xd3/0xf0 <4>[ 174.349685] do_syscall_64+0x55/0x190 <4>[ 174.349692] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 174.349699] RIP: 0033:0x7f5b5c5e2839 <4>[ 174.349704] RSP: 002b:00007ffd90f3fa58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 <4>[ 174.349715] RAX: ffffffffffffffda RBX: 00005604d46ddde0 RCX: 00007f5b5c5e2839 <4>[ 174.349723] RDX: 0000000000000000 RSI: 00005604d46debf0 RDI: 0000000000000004 <4>[ 174.349731] RBP: 00005604d46debf0 R08: 0000000000000004 R09: 0000000000000000 <4>[ 174.349740] R10: 00007ffd90f3fbc0 R11: 0000000000000246 R12: 0000000000000000 <4>[ 174.349748] R13: 00005604d46d7b00 R14: 0000000000000000 R15: 000000000000003d <4>[ 174.349759] Code: e8 82 c4 ba e0 48 8b 35 ba 62 1a 00 49 c7 c0 4e f7 63 a0 b9 e5 02 00 00 48 c7 c2 00 5f 62 a0 48 c7 c7 88 d0 54 a0 e8 83 2e c1 e0 <0f> 0b 48 c7 c1 b0 ae 65 a0 ba d4 02 00 00 48 c7 c6 e0 5e 62 a0 <1>[ 174.349874] RIP: i915_vma_destroy+0x1fd/0x410 [i915] RSP: ffffc900005579a0 <4>[ 174.352106] ---[ end trace f8090bf9ca9aa028 ]--- commit 207b700050b8d323d0c23b457c200b22c7ed3737 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 6 13:53:38 2018 +0100 drm/i915/selftests: Limit live_gtt allocation test to fit within RAM Limit the GTT size we try and allocate to ensure that it fits within RAM and does not trigger the oomkiller indiscriminately. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706125338.24432-1-chris@chris-wilson.co.uk This has been green since CI_DRM_4445 on shard-glk, 4448 on shard-apl, and 4447 on shard-kbl. Closing. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.