Summary: | [CI] igt@drv_selftest@live_gtt - incomplete | ||
---|---|---|---|
Product: | DRI | Reporter: | Marta Löfstedt <marta.lofstedt> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | intel-gfx-bugs |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | BXT, GLK | i915 features: |
Description
Marta Löfstedt
2017-11-13 12:11:05 UTC
Only seen once so far (I think at least), it looks to be a kernel leak. At the moment, the obvious thing to do is a run with kmemleak, but my initial guess is that it's a result of early fail not cleaning up properly. The modules allocations (such as drm_mm, kmem_cache etc) are checked upon module unload (and kselftest) but no warning seen, hence the search for something a little more unusual. (In reply to Chris Wilson from comment #1) > Only seen once so far (I think at least), it looks to be a kernel leak. At > the moment, the obvious thing to do is a run with kmemleak, but my initial > guess is that it's a result of early fail not cleaning up properly. The > modules allocations (such as drm_mm, kmem_cache etc) are checked upon module > unload (and kselftest) but no warning seen, hence the search for something a > little more unusual. This incomplete is pretty frequent on APL, but due to ftrace messing up pstore and the recent 4.15.0-rc1 fire, we'll have to wait and see if we can get any reasonable data on this. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3415/shard-apl6/igt@drv_selftest@live_gtt.html doesn't have the any oom stuff, so I change the title and file all igt@drv_selftest@live_gtt on this bug. run.log doesn't hint at timeout or softdog so system hang is assumed. this is last dmesg: <7>[ 2890.553261] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01 <5>[ 2891.380887] __shrink_hole timed out at ofset 1ffffff000 [0 - 1000000000000] <5>[ 2892.632006] lowlevel_hole timed out before 192296/260705 <5>[ 2893.636015] drunk_hole timed out after 114947/521410 <5>[ 2894.637006] walk_hole timed out at 1c93a000 <5>[ 2895.784092] pot_hole timed out after 16/31 <5>[ 2896.837487] fill_hole timed out (npages=279841, prime=23) <6>[ 2896.842475] Console: switching to colour dummy device 80x25 (In reply to Marta Löfstedt from comment #3) > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3415/shard-apl6/ > igt@drv_selftest@live_gtt.html > > doesn't have the any oom stuff, so I change the title and file all > igt@drv_selftest@live_gtt on this bug. > > run.log doesn't hint at timeout or softdog so system hang is assumed. > > this is last dmesg: > <7>[ 2890.553261] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to > 01 > <5>[ 2891.380887] __shrink_hole timed out at ofset 1ffffff000 [0 - > 1000000000000] > <5>[ 2892.632006] lowlevel_hole timed out before 192296/260705 > <5>[ 2893.636015] drunk_hole timed out after 114947/521410 > <5>[ 2894.637006] walk_hole timed out at 1c93a000 > <5>[ 2895.784092] pot_hole timed out after 16/31 > <5>[ 2896.837487] fill_hole timed out (npages=279841, prime=23) > <6>[ 2896.842475] Console: switching to colour dummy device 80x25 The dmesg snippet is wrong, it is from this GLK-shards run: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3415/shard-glkb4/igt@drv_selftest@live_gtt.html The last APL dmesgs are: <7>[ 1653.449383] [drm:intel_fb_initial_config [i915]] Not using firmware configuration <7>[ 1653.449404] [drm:drm_setup_crtcs] looking for cmdline mode on connector 72 <7>[ 1653.449427] [drm:drm_setup_crtcs] looking for preferred mode on connector 72 0 <7>[ 1653.449434] [drm:drm_setup_crtcs] found mode 1024x768 <7>[ 1653.449439] [drm:drm_setup_crtcs] picking CRTCs for 8192x8192 config <7>[ 1653.449467] [drm:drm_setup_crtcs] desired mode 1024x768 set on crtc 40 (0,0) <7>[ 1653.449577] [drm:intelfb_create [i915]] no BIOS fb, allocating a new one <7>[ 1653.483768] [drm:asle_work [i915]] bclp = 0x800000ff <7>[ 1653.483842] [drm:asle_work [i915]] updating opregion backlight 255/255 <6>[ 1664.028427] perf: interrupt took too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 to clarify my previous mess: Here are 2 new occurrences of this issue, both looks like system hangs. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3415/shard-apl6/igt@drv_selftest@live_gtt.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3415/shard-glkb4/igt@drv_selftest@live_gtt.html I think this explains this failure, and it should also prevent the sanitycheck incompletes. commit c325dd948b4e4e9fe0cc7d612f2101fb3804de5c (HEAD, upstream/master) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Nov 30 21:41:10 2017 +0000 igt/drv_selftests: Disable initialising the display Many of the selftests try to completely fill global resources; resources that are presumed available for bringing up the display. Avoid the contention by simply not bringing up the display! This does limit the effectiveness of selftesting to GEM for the time being. To exercise KMS from selftests we would essentially have to always mock the displays. References: https://bugs.freedesktop.org/show_bug.cgi?id=103718 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Fix included in CI_DRM_3449 I will close |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.