Summary: | [CI][SHARDS] igt@gem_mmap_gtt@forked-* - timeout - extremely slow on ICL | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Peres <martin.peres> |
Component: | DRM/Intel | Assignee: | Mika Kuoppala <mika.kuoppala> |
Status: | RESOLVED NOTOURBUG | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | high | CC: | intel-gfx-bugs |
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | ICL | i915 features: | GEM/Other |
Description
Martin Peres
2019-06-10 12:59:00 UTC
On interesting comparison, small-copy single forked glk: 2.15s 2.89s icl: 2.50s 281.08s Quite clearly it simply explodes with concurrent use. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6077/shard-iclb1/igt@gem_mmap_gtt@forked-big-copy.html -> 74.80s https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6079/shard-iclb7/igt@gem_mmap_gtt@forked-big-copy.html -> +1058.06s (timed out and killed) For the sake of consistency, next iclb1 result was https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6092/shard-iclb1/igt@gem_mmap_gtt@forked-big-copy.html -> +1032s (timed out) On May 14 (when CI_DRM_6078 wasn't run on shard-iclb) the BIOSes were upgraded for the shard. So there is large probability that this change has caused the issue. New BIOS was WW18; the shards are now running newer one. Older than WW18 can't be tested on current CPUs, so to reproduce the issue we need to find an older CPU, or full host that can be downgraded BIOS-wise. Temporary band-aid: commit 6cb3d4a9457cdfb993ebb2a086a4844b85c49ee2 (upstream/master, origin/master, origin/HEAD) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jun 10 13:52:02 2019 +0100 i915/gem_mmap_gtt: Disregard forked subtests on ICL for reasons Nothing to see here, please move along. The short story seems to be that a BIOS update made concurrent GTT access a few orders of magnitude slower, severely hampering CI. Where the fault actually lies is unknown, and how to circumvent it, unknown. References: https://bugs.freedesktop.org/show_bug.cgi?id=110882 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Martin Peres <martin.peres@linux.intel.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> (In reply to Chris Wilson from comment #5) > Temporary band-aid: > > commit 6cb3d4a9457cdfb993ebb2a086a4844b85c49ee2 (upstream/master, > origin/master, origin/HEAD) > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Mon Jun 10 13:52:02 2019 +0100 > > i915/gem_mmap_gtt: Disregard forked subtests on ICL for reasons > > Nothing to see here, please move along. > > The short story seems to be that a BIOS update made concurrent GTT > access a few orders of magnitude slower, severely hampering CI. Where > the fault actually lies is unknown, and how to circumvent it, unknown. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=110882 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Cc: Martin Peres <martin.peres@linux.intel.com> > Acked-by: Daniel Vetter <daniel@ffwll.ch> Thanks, it got rid of most of the issues. We still have the following tests being executed though: Test Machine Min (s) Avg (s) Max (s) igt@gem_mmap_gtt@forked-big-copy shard-iclb 803.781 937.90 1063.49 igt@gem_mmap_gtt@forked-big-copy-odd shard-iclb 974.263 1002.33 1044.71 (In reply to Martin Peres from comment #6) > (In reply to Chris Wilson from comment #5) > > Temporary band-aid: > > > > commit 6cb3d4a9457cdfb993ebb2a086a4844b85c49ee2 (upstream/master, > > origin/master, origin/HEAD) > > Author: Chris Wilson <chris@chris-wilson.co.uk> > > Date: Mon Jun 10 13:52:02 2019 +0100 > > > > i915/gem_mmap_gtt: Disregard forked subtests on ICL for reasons > > > > Nothing to see here, please move along. > > > > The short story seems to be that a BIOS update made concurrent GTT > > access a few orders of magnitude slower, severely hampering CI. Where > > the fault actually lies is unknown, and how to circumvent it, unknown. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=110882 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > Cc: Martin Peres <martin.peres@linux.intel.com> > > Acked-by: Daniel Vetter <daniel@ffwll.ch> > > Thanks, it got rid of most of the issues. We still have the following tests > being executed though: > > Test Machine Min (s) Avg (s) Max (s) > igt@gem_mmap_gtt@forked-big-copy shard-iclb 803.781 937.90 1063.49 > igt@gem_mmap_gtt@forked-big-copy-odd shard-iclb 974.263 1002.33 1044.71 My bad, we have not received the new results yet! Sorry for the noise! The CI Bug Log issue associated to this bug has been updated. ### New filters associated * ICL: igt@gem_mmap_gtt@forked-* - fail - Failed assertion: !(intel_gen(devid) >= 11 && ncpus > 1) - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb1/igt@gem_mmap_gtt@forked-big-copy-xy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb1/igt@gem_mmap_gtt@forked-big-copy-odd.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb4/igt@gem_mmap_gtt@forked-basic-small-copy-odd.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb4/igt@gem_mmap_gtt@forked-medium-copy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb4/igt@gem_mmap_gtt@forked-medium-copy-odd.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb4/igt@gem_mmap_gtt@forked-basic-small-copy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb4/igt@gem_mmap_gtt@forked-big-copy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb5/igt@gem_mmap_gtt@forked-medium-copy-xy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3136/shard-iclb8/igt@gem_mmap_gtt@forked-basic-small-copy-xy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb3/igt@gem_mmap_gtt@forked-big-copy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb5/igt@gem_mmap_gtt@forked-medium-copy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb5/igt@gem_mmap_gtt@forked-basic-small-copy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb5/igt@gem_mmap_gtt@forked-medium-copy-xy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb7/igt@gem_mmap_gtt@forked-big-copy-odd.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb7/igt@gem_mmap_gtt@forked-big-copy-xy.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb8/igt@gem_mmap_gtt@forked-basic-small-copy-odd.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb8/igt@gem_mmap_gtt@forked-medium-copy-odd.html - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5051/shard-iclb8/igt@gem_mmap_gtt@forked-basic-small-copy-xy.html Are these failures about old runs? AFAIK the tests have been removed. Actually no. We want to keep tracking this as an open issue, but we also don't want to clog CI with slow tests, so the choice was to make the tests fail deliberately on ICL. Investigation ongoing about the exponential slowdown on multicore access The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore. The CI Bug Log issue associated to this bug has been restored. All the previous filters are now active. Sorry for the noise! I archived the wrong bug. Anyway, this skip is hit on TGL, can you check if it is reproducible there and adjust the condition if it is not? A CI Bug Log filter associated to this bug has been updated: {- ICL: igt@gem_mmap_gtt@forked-* - fail - Failed assertion: !(intel_gen(devid) >= 11 && ncpus > 1) -} {+ ICL TGL: igt@gem_mmap_gtt@forked-* - fail - Failed assertion: !(intel_gen(devid) >= 11 && ncpus > 1) +} New failures caught by the filter: * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_365/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy-odd.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_365/fi-tgl-u/igt@gem_mmap_gtt@forked-big-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_362/fi-tgl-u/igt@gem_mmap_gtt@forked-medium-copy-xy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_365/fi-tgl-u/igt@gem_mmap_gtt@forked-medium-copy-xy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_362/fi-tgl-u/igt@gem_mmap_gtt@forked-medium-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_365/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_365/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy-xy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_365/fi-tgl-u/igt@gem_mmap_gtt@forked-medium-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_362/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy-odd.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_362/fi-tgl-u/igt@gem_mmap_gtt@forked-big-copy-xy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_363/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy-odd.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_363/fi-tgl-u/igt@gem_mmap_gtt@forked-big-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_363/fi-tgl-u/igt@gem_mmap_gtt@forked-big-copy-xy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_363/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_363/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy-xy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_363/fi-tgl-u/igt@gem_mmap_gtt@forked-medium-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy-odd.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_mmap_gtt@forked-big-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_mmap_gtt@forked-medium-copy-odd.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_mmap_gtt@forked-big-copy-xy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_mmap_gtt@forked-big-copy-odd.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_mmap_gtt@forked-basic-small-copy-xy.html * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_364/fi-tgl-u/igt@gem_mmap_gtt@forked-medium-copy.html Early tgl results: basic-small-copy: SUCCESS (1,671s) forked-basic-small-copy: SUCCESS (37,568s) Not great, but not as bad as icl (might just be difference in memdebug options?) medium-copy: SUCCESS (3,307s) forked-medium-copy: SUCCESS (76,614s) forked-medium-copy-XY: SUCCESS (203,251s) forked-medium-copy-odd: SUCCESS (204,265s) Moved to the attic: commit 0e9510b83502af3e230870df2d66d4f68918d3a4 (HEAD, upstream/master) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 17 14:01:27 2019 +0100 i915/gem_mmap_gtt: Replace forked-mmapped tests with a lighter variant Introduce a new 2-process fork test that is bound to a single cpu to exercise contention during pagefaults. This is a much lighter variant of the all-cpus test intended to be viable even on the legendary frozen lakes of molasses. References: https://bugs.freedesktop.org/show_bug.cgi?id=110882 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> (In reply to Chris Wilson from comment #17) > Moved to the attic: > > commit 0e9510b83502af3e230870df2d66d4f68918d3a4 (HEAD, upstream/master) > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Tue Sep 17 14:01:27 2019 +0100 > > i915/gem_mmap_gtt: Replace forked-mmapped tests with a lighter variant > > Introduce a new 2-process fork test that is bound to a single cpu to > exercise contention during pagefaults. This is a much lighter variant of > the all-cpus test intended to be viable even on the legendary frozen > lakes of molasses. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=110882 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Cc: Martin Peres <martin.peres@linux.intel.com> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> :'D The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.