Bug 112402

Summary: [CI][BAT]igt@i915_selftest@live_gem_contexts - dmesg-fail - igt_ctx_sseu failed with error -\d+
Product: DRI Reporter: Lakshmi <lakshminarayana.vudum>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: not set    
Priority: not set CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: KBL, SKL i915 features: GEM/Other

Description Lakshmi 2019-11-27 08:23:18 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5310/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
(i915_selftest:4962) igt_kmod-WARNING: [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 02
(i915_selftest:4962) igt_kmod-WARNING: [drm:intel_power_well_disable [i915]] disabling always-on
(i915_selftest:4962) igt_kmod-WARNING: idle: Failed with -11!
(i915_selftest:4962) igt_kmod-WARNING: i915/i915_gem_context_live_selftests: igt_ctx_sseu failed with error -11
(i915_selftest:4962) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling always-on
(i915_selftest:4962) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling DC off
(i915_selftest:4962) igt_kmod-WARNING: [drm:gen9_set_dc_state [i915]] Setting DC state from 02 to 00
(i915_selftest:4962) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling power well 2
(i915_selftest:4962) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling DDI A/E IO power well
(i915_selftest:4962) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling DDI B IO power well
(i915_selftest:4962) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling DDI C IO power well
(i915_selftest:4962) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling DDI D IO power well
(i915_selftest:4962) igt_kmod-WARNING: i915: probe of 0000:00:02.0 failed with error -11
(i915_selftest:4962) igt_kmod-CRITICAL: Test assertion failure function igt_kselftest_execute, file ../lib/igt_kmod.c:588:
(i915_selftest:4962) igt_kmod-CRITICAL: Failed assertion: err == 0
(i915_selftest:4962) igt_kmod-CRITICAL: kselftest "i915 igt__33__live_gem_contexts=1 live_selftests=-1 disable_display=1 st_filter=" failed: Resource temporarily unavailable [11]
(i915_selftest:4962) igt_core-INFO: Stack trace:
(i915_selftest:4962) igt_core-INFO:   #0 ../lib/igt_core.c:1830 __igt_fail_assert()
(i915_selftest:4962) igt_core-INFO:   #1 [igt_kselftest_execute+0x2e5]
(i915_selftest:4962) igt_core-INFO:   #2 ../lib/igt_kmod.c:622 igt_kselftests()
(i915_selftest:4962) igt_core-INFO:   #3 /usr/include/x86_64-linux-gnu/bits/stdio2.h:64 __real_main29()
(i915_selftest:4962) igt_core-INFO:   #4 ../tests/i915/i915_selftest.c:29 main()
(i915_selftest:4962) igt_core-INFO:   #5 ../csu/libc-start.c:344 __libc_start_main()
(i915_selftest:4962) igt_core-INFO:   #6 [_start+0x2a]
****  END  ****
Subtest live_gem_contexts: FAIL (33.168s)
Comment 1 CI Bug Log 2019-11-27 08:24:41 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* KBL SKL:igt@i915_selftest@live_gem_contexts - dmesg-fail - igt_ctx_sseu failed with error -\d+
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4966/fi-skl-6260u/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4966/fi-skl-gvtdvm/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4966/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4967/fi-skl-6260u/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4967/fi-skl-gvtdvm/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4967/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15444/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15449/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15445/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15447/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15452/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15454/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15434/shard-kbl3/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7425/fi-kbl-7560u/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3758/fi-kbl-7560u/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3758/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5310/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15438/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7426/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3760/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3761/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5376/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7429/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15456/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15439/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15440/fi-kbl-7560u/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15440/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15441/fi-skl-6770hq/igt@i915_selftest@live_gem_contexts.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15441/fi-skl-lmem/igt@i915_selftest@live_gem_contexts.html
Comment 2 CI Bug Log 2019-11-27 14:18:44 UTC
A CI Bug Log filter associated to this bug has been updated:

{- KBL SKL:igt@i915_selftest@live_gem_contexts - dmesg-fail - igt_ctx_sseu failed with error -\d+ -}
{+ KBL SKL:igt@i915_selftest@live_gem_contexts - dmesg-fail - igt_ctx_sseu failed with error -11 +}


  No new failures caught with the new filter
Comment 3 Chris Wilson 2019-11-27 18:33:47 UTC
commit df9f85d8582ebda052835c55ae940e4f866e1ef5
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Nov 27 13:45:27 2019 +0000

    drm/i915: Serialise i915_active_fence_set() with itself
    
    The expected downside to commit 58b4c1a07ada ("drm/i915: Reduce nested
    prepare_remote_context() to a trylock") was that it would need to return
    -EAGAIN to userspace in order to resolve potential mutex inversion. Such
    an unsightly round trip is unnecessary if we could atomically insert a
    barrier into the i915_active_fence, so make it happen.
    
    Currently, we use the timeline->mutex (or some other named outer lock)
    to order insertion into the i915_active_fence (and so individual nodes
    of i915_active). Inside __i915_active_fence_set, we only need then
    serialise with the interrupt handler in order to claim the timeline for
    ourselves.
    
    However, if we remove the outer lock, we need to ensure the order is
    intact between not only multiple threads trying to insert themselves
    into the timeline, but also with the interrupt handler completing the
    previous occupant. We use xchg() on insert so that we have an ordered
    sequence of insertions (and each caller knows the previous fence on
    which to wait, preserving the chain of all fences in the timeline), but
    we then have to cmpxchg() in the interrupt handler to avoid overwriting
    the new occupant. The only nasty side-effect is having to temporarily
    strip off the RCU-annotations to apply the atomic operations, otherwise
    the rules are much more conventional!
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112402
    Fixes: 58b4c1a07ada ("drm/i915: Reduce nested prepare_remote_context() to a trylock")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20191127134527.3438410-1-chris@chris-wilson.co.uk

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.