Bug 111923

Summary: [CI][SHARDS]igt@i915_selftest@live_execlists - dmesg-warn - WARNING: possible irq lock inversion dependency detected
Product: DRI Reporter: Lakshmi <lakshminarayana.vudum>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: not set    
Priority: not set CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: ICL, KBL, SKL i915 features: GEM/Other

Description Lakshmi 2019-10-08 16:05:52 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7014/shard-kbl1/igt@i915_selftest@live_execlists.html

<4> [2709.442616] ========================================================
<4> [2709.442619] WARNING: possible irq lock inversion dependency detected
<4> [2709.442621] 5.4.0-rc1-CI-CI_DRM_7014+ #1 Tainted: G     U           
<4> [2709.442623] --------------------------------------------------------
<4> [2709.442626] i915_selftest/7718 just changed the state of lock:
<4> [2709.442628] ffff88825da80ea0 (&i915_request_get(rq)->submit/1){-...}, at: __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4> [2709.442673] but this lock took another, HARDIRQ-unsafe lock in the past:
<4> [2709.442675]  (&ce->pin_mutex/2){+...}
<4> [2709.442676] 

and interrupts could create inverse lock ordering between them.

<4> [2709.442681] 
other info that might help us debug this:
<4> [2709.442683] Chain exists of:
  &i915_request_get(rq)->submit/1 --> &engine->active.lock --> &ce->pin_mutex/2

<4> [2709.442688]  Possible interrupt unsafe locking scenario:

<4> [2709.442691]        CPU0                    CPU1
<4> [2709.442693]        ----                    ----
<4> [2709.442694]   lock(&ce->pin_mutex/2);
<4> [2709.442696]                                local_irq_disable();
<4> [2709.442699]                                lock(&i915_request_get(rq)->submit/1);
<4> [2709.442702]                                lock(&engine->active.lock);
<4> [2709.442704]   <Interrupt>
<4> [2709.442705]     lock(&i915_request_get(rq)->submit/1);
<4> [2709.442708] 
 *** DEADLOCK ***
Comment 1 CI Bug Log 2019-10-08 16:07:48 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SKL KBL ICL: igt@i915_selftest@live_execlists - dmesg-warn - WARNING: possible irq lock inversion dependency detected
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5118/fi-icl-u3/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5119/fi-icl-u2/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14677/shard-iclb2/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14679/fi-skl-6260u/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14680/fi-icl-u2/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5214/fi-icl-u3/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7014/shard-kbl1/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7016/fi-icl-u4/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7018/fi-icl-u3/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7021/fi-icl-u3/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14685/fi-icl-u2/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7023/fi-skl-6260u/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14687/shard-kbl2/igt@i915_selftest@live_execlists.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14694/fi-skl-6260u/igt@i915_selftest@live_execlists.html
Comment 2 Chris Wilson 2019-10-08 16:09:06 UTC
This annotation never stops giving! The silly part of it is that it's purely an annotation meant to make the lock handling easier!!!
Comment 3 CI Bug Log 2019-10-08 20:20:30 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL: igt@i915_selftest@live_execlists - dmesg-warn - WARNING: HARDIRQ-safe -&gt; HARDIRQ-unsafe lock order detected
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7022/fi-icl-u3/igt@i915_selftest@live_execlists.html
Comment 4 Francesco Balestrieri 2019-10-10 06:05:03 UTC
Chris, is this a real problem or just noise?
Comment 5 Chris Wilson 2019-10-14 07:24:17 UTC
(In reply to Francesco Balestrieri from comment #4)
> Chris, is this a real problem or just noise?

No, this one is purely self-inflicted.
Comment 6 Chris Wilson 2019-10-22 12:37:05 UTC
commit 0587152bf9a0d7ebfd7fcb401068a742027adb2a (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Oct 22 13:28:45 2019 +0100

    drm/i915: Drop assertion that ce->pin_mutex guards state updates
    
    The actual conditions are that we know the GPU is not accessing the
    context, and we hold a pin on the context image to allow CPU access. We
    used a fake lock on ce->pin_mutex so that we could try and use lockdep
    to assert that access is serialised, but the various different
    hardirq/softirq contexts where we need to *fake* holding the pin_mutex
    are causing more trouble.
    
    Still it would be nice if we did have a way to reassure ourselves that
    the direct update to the context image is serialised with GPU execution.
    In the meantime, stop lockdep complaining about false irq inversions.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111923
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20191022122845.25038-1-chris@chris-wilson.co.uk

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.