Bug 107754

Summary: [CI][DRMTIP] igt@kms_vblank@pipe-a-wait-busy-hang - fail - Failed assertion: __gem_execbuf_wr(fd, execbuf) == 0 / error: -5 != 0
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: SKL i915 features: GEM/Other

Description Martin Peres 2018-08-30 14:04:27 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_87/fi-skl-gvtdvm/igt@kms_vblank@pipe-a-wait-busy-hang.html

(kms_vblank:1876) ioctl_wrappers-CRITICAL: Test assertion failure function gem_execbuf_wr, file ../lib/ioctl_wrappers.c:635:
(kms_vblank:1876) ioctl_wrappers-CRITICAL: Failed assertion: __gem_execbuf_wr(fd, execbuf) == 0
(kms_vblank:1876) ioctl_wrappers-CRITICAL: error: -5 != 0
Subtest pipe-A-wait-busy-hang failed.
Comment 1 Chris Wilson 2018-08-30 18:17:33 UTC
Lots of

<7>[  192.784110] [drm:i915_reset_device [i915]] resetting chip
<5>[  192.784186] i915 0000:00:06.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
<3>[  192.989305] i915 0000:00:06.0: Failed to idle engines, declaring wedged!

abound. Looks like it stemmed from the loss of reseting preempt across a GPU reset.

commit 0051163ab3d8090a08ea2ea5edbb738c0920000a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Jul 16 13:54:24 2018 +0100

    drm/i915/execlists: Always clear preempt status on cancelling all
    
    On reset/wedging, we cancel all pending replies from the HW and we also
    want to cancel an outstanding preemption event. Since we use the same
    function to cancel the pending replies for reset and for a preemption
    event, we can simply clear the active tracking for all.
    
    v2: Keep execlists_user_end() markup for wedging
    v3: Move assignment to inline to hide the bare assignment.
    
    Fixes: 60a943245413 ("drm/i915/execlists: Drop clear_gtiir() on GPU reset")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180716125424.5715-1-chris@chris-wilson.co.uk
Comment 2 Martin Peres 2018-09-14 13:02:45 UTC
(In reply to Chris Wilson from comment #1)
> Lots of
> 
> <7>[  192.784110] [drm:i915_reset_device [i915]] resetting chip
> <5>[  192.784186] i915 0000:00:06.0: Resetting chip for Manually set wedged
> engine mask = ffffffffffffffff
> <3>[  192.989305] i915 0000:00:06.0: Failed to idle engines, declaring
> wedged!
> 
> abound. Looks like it stemmed from the loss of reseting preempt across a GPU
> reset.
> 
> commit 0051163ab3d8090a08ea2ea5edbb738c0920000a
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Mon Jul 16 13:54:24 2018 +0100
> 
>     drm/i915/execlists: Always clear preempt status on cancelling all
>     
>     On reset/wedging, we cancel all pending replies from the HW and we also
>     want to cancel an outstanding preemption event. Since we use the same
>     function to cancel the pending replies for reset and for a preemption
>     event, we can simply clear the active tracking for all.
>     
>     v2: Keep execlists_user_end() markup for wedging
>     v3: Move assignment to inline to hide the bare assignment.
>     
>     Fixes: 60a943245413 ("drm/i915/execlists: Drop clear_gtiir() on GPU
> reset")
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>     Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>     Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>     Link:
> https://patchwork.freedesktop.org/patch/msgid/20180716125424.5715-1-
> chris@chris-wilson.co.uk

Thanks, it never happened again in almost 4 months. Let's close it!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.