Bug 88435

Summary: [BDW]igt/gem_concurrent_blit/cpu-bcs-early-read-forked-hang(rcs) sporadically takes more than 10 minutes or fail
Product: DRI Reporter: lu hua <huax.lu>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: christophe.prigent, intel-gfx-bugs
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: BDW i915 features: GEM/Other
Attachments:
Description Flags
dmesg none

Description lu hua 2015-01-15 03:31:19 UTC
Created attachment 112262 [details]
dmesg

==System Environment==
--------------------------
Regression: no, new case

Non-working platforms:  BDW

==kernel==
--------------------------
drm-intel-nightly/95cce4b4c5f3ecaf9c1c01d42f670da2748fcffb
commit 95cce4b4c5f3ecaf9c1c01d42f670da2748fcffb
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Jan 14 20:58:31 2015 +0100

    drm-intel-nightly: 2015y-01m-14d-19h-58m-09s UTC integration manifest

==Bug detailed description==
-----------------------------
Run 3 cycles on BDW nightly kernel, it fails once and  takes more than 10 mingutes twice.
It skip on -fixes kernel.

output:
[root@x-bdw01 tests]# time ./gem_concurrent_blit --run-subtest cpu-bcs-early-read-forked-hang\(rcs\)
IGT-Version: 1.9-g3214a27 (x86_64) (Linux: 3.19.0-rc4_drm-intel-nightly_95cce4_20150115+ x86_64)
using 2x512 buffers, each 1MiB
Test assertion failure function gem_quiescent_gpu, file drmtest.c:168:
Failed assertion: drmIoctl((fd), ((((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8)))), (&execbuf)) == 0
Last errno: 5, Input/output error
child 10 failed with exit status 99
Subtest cpu-bcs-early-read-forked-hang(rcs): FAIL (257.637s)

real    4m21.527s
user    0m6.242s
sys     0m8.217s
[root@x-bdw01 tests]# time ./gem_concurrent_blit --run-subtest cpu-bcs-early-read-forked-hang\(rcs\)
IGT-Version: 1.9-g3214a27 (x86_64) (Linux: 3.19.0-rc4_drm-intel-nightly_95cce4_20150115+ x86_64)
using 2x512 buffers, each 1MiB
Test assertion failure function gem_quiescent_gpu, file drmtest.c:168:
Failed assertion: drmIoctl((fd), ((((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8)))), (&execbuf)) == 0
Last errno: 5, Input/output error
child 5 failed with exit status 99
Subtest cpu-bcs-early-read-forked-hang(rcs): FAIL (721.842s)

real    12m5.701s
user    0m7.615s
sys     0m12.919s

==Reproduce steps==
---------------------------- 
1. time ./gem_concurrent_blit --run-subtest cpu-bcs-early-read-forked-hang\(rcs\)
Comment 1 Chris Wilson 2015-01-15 07:49:56 UTC
The hang is unstable, the execlists reset is broken.
Comment 2 Chris Wilson 2016-09-09 17:52:15 UTC
commit 821ed7df6e2a1dbae243caebcfe21a0a4329fca0
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Sep 9 14:11:53 2016 +0100

    drm/i915: Update reset path to fix incomplete requests
    
    Update reset path in preparation for engine reset which requires
    identification of incomplete requests and associated context and fixing
    their state so that engine can resume correctly after reset.
    
    The request that caused the hang will be skipped and head is reset to the
    start of breadcrumb. This allows us to resume from where we left-off.
    Since this request didn't complete normally we also need to cleanup elsp
    queue manually. This is vital if we employ nonblocking request
    submission where we may have a web of dependencies upon the hung request
    and so advancing the seqno manually is no longer trivial.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.