Bug 109580 - [CI][DRMTIP] igt@gem_ringfill@basic-default-hang - dmesg-fail - Failed assertion: __gem_execbuf(fd, execbuf) == 0
Summary: [CI][DRMTIP] igt@gem_ringfill@basic-default-hang - dmesg-fail - Failed assert...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: Triaged, ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-07 17:19 UTC by Lakshmi
Modified: 2019-03-06 16:24 UTC (History)
1 user (show)

See Also:
i915 platform: G33, PNV
i915 features: GEM/Other


Attachments

Description Lakshmi 2019-02-07 17:19:10 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_204/fi-blb-e6850/igt@gem_ringfill@basic-default-hang.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_204/fi-pnv-d510/igt@gem_ringfill@basic-default-hang.html

Starting subtest: basic-default-hang
(gem_ringfill:1623) ioctl_wrappers-CRITICAL: Test assertion failure function gem_execbuf, file ../lib/ioctl_wrappers.c:609:
(gem_ringfill:1623) ioctl_wrappers-CRITICAL: Failed assertion: __gem_execbuf(fd, execbuf) == 0
(gem_ringfill:1623) ioctl_wrappers-CRITICAL: error: -5 != 0
Subtest basic-default-hang failed.
**** DEBUG ****
(gem_ringfill:1623) DEBUG: Test requirement passed: !(m->flags & NEWFD && master)
(gem_ringfill:1623) igt_core-DEBUG: Test requirement passed: !igt_run_in_simulation()
(gem_ringfill:1623) ioctl_wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
(gem_ringfill:1623) DEBUG: Test requirement passed: gem_can_store_dword(fd, ring)
(gem_ringfill:1623) igt_debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_ringfill:1623) DEBUG: Verifying result
(gem_ringfill:1623) DEBUG: Test requirement passed: setup_execbuf(fd, &execbuf, obj, reloc, ring) == 0
(gem_ringfill:1623) ioctl_wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
(gem_ringfill:1623) i915/gem_context-DEBUG: Test requirement passed: has_ban_period || has_bannable
(gem_ringfill:1623) igt_gt-DEBUG: Test requirement passed: has_gpu_reset(fd)
(gem_ringfill:1623) igt_gt-DEBUG: Test requirement passed: ctx == 0 || has_ctx_exec(fd, ring, ctx)
(gem_ringfill:1623) igt_dummyload-DEBUG: Test requirement passed: nengine
(gem_ringfill:1623) DEBUG: Executing execbuf 4096 times
(gem_ringfill:1623) ioctl_wrappers-CRITICAL: Test assertion failure function gem_execbuf, file ../lib/ioctl_wrappers.c:609:
(gem_ringfill:1623) ioctl_wrappers-CRITICAL: Failed assertion: __gem_execbuf(fd, execbuf) == 0
(gem_ringfill:1623) ioctl_wrappers-CRITICAL: error: -5 != 0
(gem_ringfill:1623) igt_core-INFO: Stack trace:
(gem_ringfill:1623) igt_core-INFO:   #0 ../lib/igt_core.c:1474 __igt_fail_assert()
(gem_ringfill:1623) igt_core-INFO:   #1 ../lib/ioctl_wrappers.c:610 gem_execbuf()
(gem_ringfill:1623) igt_core-INFO:   #2 ../tests/i915/gem_ringfill.c:89 fill_ring()
(gem_ringfill:1623) igt_core-INFO:   #3 ../tests/i915/gem_ringfill.c:227 run_test()
(gem_ringfill:1623) igt_core-INFO:   #4 ../tests/i915/gem_ringfill.c:284 __real_main241()
(gem_ringfill:1623) igt_core-INFO:   #5 ../tests/i915/gem_ringfill.c:241 main()
(gem_ringfill:1623) igt_core-INFO:   #6 ../csu/libc-start.c:344 __libc_start_main()
(gem_ringfill:1623) igt_core-INFO:   #7 [_start+0x2a]
****  END  ****
Subtest basic-default-hang: FAIL (21.368s)
Comment 2 Chris Wilson 2019-02-07 23:08:46 UTC
About the only thing of real note there is the issue that execbuf sees the temporary wedging.
Comment 3 Chris Wilson 2019-02-20 16:38:56 UTC
commit c41166f9a145f1c4ce2961b338f9b57495ace4b5 (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Feb 20 14:56:37 2019 +0000

    drm/i915: Beware temporary wedging when determining -EIO
    
    At a few points in our uABI, we check to see if the driver is wedged and
    report -EIO back to the user in that case. However, as we perform the
    check and reset asynchronously (where once before they were both
    serialised by the struct_mutex), we may instead see the temporary wedging
    used to cancel inflight rendering to avoid a deadlock during reset
    (caused by either us timing out in our reset handler,
    i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare
    for a stuck modeset). If we suspect this is the case, that is we see a
    wedged driver *and* reset in progress, then wait until the reset is
    resolved before reporting upon the wedged status.
    
    v2: might_sleep() (Mika)
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-chris@chris-wilson.co.uk
Comment 4 Martin Peres 2019-03-06 16:24:49 UTC
(In reply to Chris Wilson from comment #3)
> commit c41166f9a145f1c4ce2961b338f9b57495ace4b5 (HEAD ->
> drm-intel-next-queued, drm-intel/drm-intel-next-queued)
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Wed Feb 20 14:56:37 2019 +0000
> 
>     drm/i915: Beware temporary wedging when determining -EIO
>     
>     At a few points in our uABI, we check to see if the driver is wedged and
>     report -EIO back to the user in that case. However, as we perform the
>     check and reset asynchronously (where once before they were both
>     serialised by the struct_mutex), we may instead see the temporary wedging
>     used to cancel inflight rendering to avoid a deadlock during reset
>     (caused by either us timing out in our reset handler,
>     i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare
>     for a stuck modeset). If we suspect this is the case, that is we see a
>     wedged driver *and* reset in progress, then wait until the reset is
>     resolved before reporting upon the wedged status.
>     
>     v2: might_sleep() (Mika)
>     
>     Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>     Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>     Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>     Link:
> https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-
> chris@chris-wilson.co.uk

Thanks, seems to have done the trick as it was seen 7 times in 5 runs, and now not seen for 28 runs!
Comment 5 CI Bug Log 2019-03-06 16:24:59 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.