Bug 105348 - [CI] [KBL only] gt@drv_selftest@live_hangcheck - fail - Failed assertion: err == 0 / incomplete system hang?
Summary: [CI] [KBL only] gt@drv_selftest@live_hangcheck - fail - Failed assertion: err...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-05 16:11 UTC by Martin Peres
Modified: 2018-04-11 10:07 UTC (History)
1 user (show)

See Also:
i915 platform: KBL
i915 features: GEM/Other


Attachments

Description Martin Peres 2018-03-05 16:11:16 UTC
(drv_selftest:6078) igt-kmod-CRITICAL: Test assertion failure function igt_kselftest_execute, file igt_kmod.c:513:
(drv_selftest:6078) igt-kmod-CRITICAL: Failed assertion: err == 0
(drv_selftest:6078) igt-kmod-CRITICAL: kselftest "i915 igt__22__live_hangcheck=1 live_selftests=-1 disable_display=1" failed: Input/output error [5]
Subtest live_hangcheck failed.

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3870/shard-kbl7/igt@drv_selftest@live_hangcheck.html
Comment 2 Chris Wilson 2018-04-09 20:40:37 UTC
I claim to have fixed the incompletes and kbl failures, as of

commit 028666793a0291b63eb61bae7252345821326a1b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Mar 30 14:18:01 2018 +0100

    drm/i915/selftests: Avoid repeatedly harming the same innocent context
    
    We don't handle resetting the kernel context very well, or presumably any
    context executing its breadcrumb commands in the ring as opposed to the
    batchbuffer and flush. If we trigger a device reset twice in quick
    succession while the kernel context is executing, we may end up skipping
    the breadcrumb.  This is really only a problem for the selftest as
    normally there is a large interlude between resets (hangcheck), or we
    focus on resetting just one engine and so avoid repeatedly resetting
    innocents.
    
    Something to try would be a preempt-to-idle to quiesce the engine before
    reset, so that innocent contexts would be spared the reset.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Cc: Michał Winiarski <michal.winiarski@intel.com>
    CC: Michel Thierry <michel.thierry@intel.com>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180330131801.18327-1-chris@chris-wilson.co.uk


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.