Bug 111925 - [CI][SHARDS]igt@gem_eio@in-flight-contexts-immediate|igt@gem_eio@in-flight-contexts-10ms - fail - Failed assertion: igt_seconds_elapsed(&ts) <= 10
Summary: [CI][SHARDS]igt@gem_eio@in-flight-contexts-immediate|igt@gem_eio@in-flight-co...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: low minor
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-08 16:34 UTC by Lakshmi
Modified: 2019-11-11 16:40 UTC (History)
1 user (show)

See Also:
i915 platform: BYT, HSW, SNB
i915 features: GEM/Other


Attachments

Description Lakshmi 2019-10-08 16:34:45 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7012/shard-hsw4/igt@gem_eio@in-flight-contexts-10ms.html
Starting subtest: in-flight-contexts-10ms
(gem_eio:1508) CRITICAL: Test assertion failure function trigger_reset, file ../tests/i915/gem_eio.c:85:
(gem_eio:1508) CRITICAL: Failed assertion: igt_seconds_elapsed(&ts) <= 10
(gem_eio:1508) CRITICAL: error: 26 > 10
Subtest in-flight-contexts-10ms failed.

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7011/shard-snb7/igt@gem_eio@in-flight-contexts-immediate.html
Starting subtest: in-flight-contexts-immediate
(gem_eio:3011) CRITICAL: Test assertion failure function trigger_reset, file ../tests/i915/gem_eio.c:85:
(gem_eio:3011) CRITICAL: Failed assertion: igt_seconds_elapsed(&ts) <= 10
(gem_eio:3011) CRITICAL: error: 39 > 10
Subtest in-flight-contexts-immediate failed.
Comment 1 CI Bug Log 2019-10-08 16:35:27 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SNB BYT HSW: igt@gem_eio@in-flight-contexts-immediate|igt@gem_eio@in-flight-contexts-10ms - fail - Failed assertion: igt_seconds_elapsed(&amp;ts) &lt;= 10
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7011/shard-snb7/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14671/shard-snb1/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14677/shard-snb4/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7012/shard-hsw4/igt@gem_eio@in-flight-contexts-10ms.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7012/shard-snb2/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14678/shard-snb2/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_384/fi-byt-j1900/igt@gem_eio@in-flight-contexts-10ms.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_384/fi-snb-2600/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14681/shard-snb6/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7021/shard-snb6/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3544/shard-snb4/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7024/shard-snb7/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14686/shard-snb6/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7026/shard-hsw6/igt@gem_eio@in-flight-contexts-10ms.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3550/shard-snb1/igt@gem_eio@in-flight-contexts-immediate.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3552/shard-snb7/igt@gem_eio@in-flight-contexts-immediate.html
Comment 2 Chris Wilson 2019-10-09 08:14:19 UTC
Hopefully,

commit 998e29e137444db6e0ec2c65611a7eab3e0b6a21
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Oct 7 09:26:28 2019 +0100

    lib/i915: Bump conservative threshold for ring size
    
    We are still hitting the occasional stall upon submission, so be extra
    caution and leave one more spare.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Comment 3 Lakshmi 2019-10-18 12:17:05 UTC
Still happening
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7124/shard-snb2/igt@gem_eio@in-flight-contexts-immediate.html
Starting subtest: in-flight-contexts-immediate
(gem_eio:4019) CRITICAL: Test assertion failure function trigger_reset, file ../tests/i915/gem_eio.c:90:
(gem_eio:4019) CRITICAL: Failed assertion: igt_seconds_elapsed(&ts) <= 10
(gem_eio:4019) CRITICAL: error: 21 > 10
Subtest in-flight-contexts-immediate failed.
Comment 4 CI Bug Log 2019-10-18 12:19:34 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SNB BYT HSW: igt@gem_eio@in-flight-contexts-immediate|igt@gem_eio@in-flight-contexts-10ms - fail - Failed assertion: igt_seconds_elapsed(&amp;ts) &lt;= 10 -}
{+ SNB BYT HSW: igt@gem_eio@in-flight-contexts-immediate|igt@gem_eio@in-flight-contexts-10ms|igt@gem_eio@in-flight-contexts-1us - fail - Failed assertion: igt_seconds_elapsed(&amp;ts) &lt;= 10 +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7125/shard-snb4/igt@gem_eio@in-flight-contexts-1us.html
Comment 5 Francesco Balestrieri 2019-11-11 11:28:29 UTC
Minor issue according to Chris.
Comment 6 Chris Wilson 2019-11-11 16:40:57 UTC
Next installment of this saga,
commit a64a29e775443e869e4524e5bd3a3427225810dc (upstream/master)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Nov 11 11:36:34 2019 +0000

    i915/gem_eio: Flush RCU before timing our own critical sections
    
    We cannot control how long RCU takes to find a quiescent point as that
    depends upon the background load and so may take an arbitrary time.
    Instead, let's try to avoid that impacting our measurements by inserting
    an rcu_barrier() before our critical timing sections and hope that hides
    the issue, letting us always perform a fast reset. Fwiw, we do the
    expedited RCU synchronize, but that is not always enough.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Let's be brave and like all the other times assume that this fixes it for real.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.