Bug 110917 - [CI][BAT] igt@i915_selftest@live_hangcheck - dmesg-fail - i915/intel_hangcheck_live_selftests: igt_reset_engines failed with error -110
Summary: [CI][BAT] igt@i915_selftest@live_hangcheck - dmesg-fail - i915/intel_hangchec...
Status: RESOLVED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-06-14 06:17 UTC by Martin Peres
Modified: 2019-08-06 05:07 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-06-14 06:17:23 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-icl-y/igt@i915_selftest@live_hangcheck.html

<7> [593.368203] [drm:i915_reset_engine [i915]] Failed to reset vcs2, ret=-110
<3> [593.368380] i915_reset_engine(vcs2:idle): failed, err=-110
<6> [593.368450] i915_reset_engine(vcs2:idle): 35 resets
<3> [593.368453] i915_reset_engine(vcs2:idle): reset 35 times, but reported 36
<3> [593.368461] i915/intel_hangcheck_live_selftests: igt_reset_engines failed with error -110
<3> [593.575953] intel_hangcheck_live_selftests+0xa1/0xd0 [i915] timed out, cancelling all further testing.
Comment 1 CI Bug Log 2019-06-14 06:18:52 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* igt@i915_selftest@live_hangcheck - dmesg-fail - i915/intel_hangcheck_live_selftests: igt_reset_engines failed with error -110
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4399/fi-icl-dsi/igt@i915_selftest@live_hangcheck.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-icl-y/igt@i915_selftest@live_hangcheck.html
Comment 2 Francesco Balestrieri 2019-06-18 13:57:52 UTC
From a chat with Chris (incorrectness to be blamed on me paraphrasing):

The test repeatedly resets the GPU and checks that we can execute a request afterwards. The fact that it fails isn't so critical for users, and may even be related to us using a timeout that is too short. However there have been other issues with reset (see https://bugs.freedesktop.org/show_bug.cgi?id=110683) that could be related. Also, in passing runs ICL resets < 100 times, other platforms > 1000.
Comment 3 James Ausmus 2019-06-19 15:13:48 UTC
Set to "high" priority to reflect assessment done by Francesco's team
Comment 4 Francesco Balestrieri 2019-07-30 04:27:21 UTC
So far this has occurred twice in two consecutive runs on icl-y, then not again for a month/68 runs. Let's keep monitoring it to see if it happens again and decide next steps then.
Comment 5 Francesco Balestrieri 2019-08-06 05:07:24 UTC
Not seen im 90 runs / 1 month , 3 weeks. Closing.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.