Bug 110566 - [CI][BAT] [CML only] igt@* - incomplete - timeout/system hang?
Summary: [CI][BAT] [CML only] igt@* - incomplete - timeout/system hang?
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: highest normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-30 12:04 UTC by Martin Peres
Modified: 2019-11-05 19:21 UTC (History)
1 user (show)

See Also:
i915 platform: CML
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-04-30 12:04:05 UTC

    
Comment 2 CI Bug Log 2019-05-02 07:09:46 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6010/re-cml-u/igt@gem_tiled_wb.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6012/re-cml-u/igt@gem_tiled_wb.html
Comment 3 CI Bug Log 2019-05-08 15:30:14 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6062/re-cml-u/igt@kms_chamelium@common-hpd-after-suspend.html
Comment 4 CI Bug Log 2019-05-09 06:51:33 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6061/re-cml-u/igt@kms_cursor_crc@cursor-64x64-offscreen.html
Comment 5 CI Bug Log 2019-05-09 11:43:19 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6067/re-cml-u/igt@kms_flip@flip-vs-fences.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6067/re-cml-u/igt@kms_psr@psr2_sprite_plane_move.html
Comment 6 CI Bug Log 2019-05-13 06:19:08 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6072/re-cml-u/igt@gem_pwrite@big-gtt-backwards.html
Comment 7 CI Bug Log 2019-05-15 05:50:52 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_285/fi-cml-u/igt@gem_pwrite@big-cpu-backwards.html
Comment 8 CI Bug Log 2019-05-16 08:07:40 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/re-cml-u/igt@i915_pm_rpm@pm-caching.html
Comment 10 Martin Peres 2019-05-23 13:58:00 UTC
Putting to highest priority, since incompletes are now everywhere!
Comment 11 Stanislav Lisovskiy 2019-08-07 08:16:52 UTC
Different tests, but always same 2 machines(fi-cml-u, re-cml-u) - kinda suspicious. May be it would make sense to take it from CI to run the tests manually and check?
Comment 12 Jani Saarinen 2019-09-25 15:34:27 UTC
Lakhsmi, is this still valid?
Comment 13 Lakshmi 2019-09-25 17:19:03 UTC
(In reply to Jani Saarinen from comment #12)
> Lakhsmi, is this still valid?
Yes, it's still happening. But I see that Jenkins gives up in some cases.

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_378/fi-cml-s/igt@gem_ctx_switch@vcs0-heavy.html

Starting subtest: vcs0-heavy
[32.199975] Abort requested by sudo LD_LIBRARY_PATH=/opt/igt/lib:/opt/igt/lib/x86_64-linux-gnu IGT_CI_META_TEST=yes IGT_REBOOT_ON_FATAL_ERROR=yes stdb [1183] via Terminated, terminating children
[32.200923] Abort requested by /sbin/init 3  [1] via Hangup, terminating children
[32.201059] Abort requested by sudo LD_LIBRARY_PATH=/opt/igt/lib:/opt/igt/lib/x86_64-linux-gnu IGT_CI_META_TEST=yes IGT_REBOOT_ON_FATAL_ERROR=yes stdb [1183] via Hangup, terminating children
[32.212075] Closing watchdogs
Done.

We need to go through all the failures and see if we need to split the issue.
Comment 14 Vanshidhar Konda 2019-11-05 19:21:39 UTC
I looked over about 20 logs for this issue. For about 60% of the logs there is a line saying: This is power.sh, remotely rebooting this machine.

It seems like the machine was reset remotely during the test - in a number of cases while the test was executing.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.