Bug 110566 - [CI][BAT] [CML only] igt@* - incomplete - timeout/system hang?
Summary: [CI][BAT] [CML only] igt@* - incomplete - timeout/system hang?
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: highest normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-30 12:04 UTC by Martin Peres
Modified: 2019-11-29 19:05 UTC (History)
3 users (show)

See Also:
i915 platform: CML
i915 features: CI Infra


Attachments

Description Martin Peres 2019-04-30 12:04:05 UTC

    
Comment 2 CI Bug Log 2019-05-02 07:09:46 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6010/re-cml-u/igt@gem_tiled_wb.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6012/re-cml-u/igt@gem_tiled_wb.html
Comment 3 CI Bug Log 2019-05-08 15:30:14 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6062/re-cml-u/igt@kms_chamelium@common-hpd-after-suspend.html
Comment 4 CI Bug Log 2019-05-09 06:51:33 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6061/re-cml-u/igt@kms_cursor_crc@cursor-64x64-offscreen.html
Comment 5 CI Bug Log 2019-05-09 11:43:19 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6067/re-cml-u/igt@kms_flip@flip-vs-fences.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6067/re-cml-u/igt@kms_psr@psr2_sprite_plane_move.html
Comment 6 CI Bug Log 2019-05-13 06:19:08 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6072/re-cml-u/igt@gem_pwrite@big-gtt-backwards.html
Comment 7 CI Bug Log 2019-05-15 05:50:52 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_285/fi-cml-u/igt@gem_pwrite@big-cpu-backwards.html
Comment 8 CI Bug Log 2019-05-16 08:07:40 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/re-cml-u/igt@i915_pm_rpm@pm-caching.html
Comment 10 Martin Peres 2019-05-23 13:58:00 UTC
Putting to highest priority, since incompletes are now everywhere!
Comment 11 Stanislav Lisovskiy 2019-08-07 08:16:52 UTC
Different tests, but always same 2 machines(fi-cml-u, re-cml-u) - kinda suspicious. May be it would make sense to take it from CI to run the tests manually and check?
Comment 12 Jani Saarinen 2019-09-25 15:34:27 UTC
Lakhsmi, is this still valid?
Comment 13 Lakshmi 2019-09-25 17:19:03 UTC
(In reply to Jani Saarinen from comment #12)
> Lakhsmi, is this still valid?
Yes, it's still happening. But I see that Jenkins gives up in some cases.

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_378/fi-cml-s/igt@gem_ctx_switch@vcs0-heavy.html

Starting subtest: vcs0-heavy
[32.199975] Abort requested by sudo LD_LIBRARY_PATH=/opt/igt/lib:/opt/igt/lib/x86_64-linux-gnu IGT_CI_META_TEST=yes IGT_REBOOT_ON_FATAL_ERROR=yes stdb [1183] via Terminated, terminating children
[32.200923] Abort requested by /sbin/init 3  [1] via Hangup, terminating children
[32.201059] Abort requested by sudo LD_LIBRARY_PATH=/opt/igt/lib:/opt/igt/lib/x86_64-linux-gnu IGT_CI_META_TEST=yes IGT_REBOOT_ON_FATAL_ERROR=yes stdb [1183] via Hangup, terminating children
[32.212075] Closing watchdogs
Done.

We need to go through all the failures and see if we need to split the issue.
Comment 14 Vanshidhar Konda 2019-11-05 19:21:39 UTC
I looked over about 20 logs for this issue. For about 60% of the logs there is a line saying: This is power.sh, remotely rebooting this machine.

It seems like the machine was reset remotely during the test - in a number of cases while the test was executing.
Comment 15 Jani Saarinen 2019-11-18 07:05:24 UTC
Tomi, is there something can be done to CI for this ?
Comment 16 Lakshmi 2019-11-18 12:35:41 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_396/fi-cml-s/igt_runner6.txt
Subtest pipe-A-overlay-size-64: SUCCESS (2.454s)
[37.463866] [11/99] (945s left) kms_plane (plane-panning-top-left-pipe-b-planes)
Starting subtest: plane-panning-top-left-pipe-B-planes
[41.235917] Abort requested by /sbin/init 3 [1] via Hangup, terminating children
[41.237585] Abort requested by sudo LD_LIBRARY_PATH=/opt/igt/lib:/opt/igt/lib/x86_64-linux-gnu IGT_CI_META_TEST=yes IGT_REBOOT_ON_FATAL_ERROR=yes stdb [1215] via Terminated, terminating children
[42.417619] Closing watchdogs
Done.
These incompletes should be fixed as part of bug 111747.
Comment 17 Martin Peres 2019-11-29 19:05:53 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/283.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.