Bug 110566

Summary: [CI][BAT] [CML only] igt@* - incomplete - timeout/system hang?
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED MOVED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: highest CC: intel-gfx-bugs, radoslaw.szwichtenberg, tomi.p.sarvela
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: CML i915 features: CI Infra

Description Martin Peres 2019-04-30 12:04:05 UTC

    
Comment 2 CI Bug Log 2019-05-02 07:09:46 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6010/re-cml-u/igt@gem_tiled_wb.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6012/re-cml-u/igt@gem_tiled_wb.html
Comment 3 CI Bug Log 2019-05-08 15:30:14 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6062/re-cml-u/igt@kms_chamelium@common-hpd-after-suspend.html
Comment 4 CI Bug Log 2019-05-09 06:51:33 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6061/re-cml-u/igt@kms_cursor_crc@cursor-64x64-offscreen.html
Comment 5 CI Bug Log 2019-05-09 11:43:19 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6067/re-cml-u/igt@kms_flip@flip-vs-fences.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6067/re-cml-u/igt@kms_psr@psr2_sprite_plane_move.html
Comment 6 CI Bug Log 2019-05-13 06:19:08 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6072/re-cml-u/igt@gem_pwrite@big-gtt-backwards.html
Comment 7 CI Bug Log 2019-05-15 05:50:52 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_285/fi-cml-u/igt@gem_pwrite@big-cpu-backwards.html
Comment 8 CI Bug Log 2019-05-16 08:07:40 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CML: random tests - incomplete -}
{+ CML: random tests - incomplete +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/re-cml-u/igt@i915_pm_rpm@pm-caching.html
Comment 10 Martin Peres 2019-05-23 13:58:00 UTC
Putting to highest priority, since incompletes are now everywhere!
Comment 11 Stanislav Lisovskiy 2019-08-07 08:16:52 UTC
Different tests, but always same 2 machines(fi-cml-u, re-cml-u) - kinda suspicious. May be it would make sense to take it from CI to run the tests manually and check?
Comment 12 Jani Saarinen 2019-09-25 15:34:27 UTC
Lakhsmi, is this still valid?
Comment 13 Lakshmi 2019-09-25 17:19:03 UTC
(In reply to Jani Saarinen from comment #12)
> Lakhsmi, is this still valid?
Yes, it's still happening. But I see that Jenkins gives up in some cases.

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_378/fi-cml-s/igt@gem_ctx_switch@vcs0-heavy.html

Starting subtest: vcs0-heavy
[32.199975] Abort requested by sudo LD_LIBRARY_PATH=/opt/igt/lib:/opt/igt/lib/x86_64-linux-gnu IGT_CI_META_TEST=yes IGT_REBOOT_ON_FATAL_ERROR=yes stdb [1183] via Terminated, terminating children
[32.200923] Abort requested by /sbin/init 3  [1] via Hangup, terminating children
[32.201059] Abort requested by sudo LD_LIBRARY_PATH=/opt/igt/lib:/opt/igt/lib/x86_64-linux-gnu IGT_CI_META_TEST=yes IGT_REBOOT_ON_FATAL_ERROR=yes stdb [1183] via Hangup, terminating children
[32.212075] Closing watchdogs
Done.

We need to go through all the failures and see if we need to split the issue.
Comment 14 Vanshidhar Konda 2019-11-05 19:21:39 UTC
I looked over about 20 logs for this issue. For about 60% of the logs there is a line saying: This is power.sh, remotely rebooting this machine.

It seems like the machine was reset remotely during the test - in a number of cases while the test was executing.
Comment 15 Jani Saarinen 2019-11-18 07:05:24 UTC
Tomi, is there something can be done to CI for this ?
Comment 16 Lakshmi 2019-11-18 12:35:41 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_396/fi-cml-s/igt_runner6.txt
Subtest pipe-A-overlay-size-64: SUCCESS (2.454s)
[37.463866] [11/99] (945s left) kms_plane (plane-panning-top-left-pipe-b-planes)
Starting subtest: plane-panning-top-left-pipe-B-planes
[41.235917] Abort requested by /sbin/init 3 [1] via Hangup, terminating children
[41.237585] Abort requested by sudo LD_LIBRARY_PATH=/opt/igt/lib:/opt/igt/lib/x86_64-linux-gnu IGT_CI_META_TEST=yes IGT_REBOOT_ON_FATAL_ERROR=yes stdb [1215] via Terminated, terminating children
[42.417619] Closing watchdogs
Done.
These incompletes should be fixed as part of bug 111747.
Comment 17 Martin Peres 2019-11-29 19:05:53 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/283.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.