Bug 108680 - [CI][BAT] igt@kms_chamelium may need to wait a little bit for the network to come back up before connecting to chamelium
Summary: [CI][BAT] igt@kms_chamelium may need to wait a little bit for the network to ...
Status: CLOSED DUPLICATE of bug 108767
Alias: None
Product: DRI
Classification: Unclassified
Component: IGT (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Petri Latvala
QA Contact:
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-11-06 12:07 UTC by Martin Peres
Modified: 2018-12-07 16:19 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Martin Peres 2018-11-06 12:07:32 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5091/fi-skl-6700k2/igt@kms_chamelium@common-hpd-after-suspend.html

Starting subtest: common-hpd-after-suspend
Subtest common-hpd-after-suspend: SUCCESS (36.272s)
(kms_chamelium:3059) igt_chamelium-CRITICAL: Test assertion failure function chamelium_rpc, file ../lib/igt_chamelium.c:303:
(kms_chamelium:3059) igt_chamelium-CRITICAL: Failed assertion: !chamelium->env.fault_occurred
(kms_chamelium:3059) igt_chamelium-CRITICAL: Last errno: 113, No route to host
(kms_chamelium:3059) igt_chamelium-CRITICAL: Chamelium RPC call failed: libcurl failed to execute the HTTP POST transaction, explaining:  Failed to connect to 192.168.1.220 port 9992: No route to host
Comment 1 Jani Saarinen 2018-11-09 08:37:18 UTC
From Petri:
"The story for it is that the test suspends a couple of times and then reports success, and chamelium's atexit handlers attempt to reset the chamelium to a good default state for other tests, and that fails with "No route to host". To be investigated: Is the theory correct that the DUT doesn't have an IP when that happens, or the link is down still? If the dut's network interface is still down from the suspend, can a wait (using netlink) be added to chamelium_rpc() if the interface is coming up, or if this can be made more robust in some other way."
Comment 2 Petri Latvala 2018-11-09 10:55:14 UTC
Turns out, fi-skl-6700k2's network interface is just prone to completely disappearing on suspend. Work is ongoing to make IGT instead check for working network connectivity after suspending to mitigate this.
Comment 3 Martin Peres 2018-11-09 11:35:00 UTC
Another example of this issue. The watchdog just randomly killed the platform before it was done:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5110/fi-skl-6700k2/igt@drv_selftest@live_execlists.html
Comment 4 Lakshmi 2018-11-12 12:12:47 UTC
IGT infrastructure issue rather than driver issue.
Comment 5 Jani Saarinen 2018-11-16 11:15:52 UTC
Dropping to low as feature request. 
CI failures on CI due to broken HW or driver on NIC.
Comment 6 Martin Peres 2018-11-16 12:17:30 UTC
(In reply to Jani Saarinen from comment #5)
> Dropping to low as feature request. 
> CI failures on CI due to broken HW or driver on NIC.

Even perfectly-functioning platforms may hit this issue, so raising the priority a bit.
Comment 7 Petri Latvala 2018-11-28 11:18:46 UTC
(In reply to Martin Peres from comment #6)
> (In reply to Jani Saarinen from comment #5)
> > Dropping to low as feature request. 
> > CI failures on CI due to broken HW or driver on NIC.
> 
> Even perfectly-functioning platforms may hit this issue, so raising the
> priority a bit.

But the story as stated by the bug title is bogus, as the network is not coming up. It's dead, a late network. If it wasn't nailed to the perch it would be pushing up the daisies.

In other words, this bug cannot be solved.

Only thing CI can do is have a solution for bug #108767.
Comment 8 Lakshmi 2018-12-07 16:19:58 UTC

*** This bug has been marked as a duplicate of bug 108767 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.