Bug 109483 - [CI][DRMTIP] igt@kms_chamelium@dp-edid-read - warn - Chamelium RPC call failed: RPC failed at server. <class 'chameleond.utils.i2c.I2cBusError'>:I2C access error
Summary: [CI][DRMTIP] igt@kms_chamelium@dp-edid-read - warn - Chamelium RPC call faile...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Stuart Summers
QA Contact: Intel GFX Bugs mailing list
Whiteboard: ReadyForDev
Depends on:
Reported: 2019-01-28 14:28 UTC by Martin Peres
Modified: 2019-02-27 15:58 UTC (History)
3 users (show)

See Also:
i915 platform: KBL
i915 features: display/Other


Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-01-28 14:28:45 UTC

Starting subtest: dp-edid-read
Subtest dp-edid-read: SUCCESS (2.015s)
(kms_chamelium:2873) igt_chamelium-CRITICAL: Test assertion failure function chamelium_rpc, file ../lib/igt_chamelium.c:303:
(kms_chamelium:2873) igt_chamelium-CRITICAL: Failed assertion: !chamelium->env.fault_occurred
(kms_chamelium:2873) igt_chamelium-CRITICAL: Chamelium RPC call failed: RPC failed at server.  <class 'chameleond.utils.i2c.I2cBusError'>:I2C access error
Comment 1 CI Bug Log 2019-01-28 14:31:25 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* CHAMELIUM: igt@kms_chamelium@dp-edid-read - warn - Chamelium RPC call failed: RPC failed at server.  &lt;class &#39;chameleond.utils.i2c.I2cBusError&#39;&gt;:I2C access error
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5488/fi-kbl-7500u/igt@kms_chamelium@dp-edid-read.html
Comment 2 Stuart Summers 2019-02-08 22:01:30 UTC
From initial analysis of the logs themselves, the tests are passing, but it appears we're getting an out of band error during DP hotplug, specifically an error code sent to IGT as a result of a command sent via XML-RPC.

I don't see anything obvious in dmesg at this point indicating the time in which the error occurred with respect to the passing dp-edid-read subtest. The two commands sent as part of dp-edid-read are "ApplyEdid" and "Plug", so perhaps the plug event is taking too long to return or getting some unexpected i2c delays?

I do see this around the time of the failure (dmesg):
<7>[  145.024275] [drm:i915_hotplug_work_func [i915]] Connector DP-1 (pin 5) received hotplug event.
<7>[  145.024306] [drm:intel_dp_detect [i915]] [CONNECTOR:85:DP-1]
<7>[  145.024577] [drm:drm_fb_helper_hotplug_event.part.24] 
<7>[  145.024627] [drm:drm_setup_crtcs] 
<7>[  145.025115] [drm:intel_dp_read_dpcd [i915]] DPCD: 11 0a 84 01 01 00 01 00 02 00 00 00 00 00 00
<7>[  145.025517] [drm:intel_dp_print_rates [i915]] source rates: 162000, 216000, 270000, 324000, 432000, 540000
<7>[  145.025558] [drm:intel_dp_print_rates [i915]] sink rates: 162000, 270000
<7>[  145.025598] [drm:intel_dp_print_rates [i915]] common rates: 162000, 270000
<7>[  145.026021] [drm:drm_dp_read_desc] DP sink: OUI 00-00-00 dev-ID  HW-rev 0.0 SW-rev 0.0 quirks 0x0000
<7>[  145.026062] [drm:intel_dp_detect [i915]] MST support? port B: yes, sink: no, modparam: yes
<7>[  145.029336] [drm:drm_dp_i2c_xfer] Partial I2C reply: requested 16 bytes got 2 bytes
<7>[  145.032999] [drm:drm_dp_i2c_xfer] Partial I2C reply: requested 2 bytes got 1 bytes
<7>[  145.033493] [drm:drm_dp_i2c_do_msg] I2C defer
<7>[  145.067560] [drm:drm_dp_i2c_xfer] Partial I2C reply: requested 16 bytes got 2 bytes

Specifically those last few lines are unique to the dmesg log (only time we get an I2C defer from the Chamelium).

I am planning to work on this further early next week when I get access to a Chamelium device to try to reproduce.
Comment 3 CI Bug Log 2019-02-11 12:50:58 UTC
A CI Bug Log filter associated to this bug has been updated:

{- CHAMELIUM: igt@kms_chamelium@dp-edid-read - warn - Chamelium RPC call failed: RPC failed at server.  &lt;class &#39;chameleond.utils.i2c.I2cBusError&#39;&gt;:I2C access error -}
{+ CHAMELIUM: igt@kms_chamelium@dp-edid-read - warn / fail - Chamelium RPC call failed: RPC failed at server.  &lt;class &#39;...&#39;&gt;:I2C access error +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5586/fi-kbl-7500u/igt@kms_chamelium@dp-edid-read.html
Comment 4 Stuart Summers 2019-02-16 00:54:09 UTC
The latest failure is interesting, since it actually results in a failed test. From the logs, it's a little unclear in the warn case why we are getting an RPC call to the Chamelium outside of a running test. The assert location is the same as the failure case, so I'd expect the test to fail in the first case as well.

That said, given that 1) we don't have full logs for the successful case and 2) we don't have *any* logs coming off of the Chamelium itself, there isn't a lot to go on. I have a patch up on igt-dev which attempts to extract a little more information from the CI log, although it will only hit in a failure case. Unfortunately the patch requires a new package be added to the CI system. I have a request for this in i915-infra: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/29.

Next steps here are to:
1) Reproduce manually
   - I haven't been successful there since this issue seems specifically related to DP connectors while my test system only has an HDMI connector. I'll look into this next week.
2) Add better logging capability in these kms_chamelium tests.
   - This is probably better in the long term anyway to reduce debug time of these types of problems.
   - This logging can also come from IGT or from the chameleond daemon running on the Chamelium board itself. If I can't get traction in IGT, I'll look into updating chameleond.

Another thing that might be interesting is to start producing debug logging even for tests which succeed, at least for CI. I'll see if that's a possibility and possibly file another i915-infra request if it looks promising.
Comment 5 Stuart Summers 2019-02-16 00:59:33 UTC
Added one more i915-infra request: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/31 to add more information for passing tests. I'll follow up there.
Comment 6 Stuart Summers 2019-02-27 15:58:45 UTC
Bumping to Medium given the issue seems most likely to be in the interaction between IGT and Chamelium.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.