Bug 110271

Summary: [CI][SHARDS] igt@i915_pm_rpm@i2c - dmesg-fail - Failed assertion: diff <= vga_outputs && diff >= 0, Last errno: 121, Remote I/O error
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Imre Deak <imre.deak>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: highest CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: ICL i915 features: power/runtime PM

Description Martin Peres 2019-03-28 12:44:23 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5825/shard-iclb2/igt@i915_pm_rpm@i2c.html

Failed assertion: diff <= vga_outputs && diff >= 0
Last errno: 121, Remote I/O error
Comment 1 CI Bug Log 2019-03-28 12:45:06 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL: igt@i915_pm_rpm@i2c - dmesg-fail - Failed assertion: diff &lt;= vga_outputs &amp;&amp; diff &gt;= 0, Last errno: 121, Remote I/O error
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12595/shard-iclb2/igt@i915_pm_rpm@i2c.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5825/shard-iclb2/igt@i915_pm_rpm@i2c.html
Comment 2 Daniel Vetter 2019-04-01 12:52:51 UTC
Smells a bit like hw issue, could also be runtime pm/power well tracking screw-up somewhere. Very much functionality issue (some configurations for displays can only be done on i2c, and the kms driver doesn't expose it, hence why the raw i2c /dev driver needs to work).
Comment 3 Daniel Vetter 2019-04-01 12:54:59 UTC
This is used by ddccontrol and ddcutil to set display configuration as per vesa standard (stuff like brightness and all the other things you usually can also adjust through the OSD).
Comment 4 Daniel Vetter 2019-04-01 12:57:10 UTC
btw this is ofc also used by the kernel internally, so if it's a general issue then the priority needs to be highest because we need this to read the EDID. Comment #3 is only for the /dev i2c interface, for the case where the bug only affects that one.
Comment 5 Martin Peres 2019-04-01 13:07:11 UTC
The failure has been seen only on iclb2 so far, and iclb2 also has issues with DP aux (also only on iclb2): https://bugs.freedesktop.org/show_bug.cgi?id=109982

So this might indicate a HW issue with the pre-production machine, as opposed to the platform.

Keeping the priority high because of the potential high customer impact (as per Daniel's comment), in order to investigate potential fixes quickly.
Comment 6 Imre Deak 2019-04-01 13:20:01 UTC
The TypeC mode tracking is broken on ICL atm. Unlike on other platforms we can't do AUX or enable a mode unless a sink is plugged in. I'm working on fixing the mode switching in general, that should get rid of the timeout errors.
Comment 7 Martin Peres 2019-04-12 13:18:07 UTC
*** Bug 109982 has been marked as a duplicate of this bug. ***
Comment 8 Martin Peres 2019-04-25 06:17:11 UTC
(In reply to Imre Deak from comment #6)
> The TypeC mode tracking is broken on ICL atm. Unlike on other platforms we
> can't do AUX or enable a mode unless a sink is plugged in. I'm working on
> fixing the mode switching in general, that should get rid of the timeout
> errors.

Thanks! Bumping to highest as this could lead to failed modesets or missed hot plug detections.
Comment 9 Jani Saarinen 2019-05-03 15:50:29 UTC
This is already wip by Imre.
Comment 10 Jani Saarinen 2019-05-06 10:33:27 UTC
There is now preparation series sent and type-c: https://patchwork.freedesktop.org/series/60242/ that needs to land first.
Comment 11 Jani Saarinen 2019-05-13 06:07:14 UTC
That now all reviewed but further testing needed by Imre still
Comment 12 Arek Hiler 2019-06-10 06:15:01 UTC
The reproduction rate was about 5% (1 in 20) on just a single machine from ICL shards. I think that Daniel is right attributing it to pre-prod hw issue.

We have changed the configuration of that machine and the issue has not been reproduced since CI_DRM_6001.

We are at CI_DRM_6222 right now. We are past 10x the old rate, so I am closing the issue.

As a side note, we were quite lucky that errno was actually tied to the problem. The test currently opens all the /dev/i2c-* and this might have been just a random pollution.

There's a twin bug, that is more generic/errno-less:
https://bugs.freedesktop.org/show_bug.cgi?id=104097

There's also a patch that will enable us to pinpoint i2c issues to particular connector and will log much more details on how we failed:
https://patchwork.freedesktop.org/series/60357/
Comment 13 CI Bug Log 2019-09-04 08:39:44 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.