Bug 108131 - [CI][SHARDS] igt@* - dmesg-warn - *ERROR* LSPCON mode hasn't settled
Summary: [CI][SHARDS] igt@* - dmesg-warn - *ERROR* LSPCON mode hasn't settled
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: low normal
Assignee: Swati Sharma
QA Contact: Intel GFX Bugs mailing list
Whiteboard: ReadyForDev
Depends on:
Blocks: 105979
  Show dependency treegraph
Reported: 2018-10-02 15:28 UTC by Martin Peres
Modified: 2019-11-29 17:53 UTC (History)
2 users (show)

See Also:
i915 platform: BXT
i915 features: display/LSPCON


Description Martin Peres 2018-10-02 15:28:21 UTC

<3> [317.006930] [drm:lspcon_wait_mode [i915]] *ERROR* LSPCON mode hasn't settled
Comment 1 Martin Peres 2018-10-25 11:35:09 UTC


<3> [633.225687] [drm:lspcon_wait_mode [i915]] *ERROR* LSPCON mode hasn't settled
<3> [633.362409] [drm:lspcon_change_mode.constprop.4 [i915]] *ERROR* Error reading LSPCON mode
<3> [633.362506] [drm:intel_dp_detect [i915]] *ERROR* LSPCON resume failed
Comment 2 Swati Sharma 2019-03-25 12:17:52 UTC
Updated CI results?
Comment 3 Lakshmi 2019-03-25 12:43:14 UTC
(In reply to Swati Sharma from comment #2)
> Updated CI results?

Last seen on IGT_4777_full (2 months / 1284 runs ago), this issue used to occur once in 1-3 weeks or 3-974 runs. 
Dropping the priority to Medium.
Comment 4 CI Bug Log 2019-06-27 08:37:33 UTC
A CI Bug Log filter associated to this bug has been updated:

{- APL: random tests - dmesg-warn - LSPCON mode hasn&#39;t settled -}
{+ APL: random tests - dmesg-warn - LSPCON mode hasn&#39;t settled +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6361/shard-apl5/igt@kms_flip@flip-vs-suspend-interruptible.html
Comment 5 Matt Roper 2019-08-27 17:29:29 UTC
LSPCON refers to a DP -> HDMI adapter used on these systems ("Level Shifter and Protocol CONverter"); it's a separate downstream device and when we perform a suspend/resume cycle, we need to settle into its PCON mode before using it.  The messages here indicate that although the LSPCON is responding to DPCD reads on the aux channel following resume, when we try to check the mode (LS or PCON) by doing DPCD reads of offset 41, all of those reads return "defer" until we eventually give up and declare a timeout.

Higher level logic does itself retry probing the LSPCON mode and the LSPCON finally starts responding again after more than a second has passed (658.672242 -> 659.860423).

It's hard to say why the LSPCON flakes out for over a second and fails to respond to us, but there have been a few upstream changes to extend the timeouts in places (e.g., "drm/i915: Increase LSPCON timeout").  From the CI database, it looks like the issue became significantly less common once those timeouts were extended (last seen two months ago, and the previous occurrence was five months before that); we could probably eliminate this completely if we kept extending timeouts far enough, but that would likely lead to poor user experience in situations where we legitimately do need to timeout for an operation (the commit message for the commit above does indicate they chose 400ms rather than the original 1000ms for this reason).

Due to the rarity of this problem, the lack of user-visible impact (the higher-level code does retry further and get a response as we can see in the logs), I think it's safe to downgrade this bug to 'low' exposure.
Comment 6 Martin Peres 2019-11-29 17:53:09 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/165.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.