Bug 108529 - [CI][BAT] igt@* - dmesg-warn - [drm:drm_dp_dpcd_access] Too many retries, giving up. First error: -5, then *ERROR* Failed to probe lspcon
Summary: [CI][BAT] igt@* - dmesg-warn - [drm:drm_dp_dpcd_access] Too many retries, giv...
Status: RESOLVED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Swati Sharma
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-23 15:16 UTC by Martin Peres
Modified: 2019-07-23 10:44 UTC (History)
1 user (show)

See Also:
i915 platform: BXT, KBL, SKL
i915 features: display/LSPCON


Attachments
BIOS(VBT) information from failed machine. (9.00 KB, text/plain)
2018-10-30 07:20 UTC, Lakshmi
no flags Details
BIOS VBT file (6.01 KB, application/octet-stream)
2018-10-30 10:29 UTC, Lakshmi
no flags Details

Description Martin Peres 2018-10-23 15:16:19 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5021/fi-apl-guc/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

<7> [324.212725] [drm:drm_dp_dpcd_access] Too many retries, giving up. First error: -5
<3> [324.212863] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5021/fi-apl-guc/igt@drv_module_reload@basic-reload.html

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5021/fi-apl-guc/igt@drv_module_reload@basic-reload-inject.html

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5021/fi-apl-guc/igt@pm_rpm@module-reload.html

<3> [349.753045] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [349.753168] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<3> [353.686349] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [353.686443] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
Comment 1 Martin Peres 2018-10-23 15:17:43 UTC
This also got hit in the same way in https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1981/issues.html
Comment 3 Jani Nikula 2018-10-29 17:35:29 UTC
Please attach /sys/kernel/debug/dri/0/i915_vbt from the failing device.
Comment 4 Lakshmi 2018-10-30 07:20:26 UTC
Created attachment 142272 [details]
BIOS(VBT) information from failed machine.
Comment 5 shashank.sharma@intel.com 2018-10-30 07:42:33 UTC
Swati tested LSPCON probing on Bangalore APL system, where:
 	- LSPCON HW is on port B,
 	- VBT was bad, and was showing LSPCON on all ports A, B C
 	- On Port B, the probe was successful, but probe was failure on port 
 A and C (as there was no LSPCON)
 
 - On FI-CI Build APL system:
 	- the VBT seems showing LSPCON only on Port B
 	- but we are seeing a failure during the probe itself, even on LSPCON port
 
 So technically, Swati is not able to reproduce this issue yet (as the 
 probe failures were on non-LSPCON ports A and C, due to wrong VBT 
 reports)
 
 Next steps for debug:
 Lakshmi: Can you please share the BIOS(VBT) which is being flashed 
 on CI- APL system where the failure has been observed ?
 
 Swati: Please flash this BIOS(VBT) on Bangalore APL system, and see 
 if you are able to reproduce this issue.
 	- If yes: Lets debug this issue further and root cause.
 	- If no: We might need to get access to CI-APL system, for root causing.
Comment 6 Lakshmi 2018-10-30 10:29:17 UTC
Created attachment 142273 [details]
BIOS VBT file
Comment 7 Lakshmi 2018-11-12 15:14:47 UTC
Adding KBL platform based on the latest results.
Comment 8 Swati Sharma 2018-11-20 05:54:15 UTC
In the boot log of https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5021/fi-apl-guc/boot0.log, it can be seen
<7>[    6.938880] [drm:lspcon_wake_native_aux_ch [i915]] Native AUX CH up, DPCD version: 0.0

DPCD version is read incorrectly, after that lspcon probe error comes. 

Failure of following i-g-t subtests has this thing common:
kms_pipe_crc_basic@suspend-read-crc-pipe-a
igt@drv_module_reload@basic-reload
igt@drv_module_reload@basic-reload-inject
igt@pm_rpm@module-reload

The same thing is happening for APL and KBL.

KBL NUC on which I am trying to reproduce this issue, DPCD version is read correctly and probe failure is not observed.
Comment 9 Lakshmi 2019-02-19 07:44:29 UTC
Swati, how to proceed further?
Comment 10 Lakshmi 2019-02-19 08:05:15 UTC
Last seen this issue CI_DRM_5629 (11 hours, 1 minute / 5 runs ago).
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5629/fi-kbl-7567u/igt@pm_rpm@module-reload.html
Comment 11 CI Bug Log 2019-02-25 16:54:08 UTC
A CI Bug Log filter associated to this bug has been updated:

{- LSPCON fi-apl-guc fi-kbl-7567u: igt@drv_module_reload@basic-reload(-inject)? igt@pm_rpm@module-reload - dmesg-warn - *ERROR* Failed to probe lspcon -}
{+ LSPCON fi-apl-guc fi-kbl-7567u: igt@drv_module_reload@basic-reload(-inject)? igt@pm_rpm@module-reload - dmesg-warn - *ERROR* Failed to probe lspcon +}

 No new failures caught with the new filter
Comment 12 CI Bug Log 2019-02-25 16:57:51 UTC
A CI Bug Log filter associated to this bug has been updated:

{- LSPCON fi-apl-guc fi-kbl-7567u: igt@drv_module_reload@basic-reload(-inject)? igt@pm_rpm@module-reload - dmesg-warn - *ERROR* Failed to probe lspcon -}
{+ LSPCON fi-apl-guc SKL fi-kbl-7567u: igt@drv_module_reload@basic-reload(-inject)? igt@pm_rpm@module-reload -dmesg-warn - *ERROR* Failed to probe lspcon +}

 No new failures caught with the new filter
Comment 13 Lakshmi 2019-02-25 17:01:57 UTC
SKL is added to this bug as there is a failure from pre-merge testing.

https://patchwork.freedesktop.org/series/57037/#rev1

https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12276/fi-skl-6770hq/igt@i915_module_load@reload-with-fault-injection.html

<4> [327.445661] drm_dp_i2c_do_msg: 2 callbacks suppressed
<3> [327.543286] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [327.543340] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<3> [328.252606] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [328.252661] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<3> [329.682570] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [329.682643] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<3> [331.082274] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [331.082329] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<4> [332.446796] drm_dp_i2c_do_msg: 48 callbacks suppressed
<3> [332.455316] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [332.455374] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<3> [333.853294] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [333.853351] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<3> [335.249472] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [335.249529] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<3> [336.649781] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [336.649886] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
<4> [337.953550] drm_dp_i2c_do_msg: 28 callbacks suppressed
<3> [338.051282] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
<3> [338.051339] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port B
Comment 14 CI Bug Log 2019-02-25 17:03:08 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SKL: igt@i915_module_load@reload-with-fault-injection - dmesg-warn -  Too many retries, giving up. First error: -110 (No new failures associated)
Comment 15 CI Bug Log 2019-02-26 14:58:32 UTC
A CI Bug Log filter associated to this bug has been updated:

{- SKL: igt@i915_module_load@reload-with-fault-injection - dmesg-warn -  Too many retries, giving up. First error: -110 -}
{+ SKL: igt@i915_module_load@reload-with-fault-injection - dmesg-warn -  Too many retries, giving up. First error: -110 +}

 No new failures caught with the new filter
Comment 16 Lakshmi 2019-02-27 10:21:53 UTC
Swati was unable to reproduce this issue locally. Needs further investigation.
Dropping the priority to high as the issue not seen locally and occurs only on one machine.
Comment 17 Swati Sharma 2019-03-25 12:37:43 UTC
Lakshmi, can you please check if this error is seen in CI with latest FW 1.75.01 or not. If not change state to resolved?
Comment 18 Swati Sharma 2019-05-07 08:16:39 UTC
(In reply to Swati Sharma from comment #17)
> Lakshmi, can you please check if this error is seen in CI with latest FW
> 1.75.01 or not. If not change state to resolved?

Should we close this?
Comment 19 Lakshmi 2019-05-07 11:59:43 UTC
(In reply to Swati Sharma from comment #18)
> (In reply to Swati Sharma from comment #17)
> > Lakshmi, can you please check if this error is seen in CI with latest FW
> > 1.75.01 or not. If not change state to resolved?
> 
> Should we close this?

Sorry for the delay. This issue used to occur once in 1-56 runs. Last seen on CI_DRM_5870, so we can close this issue if there is no new failures till CI_DRM_6430.

Dropping the priority to Medium as last seen 1 month ago.
Comment 20 Swati Sharma 2019-07-23 09:42:05 UTC
@lakshmi, should we close this issue?
Comment 21 Lakshmi 2019-07-23 10:44:27 UTC
Last seen CI_DRM_5870 (3 months, 2 weeks old). Till then, this issue used to occur on once in 6 BAT runs. Current run is CI_DRM_6537.
Closing this issue as WORKSFORME.
Comment 22 CI Bug Log 2019-07-23 10:44:41 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.