Bug 111214 - [CI][BAT] igt@i915_module_load@reload - timeout - Unable to sync register 0x2f8100
Summary: [CI][BAT] igt@i915_module_load@reload - timeout - Unable to sync register 0x2...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high major
Assignee: Filip Kaczmarski
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-25 08:29 UTC by Martin Peres
Modified: 2019-08-20 12:53 UTC (History)
2 users (show)

See Also:
i915 platform: ICL
i915 features: display/audio


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-07-25 08:29:22 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6546/fi-icl-u2/igt@i915_module_load@reload.html

<6> [385.496790] Console: switching to colour dummy device 80x25
<6> [385.496825] [IGT] i915_module_load: executing
<6> [385.499252] [IGT] i915_module_load: starting subtest reload
<4> [388.633911] snd_hda_intel 0000:00:1f.3: azx_get_response timeout, switching to polling mode: last cmd=0x202f8100
<4> [389.635866] snd_hda_intel 0000:00:1f.3: No response from codec, disabling MSI: last cmd=0x202f8100
<3> [581.484906] snd_hda_codec_hdmi hdaudioC0D2: Unable to sync register 0x2f8100. -11
<7> [581.485225] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7> [581.485342] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7> [581.485438] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7> [581.485528] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7> [581.485615] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7> [581.485703] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7> [581.485900] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<7> [581.486017] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<7> [581.486140] [drm:i915_audio_component_get_eld [i915]] Not valid for port E
<7> [581.486297] [drm:i915_audio_component_get_eld [i915]] Not valid for port E
<7> [581.486403] [drm:i915_audio_component_get_eld [i915]] Not valid for port E
<7> [581.486510] [drm:i915_audio_component_get_eld [i915]] Not valid for port F
<7> [581.486615] [drm:i915_audio_component_get_eld [i915]] Not valid for port F
<7> [581.486723] [drm:i915_audio_component_get_eld [i915]] Not valid for port F
Comment 2 CI Bug Log 2019-07-26 06:36:39 UTC
A CI Bug Log filter associated to this bug has been updated:

{- ICL: igt@i915_module_load@reload - timeout - Unable to sync register 0x2f8100 -}
{+ ICL: igt@i915_module_load@(reload|module-reload) - timeout - Unable to sync register 0x2f8100 +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6550/fi-icl-y/igt@i915_pm_rpm@module-reload.html
Comment 3 Lakshmi 2019-07-26 08:09:25 UTC
Filip, this bug needs assessment. Can you please add your comments and set appropriate severity and priority?
Comment 4 Cezary Rojewski 2019-07-30 09:10:27 UTC
Issue seems to be connected with SGPC - Shared Graphics Power Component. That is, during power state change on gfx side, codec -or clock (not entirely sure which one) may be turned off without notifying audio side about such change.

Now, from audio side codec is still present on the link and thus counted as active. And thus verbs are still being sent to it - timing out in consequence.

Problem seems to exist since SKL and isn't strictly tied to ICL. The only reason it is visible now, is probably of yet another issue present for ICL platforms. We need to fix the root cause though. GFX guys must be engaged and comment on this.
Comment 5 Stanislav Lisovskiy 2019-08-05 07:17:30 UTC
The issue is indeed related to power management and suspend/resume logic. 

I could reproduce it with suspending/resuming the machine, azx_response_timeout can be reproduced easily by enabling codec while disabling correspondent power domain. I also discovered that at least in Ubuntu if I have DP audio device available before suspend, then after suspend it disappears. 

There seems to be some issue/mismatch in intel_audio_codec_enable/intel_audio_codec_disable calls. I'm now also investigating it - to check which issues we might need to fix from GFX side.
Comment 6 Stanislav Lisovskiy 2019-08-07 13:44:07 UTC
Looks like that power well might be not an issue, as power wells 2,3 were enabled just before the timeout. 

However the reason seems to be in a codec itself, at least according to that mailing thread:

https://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg198762.html
Comment 7 Chris Wilson 2019-08-09 07:12:16 UTC
Note that although BAT is not reporting the timeout anymore, it is still present as icl-u2: 198.37 igt@i915_module_load@reload dmesg-warn
Comment 8 Stanislav Lisovskiy 2019-08-19 13:35:03 UTC
(In reply to Chris Wilson from comment #7)
> Note that although BAT is not reporting the timeout anymore, it is still
> present as icl-u2: 198.37 igt@i915_module_load@reload dmesg-warn

Do I understand correctly that this is supposed to be fixed by Takashi Iwai?

Patches: https://patchwork.freedesktop.org/patch/319867/
Comment 9 Lakshmi 2019-08-20 12:53:13 UTC
Comments from Stan: This issue can lead to unpredictable timeouts when verbs are sent to code and it doesn't break the system, not even audio I guess. Most likely this is going to be fixed by this: https://patchwork.freedesktop.org/patch/319867/

Updating the priority and severity.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.