Starting subtest: system-suspend-execbuf
(pm_rpm:1305) CRITICAL: Test assertion failure function system_suspend_execbuf_subtest, file ../tests/pm_rpm.c:1476:
(pm_rpm:1305) CRITICAL: Failed assertion: wait_for_suspended()
Subtest system-suspend-execbuf failed.
<4> [307.965661] snd_hda_intel 0000:00:1b.0: No response from codec, disabling MSI: last cmd=0x204f0900
<3> [308.973666] snd_hda_intel 0000:00:1b.0: azx_get_response timeout, switching to single_cmd mode: last cmd=0x204f0900
<7> [308.974737] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<6> [308.986686] acpi LNXPOWER:02: Turning OFF
<6> [308.988673] acpi LNXPOWER:01: Turning OFF
<6> [308.989711] Bluetooth: hci0: RTL: rtl: examining hci_ver=06 hci_rev=000b lmp_ver=06 lmp_subver=8723
<6> [308.990539] acpi LNXPOWER:00: Turning OFF
<6> [308.991186] OOM killer enabled.
<6> [308.991197] Restarting tasks ...
<6> [308.992383] Bluetooth: hci0: RTL: rom_version status=0 version=1
<6> [308.992415] Bluetooth: hci0: RTL: rtl: loading rtl_bt/rtl8723b_fw.bin
<4> [308.994397] done.
<6> [309.002877] Bluetooth: hci0: RTL: rtl: loading rtl_bt/rtl8723b_config.bin
<4> [309.003231] bluetooth hci0: Direct firmware load for rtl_bt/rtl8723b_config.bin failed with error -2
<6> [309.003284] Bluetooth: hci0: RTL: cfg_sz -2, total sz 22496
<6> [309.009486] PM: suspend exit
The CI Bug Log issue associated to this bug has been updated.
### New filters associated
* BYT ICL: igt@pm_rpm@system-suspend-* - fail - <3> ...: azx_get_response timeout, switching to single_cmd mode
Created attachment 143409 [details]
Dump additional logs
Created attachment 143410 [details]
Dump additional traces
After thousands of iterations we are still not able to reproduce the issue.
Attaching simple debug and trace scripts which dump additional driver logs.
Please retest using following steps:
1. Upload dmesg_logs.sh and trace_logs.sh to tested machine
2. Call chmod +x on both files
3. Start “./dmesg_logs.sh > dmesg.txt” in first terminal and "./trace_logs.sh > trace.txt” in second terminal
4. Note – scripts are calling some commands as “sudo” so you may need to enter password
5. Execute your test
6. Interrupt scripts with ctrl+c
Sorry for forgetting to add the link of the ICL failure. After a fair bit of archaeology, I found it: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5164/shard-iclb1/igt@firstname.lastname@example.org
(In reply to cezary.j.rojewski from comment #4)
> After thousands of iterations we are still not able to reproduce the issue.
> Attaching simple debug and trace scripts which dump additional driver logs.
> Please retest using following steps:
> 1. Upload dmesg_logs.sh and trace_logs.sh to tested machine
> 2. Call chmod +x on both files
> 3. Start “./dmesg_logs.sh > dmesg.txt” in first terminal and
> "./trace_logs.sh > trace.txt” in second terminal
> 4. Note – scripts are calling some commands as “sudo” so you may need to
> enter password
> 5. Execute your test
> 6. Interrupt scripts with ctrl+c
Thanks for doing this.
So far, we have had a reproduction rate of ~1/500. What test did you execute in a loop? On ICL (do not tell the HW revision here though)? How many thousand times did you run the test?
Thanks to your work, we will be able to drop the priority of this issue for ICL, but the problem is still very visible for our fi-byt-j1900, which you still probably want to look into.
 https://intel-gfx-ci.01.org/hardware.html#fi-byt-j1900 , https://www.gigabyte.com/Mini-PcBarebone/GB-BXBT-1900-rev-10
We tried to reproduce this issue on ICL U.
We've installed pm-graph:
Then we run analyze_suspend.py 2k times:
sudo ./analyze_suspend.py -rtcwake 3 -multi 2000 1 -m mem
And we do not have reproduction.
The failure has not been seen on ICL since. As for BYT, we used to see the issue every 4.5 runs in average, but now not seen since drmtip_240. Let's wait until drmtip_285 before closing!