Bug 107623

Summary: (4.18.3)[drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
Product: DRI Reporter: jian-hong
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED WORKSFORME QA Contact:
Severity: normal    
Priority: medium CC: harry.wentland, sunpeng.li
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=110477
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
"wait timed out!" in dmesg
none
The resume process video
none
dmesg of boot into multi-user.target none

Description jian-hong 2018-08-20 10:27:59 UTC
Created attachment 141196 [details]
"wait timed out!" in dmesg

We have an ASUS X570ZD laptop equipped with AMD Ryzen 7 2700U with Radeon Vega Mobile Gfx and NVIDIA GeForce GTX 1050 Mobile.

We tried with Linux kernel 4.18.3.

After system resumes from suspend, there is a prompted picture for a short time, then system goes no display for a long time.
After the long time, system turns off the display's backlight. User must presses any key to have the login screen.

The resume process is not smooth and hits the error:
[drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!

We have the dmesg with amdgpu.dc_log=1 and drm.debug=6
Comment 1 jian-hong 2018-08-20 10:31:31 UTC
Created attachment 141197 [details]
The resume process video
Comment 2 jian-hong 2018-08-20 10:34:42 UTC
It is nouveau with parameters runpm=0 and noaccel=1 as the driver for NVIDIA GeForce GTX 1050 Mobile card.
Comment 3 jian-hong 2018-08-21 02:57:49 UTC
Created attachment 141207 [details]
dmesg of boot into multi-user.target

If I boot system into "multi-user.target", then "systemctl suspend" and press key to resume. This issue is not reproduced.
Comment 4 Harry Wentland 2018-08-24 13:18:08 UTC
Have you tried with nouveau disabled?

https://askubuntu.com/questions/841876/how-to-disable-nouveau-kernel-driver/951892#951892
Comment 5 Daniel Drake 2018-08-28 03:24:03 UTC
This issue was detected on 4.18.3 when using a somewhat reduced developer config. When using our usual full distro kernel config (from Ubuntu), the issue convincingly goes away.

Both Jian-Hong and I looked at the differences in the config and couldn't spot anything obvious that would explain different behaviour. Trying to bisect good and bad configs, for a while it looked like it might be CONFIG_NUMA_BALANCING=y and CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y but we then disproved that in further experiments. We didn't find which config option is responsible for the different behaviour.

There's probably a bug here somewhere, but as it's working on our shipped config I'll close this issue, as we need to focus our time on other issues on this platform.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.