On a Fedora 25 system, under kernel 4.10.9, I have an RX480 with three Dell P2715Q monitors connected via displayport. 1. The machine is left alone, until the monitors are put into sleep mode. 2. The mouse is moved until the monitors show signs of coming up. It is expected that all monitors come up cleanly and an unlock screen is presented. What actually happens is that not all monitors come up. Some monitors indicate that no signal is coming. Which monitor or monitors fail to come up is non-deterministic. Every time this happens, dmesg shows exactly three entries of the form: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* displayport link status [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* clock recovery failed It doesn't matter how many of the three monitors come up, dmesg always shows this message three times. I've modified the failure point to print the return value of drm_dp_dpcd_read_link_status(), and it comes back as -5. I believe that is -EIEIO Also, switching to VT2, via Ctrl-Alt-F2 brings up all the monitors with 100% success rate. Switching back to VT1 may either: * present a working unlock screen (20% of the time) * present an unlock screen with Xorg being locked up in a poll() call (50% of the time) * or completely crash Xorg (20% of the time) * lock up the machine (10% of the time) This procedure crashes wayland with 100% yield.
OK, I had a short look into this, So it seems that we have that, amdgpu_atombios_dp_aux_transfer() calls amdgpu_atombios_dp_process_aux_chan() which has either a ucReplyStatus == 2 or 3 from atombios returned. If you could please attach dmesg logs after running # echo 0xf > /sys/module/drm/parameters/debug and waiting for the situation to reoccur that would be most useful.
Created attachment 130957 [details] log around the time the problem happens (with excessive debug info)
(In reply to mr.nuke.me from comment #2) > Created attachment 130957 [details] > log around the time the problem happens (with excessive debug info) yes ok, so we are indeed hitting 'ucReplyStatus == 2' from atombios. Someone from AMD will have to determine the problem with that then because atombios is a closed component.
Are the monitors set to DP input or to auto-select? If they are in auto-select will setting input to DP help? I've seen auto-select mode have problems with DP many times, especially with scenarios like coming back from DPMS or S3 resume.
P2715Q does not have auto-select mode. They're always listening on the same input.
Wanted to add that I'm seeing this issue now under a similar setup. I've seen it in the past, but the last few kernel releases have been pretty smooth. Once I upgraded to GNOME 3.26, however, both 4.13 and now 4.14-rc3 are displaying this issue. I'm on Arch (using Wayland primarily), with a Fury X and three Dell P2415Q monitors, also connected via DisplayPort. I have MST disabled on all three, since (even Dell has documented) this model has issues hitting 4K@60Hz with it enabled. Same "displayport link status failed" and "clock recovery failed" messages appear for me, also three times in a row. This more often than not leads to gnome-shell crashing. I see it most often after I've put my computer to sleep when I try to wake it up. Other times when putting it to sleep, one monitor will stay powered and show a backlit blank screen.
Created attachment 135668 [details] dmesg log error
I hit the same problem today after enabling amdgpu.dc=1 The screen doesn't light up at all if I boot the kernel with amdgpu.dc=1 Config is: Fedora 27 + kernel 4.15.0-0.rc0.git7.1.fc28.x86_64 Radeon R9 380X Dell U2414H dmesg error is: kernel: [drm:dm_logger_write [amdgpu]] *ERROR* perform_clock_recovery_sequence: Link Training Error, could not get CR after 100 tries.
(In reply to Benjamin Bellec from comment #8) > I hit the same problem today after enabling amdgpu.dc=1 > The screen doesn't light up at all if I boot the kernel with amdgpu.dc=1 AFAICT this report is about the non-DC code, please file your own report about the issue with DC.
This is a real problem for me as well, for some time now, with amdgpu (Radeon RX560), Fedora-27, gnome-shell and Dell P2715Q monitor. It happens both on Xorg and Wayland. > kernel: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* displayport link status failed > kernel: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* clock recovery failed > kernel: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* displayport link status failed > kernel: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* clock recovery failed The nuisance here is this almost always crashes gnome-shell. Attached coredump excerpt. I used to be able to circumvent the gnome-shell crash by disabling dpms ("xset -dpms" and/or "xset dpms force off") but this doesn't seem to help anymore.
Created attachment 137017 [details] gnome-shell coredump after amdgpu displayport link status failed
(In reply to Dimitrios Liappis from comment #10) > The nuisance here is this almost always crashes gnome-shell. Attached > coredump excerpt. FWIW, that's most likely a gnome-shell/mutter bug.
(In reply to Michel Dänzer from comment #12) > > FWIW, that's most likely a gnome-shell/mutter bug. Thank you, indeed this is a mutter bug; I hunted the bug in https://bugzilla.gnome.org/show_bug.cgi?id=789501 and there is a specific patch for a monitor-manager/kms bug that fixes it, as described in https://bugzilla.gnome.org/show_bug.cgi?id=789501.
Confirming still similar problem (Screen stays black while trying resume from suspend) 2x DELL u2415h + display port daisy chain + RX480 (amdgpu 18.0.1-2)
(In reply to Kimmo from comment #14) > Confirming still similar problem (Screen stays black while trying resume > from suspend) > 2x DELL u2415h + display port daisy chain + RX480 (amdgpu 18.0.1-2) Actually need to correct myself. The suspend problem seems to be fixed and working ok for me so far. Problem seems to be more related if Dell monitor is allowed to shutdown by itself due to inactivity, but not sure if it has any relations to amdgpu. Using KDE plasma desktop 5.13.3. Sorry for inconvenience.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/158.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.