Bug 106529 - dc=1 kernels somehow trigger a disconnect of an lg ultrawide monitor during DP link training while attempting a wakeup
Summary: dc=1 kernels somehow trigger a disconnect of an lg ultrawide monitor during D...
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-15 15:00 UTC by Mariusz Mazur
Modified: 2019-11-19 08:38 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
ubuntu's 4.15 dmesg with amdgpu.dc_log=1 (90.83 KB, text/plain)
2018-05-15 15:00 UTC, Mariusz Mazur
no flags Details

Description Mariusz Mazur 2018-05-15 15:00:35 UTC
Created attachment 139577 [details]
ubuntu's 4.15 dmesg with amdgpu.dc_log=1

It appears that the display-handling code in 4.15+ (including 4.17-rc5) marks any display not fully woken up as disconnected or something to that effect. So when I have multiple monitors and they get woken up from sleep (doesn't matter whether the computer was suspended or just idle for a long period), and my primary display wakes up slower than the any of the other displays (and it does), then for a few hundred ms KDE tries to change the primary display to the one that first woke up (I'm guessing it thinks that's the only one available), then changes its mind a bit later, when all displays wake up.

End result looks like crap and is not usable, since all my windows get displaced. Here's how it looks like in practice:

Setup:
Radeon RX 560
Primary display: bottom right, displayport, on
Secondary display 1: left, hdmi, on
Secondary display 2: top right, dvi, off

Video of how it used to work, using 4.4.15: https://www.youtube.com/watch?v=h7nMYbm5ZxU
Note how even though the left (secondary) display wakes up first, both instantly display their proper desktops.

Video of how it works with ubuntu 18.04's default 4.15 kernel/4.16.7/4.17-rc5: https://www.youtube.com/watch?v=VdVFNXPszSI
Note how when the left (secondary) display wakes up first, KDE decides that this is probably the only display available, so tries to switch the primary desktop to it, then a split second later the bottom right wakes up and KDE knows *that* should be the primary according to the config, so it switches back.

Attached is dmesg from ubuntu's 4.15 with dc_log=1. I put the computer to sleep, then wake it up and get the behavior like in the video.
Comment 1 Mariusz Mazur 2018-08-09 20:29:46 UTC
I've tried the same thing with only the DP ultrawide monitor connected and got the same issue. dc logs, reading code and some kernel tracers (ftrace) on DC link training code basically tell me that the important part is this:

21:23:16 [drm] [LKTN]        [DP][ConnIdx:0] RBRx4 pass VS=2, PE=1^
21:23:16 [drm] link=0, dc_sink_in=          (null) is now Disconnected
21:23:18 [drm] [LKTN]        [DP][ConnIdx:0] HBRx4 pass VS=1, PE=0^
21:23:18 [drm] link=0, dc_sink_in=00000000dbbee48e is now Connected
21:23:18 [drm] [LKTN]        [DP][ConnIdx:0] RBRx4 pass VS=1, PE=1^

Whatever's going on with the first attempt at link training, it ends up with the monitor disconnecting (not sure if deliberately or by causing some error in the monitor; didn't dive deep enough into the code).

Pre-DC codepaths did not have an issue like this at all until Michel Dänzer created this patch: https://patchwork.freedesktop.org/patch/209464/ for bug 105308 thereby introducing a problem with the same effects (DC monitor gets disconnected on wakeup, which on multi-display causes issues) via a quite different approach (a deliberate DRM_MODE_DPMS_OFF & ON). But that should be a separate bug, I think.

And since Michel's patch got applied to all the kernels, the effect is that on any up to date 4.15+ kernel the same issue shows up whether you're doing dc=1 or dc=0, while the actual code paths causing it are different. (Which made this a PITA to figure out.)

Anyway, since I know nothing about DP link training and how to fix code relating to it, I've just bought a DP->HDMI cable thus "fixing" my problem entirely (including the second issue of a 2s lag with DP audio).

Hopefully someone in the future, a future man so to speak, will fix this issue eventually. Cause currently DP support is quite bad, it seems.
Comment 2 Michel Dänzer 2018-08-15 14:43:52 UTC
(In reply to Mariusz Mazur from comment #1)
> Pre-DC codepaths did not have an issue like this at all until Michel Dänzer
> created this patch: https://patchwork.freedesktop.org/patch/209464/ for bug
> 105308 thereby introducing a problem with the same effects (DC monitor gets
> disconnected on wakeup, which on multi-display causes issues) via a quite
> different approach (a deliberate DRM_MODE_DPMS_OFF & ON). But that should be
> a separate bug, I think.

FWIW, that change should have no direct effect on whether the display is considered connected or disconnected. The only change is that when the driver is notified that a DP display is "disconnected" (which can also happen without a physical disconnection, e.g. if the display is turned off), it doesn't immediately turn off the GPU's DP source anymore, but waits until either userspace asks to turn it off, or it gets notified that a display is "connected" again. Both "disconnect" / "connect" hotplug events are sent to userspace before and after this change. I suspect that immediately turning off the DP source simply happened to delay sending the "disconnect" hotplug event to userspace enough to avoid the issue on your system.
Comment 3 Mariusz Mazur 2018-08-15 16:08:25 UTC
dc=1 codepath has the problem occur 100%  of the times the display wakes up (iirc).

dc=0 with your patch does not. It happens regularly, but not 100% of the times. Which I guess does indicate there's some timing issues involved.
Comment 4 Alex Deucher 2018-08-15 17:29:15 UTC
I suspect it may have something to do with the pulse timing from the monitor.  Long pulses are for connect/disconnect events and short pulses are a feedback mechanism for the monitor to the driver.  I suspect the monitor is sending a short pulse that the driver is interpreting as a long pulse.
Comment 5 Michel Dänzer 2018-08-16 07:36:50 UTC
Maybe one thing we could do is delaying the sending of the disconnect hotplug event to userspace by some time (though at most until the next connect hotplug event).
Comment 6 Martin Peres 2019-11-19 08:38:27 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/386.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.