Created attachment 138987 [details] dmesg When amdgpu.dc=1 and an Xorg compositor are enabled at the same time, there is stuttering when moving windows. It's most visible with Compton (which is completely stutter-free when amdgpu.dc=0), but also KWin and to a lesser extent also Gnome-Mutter. It happens also with GPU clocks forced to maximum, so it doesn't seem to be a powersaving issue. In the recent past, there was an issue with performance degrading when amdgpu.dc=1 and an Xorg compositor are enabled and the hardware mouse cursor was used, maybe it's still related (just guessing though)? Tested with Linux 4.17 RC1 and drm-next-4.18-wip (4.16.1.52132fd03) xorg-server 1.19.6+13+gd0d1a694f (amdgpu DDX & modesetting) RX 560
Created attachment 138988 [details] xorg log
Does this also happen without overriding the EDID of the DVI-D output?
Yes, then it also occurs with the monitor's resolution/refreshrate provided by its own edid (2560x1440 59.95Hz). Sorry, should have mentioned that. Without compositor, moving of windows looks smooth. I also tried all vsync settings provided by Compton, but all show the stuttery behavior with amdgpu.dc.
I found out that it's still related to the hardware cursor. When I set Option "SWCursor" "true" in Xorg config, moving windows is smooth. The mouse uses 1000Hz polling rate, if that makes any difference. To spot the stuttering best, set acceleration profile to flat for libinput. Btw: There is also a little problem with Redshift and hardware cursor, it turns more yellow/orange than the rest of the screen. Using software cursor works around this problem as well.
amdgpu.dc=1 also causes performance issue with 2 games I own: "Rise of the Tomb Raider" and "Helium Rain" (UE4 game with sources publicly available on github). I have 2 monitors (1st is 60Hz, 2nd in 144Hz). During tests I was using 1920x1080 resolution in both games which is 60Hz on both monitors. 1. With amdgpu.dc=0 everything is fine. 2. With amdgpu.dc=1: The issue was showing up only in menus when cursor was visible and was in the game window e.g. when Helium Rain was running on 2nd monitor (fullscreen) and I was moving mouse over its window - mouse was lagging/stuttering/not responding and Xorg server was producing messages like below in Xorg.0.log: (II) event5 - USB Gaming Mouse: SYN_DROPPED event - some input events have been lost. (EE) client bug: timer event5 debounce short: offset negative (-7ms) When I moved mouse to the 1st monitor the messages were not produced and mouse was not lagging. I tried few things (including switching from full dyntick system to idle dyntick) until finally MrCooper at #xorg-devel suggested trying amdgpu.dc=0 which fixed the issue. With amdgpu.dc=1 latencytop was showing that drm_modeset_backoff was taking ~20ms. I was using kernel 4.17.0-rc1 and 4.17.0-rc2, Mesa 18.2.0-devel (git-d136a5fad9) and X.Org X Server 1.19.99.904 (1.20.0 RC 4) with xf86-video-amdgpu and radeonsi. I'm using 1000Hz gaming mouse.
I forgot about - I have Radeon Fury X graphics card.
Does your kernel tree have the following patches? 90fef6476917 Revert "drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)" c7bd22893408 Revert "drm/amd/display: fix dereferencing possible ERR_PTR()" If not can you grab the latest drm-next-4.18-wip and check again? Those reverts should have fixed problems where mouse movement would slow the system down.
I had these patches in the kernel tree - mine is from 22nd April, while these patches were committed on 12th April. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?ofs=350
I got them too. Before those commits, my issue was way more severe. It's still really nasty stutter though.
I have the same issue. For me amdgpu.dc=0 does not really fix it either. I have a 3840x1600 monitor running at 75 Hz. This is the different behavior I noticed when toggling the DC setting: amdgpu.dc=0 Moving windows smooth most of the time but cursor frequently skips frames amdgpu.dc=1 There is stutter/tearing in the movement of windows but the cursor is completely smooth otherwise. Running 4.16.2 kernel here with a Radeon Pro WX 7100.
I noticed that this issue also exists apart from Xorg compositors. When I run Serious Sam: Fusion (both OpenGL and Vulkan) in fullscreen (no Xorg compositor enabled in the background), the mouse cursor (hardware cursor) in the main menu can be moved without stuttering. But as soon as I enable vsync in the game, its movement becomes stuttery. Again no problem with amdgpu.dc=0.
Latest drm-next-4.18-wip aa1bce17d841a362d40da940487e13affe4c7b3b still shows the same behavior. I'd be happy if more users would comment on this, since it makes use of amdgpu.dc totally impossible for me.
When I use modesetting driver with Option "PageFlip" "false", the stuttering is gone (however, as expected tearing is not fully prevented anymore). So there might be an actual connection to pageflipping?
(In reply to tempel.julian from comment #13) > So there might be an actual connection to pageflipping? Yeah, the problem seems to be a bad interaction between page flipping and cursor updates. FWIW, page flipping can be disabled with xf86-video-amdgpu as well, with Option "EnablePageFlip" "false"
I just tried that option with the xf86 amdgpu DDX driver and as expected, stuttering disappears in exchange for tearing close to the very top of the screen. I'm really glad you could confirm the issue, the absent reports of other users really worried me that I'd have to live forever with it.
I have this issue too, disabling page flipping fixes it for me on my vega10. It started with 4.16rc1 IIRC
https://patchwork.freedesktop.org/patch/227925/ might provide inspiration for how this could be solved.
My hypothesis is that has something to do with the mouse polling rate. Could you set the polling rate to 125 Hz (8 ms) and see if the problem persists? This information will help us troubleshoot the problem. Set mouse polling rate: https://wiki.archlinux.org/index.php/mouse_polling_rate
(In reply to David Francis from comment #18) > My hypothesis is that has something to do with the mouse polling rate. What is that hypothesis based on? The kernel is supposed to be able to process any number of DRM_IOCTL_MODE_CURSOR(2) ioctls in parallel with a DRM_IOCTL_MODE_PAGE_FLIP ioctl, without them interfering with each other. Most likely there's an issue in the DC code interfering this. See the patch I referenced in comment 17 for an example of what might need to be done to solve this.
Note that the ioctls don't literally run "in parallel"; both ioctls are called by the Xorg main thread, so they can't preempt each other. What I mean is that any number of cursor ioctls can happen while a page flip is pending.
Don't want to nag at anyone, but this bug still makes DC unusable for me and thus is a real dealbreaker. Does implementing a fix for it require lots of efforts?
Created attachment 140858 [details] GALLIUM_HUD showing stuttering I can confirm the issue. Having a Radeon R9 290X (Hawaii XT), DC introduces heavy stuttering on a composited X desktop while interacting with the window manager. This stuttering can't be resolved by forcing high power states. Attached is a screenshot to give an impression of the stuttering. Top left is the GALLIUM_HUD with the compositor's frame rate and frame time graph. On the right, there is Firefox with the page https://testufo.com/photo#photo=quebec.jpg&pps=960&pursuit=0&height=0 having hardware acceleration force-enabled and also showing fps/frametime graphs. The web page has continuous movement, ensures that there is a screen update every frame and makes it easy to detect stutter. This looks perfectly fine and the graphs represent that as well. On the bottom there is the same setup but while changing window focus or moving a third window around. Firefox stutters as hell (suspecting vsync stuttering) and the graphs show that as well. Disabling DC resolves the issue completely and the bottom scenario would look and feel the same as the upper one with DC. In this current situation, I could disable DC and have a smooth desktop at the cost of several dozens of watts idle power or save power and use a stuttering desktop with DC enabled.
If I bought Vega, Raven Ridge or, in the future, Navi, I'd be really annoyed by this bug because I had to turn off page flipping, resulting in unacceptable tearing. :( Could we please get an update?
The challenge here is that we still can't seem to reproduce this internally on any of our setups. Can anyone identify a commonality in setup to help isolate the reproducing behaviour?
It doesn't seem to be related to a certain GCN generation, as there are exactly matching reports of at least Hawaii, Polaris 10/11 and Vega 10 (probably also Fiji). It probably neither is related to the type of display output, as I am using DL-DVI and grmat afaik uses Display Port. And I suppose we all tried standard refreshrate of 60Hz without success. So, unfortunately, I am rather clueless. But I just noticed we haven't yet followed David's idea of setting a low "standard" mouse polling rate of 125Hz. I currently don't have my Radeon installed, so I can't give this a quick try (but can do so in the future). Does anybody have this issue with a native resolution of 1920x1080 60Hz? I haven't tested such a display yet.
Yes, I'm using DP (required for 144 Hz with WQHD). However, I just reproduced the issue on a 19" monitor with 1280x1024 at 60 Hz and with a cheap old mouse with a 100 Hz polling rate. The issue is no *that* bad with the low polling rate but still very much noticeable.
Is this commit related to it? https://lists.freedesktop.org/archives/amd-gfx/2018-October/027726.html
(In reply to tempel.julian from comment #27) > Is this commit related to it? > https://lists.freedesktop.org/archives/amd-gfx/2018-October/027726.html It shouldn't be. You would likely be experiencing a driver hang in this case because of the fault.
I gave it a try again: Unfortunately, there are no improvements to report with latest 4.21-wip vs. the status of some months ago. I really wonder how you can have trouble reproducing. This is not meant as a reproach, but it's really frustrating.
https://github.com/yshui/compton/issues/25 - related to this issue Some tests me and others did in compton shows there is some relation with vsync issues and the HW cursor. When I turn swcursor on in xorg config both kwin compositor and compton get significantly smoother. Similar behavior is noticeable in some games for me, game stays smooth when playing with keyboard or gamepad then you move the mouse and it starts stuttering hard. With swcursor I get a bit of input lag but smoother performance overall.
Note that SWcursor completely disables page flipping, at least with xf86-video-amdgpu, because the two things are fundamentally incompatible with each other. Does only disabling page flipping also avoid the problem?
(In reply to Michel Dänzer from comment #31) > Does only disabling page flipping also avoid the problem? Not from what I can tell. > Option "EnablePageFlip" "off" results in >[ 35496.178] (II) AMDGPU(0): KMS Pageflipping: disabled obvious stuttering is still present with TearFree on.
I suppose TearFree forces pageflipping regardless, as we don't see any tearing with that configuration.
(In reply to tempel.julian from comment #33) > I suppose TearFree forces pageflipping regardless, as we don't see any > tearing with that configuration. Right, you'd have to disable TearFree as well. Can be done at runtime with xrandr --output <output name> --set TearFree off
(In reply to Michel Dänzer from comment #31) > Note that SWcursor completely disables page flipping, at least with > xf86-video-amdgpu, because the two things are fundamentally incompatible > with each other. Does only disabling page flipping also avoid the problem? Justed tested it and yes, disabling pageflip also gets rid of stutter for me.
So, to help find the origin of the issue, there are a few options that get rid of stutter when compositing: 1 - amdgpu.dc=0 - The old DC seems unaffected by the bug. 2 - SWcursor on - Unaffected by bug because it disables pageflipping 3 - Pageflipping off
I think software cursor would also be unusable even if it left pageflipping on. It causes nasty issues like flickering cursor or other visual corruption.
(In reply to tempel.julian from comment #37) > I think software cursor would also be unusable even if it left pageflipping > on. It causes nasty issues like flickering cursor or other visual corruption. Yes I also noticed those, I think we can open another issue for that.
(In reply to Michel Dänzer from comment #34) > > Right, you'd have to disable TearFree as well. Then I think the logs should represent that, even when the manpage tells me that tearfree is using page flipping. If i set explicitly to off, and the log says so, I expect it to be off. And yes, disabling page flipping "resolves" the issue, but that's not new knowledge.
For the DC guys: We've now confirmed that the problem is due to some bad interaction between page flips and HW cursor updates. (In reply to tempel.julian from comment #37) > I think software cursor would also be unusable even if it left pageflipping > on. It causes nasty issues like flickering cursor or other visual corruption. Yeah, that's why xf86-video-amdgpu disables DRI page flipping while there's an SW cursor, as I said in comment 31. Note that the modesetting driver doesn't do this, allowing users to run into those issues. (In reply to grmat from comment #39) > (In reply to Michel Dänzer from comment #34) > > > > Right, you'd have to disable TearFree as well. > > Then I think the logs should represent that, even when the manpage tells me > that tearfree is using page flipping. If i set explicitly to off, and the > log says so, I expect it to be off. Patches or at least specific suggestions welcome, but I'm afraid it's tricky to describe all possible interactions concisely and clearly. DRI page flipping and TearFree are mostly separate things, but they use the same kernel page flipping mechanism, which is what matters for this issue.
Guys, please take a closer look at this, its actually a lot worse than what OP describes and affect a lot of other use cases, vsync is a vital feature for any kind of PC activity, literally everything you do on a computer sucks with tearing. amdgpu.dc=1 has been default for a few kernels, has been updated almost daily with features and various other fixes but basic vital stuff like vsync and higher frequencies (flickering and screen glitches) have been broken for many people with all ranges of cards for a while now. BTW, TearFree is slow and stuttery even with old dc for me. I'd love to provide more info about any of those issues and help testing as I'm sure does others users, we just need a bit more attention from you devs. Features are awesome, I love to wake up everyday with new mesa/drm features but what I love even more is wake up with an annoying bug fixed.
This is pretty serious. Just moving the mouse cursor around while something slightly GPU-heavy is running at 60hz can produce frame-skipping. I switched the display core off with amdgpu.dc=0 and everything got significantly smoother and chromium doesn't chug on heavy pages any more. I'm using 4.19.x. I haven't tried the drm-next-4.21-wip tree yet.
(In reply to Brandon Wright from comment #42) > This is pretty serious. Just moving the mouse cursor around while something > slightly GPU-heavy is running at 60hz can produce frame-skipping. > > I switched the display core off with amdgpu.dc=0 and everything got > significantly smoother and chromium doesn't chug on heavy pages any more. > > I'm using 4.19.x. I haven't tried the drm-next-4.21-wip tree yet. Dont need to try drm-next-4.21-wip, just did and it still has the issue If devs want an easy test case, use these links for reproducing it in chromium: https://www.vsynctester.com/ https://www.testufo.com/photo https://www.slither.io move the cursor around, move/resize some windows. you will notice it the vsync/cursor stutters and frame-skips are pretty noticeable with dc=1 on all three links KWin, compton, TearFree, mutter, xfwm4 all have the same problems.
You're too late, I already tried it. But as you say, there's no improvement.
(In reply to bmilreu from comment #43) > If devs want an easy test case, use these links for reproducing it in > chromium: > > https://www.vsynctester.com/ > https://www.testufo.com/photo > https://www.slither.io > > move the cursor around, move/resize some windows. you will notice it > > the vsync/cursor stutters and frame-skips are pretty noticeable with dc=1 on > all three links > > KWin, compton, TearFree, mutter, xfwm4 all have the same problems. I just tried dc=1 and I only seem to have a problem if I use TearFree. Things are totally fine without TearFree. To be clear about what I'm doing here right now: I made sure DC is enabled: $ systool -vm amdgpu | grep dc dc = "1" $ dmesg | grep -i display [ 1.014297] [drm] Display Core initialized with v3.1.59! I removed TearFree from my X config: $ cat /etc/X11/xorg.conf.d/20-amdgpu.conf Section "OutputClass" Identifier "my amdgpu settings" MatchDriver "amdgpu" Option "DRI" "3" EndSection And I started Compton like this to make sure it's a clean config: $ compton --config /dev/null --backend glx --vsync opengl With this setup, I don't seem to have any stutter. I visited the websites you mention in a Chromium window, then opened another window and tried moving things around and resizing. It behaves fine, same as what I know from normally using dc=0. Kernel is 4.19.2, Mesa 18.2.4, Xorg 1.20.3, the GPU is a RX480, monitor is 60 Hz. After I had typed this, I have now added TearFree to the X config and restarted X: $ cat /etc/X11/xorg.conf.d/20-amdgpu.conf Section "OutputClass" Identifier "my amdgpu settings" MatchDriver "amdgpu" Option "TearFree" "true" Option "DRI" "3" EndSection Now, with TearFree enabled, things are super terrible. Trying to move a window around has extreme stutter, it seems to drop frames. If I restart Compton with "GALLIUM_HUD=fps" and then try moving a window around in circles, I can see it stays below 40 fps instead of hitting the 60 fps that it should be running at.
I've never run TearFree, so that's not the case here. My Xorg config is similar to yours, just amdgpu and DRI 3. I did have an extra section to use evdev instead of libinput, but I tried removing that and there's still no change.
FWIW, note that TearFree can be toggled at runtime using the RandR output property of the same name. At its default value "auto", TearFree is automatically enabled for an output using rotation / scaling / other transformations. (In reply to bmilreu from comment #41) > BTW, TearFree is slow and stuttery even with old dc for me. Sounds like the issue you're seeing with TearFree might be different from the one this report is about.
With amdgpu.dc=0, TearFree works as expected for me (no tearing without compositor, scrolling in Firefox windowed is free of stutter, no issues with compositor vsync either). I think we should leave TearFree out of this as it's entirely unrelated, apart from the fact that it forces pageflipping. Regarding the original issue with amdgpu.dc=1: Still totally unchanged for me with latest stable versions and also 4.21-wip, llvm-svn, mesa-git, libdrm-git, xorg-server 1.20.3 and modesetting / xf86-video-amdgpu-git on Arch. I'm getting an Asus Vega 56 Strix card tomorrow, which I will try instead of my current MSI Aero RX 560 card. But since there were already reports for Vega, I'm not hopeful.
I'm going to speculate that maybe the hardware cursor updates are triggering an update to the vsync timestamp counter or msc that's incorrect and throwing off the timing.
> I have this issue too, disabling page flipping fixes it for me on my vega10. It started with 4.16rc1 IIRC Negative. I checked back as far as the DC/DAL was integrated (4.15) and it's been there from the start. It's in the kernel somewhere, in the DC DRM layer above the device specific stuff. I looked in and couldn't see anything that's grossly problematic. I suspect Michel's suggestion for async cursor updates might be the fix, but I can't help wondering why the legacy DRM code is unaffected.
DC uses the atomic KMS interface, the old code uses the legacy KMS interface.
Ok, I think I understand what's going on. Forgive me if this sounds stupid, I'm looking at the DRM code for the first time. The old KMS interface uses what's flagged as "legacy" cursor updates. These are "asynchronous" in that they're handled and passed to the hardware as they come in. On the vertical retrace interrupt, it uses whatever the last data passed in was. My theory is the DC interface isn't passing these on to the hardware immediately. It's aggregating them until the next sync, when they're all handled at once. And that is what's causing the disturbance at page-flip time. High-report-rate mice might exacerbate it. Intel's driver hasn't merged that async code yet. It's still using legacy cursor updates and working around this. The DC code seems to have a TODO comment in amdgpu_dm.c that suggests something about the legacy_cursor_update flag, but it doesn't do anything with it.
(In reply to rropid from comment #45) > (In reply to bmilreu from comment #43) > > If devs want an easy test case, use these links for reproducing it in > > chromium: > > > > https://www.vsynctester.com/ > > https://www.testufo.com/photo > > https://www.slither.io > > > > move the cursor around, move/resize some windows. you will notice it > > > > the vsync/cursor stutters and frame-skips are pretty noticeable with dc=1 on > > all three links > > > > KWin, compton, TearFree, mutter, xfwm4 all have the same problems. > > I just tried dc=1 and I only seem to have a problem if I use TearFree. > Things are totally fine without TearFree. > > To be clear about what I'm doing here right now: > > I made sure DC is enabled: > > $ systool -vm amdgpu | grep dc > dc = "1" > $ dmesg | grep -i display > [ 1.014297] [drm] Display Core initialized with v3.1.59! > > I removed TearFree from my X config: > > $ cat /etc/X11/xorg.conf.d/20-amdgpu.conf > Section "OutputClass" > Identifier "my amdgpu settings" > MatchDriver "amdgpu" > Option "DRI" "3" > EndSection > > And I started Compton like this to make sure it's a clean config: > > $ compton --config /dev/null --backend glx --vsync opengl > > With this setup, I don't seem to have any stutter. I visited the websites > you mention in a Chromium window, then opened another window and tried > moving things around and resizing. It behaves fine, same as what I know from > normally using dc=0. > > Kernel is 4.19.2, Mesa 18.2.4, Xorg 1.20.3, the GPU is a RX480, monitor is > 60 Hz. > > After I had typed this, I have now added TearFree to the X config and > restarted X: > > $ cat /etc/X11/xorg.conf.d/20-amdgpu.conf > Section "OutputClass" > Identifier "my amdgpu settings" > MatchDriver "amdgpu" > Option "TearFree" "true" > Option "DRI" "3" > EndSection > > Now, with TearFree enabled, things are super terrible. Trying to move a > window around has extreme stutter, it seems to drop frames. If I restart > Compton with "GALLIUM_HUD=fps" and then try moving a window around in > circles, I can see it stays below 40 fps instead of hitting the 60 fps that > it should be running at. "compton --vsync opengl" is a case less/not affected by this in my setup, try --vsync opengl-swc, --vsync opengl-oml or --vsync opengl-mswc Also try other compositors. Kwin, mutter, xfwm4
You should btw. also set CPU clock governor to either acpi-cpufreq performance or intel_pstate performance, since governors like powersave, ondemand or schedutil can already cause severe stuttering at vsynctester.com, even without a compositor. The result should be 100% stutter free, at least that's the case for me with amdgpu.dc=0. This way you should be able to be absolutely sure if the result is badly affected by amdgpu.dc=1.
Created attachment 142558 [details] [review] Patch that "fixes" the problem. I've attached a patch that fixes the problem for me. It copies parts from the intel patch and uses the existing async infrastructure for the cursor. It's really tiny, so I hope this is helpful enough to get this problem fixed quick.
(In reply to Brandon Wright from comment #55) > Created attachment 142558 [details] [review] [review] > Patch that "fixes" the problem. > > I've attached a patch that fixes the problem for me. It copies parts from > the intel patch and uses the existing async infrastructure for the cursor. > > It's really tiny, so I hope this is helpful enough to get this problem fixed > quick. Tested and solved for me on Polaris RX580. This also solves my stuttering with TearFree, which makes possible to avoid using a compositor only for vsync. Games that stuttered with mouse movement also fixed. Review and push this asap as a fix, you are a hero.
@Brandon Wright Sorry for double posting, but I think if you send the patch to amd-gfx mailing-list directly it might get reviewed faster.
(In reply to Brandon Wright from comment #55) > Created attachment 142558 [details] [review] [review] > Patch that "fixes" the problem. > > I've attached a patch that fixes the problem for me. It copies parts from > the intel patch and uses the existing async infrastructure for the cursor. > > It's really tiny, so I hope this is helpful enough to get this problem fixed > quick. This is a nice attempt but it only resolves the problem because it relies on the blocking behavior in atomic check that amdgpu_dm currently does (and shouldn't be doing). Asynchronous updates can and will occur in parallel with other commits on worker threads. Without the wait in atomic_check you'll see the IGT legacy cursor tests break with this patch (and there will probably be system faults as well). There are larger problems within amdgpu_dm's commit tail that if addressed should resolve this issue for compton I'd imagine.
(In reply to Nicholas Kazlauskas from comment #58) > (In reply to Brandon Wright from comment #55) > > Created attachment 142558 [details] [review] [review] [review] > > Patch that "fixes" the problem. > > > > I've attached a patch that fixes the problem for me. It copies parts from > > the intel patch and uses the existing async infrastructure for the cursor. > > > > It's really tiny, so I hope this is helpful enough to get this problem fixed > > quick. > > This is a nice attempt but it only resolves the problem because it relies on > the blocking behavior in atomic check that amdgpu_dm currently does (and > shouldn't be doing). > > Asynchronous updates can and will occur in parallel with other commits on > worker threads. Without the wait in atomic_check you'll see the IGT legacy > cursor tests break with this patch (and there will probably be system faults > as well). > > There are larger problems within amdgpu_dm's commit tail that if addressed > should resolve this issue for compton I'd imagine. Since you've been working on Freesync, you should know your patches are also affected by this bug on some wine games. Any chance you could you kindly try to tackle this? btw, I don't have igt on my system atm, nor got any system fault yet with the patch. I really need dc for the extra headphone jack, mine is broken atm :(
> There are larger problems within amdgpu_dm's commit tail that if addressed > should resolve this issue for compton I'd imagine. Honestly, I don't care about compton. I don't think you realize the effects of this issue. It seriously affects performance when the cursor is in motion with any page-flipping application. GNOME and KDE, while the window motion is less affected, stutter in composited client applications. > This is a nice attempt but it only resolves the problem because it relies on > the blocking behavior in atomic check that amdgpu_dm currently does > (and shouldn't be doing). > > Asynchronous updates can and will occur in parallel with other commits on > worker threads. Without the wait in atomic_check you'll see the IGT legacy > cursor tests break with this patch (and there will probably be system faults > as well). You'd have to point this out to me, because I didn't see anything that would obviously block, unless it's buried in dc_validate_plane. Since, as you say, atomic_check is blocking for now, why not work around this issue with a tiny change. If someone ever gets around to doing things the correct way it's no big deal to remove.
Thanks a lot @ Brandon Wright, your patch really does the trick. I also totally agree on your opinion that it should be mainlined as at least a temporary solution (and also get backported to older kernels). I just noticed that it works fine with xf86-video-amdgpu driver, but with modesetting driver, xorg or the driver freezes when starting/logging in. Not sure if this is related to latest 4.21-wip-changes or the cursor patch though.
(In reply to tempel.julian from comment #61) > I just noticed that it works fine with xf86-video-amdgpu driver, but with > modesetting driver, xorg or the driver freezes when starting/logging in. Not > sure if this is related to latest 4.21-wip-changes or the cursor patch > though. I'm getting the modesetting freeze, too, on 4.20-rc3, so it's likely the cursor patch. I called it a "fix", in quotation marks for a reason. I've barely looked at the KMS/DRM stuff for an hour, so I have no clue what I'm doing. I just wanted to show the AMD guys that we have pinpointed the problem, give them something that we can confirm no longer produces the problem, and hope that they'd go ahead and do things correctly.
(In reply to Brandon Wright from comment #62) > (In reply to tempel.julian from comment #61) > > I just noticed that it works fine with xf86-video-amdgpu driver, but with > > modesetting driver, xorg or the driver freezes when starting/logging in. Not > > sure if this is related to latest 4.21-wip-changes or the cursor patch > > though. > I'm getting the modesetting freeze, too, on 4.20-rc3, so it's likely the > cursor patch. I called it a "fix", in quotation marks for a reason. I've > barely looked at the KMS/DRM stuff for an hour, so I have no clue what I'm > doing. I just wanted to show the AMD guys that we have pinpointed the > problem, give them something that we can confirm no longer produces the > problem, and hope that they'd go ahead and do things correctly. Probably easy to make the workaround only activate on xf86-video-amdgpu. I luckily don't need the modesetting driver for anything that I'm aware off, what do you guys use that driver for ? Is it for GPU switching?
Created attachment 142574 [details] [review] 0001-drm-amd-display-Add-fast-path-for-legacy-cursor-plan.patch This patch is similar to the async_update one but it takes care to lock if anything is modifying the plane. It's very close to what i915 does with a few minor differences with framebuffer handling. I've tested it for compton with Gallium HUD up and I no longer see the issue on mouse movement (cursor fb changes are still a bit slow, so you'll still probably see spikes on cursor changes). You can try this on top of amd-staging-drm-next and I'd imagine it'd fix your problems.
(In reply to Nicholas Kazlauskas from comment #64) > Created attachment 142574 [details] [review] [review] > 0001-drm-amd-display-Add-fast-path-for-legacy-cursor-plan.patch > > This patch is similar to the async_update one but it takes care to lock if > anything is modifying the plane. It's very close to what i915 does with a > few minor differences with framebuffer handling. > > I've tested it for compton with Gallium HUD up and I no longer see the issue > on mouse movement (cursor fb changes are still a bit slow, so you'll still > probably see spikes on cursor changes). > > You can try this on top of amd-staging-drm-next and I'd imagine it'd fix > your problems. Patch does work for me. Is there an easy way to backport this to 4.19 mainline? Would be very useful to integrate the fix into stable kernels. As it is currently it wont work on 4.19 because it uses <drm/drm_atomic_uapi.h> which isnt mainlined yet. Brandon's hack works on 4.19 just in case it matters. Last question, is this patch https://patchwork.freedesktop.org/patch/263412/ you just submitted related to this issue? Thanks a LOT for tackling this Nicholas and Brandon
(In reply to bmilreu from comment #65) > Is there an easy way to backport this to 4.19 mainline? Would be very useful > to integrate the fix into stable kernels. > > As it is currently it wont work on 4.19 because it uses > <drm/drm_atomic_uapi.h> which isnt mainlined yet. Brandon's hack works on > 4.19 just in case it matters. Just remove the header include. There was some refactoring, and the functions needed in that file are in the others included. > Last question, is this patch https://patchwork.freedesktop.org/patch/263412/ > you just submitted related to this issue? Looks like it's related. Thanks for taking on our issue, Nicholas.
Just wanted to note that applying [PATCH 1/2] drm/amd/display: Use private obj helpers for dm_atomic_state [PATCH 2/2] drm/amd/display: Remove wait for hw/flip done in atomic check does not solve/workaround the issue, unlike Brandon's patch.
(In reply to tempel.julian from comment #67) > Just wanted to note that applying > [PATCH 1/2] drm/amd/display: Use private obj helpers for dm_atomic_state > [PATCH 2/2] drm/amd/display: Remove wait for hw/flip done in atomic check > does not solve/workaround the issue, unlike Brandon's patch. try 0001-drm-amd-display-Add-fast-path-for-legacy-cursor-plan.patch
(In reply to bmilreu from comment #68) > try 0001-drm-amd-display-Add-fast-path-for-legacy-cursor-plan.patch That one works, also with modesetting driver. Regarding your question if modesetting driver is any beneficial: I'd say generally not, as it doesn't offer every feature of the xf86 DDX driver. But it can be sufficient in many cases, and I also just found a bug with xf86 driver + amdgpu.dc=1 causing stutter in mpv. So I'm lucky to have modesetting as a fallback in the meantime.
Comment on attachment 142558 [details] [review] Patch that "fixes" the problem. Marked my patch obsolete.
@Nicholas Kazlauskas any reason not to push this fix to staging or next?
(In reply to bmilreu from comment #71) > @Nicholas Kazlauskas > any reason not to push this fix to staging or next? I agree. This will reduce stuttering for everyone, especially those who think the problem is caused elsewhere and just discount it as bad software or graphics card performance like I did.
Yeah, I'd be extremely disappointing if this wouldn't land before linux 4.21 DRM merging window closes. Like I already said, I think this is even worth getting backported to older kernels, as I'd consider it an important fix. Likely every AMD Xorg user has degraded performance because of this.
Is anyone from the AMD driver team still following this? Could we please have a review of Nicholas's patch and try to get it into 4.20? It's not that disruptive code-wise, but it makes a big smoothness difference. I can quickly compile a kernel/module for myself pretty easily, but most users aren't going to be that technical or even know why things are so stuttery.
Any update, please?
https://patchwork.freedesktop.org/series/53589/ A new patch has been submitted. So it's in the pipeline for inclusion now.
@Nicholas Kazlauskas is there anything important in the new patch vs the first one? it fails a hunk on 4.19 for me thanks for submiting it to amd-gfx
It's queued now for 4.21: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.21-wip&id=0e65ba74dbd61f54f2dc74035d07490d5fd99a38 Thanks, guys! Tested with latest 4.21-wip kernel and seems to be working fine on a quick try. I just wanna mention here that Alex assumes the following bug regarding stutter with gamma adjustments roots to the driver being atomic as well: https://bugs.freedesktop.org/show_bug.cgi?id=108917 Would be nice if this could be solved a bit faster than the cursor issue (Nicholas Kazlauskas to the rescue? :) ). I guess I can close this ticket regarding the cursor now.
any chance to backport the last version of the patch to 4.20?
It seems like the issue is actually not 100% resolved (linux 5.0-rc1). The moving of windows is free of stutter now, but moving of windows can still negatively affect performance of other windows as long as fullscreen vsync is enabled (not necessarily via compositor, can also be done via TearFree without a compositor). Again, this is best seen on https://www.vsynctester.com/ . This seems to happen mostly when mouse clicks occur, but sometimes also apart from this. It can also happen when just moving the mouse cursor repeatedly on top of shell elements which trigger pop ups, like the system tray of KDE Plasma. As expected, setting amdgpu.dc=0 completely "fixes" the issue.
(In reply to tempel.julian from comment #80) > It seems like the issue is actually not 100% resolved (linux 5.0-rc1). > The moving of windows is free of stutter now, but moving of windows can > still negatively affect performance of other windows as long as fullscreen > vsync is enabled (not necessarily via compositor, can also be done via > TearFree without a compositor). > Again, this is best seen on https://www.vsynctester.com/ . This seems to > happen mostly when mouse clicks occur, but sometimes also apart from this. > It can also happen when just moving the mouse cursor repeatedly on top of > shell elements which trigger pop ups, like the system tray of KDE Plasma. > As expected, setting amdgpu.dc=0 completely "fixes" the issue. I'm not sure how much of this is actually amdgpu or Plasma. I can reproduce what you're reporting - red lines and spikes in the graph on vsynctester.com. This happens whenever I do something like open the dock or volume widgets in the tray on Plasma with the compositor tearing prevention set to automatic. However, moving the cursor or moving windows doesn't really seem to affect this and you can verify that in Plasma and other compositors. As for the difference between dc=1 and dc=0, that might just be a difference in behavior on the DRM level for atomic vs non-atomic drivers. Or a difference in userspace if they make a distinction there. It affects more than just amdgpu at least.
Could you please try the following? -disable Plasma compositing with Ctr + Alt + F12 (or in the compositor settings and log out and in again) -get the latest Compton release by yshui: https://github.com/yshui/compton -start Compton (Mesa still enables vsync automatically in the latest release, this has changed with compton-git) -open a Dolphin window and www.vsynctester.com in webbrowser -now click Dolphin's title bar, move it a bit, release it and repeat the procedure several times in a row -there should be stutter happening in the vsynctester.com browser window -now quit Compton and start it with "vblank_mode=0 compton" to enable GLX compositing without vsync -now the repeatedly moving Dolphin's window procedure described above shouldn't lead to any stutter -when booting with amdgpu.dc=0, there also shouldn't be any stutter when vsync is kept enabled My layman impression is that this indicates the problem originating somewhere in the xorg/driver space, and not in Plasma or the compositor.
Edit: Sorry, I meant disabling KWin compositing via Shift + Alt F12, not Ctrl.
(In reply to Nicholas Kazlauskas from comment #81) > As for the difference between dc=1 and dc=0, that might just be a difference > in behavior on the DRM level for atomic vs non-atomic drivers. Or a > difference in userspace if they make a distinction there. xf86-video-amdgpu doesn't, FWIW.
(In reply to tempel.julian from comment #82) > Could you please try the following? > > -disable Plasma compositing with Ctr + Alt + F12 (or in the compositor > settings and log out and in again) > -get the latest Compton release by yshui: https://github.com/yshui/compton > -start Compton (Mesa still enables vsync automatically in the latest > release, this has changed with compton-git) > -open a Dolphin window and www.vsynctester.com in webbrowser > -now click Dolphin's title bar, move it a bit, release it and repeat the > procedure several times in a row > -there should be stutter happening in the vsynctester.com browser window > > -now quit Compton and start it with "vblank_mode=0 compton" to enable GLX > compositing without vsync > -now the repeatedly moving Dolphin's window procedure described above > shouldn't lead to any stutter > -when booting with amdgpu.dc=0, there also shouldn't be any stutter when > vsync is kept enabled > > My layman impression is that this indicates the problem originating > somewhere in the xorg/driver space, and not in Plasma or the compositor. I've tried the setup you've described but I see no stuttering in vsynctester and no difference with vblank_mode=0. The perf overlay on compton itself looks stutter free as well. I only see the stuttering doing what you first described with the dock widgets/tray widgets.
Thanks for trying to reproduce. Hm, that leaves me clueless. It's 100% reproducible here and I tried things like forcing high performance profile, using plain 60Hz edid and disabling overclock, which all don't show any effect in that regard. I perhaps should attach new logfiles.
It's important to distinguish between when the compositor and application desync and when there's actually a driver hiccup. If the compositor and application swaps get switched around briefly, it'll produce stutter, but this won't be reflected on the vsynctester graph (it'll stay green). If it's red, that means the the application missed a frame. The former is strictly the compositor's fault.
Well, there's no compositor to blame, as the issue also occurs with just TearFree enabled without any compositor. Then you just have to avoid moving another window on top of the browser window, as there are no independent window layers without compositor. But yes, there are red spikes of the "inter-frame (green/red) right scale" with amdgpu.dc=1, while there are non with amdgpu.dc=0 (TearFree however doesn't work well in general without amdgpu.dc=1). I can also reproduce the issue with mpv playing a video, instead of a webrowser. It stutters the same when doing the "clicky window moving" described above. Attaching recent dmesg and xorg log.
Created attachment 143080 [details] new dmesg log for reopened cursor issue
Created attachment 143081 [details] new xorg.0 log for reopened cursor issue
I suppose I could narrow down the culprit: The issue does not occur with stable version 18.1.0 of xf86-video-amdgpu DDX driver. But it does occur with both xorg modesetting and latest git commit of xf86-video-amdgpu. I'll try to find the faulty commit of the xf86 DDX driver.
This is the commit with which the stutter is introduced: https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/commit/0d60233d26ec70d4e1faa343b438e33829c6d5e4 (But it seems amdgpu.dc=1 unfortunately is also a bit more vulnerable towards stutter in general when new windows occur, compared to the old display stack.)
I suppose to keep reports clean, you'd want me to create a new ticket for this new issue. Thus I went ahead: https://bugs.freedesktop.org/show_bug.cgi?id=109332 Closing again.
(In reply to tempel.julian from comment #88) > (TearFree however doesn't work well in general without amdgpu.dc=1). Can you elaborate on that? Maybe file another report on it.
(In reply to Michel Dänzer from comment #94) > (In reply to tempel.julian from comment #88) > > (TearFree however doesn't work well in general without amdgpu.dc=1). > > Can you elaborate on that? Maybe file another report on it. For me, at least, it hiccups regularly every second and introduces noticeable latency. With dc=1 it's smooth and more responsive.
Yep, that's my observation as well (no compositor vsync enbled at the same time). I also noticed that sometimes switching between Vulkan fullscreen windows (radv) and desktop can break TearFree, it needs to be reactivated then. I however would like to find a 100% reproducible way to reproduce first before reporting.
(In reply to tempel.julian from comment #96) > I also noticed that sometimes switching between Vulkan fullscreen windows > (radv) and desktop can break TearFree, it needs to be reactivated then. I > however would like to find a 100% reproducible way to reproduce first before > reporting. Make sure you test with https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/merge_requests/15 before reporting. :)
(In reply to Brandon Wright from comment #95) > For me, at least, it hiccups regularly every second and introduces > noticeable latency. With dc=1 it's smooth Thanks, I was able to reproduce the hiccups with vsynctester.com in windowed Firefox. > and more responsive. Couldn't see a difference there (looking at how much the red dot trails the mouse cursor on vsynctester.com). Ironically, the problem is actually that non-DC can be too responsive, sometimes completing page flips in the same vertical blank period. https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/merge_requests/23 fixes it for me.
Does the trick for me too, TearFree with amdgpu.dc=0 seems to be completely smooth now. Delay / input latency seems to be the same between amdgpu.dc=1 and 0, I suppose this is as low as it can be with traditional vsync (without variable vblank).
(In reply to tempel.julian from comment #99) > Does the trick for me too, TearFree with amdgpu.dc=0 seems to be completely > smooth now. Delay / input latency seems to be the same between amdgpu.dc=1 > and 0, I suppose this is as low as it can be with traditional vsync (without > variable vblank). Confirmed here, too.
For me the TearFree Option works flawlessly with amdgpu.dc=0 on 5.0rc2 and 4.20.1, but 3d games lag. With amdgpu.dc=1 there are some slight hiccups every 2-3 seconds (Firefox`s autoscrolling, video playback etc.), but 3d works without noticeable issues.
Is that with latest git version of the xf86 DDX driver, including the PR Michel posted? I had subpar game performance (looked like half of the fps) too without vsync the last time I checked before the aforementioned PR, but everything looks well here now (apart from TearFree breaking issue I just opened).
With the latest xf86 DDX driver 3d works better in amdgpu.dc=0, but still lags, even though a built-in counter is showing high fps. Does amdgpu.dc=1 improve 3d performance that much?
(In reply to Hans D from comment #103) > With the latest xf86 DDX driver 3d works better in amdgpu.dc=0, but still > lags, even though a built-in counter is showing high fps. Does amdgpu.dc=1 > improve 3d performance that much? It shouldn't affect 3D performance directly at at all. I'm not sure what exactly you mean by "lags", so it's hard to guess what's going on.
I have installed xf86-video-amdgpu 19.0.0 on Solus but the issue is still present. I'm running Plasma 5.14 and kernel 4.20.10 with mesa 18.3.3. Is there something else I should to fix the stuttering?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.