Bug 102646 - Screen flickering under amdgpu-experimental [buggy auto power profile]
Summary: Screen flickering under amdgpu-experimental [buggy auto power profile]
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
: 96868 105300 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-09-10 17:06 UTC by Justin Mitzel
Modified: 2018-12-02 09:17 UTC (History)
8 users (show)

See Also:
i915 platform:
i915 features:


Attachments
The Dmesg output (80.63 KB, text/plain)
2017-09-12 21:52 UTC, Justin Mitzel
no flags Details
Current Xorg log (92.28 KB, text/plain)
2017-09-12 21:53 UTC, Justin Mitzel
no flags Details
Output of journalctl -k (94.50 KB, text/plain)
2017-09-12 21:57 UTC, Justin Mitzel
no flags Details
New Xorg Log (115.30 KB, text/x-log)
2017-09-17 00:22 UTC, Justin Mitzel
no flags Details
dmesg log (maximum verbosity) RX580 (125.32 KB, text/plain)
2018-03-05 07:13 UTC, Ruben Harutyunyan
no flags Details
attachment-9516-0.html (1.30 KB, text/html)
2018-11-21 11:34 UTC, Tim Writer
no flags Details
possible fix (1.18 KB, patch)
2018-11-30 01:25 UTC, Alex Deucher
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Mitzel 2017-09-10 17:06:45 UTC
GPU/Stack: R9 390x/amdgpu-experimental/xf86-video-amdgpu/radeonsi
CPU: R7 1700x
Distro/Kernel: Manjaro/4.12.11
Display: ASUS VG248QE 144hz.
Desktop Environment: KDE

Under games and on the desktop(occasionally) there is screen flickering.

Example: https://vid.me/n0t1r (sorry for audio/video, OBS is unable to capture the flickering)

This happens at every screen frequency that I am capable of achieving (60-144hz), on HDMI and Display Port. I have tried all sorts of Vsync options, and the Compton compositor. However, none of those worked. 

Let me know what other information you want.
Comment 1 Justin Mitzel 2017-09-11 00:02:50 UTC
Update: This does not appear to happen under KDE wayland
Comment 2 Justin Mitzel 2017-09-11 00:20:12 UTC
It also does not happen on XFCE nor lxqt. I tried openbox/KDE and the problem happened there.
Comment 3 Michel Dänzer 2017-09-11 00:25:25 UTC
Please attach the corresponding dmesg output and Xorg log file.
Comment 4 Justin Mitzel 2017-09-12 21:52:38 UTC
Created attachment 134180 [details]
The Dmesg output
Comment 5 Justin Mitzel 2017-09-12 21:53:51 UTC
Created attachment 134181 [details]
Current Xorg log
Comment 6 Justin Mitzel 2017-09-12 21:57:02 UTC
Created attachment 134182 [details]
Output of journalctl -k

Thought I'd add this for good measure.
Comment 7 Michel Dänzer 2017-09-13 02:52:45 UTC
Looks like a display related issue in the kernel driver. Any chance you can try an amd-staging kernel with DC enabled, to see if it happens with that as well?
Comment 8 Justin Mitzel 2017-09-13 13:29:36 UTC
Sure, could you link me directly to the one you wanted me to try?
Comment 9 Justin Mitzel 2017-09-14 18:37:53 UTC
I went looking for them and noticed that these are all made for ubuntu. Will they still work with Manjaro?
Comment 10 Justin Mitzel 2017-09-15 15:38:20 UTC
I managed to get Kernel 4.12 of amd-staging working and the problem still persists.
Comment 11 Justin Mitzel 2017-09-17 00:15:44 UTC
The problem also happens in Kernel 4.13.2. Although it does not seem as bad.
Comment 12 Justin Mitzel 2017-09-17 00:22:36 UTC
Created attachment 134285 [details]
New Xorg Log
Comment 13 Harry Wentland 2017-10-02 14:14:07 UTC
I'm curious if you only see this with Stardew Valley? I've seen reports of people having the same issue with Stardew Valley independent of their graphics adapter. Seems to be a combination of game and window manager.

https://steamcommunity.com/app/413150/discussions/1/405693392918042081/
https://steamcommunity.com/app/413150/discussions/1/333656722968187453/
Comment 14 Justin Mitzel 2017-10-03 20:44:38 UTC
No, I see this in other games too. Antichamber and Torchlight II come to mind.
Comment 15 Justin Mitzel 2017-10-03 20:53:26 UTC
Oh, I should also mention that it is not only in games that I see this flickering. It happens in firefox, and when switching activities in KDE sometimes.
Comment 16 Justin Mitzel 2017-10-21 01:54:11 UTC
I tested under other Desktop Environments, and it does appear that the problem is still there, just not as noticeable as under plasma. I do want to know though, is there any work I can do that would help someone to debug?
Comment 17 Justin Mitzel 2017-11-29 23:03:54 UTC
The problem seems to have gotten better within the last month, but is still not solved.
Comment 18 Justin Mitzel 2017-12-14 04:06:12 UTC
I have tried using a different monitor and also using a DVI cable. Neither of these mitigated the problem in any way. I also tried setting the resolution of the monitor to 720p, which interestingly did help the problem quite a bit. Still did not solve it, unfortunately, but I hope when someone sees this it will give them a better idea of what they're dealing with.
Comment 19 Justin Mitzel 2017-12-16 18:37:31 UTC
I figured it out, Since the power profile in dri was set to auto, it keeps switching between the maximum and minimum clock speeds.
Comment 20 Harry Wentland 2017-12-20 20:22:22 UTC
Looks like some guys more familiar with the power profiles should take a peek at this. Not really familiar with that myself.
Comment 21 Alex Deucher 2017-12-20 20:29:06 UTC
Can you post a video of the flickering? The one linked above no longer works.  Is it like tearing or does the display go blank or unstable image?
Comment 22 Justin Mitzel 2018-01-30 19:26:32 UTC
Hi, sorry I took so long. I usually check this around once a month. I reuploaded my gameplay on youtube. https://www.youtube.com/watch?v=-uPHG8mz4Xc&feature=youtu.be  

This happens in every game, and on the desktop if I don't set my power profile manually to high. Auto and low exhibit buggy behavior, with low being far worse than auto.
Comment 23 Ruben Harutyunyan 2018-03-05 07:10:50 UTC
Hello!

I am having a similar (same?) issue on my RX580 (Asus STRIX TOC).
Seems to be an issue with MCLK switching.


Here is a video of it happening on the desktop:
https://www.youtube.com/edit?o=U&video_id=z28fFqNdjAY
(there is also screen flickering that's not seen on camera, but it doesn't happen too often in contrary to the horizontal lines)

OBS is unable to campture the glitches though:
https://www.youtube.com/edit?o=U&video_id=iMEnprhBKFQ

Notes: 
1) Most of the time glitches happen when something new gets rendered. 
2) Google Chrome/Chromium always glitch (to a lesser extent when only the start page is open and nothing changes on the screen, opening Facebook guarantees glitches).
3) Playing video in VLC doesn't cause any glitches (x264 encoded MKV).
4) It's really easy to reproduce by setting the power profile to low (which fixes the issue) and then switching to high while looking at the screen. The glitch will occur for a split second. Switching from high to low also causes the issue.

Workarounds so far:
1) Recompiling the kernel with "smu7_vblank_too_short" forced to output true (aka disabling MCLK switching) fixes the problem but locks the MCLK at 2Ghz and causes coil whine and higher temps.
2) Setting the power profile to anything but "auto".
3) Disabling DC.

It's also worth noting that in my case "low" power profile works fine, but R9 390x users seem to need "high" power profile to fix it (from the "smu7_vblank_too_short" thread: https://bugs.freedesktop.org/show_bug.cgi?id=96868#c32).

I can test any patches/programs/cases if you need it.
Comment 24 Ruben Harutyunyan 2018-03-05 07:13:49 UTC
Created attachment 137789 [details]
dmesg log (maximum verbosity) RX580
Comment 25 Ruben Harutyunyan 2018-03-13 16:29:28 UTC
Seems like this is a regression actually. I've managed bisect the commit which caused the problems.

Last working commit: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.15-dc&id=8ee5702afdd48b5864c46418ad310d6a23c8e9ab

Breaking commit: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.15-dc&id=b9e56e41e0c55c2b2ab5919c5e167faa4200b083

Keep in mind that you need https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.15-dc&id=9ba29fcb76a559078491adffc74f66bf92b9dbea commit to be able to compile the kernel, but judging by the changes the merge commit was the one that broke it.

Some more details:

1) Disabling/enabling FreeSync doesn't matter
2) Issue happens under Xorg KDE/Wayland Gnome/Xorg GNOME
3) HDMI or DP doesn't matter


I'd try to narrow it down to the exact part of the commit that causes the issue, but I am having an issue with my Ryzen soft locking thread by thread until a complete lockup happens when compiling with 16 threads.
Comment 26 Harry Wentland 2018-03-13 18:05:58 UTC
That regression commit is the one that introduced the new DC display driver, so going back to the last working commit will effectively be the same as disabling DC for you.
Comment 27 Ruben Harutyunyan 2018-03-13 19:01:20 UTC
Hmm.

I guess I got deceived by `CONFIG_DRM_AMD_DC=y` being available in that commit and logs like `dc_link_detect` and `dc_link_handle_hpd_rx_irq` that didn't show up before.

Sorry about that!
Comment 28 Justin Mitzel 2018-04-22 15:45:37 UTC
I'm not sure what the status of this bug is, but it's only gotten worse with kernel 4.16 and the amd-staging-drm-next branch.
Comment 29 Hadrien Lacour 2018-06-25 18:09:25 UTC
For what it's worth, the problem seems gone since I switched from 4.16.16 to 4.17.2 (with CONFIG_DRM_AMD_DC=y).
Comment 30 L.Y. Sim 2018-09-06 12:06:10 UTC
I have this issue on a 3840x1600 Acer XR382CQK with an RX560 with Kernel 4.18.5-1 on Manjaro. 

When I set the refresh rate to 75Hz, severe artifacts and flickering appear.

Both 

    echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level
and

    echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level

stop the flickering and artifacting, and I can see via 

    cat /sys/class/drm/card0/device/pp_dpm_mclk 

that the memory clocks are set to 1750Mhz and 300Mhz respectively. 

However, if /sys/class/drm/card0/device/power_dpm_force_performance_level is set to auto, I can see (via watching /sys/class/drm/card0/device/pp_dpm_mclk with time intervals around 0.1s), that the memory clock oscillates rapidly between 300Mhz, 625Mhz and 1750Mhz.

So it seems to me that the rapid change in memory frequency is what's causing the flickering.
Comment 31 Timothy Pearson 2018-10-27 07:40:45 UTC
(In reply to L.Y. Sim from comment #30)
> I have this issue on a 3840x1600 Acer XR382CQK with an RX560 with Kernel
> 4.18.5-1 on Manjaro. 
> 
> When I set the refresh rate to 75Hz, severe artifacts and flickering appear.
> 
> Both 
> 
>     echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level
> and
> 
>     echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level
> 
> stop the flickering and artifacting, and I can see via 
> 
>     cat /sys/class/drm/card0/device/pp_dpm_mclk 
> 
> that the memory clocks are set to 1750Mhz and 300Mhz respectively. 
> 
> However, if /sys/class/drm/card0/device/power_dpm_force_performance_level is
> set to auto, I can see (via watching /sys/class/drm/card0/device/pp_dpm_mclk
> with time intervals around 0.1s), that the memory clock oscillates rapidly
> between 300Mhz, 625Mhz and 1750Mhz.
> 
> So it seems to me that the rapid change in memory frequency is what's
> causing the flickering.

Confirmed here on a Polaris 10 GPU (WX7100) on DisplayPort with the latest kernel master from GIT (4.19+).  The workaround sequence above stops the irritating flickering.  dc=0 alone does /not/ stop the flickering, and dc=1 yields no displays detected due to some other bug.
Comment 32 Peter 2018-10-28 18:45:27 UTC
(In reply to L.Y. Sim from comment #30)
> I have this issue on a 3840x1600 Acer XR382CQK with an RX560 with Kernel
> 4.18.5-1 on Manjaro. 
> 
> When I set the refresh rate to 75Hz, severe artifacts and flickering appear.
> 
> Both 
> 
>     echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level
> and
> 
>     echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level
> 
> stop the flickering and artifacting, and I can see via 
> 
>     cat /sys/class/drm/card0/device/pp_dpm_mclk 
> 
> that the memory clocks are set to 1750Mhz and 300Mhz respectively. 
> 
> However, if /sys/class/drm/card0/device/power_dpm_force_performance_level is
> set to auto, I can see (via watching /sys/class/drm/card0/device/pp_dpm_mclk
> with time intervals around 0.1s), that the memory clock oscillates rapidly
> between 300Mhz, 625Mhz and 1750Mhz.
> 
> So it seems to me that the rapid change in memory frequency is what's
> causing the flickering.

I think the "rapid change in memory frequency" really is the problem:
When I 
 echo manual > /sys/class/drm/card0/device/power_dpm_force_performance_level
and then 
 echo "0" >  /sys/class/drm/card0/device/pp_dpm_mclk
or
  echo "1" >  /sys/class/drm/card0/device/pp_dpm_mclk
or
  echo "2" >  /sys/class/drm/card0/device/pp_dpm_mclk
there is no more flickering. 
(This limits the memory clock to 300Mhz, 1000 or 2000Mhz on my RX580 card. Using Arch Linux kernel 4.18.16 by the way.)

Whereas 
or
  echo "0 1 2" >  /sys/class/drm/card0/device/pp_dpm_mclk
or any combination of 2 memory clock frequencies brings flickering back.
Comment 33 Alex Deucher 2018-10-29 15:41:52 UTC
The mclk switching needs to happen during the vblank period to avoid the flickering.  If there is not enough time in the vblank period, you may see flickering outside of the blanking period.  Can you figure out what modes and refresh rates exhibit this issue?
Comment 34 bmilreu 2018-10-29 17:57:48 UTC
Alex this bug looks like the same I reported here https://bugs.freedesktop.org/show_bug.cgi?id=108322 

The flickering issues I have with 75hz are gone after forcing profile to high.
Comment 35 Alex Deucher 2018-10-29 18:38:25 UTC
(In reply to bmilreu from comment #34)
> The flickering issues I have with 75hz are gone after forcing profile to
> high.

The disables mclk switching by forcing the clocks to high.  What modes and refresh rates exhibit the problem?
Comment 36 Alex Deucher 2018-10-31 19:08:25 UTC
*** Bug 105300 has been marked as a duplicate of this bug. ***
Comment 37 bmilreu 2018-10-31 22:57:45 UTC
(In reply to Alex Deucher from comment #35)
> (In reply to bmilreu from comment #34)
> > The flickering issues I have with 75hz are gone after forcing profile to
> > high.
> 
> The disables mclk switching by forcing the clocks to high.  What modes and
> refresh rates exhibit the problem?

Any resolution at 75hz. My FullHD modelines are:

[   802.898] (II) AMDGPU(0): Modeline "1920x1080"x0.0  148.50  1920 2008 2052 2200  1080 1084 1089 1125 +hsync +vsync (67.5 kHz eP)
[   802.898] (II) AMDGPU(0): Modeline "1920x1080"x0.0  170.00  1920 1928 1960 2026  1080 1105 1113 1119 +hsync +vsync (83.9 kHz e)

First one is 1080p@60hz (good), second is 1080p@75hz (flickers)
Comment 38 bmilreu 2018-10-31 23:06:30 UTC
(In reply to Alex Deucher from comment #36)
> *** Bug 105300 has been marked as a duplicate of this bug. ***

https://bugs.freedesktop.org/show_bug.cgi?id=108322 - Also related. 

When I tested 4.18 kernel the bug used to trigger only @>73hz after sleep/wakeup. In latest drm-next-4.21-wip triggers as soon as I switch to 75hz. If I switch back from 75hz to 60hz it keeps flickering until I manually turn my monitor off/on.
Comment 39 bmilreu 2018-11-01 16:52:13 UTC
Another interesting info, even with amdgpu.dc=0 I get flickering @75hz. Difference is the flickering immediatly stops when I switch back to 60hz (no need to reboot or switch monitor off/on)
Comment 40 tempel.julian 2018-11-17 14:00:26 UTC
Still having this issue with 2560 x 1440 @ 75Hz and latest 4.21-wip kernel.
Manually forcing a single VRAM clock state eliminates the flicker artifacts. As soon as there is dynamic VRAM clocking happening, it flickers.
Comment 41 afm 2018-11-18 17:36:54 UTC
RX 570
1002:67df [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev ef)
Kernel: 4.19.2-300.fc29.x86_64
1920x1080     59.93*+

Lots of strange flickering and black lines with dc=1. Strange ghosting and screen corruption with dc=0

echo manual > /sys/class/drm/card0/device/power_dpm_force_performance_level
echo "0" >  /sys/class/drm/card0/device/pp_dpm_mclk

Fixes the issue.

echo "2" >  /sys/class/drm/card0/device/pp_dpm_mclk
echo "0" >  /sys/class/drm/card0/device/pp_dpm_mclk
echo "2" >  /sys/class/drm/card0/device/pp_dpm_mclk

Produces screen corruption for half a second on each change. Across all 3 monitors.
Comment 42 afm 2018-11-18 18:12:12 UTC
Just to add. I noticed the corruption during the plymouth boot screen too.
After just waking the monitors from DPMS i got corruption until I echo'd 0 again.
Comment 43 afm 2018-11-18 18:38:08 UTC
OK really sorry for the noise on this ticket. It seems I did this to myself with featuremask 0xfffffff . All is working without it. (i had been using watman gtk)
Comment 44 tempel.julian 2018-11-18 18:47:44 UTC
I have this issue also without amdgpu.ppfeaturemask=0xffffffff.
amdgpu.dc 0 or 1 doesn't make a difference for me.
Comment 45 bmilreu 2018-11-18 22:54:31 UTC
A temporary way to at least lock mclck forever would be great until this is fixed. There is no sane way to do it permanently (some events switch it back to auto, IE resolution changes and sleep/wake)
Comment 46 tempel.julian 2018-11-18 23:16:50 UTC
As a workaround, a timer job/script writing every few seconds might do the trick.
Comment 47 tempel.julian 2018-11-21 11:34:21 UTC
I had the chance to test three different GPUs:
RX 560: flickers
RX 580: flickers
RX Vega 56: doesn't flicker (and saves way more power than Polaris at the same time)
Comment 48 Tim Writer 2018-11-21 11:34:31 UTC
Created attachment 142539 [details]
attachment-9516-0.html

I'm out-of-office for jury duty Nov. 19 - Dec. 11, returning on Dec. 12. I will be checking e-mail daily and will endeavour to route important e-mail to my team.

Regards,
Tim
Comment 49 bmilreu 2018-11-23 19:35:58 UTC
(In reply to tempel.julian from comment #46)
> As a workaround, a timer job/script writing every few seconds might do the
> trick.

I tried a systemd timer but it floods my dmesg every time it triggers and also caused weird lag in systemd monitoring, perhaps writing to sysfs constantly has a noticeable cost. Maybe someone could hack a kernel option to force it to high as a temporary workaround? Unfortunately I'm ignorant in C.
Comment 50 tempel.julian 2018-11-29 11:01:49 UTC
Or just watch pp_dpm_mclk e.g. every second instead of constantly writing into it, and write only in case of a change?
Comment 51 Alex Deucher 2018-11-29 20:52:27 UTC
Does this patch help?
https://patchwork.freedesktop.org/patch/264781/
or this patch for older kernels:
https://bugs.freedesktop.org/attachment.cgi?id=142660
Comment 52 tempel.julian 2018-11-29 22:40:58 UTC
It still flickers with 4.21-wip build including this patch.
Comment 53 Alex Deucher 2018-11-30 01:25:40 UTC
Created attachment 142662 [details] [review]
possible fix

How about this patch?
Comment 54 Alex Deucher 2018-11-30 01:26:34 UTC
*** Bug 96868 has been marked as a duplicate of this bug. ***
Comment 55 bmilreu 2018-11-30 04:10:24 UTC
(In reply to Alex Deucher from comment #53)
> Created attachment 142662 [details] [review] [review]
> possible fix
> 
> How about this patch?

for me it still flickers after sleep/wake @75hz unless I lock mclck
Comment 56 tempel.julian 2018-11-30 09:24:55 UTC
(In reply to bmilreu from comment #55)
> (In reply to Alex Deucher from comment #53)
> > Created attachment 142662 [details] [review] [review] [review]
> > possible fix
> > 
> > How about this patch?
> 
> for me it still flickers after sleep/wake @75hz unless I lock mclck

I can confirm this. For me, it instantly starts flickering after starting x.
Comment 57 tempel.julian 2018-12-02 09:17:45 UTC
Instead of my usual DL-DVI display, I connected a Samsung C27H711 via Display Port. It officially offers 2560x1440 75Hz and the very same flickering issue occurs.

I suspect the Polaris Windows driver has the same or a similar issue, as there is sometimes black flashing when there is heavy VRAM clock jumping going on, e.g. in games during loading screens or when watching videos with low GPU load in mpv.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.