Summary: | AMD Navi10 GPU powerplay issues when using two DisplayPort connectors | ||
---|---|---|---|
Product: | DRI | Reporter: | Timur Kristóf <venemo> |
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | normal | ||
Priority: | not set | CC: | stefan |
Version: | DRI git | ||
Hardware: | All | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Timur Kristóf
2019-10-07 08:17:38 UTC
After a looking at the problem a bit further, it seems that the problem occurs when using any two DisplayPort connectors, but does not happen when using just one DisplayPort and the HDMI connector. Forgot to mention, this happened with kernel 5.4-rc1 and mesa 19.2 I can confirm this. My card is a PowerColor Radeon RX 5700 XT Red Dragon. As soon as I connect a second monitor, I get the same errors in dmesg as Timur Kristóf described. Unfortunately, the workaround with the HDMI connection does not seem to work in my case. It does not matter wether the monitors are connected via DP or HDMI. One important fact: the problem started with kernel 5.4-rc1 and persists in 5.4-rc2, but 5.3 works fine (except for the problem with the high idle power consumption, but that is a different story :))! Just to clarify: this is not just a "cosmetic" issue. The computer is barely usable. Application take extremely long to start and/or run slowly. Also the files in sysfs (/sys/class/drm/card0/device/pp_*) dont return anything anymore and lm_senors reports N/A for all sensors except the fan speed. Are both monitors 60hz? I've seen this occur with 2x60hz setups, but not with other combinations of refresh rates. It seems to be similar to issues with 75hz in a single monitor configuration. Other combinations of dual monitor refresh rates don't exhibit the issue, for me (although there are other problems, as discussed in https://bugs.freedesktop.org/show_bug.cgi?id=111482). Yes, both monitors run at 60 Hz. (In reply to Andrew Sheldon from comment #5) > Are both monitors 60hz? I've seen this occur with 2x60hz setups, but not > with other combinations of refresh rates. It seems to be similar to issues > with 75hz in a single monitor configuration. In my case, both are Dell U2718Q monitors, the resolution is 4K (3840x2160), and the refresh rate is 60Hz on both monitors. In my case the resolution of both monitors is 2560x1440 (In reply to Stefan Rehm from comment #8) > In my case the resolution of both monitors is 2560x1440 You could try overclocking (or underclocking) one or both monitors to see if the bug still exists, using: https://github.com/kevinlekiller/cvt_modeline_calculator_12 I recommend using the "-b" option which uses reduced blanking V2 mode, but you could experiment with different options. Then to use it: xrandr --output <monitor output> --newmode <modeline name> <modeline details from cvt> xrandr --output <monitor output> --addmode <monitor output> <modeline name> xrandr --output <monitor output> --mode <modeline name> Modeline name being whatever you like. You'll probably have to launch X with one of the monitors disconnected (as the bug may trigger before you can apply the modeline change). I believe the amdgpu DDX has support for specifying modelines, but I don't know the syntax off the top of my head. Correction: the exact frequency reported by xrandr is 59.95 I took Andrew Sheldon`s advice and experimented a bit with refresh rates and resolutions. Turns out, that the problem does not occur in lower resolutions even when both displays operate at 60 Hz. git bisect shows that commit fb6959ae50176758a073687dbb081d26521f4576 ("Embed DCN2 SOC bounding box") is the first to to trigger the bug. If I change dcn2_0_soc.dram_clock_change_latency_us in "rivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c" from 404.0 to 10.0 (the value used in kernel 5.3) the messages disappear and the system behaves normal again. However, as long as /sys/class/drm/card0/device/power_dpm_force_performance_level is set to "auto", I am now seeing massive flickering. Forcing it to low or high fixes that. According to the sources for kernel 5.3 the value of 10.0 for dram_clock_change_latency is a hack. Can anyone elaborate on this? -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/929. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.