Bug 97940

Summary: Short desktop freezes with Radeon 7850
Product: xorg Reporter: Daniel Lichtenberger <daniel.lichtenberger>
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: lyude
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
test program to list all connectors
none
Boot log with drm.debug=0x6
none
Boot log for drm-next none

Description Daniel Lichtenberger 2016-09-26 19:45:45 UTC
Created attachment 126797 [details]
test program to list all connectors

Kernel: openSUSE Tumbleweed 4.7.4-2-default
HW: Radeon 7850 2GB, dual DVI monitors

I have an annoying issue on my Radeon 7850. At some point in
the kernel 4.x timeframe, the entire desktop started to freeze for
short periods (< 1 second), almost exclusively during startup when
the desktop applets/widgets were loading. It's very
noticeable/annoying because the mouse cursor freezes as well.

Aside from the startup freezes that happen every time, the issue 
is hard to reproduce (I get also short freezes when opening a PDF 
in a new instance of Okular - the KDE PDF viewer).

With the help of perf I stumbled over suspicious traces of DRM
getconnector calls. I hacked together a small program that reproduces
the issue on my system: when run in a loop, I can barely move the
mouse cursor and I get perf traces like the following:


   - drm_helper_probe_single_connector_modes
-   45,15%     0,00%  drm-getconnecto  [kernel.kallsyms]
[k] drm_helper_probe_single_connector_modes
      - 41,70% radeon_dvi_detect
         - 40,19% radeon_connector_get_edid
            - drm_get_edid
               - 39,30% drm_do_get_edid
                    drm_do_probe_ddc_edid
                    i2c_transfer
                    __i2c_transfer
                  - bit_xfer
                     + 16,45% sclhi
                       15,74% delay_tsc
                     + 4,39% acknak
                       1,14% set_clock
                       0,76% get_data
                     + 0,54% try_address
               + 0,89% drm_do_probe_ddc_edid
         + 1,51% radeon_ddc_probe
      + 3,37% radeon_dp_detect

The program enumerates the VGA connectors with DRM_IOCTL_MODE_GETRESOURCES and DRM_IOCTL_MODE_GETCONNECTOR syscalls. It's the first call to DRM_IOCTL_MODE_GETCONNECTOR that seems to cause the issues on my system.

After I stop the program all is well again, the system is stable and I
experience no other issues. 

When I ran the program on another system with a Radeon 5750 and a
recent kernel (4.5 or 4.6), it did not cause the intermittent freezes and worked faster.

I'm running a dual DVI monitor setup and KDE 5. The issue also happens
under other window managers/DEs (IceWM, XFCE), so I don't think it is
related to KDE (but the pattern of connector lookups may have changed in a recent Qt or xorg version so that the issue became more noticeable).
Comment 1 Alex Deucher 2016-09-26 20:04:17 UTC
Can you bisect?
Comment 2 Daniel Lichtenberger 2016-10-02 12:38:28 UTC
Bisection points to 82922da39190199260a726d7081a8ea4873e5fd6 (drm/dp_helper: Retry aux transactions on all errors).

Reverting the commit on 4.7.5 almost fixes the issue for me. There's still a short freeze that is noticeable when moving the mouse around, but it's *much* better than before.
Comment 3 Lyude Paul 2016-10-03 14:40:04 UTC
Hi. The reason we do so many retries in the aux handler is because monitors are notoriously bad at handling aux transactions and a lot of times just retrying until we get what we want is the only way to make things work. This being said though the radeon driver did have issues around 4.7.x with retrying aux transactions far more often then it needed to be (over 100+ times in many cases), in 4.8+ it's been fixed so that we only retry 32 times, which shows a pretty significant difference in how long reprobes take.

So, I'd say try a 4.8 kernel and see if the behavior improves. If it does it'll probably be a good idea to send those patches to stable so they get merged in 4.7.y. If that doesn't work, then the next thing to check would be why your card seems to be so generous about consistently reprobing the connectors. Reprobes generally shouldn't be happening often enough for you to even notice the time required for them. To help us out with this, boot your machine with

drm.debug=0x6

Added to the kernel command line, and use your computer normally until you manage to freeze it up for a second. Once you do that, just upload the full kernel log (dmesg) to here. That might give some insight onto what your card is up to.
Comment 4 Daniel Lichtenberger 2016-10-03 16:37:47 UTC
Created attachment 126977 [details]
Boot log with drm.debug=0x6

Kernel 4.8 didn't really improve the situation, unfortunately. I attached the boot log with drm.debug=0x6. It contains many "dp_aux_ch timed out" messages that are probably linked to my problem.
Comment 5 Lyude Paul 2016-10-03 18:03:16 UTC
Okay, so looks like I jumped the gun a little bit. It looks like the patchset I was thinking of didn't actually get merged into the kernel tree just yet, so you'll have to use drm-next or build your own kernel with the patches in order to see if they fix your problem. The patchseries in question here btw is:

https://patchwork.freedesktop.org/series/4827/
Comment 6 Daniel Lichtenberger 2016-10-03 20:12:49 UTC
Created attachment 126980 [details]
Boot log for drm-next

With drm-next it's much better, but I wouldn't call it "fixed". The desktop still freezes for shorter periods (I'd guess 100-200ms) for multiple times during startup. I attached another boot log, maybe you can get something out of this.
Comment 7 Daniel Lichtenberger 2017-03-01 16:25:43 UTC
With 4.10 the freezes are completely gone. As far as I'm concerned, this bug can be closed. Thanks for the general handling of this issue!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.