Bug 97635 - radeon fails to initialize some DisplayPort monitors
Summary: radeon fails to initialize some DisplayPort monitors
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-08 10:44 UTC by Reg
Modified: 2019-11-19 09:18 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Logs to compare all screens properly booted to some not (619.59 KB, application/x-xz)
2016-09-08 10:44 UTC, Reg
no flags Details

Description Reg 2016-09-08 10:44:42 UTC
Created attachment 126301 [details]
Logs to compare all screens properly booted to some not

It took a mistake or two but I have been directed that this is the place to report this issue. I believe I am in a unique position to help with DisplayPort issues (and want to do so) because I have been able to generate both working and non-working logs and because I have a significant quantity of DisplayPorts on my system, 6 in total. Also, I put a wealth of information together (automated for completeness and consistency) that should help the development team nail down the cause of this issue.

Here's everything I have been able to determine but first the hardware setup: My graphics card is "HD 5870 Eyefinity 6" which has 6 DisplayPorts. I have them setup in a grid of 3 across by 2 down. Each display is at a resolution of 2560x1440 creating a total work area of 7680x2880 in a Xinerama setup running on the KDE4 desktop.

I currently have 3 kernels in my grub list which are:
  kernel-3.16.7
  kernel-4.7.0
  kernel-4.7.2

These are all with suse's Tumbleweed however kernel-3.16.7 came with openSUSE 13.2.

I have no evidence that my problem is related to so many screens of DisplayPorts but it does allow me to see more variations of the problem than most do which helps pinpoint what the real problem is (hopefully!)

Focusing on kernel-4.7.2 the kernel would only turn on the first two displays. That happens during boot long before Xorg gets loaded.

In Xorg the behavior is a little strange when it gets DisplayPorts off from the kernel. Xorg will acknowledge all 6 displays but it is not able to turn on any that are initially off when the kernel was handling them. E.g.: the last 4 monitors in the case of the 4.x kernels.

The upshot is that when I go to the multidisplay setup part of KDE all 6 displays are showing as active even though only the first two are turned on in reality. If I disable and re-enable the displays turned off, they don't turn on. If I use xrandr to turn them on, no dice. That is, if they are off when the kernel was handling them they are off for good, nothing in Xorg or KDE can change it that I have found.

That said, adding radeon.audio=0 to the boot makes things better but doesn't fix the issue completely. With that settings sometimes I'll get all 6 boot good, more often I'll get 5 out of six boot good and one bad. Usually, the last one (DisplayPort 5) is the one that fails when one does, however, not always.

I went to the trouble to write a script to gather information and I think I got enough to show where things are going wrong. At least enough to show a difference between a good and bad boot and I will help with more information as needed. I really want to get this problem solved and I'll do whatever I can to help. 

In the tarred file, to see what's different between a good and bad boot all you have to do is a diff on the files:
    ./logs/timing-stripped/filtered-drm/
        screens-0-4-good-5-bad_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt
        screens-0-5-good_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt

Anybody who wanted to also gather comprehensive information for the developers could take the file ./gather-info-for-diagnostics.sh in the tarred file and modify as needed for their own system.

That said, below explains in detail what's in the tarred compressed file.

Directory structure
===================
.
+-- logs
    +-- filtered-drm
    +-- timing-stripped
        +-- filtered-drm

This structure is as follows:
    . 
    =
    The script that creates the log files and script to turn on any screens that are off during boot (more on this one later).

    ./logs
    ======
    The raw log files the script gathered which include:
        dmsg.txt                            - from dmesg
        proc-cmdline.txt                    - from /proc/cmdline
        module-kernel-parameters.txt        - from /sys/module/kernel/parameters/*
        module-processor-parameters.txt     - from /sys/module/processor/parameters/*
        sys-module-radeon-parameters.txt    - from /sys/module/radeon/parameters/*
        Xorg.0.log.txt                      - from /var/log/Xorg.0.log

    ./logs/filtered-drm
    ===================
    Some of the above raw log files with lines that do not contain radeon information removed - makes it easier to see what's relevant. If you want to know exactly how the lines were filtered you can look at the script ./gather-info-for-diagnostics.sh.

    ./logs/timing-stripped
    ======================
    The above raw log files with the timing at the beginning of each line removed. This makes using diff programs easier (I use meld on Linux). If you want to know exactly how this was done you can look at the script ./gather-info-for-diagnostics.sh.

    ./logs/timing-stripped/filtered-drm
    ===================================
    Some of the above raw log files with the timing at the beginning of each line removed and lines that do not contain radeon information removed. Again, makes it easier to see what's relevant.  If you want to know exactly how this was done you can look at the script ./gather-info-for-diagnostics.sh.


Scripts
=======

./gather-info-for-diagnostics.sh
--------------------------------
Does all the heavy lifting in gathering the info.

./display-on.sh
---------------
This was a curious discovery and may make fixing the issue easier. This is because I found when the script was like this:

    xrandr --output DisplayPort-${1} --mode 1920x1080
    xrandr --output DisplayPort-${1} --mode 2560x1440

it sometimes it would turn the display on but others it would turn it off. To consistantly turn the display on I had to change it to this:

    xrandr --output DisplayPort-${1} --mode 1920x1080
    sleep 5
    xrandr --output DisplayPort-${1} --mode 2560x1440

suggesting there might be a timing problem that needs to be addressed. Even though running this script can turn the display on that was erroneously off during boot the display will turn itself back off after a few seconds or so so it's not a usable workaround. I guess there is some status flag during boot in the kernel that ultimately can't be changed or overridden that eventually reasserts itself.

Update: It may not be that the 5 second delay solved the issue. It may be that just running it again was the solution. Perhaps the first time some cache got cleared, I'm not really sure, some experimenting is in need on this one.

File Names
==========

File names take the form of:
    <what happened to the screens at boot>_<partial command line when booting the kernel>_<the file name>.txt
    E.g. The file:
        screens-0-4-good-5-bad_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt

    can be broken down to:
        screens-0-4-good-5-bad      = The first 5 of the 6 screens came on as they should during boot but the 6th one (number 5) did not.
        kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects
                                    = shows most of the boot command line
        dmsg                        = A key indicating the file contents, from dmesg in this case
        .txt                        = That this is a text file

If the file starts off with something like this:  screens-0-5-good-after-5-fixed-with_display-on.sh it means after booting and logging in I ran the script ./display-on.sh to turn on the display and then gathered all the log information. I will have gathered the log information prior to running the script as well so you will also see files prefixed with just screens-0-5-good in such a case.

Let me know what else I can do to help.
Comment 1 Martin Peres 2019-11-19 09:18:26 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/739.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.