[Bug 97635] radeon fails to initialize some DisplayPort monitors

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Sep 8 10:44:42 UTC 2016


https://bugs.freedesktop.org/show_bug.cgi?id=97635

            Bug ID: 97635
           Summary: radeon fails to initialize some DisplayPort monitors
           Product: DRI
           Version: XOrg git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Radeon
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: nybbles2bytes at gmail.com

Created attachment 126301
  --> https://bugs.freedesktop.org/attachment.cgi?id=126301&action=edit
Logs to compare all screens properly booted to some not

It took a mistake or two but I have been directed that this is the place to
report this issue. I believe I am in a unique position to help with DisplayPort
issues (and want to do so) because I have been able to generate both working
and non-working logs and because I have a significant quantity of DisplayPorts
on my system, 6 in total. Also, I put a wealth of information together
(automated for completeness and consistency) that should help the development
team nail down the cause of this issue.

Here's everything I have been able to determine but first the hardware setup:
My graphics card is "HD 5870 Eyefinity 6" which has 6 DisplayPorts. I have them
setup in a grid of 3 across by 2 down. Each display is at a resolution of
2560x1440 creating a total work area of 7680x2880 in a Xinerama setup running
on the KDE4 desktop.

I currently have 3 kernels in my grub list which are:
  kernel-3.16.7
  kernel-4.7.0
  kernel-4.7.2

These are all with suse's Tumbleweed however kernel-3.16.7 came with openSUSE
13.2.

I have no evidence that my problem is related to so many screens of
DisplayPorts but it does allow me to see more variations of the problem than
most do which helps pinpoint what the real problem is (hopefully!)

Focusing on kernel-4.7.2 the kernel would only turn on the first two displays.
That happens during boot long before Xorg gets loaded.

In Xorg the behavior is a little strange when it gets DisplayPorts off from the
kernel. Xorg will acknowledge all 6 displays but it is not able to turn on any
that are initially off when the kernel was handling them. E.g.: the last 4
monitors in the case of the 4.x kernels.

The upshot is that when I go to the multidisplay setup part of KDE all 6
displays are showing as active even though only the first two are turned on in
reality. If I disable and re-enable the displays turned off, they don't turn
on. If I use xrandr to turn them on, no dice. That is, if they are off when the
kernel was handling them they are off for good, nothing in Xorg or KDE can
change it that I have found.

That said, adding radeon.audio=0 to the boot makes things better but doesn't
fix the issue completely. With that settings sometimes I'll get all 6 boot
good, more often I'll get 5 out of six boot good and one bad. Usually, the last
one (DisplayPort 5) is the one that fails when one does, however, not always.

I went to the trouble to write a script to gather information and I think I got
enough to show where things are going wrong. At least enough to show a
difference between a good and bad boot and I will help with more information as
needed. I really want to get this problem solved and I'll do whatever I can to
help. 

In the tarred file, to see what's different between a good and bad boot all you
have to do is a diff on the files:
    ./logs/timing-stripped/filtered-drm/
       
screens-0-4-good-5-bad_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt
       
screens-0-5-good_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt

Anybody who wanted to also gather comprehensive information for the developers
could take the file ./gather-info-for-diagnostics.sh in the tarred file and
modify as needed for their own system.

That said, below explains in detail what's in the tarred compressed file.

Directory structure
===================
.
+-- logs
    +-- filtered-drm
    +-- timing-stripped
        +-- filtered-drm

This structure is as follows:
    . 
    =
    The script that creates the log files and script to turn on any screens
that are off during boot (more on this one later).

    ./logs
    ======
    The raw log files the script gathered which include:
        dmsg.txt                            - from dmesg
        proc-cmdline.txt                    - from /proc/cmdline
        module-kernel-parameters.txt        - from
/sys/module/kernel/parameters/*
        module-processor-parameters.txt     - from
/sys/module/processor/parameters/*
        sys-module-radeon-parameters.txt    - from
/sys/module/radeon/parameters/*
        Xorg.0.log.txt                      - from /var/log/Xorg.0.log

    ./logs/filtered-drm
    ===================
    Some of the above raw log files with lines that do not contain radeon
information removed - makes it easier to see what's relevant. If you want to
know exactly how the lines were filtered you can look at the script
./gather-info-for-diagnostics.sh.

    ./logs/timing-stripped
    ======================
    The above raw log files with the timing at the beginning of each line
removed. This makes using diff programs easier (I use meld on Linux). If you
want to know exactly how this was done you can look at the script
./gather-info-for-diagnostics.sh.

    ./logs/timing-stripped/filtered-drm
    ===================================
    Some of the above raw log files with the timing at the beginning of each
line removed and lines that do not contain radeon information removed. Again,
makes it easier to see what's relevant.  If you want to know exactly how this
was done you can look at the script ./gather-info-for-diagnostics.sh.


Scripts
=======

./gather-info-for-diagnostics.sh
--------------------------------
Does all the heavy lifting in gathering the info.

./display-on.sh
---------------
This was a curious discovery and may make fixing the issue easier. This is
because I found when the script was like this:

    xrandr --output DisplayPort-${1} --mode 1920x1080
    xrandr --output DisplayPort-${1} --mode 2560x1440

it sometimes it would turn the display on but others it would turn it off. To
consistantly turn the display on I had to change it to this:

    xrandr --output DisplayPort-${1} --mode 1920x1080
    sleep 5
    xrandr --output DisplayPort-${1} --mode 2560x1440

suggesting there might be a timing problem that needs to be addressed. Even
though running this script can turn the display on that was erroneously off
during boot the display will turn itself back off after a few seconds or so so
it's not a usable workaround. I guess there is some status flag during boot in
the kernel that ultimately can't be changed or overridden that eventually
reasserts itself.

Update: It may not be that the 5 second delay solved the issue. It may be that
just running it again was the solution. Perhaps the first time some cache got
cleared, I'm not really sure, some experimenting is in need on this one.

File Names
==========

File names take the form of:
    <what happened to the screens at boot>_<partial command line when booting
the kernel>_<the file name>.txt
    E.g. The file:
       
screens-0-4-good-5-bad_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt

    can be broken down to:
        screens-0-4-good-5-bad      = The first 5 of the 6 screens came on as
they should during boot but the 6th one (number 5) did not.
        kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects
                                    = shows most of the boot command line
        dmsg                        = A key indicating the file contents, from
dmesg in this case
        .txt                        = That this is a text file

If the file starts off with something like this: 
screens-0-5-good-after-5-fixed-with_display-on.sh it means after booting and
logging in I ran the script ./display-on.sh to turn on the display and then
gathered all the log information. I will have gathered the log information
prior to running the script as well so you will also see files prefixed with
just screens-0-5-good in such a case.

Let me know what else I can do to help.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160908/21cbe7b4/attachment.html>


More information about the dri-devel mailing list