Bug 41121

Summary: Apple Thunderbolt display is not initialized when plugged into iMac 12,2 (Radeon 6870)
Product: DRI Reporter: Brad Campbell <brad>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium Keywords: regression
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Clean boot with nothing plugged in
none
Boot with thunderbolt plugged in
none
boot with DVI plugged in
none
boot with DVI plugged in and thunderbolt inserted after about 30s
none
lspci -vv
none
gzipped config none

Description Brad Campbell 2011-09-22 07:40:29 UTC
Created attachment 51518 [details] [review]
Clean boot with nothing plugged in

This iMac has 2 "Thunderbolt" (mini-displayport connectors on the back). They will happily drive a pair of 1920x1200 DVI monitors with mini-displayport to DVI dongles, but will not look at the Apple "thunderbolt" monitor when plugged in.

It does work in OSX however.

If the monitor is plugged in at boot, the kernel appears to read the EDID correctly (the modes are displayed as what they should be), however it always fails to bring the displayport link up.

I've included 4 dmesg runs from clean boot to a single user console.
1) clean with no external monitors
2) clean with thunderbolt monitor at boot
3) clean with DVI at boot
4) (3) but with thunderbolt plugged in at about the 30s point. (This often reduces the machine to a steaming pile of molasses, almost like an interrupt storm but with no outward signs except a very slow console/processing.
Comment 1 Brad Campbell 2011-09-22 07:40:57 UTC
Created attachment 51519 [details] [review]
Boot with thunderbolt plugged in
Comment 2 Brad Campbell 2011-09-22 07:41:25 UTC
Created attachment 51520 [details] [review]
boot with DVI plugged in
Comment 3 Brad Campbell 2011-09-22 07:41:59 UTC
Created attachment 51521 [details] [review]
boot with DVI plugged in and thunderbolt inserted after about 30s
Comment 4 Brad Campbell 2011-09-22 07:43:34 UTC
Created attachment 51522 [details] [review]
lspci -vv
Comment 5 Brad Campbell 2011-09-22 07:43:54 UTC
Created attachment 51523 [details]
gzipped config
Comment 6 Brad Campbell 2011-09-28 00:39:56 UTC
Ok, so some kernel version backtracking shows that this worked in 2.6.39.
I've run a bisect between 2.6.39 and 3.0-rc1, but I keep getting lost in the major displayport re-write that happened in there.

When the machine is booted using Bootcamp v4 directly (not via refit) the bios sets up the thunderbolt bridge, makes all the PCIe end devices visible on the system and wires the displayport through to the radeon.

I've added the regression tag to this bug.

I'm quite happy to spend time on irc running crash and burn cycles as required.
Comment 7 Brad Campbell 2011-09-29 03:46:09 UTC
Further bisection reveals this commit broke it :

Commit 834b2904bbfde3d85b5e984688777d56e9c7bf80
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Fri May 20 04:34:24 2011 -0400

    drm/radeon/kms: improve aux error handling

    Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>


Hacking the aux retries back in fixes it

diff --git a/drivers/gpu/drm/radeon/atombios_dp.c b/drivers/gpu/drm/radeon/atombios_dp.c
index 7ad43c6..1a44aa9 100644
--- a/drivers/gpu/drm/radeon/atombios_dp.c
+++ b/drivers/gpu/drm/radeon/atombios_dp.c
@@ -60,11 +60,11 @@ static int radeon_process_aux_ch(struct radeon_i2c_chan *chan,
        int index = GetIndexIntoMasterTable(COMMAND, ProcessAuxChannelTransaction);
        unsigned char *base;
        int recv_bytes;
-
+       int retry_count = 0;
        memset(&args, 0, sizeof(args));
 
        base = (unsigned char *)rdev->mode_info.atom_context->scratch;
-
+retry:
        memcpy(base, send, send_bytes);
 
        args.v1.lpAuxRequest = 0;
@@ -77,8 +77,17 @@ static int radeon_process_aux_ch(struct radeon_i2c_chan *chan,
 
        atom_execute_table(rdev->mode_info.atom_context, index, (uint32_t *)&args);
 
-       *ack = args.v1.ucReplyStatus;
+   if (args.v1.ucReplyStatus && !args.v1.ucDataOutLen) {
+       if (args.v1.ucReplyStatus == 0x20 && retry_count++ < 10){
+                       printk("doing retry : %i\n",retry_count);
+           goto retry;}
+       DRM_DEBUG_KMS("failed to get auxch %02x%02x %02x %02x 0x%02x %02x after %d retries\n",
+             send[1], send[0], send[2], send[3],
+             chan->rec.i2c_id, args.v1.ucReplyStatus, retry_count);
+       return false;
+               }
 
+       *ack = args.v1.ucReplyStatus;
        /* timeout */
        if (args.v1.ucReplyStatus == 1) {
                DRM_DEBUG_KMS("dp_aux_ch timeout\n");
@@ -128,6 +137,7 @@ static int radeon_dp_aux_native_write(struct radeon_connector *radeon_connector,
        while (1) {
                ret = radeon_process_aux_ch(dig_connector->dp_i2c_bus,
                                            msg, msg_bytes, NULL, 0, delay, &ack);
+               printk("aux_native_write : ret : %i\n",ret);
                if (ret < 0)
                        return ret;
                if ((ack & AUX_NATIVE_REPLY_MASK) == AUX_NATIVE_REPLY_ACK)
@@ -158,6 +168,8 @@ static int radeon_dp_aux_native_read(struct radeon_connector *radeon_connector,
        while (1) {
                ret = radeon_process_aux_ch(dig_connector->dp_i2c_bus,
                                            msg, msg_bytes, recv, recv_bytes, delay, &ack);
+               
+               printk("aux_native_read : ret : %i\n",ret);
                if (ret == 0)
                        return -EPROTO;
                if (ret < 0)

Quite a few retries sprinkled through the log, so it appears it is required on some links.

[   16.365260] ADDRCONF(NETDEV_UP): eth2: link is not ready
[   17.393646] aux_native_read : ret : 8
[   17.592073] aux_native_read : ret : 0
[   17.604003] aux_native_read : ret : 0
[   17.612451] doing retry : 1
[   17.612689] aux_native_read : ret : 8
[   17.743642] aux_native_read : ret : 0
[   17.751633] aux_native_read : ret : 0
[   18.572898] tg3 0000:02:00.0: eth1: Link is up at 1000 Mbps, full duplex
[   18.572902] tg3 0000:02:00.0: eth1: Flow control is on for TX and on for RX
[   18.572905] tg3 0000:02:00.0: eth1: EEE is disabled
[   18.573642] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[   19.078061] NET: Registered protocol family 17
[   19.116439] aux_native_read : ret : 8
[   19.299182] aux_native_read : ret : 0
[   19.307137] aux_native_read : ret : 0
[   19.315577] doing retry : 1
[   19.315815] aux_native_read : ret : 8
[   19.430797] aux_native_read : ret : 0
[   19.438793] aux_native_read : ret : 0
[   19.463489] aux_native_read : ret : 8
[   19.646190] aux_native_read : ret : 0
[   19.654164] aux_native_read : ret : 0
[   19.662574] doing retry : 1
[   19.662823] aux_native_read : ret : 8
[   19.781789] aux_native_read : ret : 0
[   19.789763] aux_native_read : ret : 0
Comment 8 Alex Deucher 2011-09-29 08:01:27 UTC
It should still be retrying, just restructured slightly.  The retry logic just moved into radeon_dp_i2c_aux_ch(), radeon_dp_aux_native_write(), and radeon_dp_aux_native_read(), e.g.,

               else if ((ack & AUX_NATIVE_REPLY_MASK) == AUX_NATIVE_REPLY_DEFER)
                       udelay(400);

Perhaps the delay is causing a problem.  Does removing the udelay(400); help?
Comment 10 Florian Mickler 2011-10-06 05:53:17 UTC
A patch referencing this bug report has been merged in Linux v3.1-rc9:

commit 109bc10d30f33e84f1d7289f0039e0c858ade82f
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Mon Oct 3 09:13:45 2011 -0400

    drm/radeon/kms: fix regression in DP aux defer handling

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.