Bug 67628 - [NVC1] [BISECTED] Monitor on Display port shows distortions
Summary: [NVC1] [BISECTED] Monitor on Display port shows distortions
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
: 74815 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-08-01 15:29 UTC by Torsten Wagner
Modified: 2014-02-19 03:27 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Kernel bisect (3.47 KB, text/plain)
2013-08-01 15:29 UTC, Torsten Wagner
no flags Details
dmesg of a working kernel (72.41 KB, text/plain)
2013-08-01 15:30 UTC, Torsten Wagner
no flags Details
dmesg of a kernel with colour problem (71.95 KB, text/plain)
2013-08-01 15:31 UTC, Torsten Wagner
no flags Details
image of the monitor distortion (852.20 KB, image/jpeg)
2013-08-01 15:33 UTC, Torsten Wagner
no flags Details
noveau_debug_NOT_working_kernel_3.12.9-2-ARCH.txt (489.63 KB, text/plain)
2014-02-05 09:28 UTC, Torsten Wagner
no flags Details
noveau_debug_working_kernel_3.8.0-rc6-debug-00080-g5cc027f.txt (500.43 KB, text/plain)
2014-02-05 09:28 UTC, Torsten Wagner
no flags Details
quadro 400 (gt216gl) vbios.rom (61.50 KB, application/octet-stream)
2014-02-12 17:25 UTC, Michael Gulick
no flags Details
Quadro 600 (GF108GL) vbios.rom (60.00 KB, application/octet-stream)
2014-02-12 17:26 UTC, Michael Gulick
no flags Details
dmesg - working dp - linux git 0a0afd28^ (245.12 KB, text/plain)
2014-02-13 21:11 UTC, Michael Gulick
no flags Details
dmesg - broken dp - linux git 0a0afd28 (245.15 KB, text/plain)
2014-02-13 21:11 UTC, Michael Gulick
no flags Details
dmesg - working dp - linux git 0a0afd28 with fixed 4 lanes (360.06 KB, text/plain)
2014-02-13 22:56 UTC, Michael Gulick
no flags Details

Description Torsten Wagner 2013-08-01 15:29:30 UTC
Created attachment 83446 [details]
Kernel bisect

Updating to a kernel > 3.8, the monitor connected via display port shows heavy colour distortions (see attached image). 
I bisect the kernel and it came down to patch 

5cc027f6b1ec651c18a4322ed3e30c6e9cf01e96
(drm/nv50-/disp: move DP link training to core and train from supervisor)

This makes sens, since this patch deals with display port functions. 

The problem appears during boot and hence I guess it is part of the DRM resp. KMS part.

Attached are dmesg outputs for a working and a non-working kernel.
An image which shows the kind of distortions and the kernel bisect log.

Relevant system infos:

01:00.0 VGA compatible controller: NVIDIA Corporation GF108GL [Quadro 600] (rev a1)
Display: DELL U2410
Comment 1 Torsten Wagner 2013-08-01 15:30:16 UTC
Created attachment 83447 [details]
dmesg of a working kernel
Comment 2 Torsten Wagner 2013-08-01 15:31:00 UTC
Created attachment 83448 [details]
dmesg of a kernel with colour problem
Comment 3 Torsten Wagner 2013-08-01 15:33:16 UTC
Created attachment 83449 [details]
image of the monitor distortion
Comment 4 Per Arnold Blaasmo 2013-08-30 10:14:49 UTC
I have the same problem with dual screen and displayport.
I have described it in bug #66129.

Bug #66129 was resolved for VGA displays, but not for displayport.
Comment 5 Ben Skeggs 2013-10-31 04:32:47 UTC
I wasn't able to reproduce your problem exactly, even on the same GPU.  However, I did note additional issues which could be related.  I'll keep you posted once I get around to looking at a fix.
Comment 6 Torsten Wagner 2013-10-31 11:40:13 UTC
Hi,
thanks for the info. If there is anything I could provide or test,
please let me know.
I hope it is not a monitor issue... will test this.



On 31 October 2013 05:32,  <bugzilla-daemon@freedesktop.org> wrote:
> Comment # 5 on bug 67628 from Ben Skeggs
>
> I wasn't able to reproduce your problem exactly, even on the same GPU.
> However, I did note additional issues which could be related.  I'll keep you
> posted once I get around to looking at a fix.
>
> ________________________________
> You are receiving this mail because:
>
> You reported the bug.
Comment 7 Ilia Mirkin 2013-11-09 21:01:57 UTC
Please try the kernel at nouveau/master (http://cgit.freedesktop.org/nouveau/linux-2.6/), or drm-next. This includes a fix for a particularly nasty bug for NVC1 specifically, which could manifest itself in a wide variety of ways (since 3.11), as well as a few DP-related fixes, e.g. http://cgit.freedesktop.org/nouveau/linux-2.6/commit/?id=fa95062e052723fdeaede16af736c2bcc4cd1537 (and a couple others).
Comment 8 Torsten Wagner 2013-11-26 21:51:29 UTC
Hi,

sorry for the late reply. It took me some time to test it. Unfortunately,
my test a week ago from the current git did not show any improvement. Still
the same distortion on the display connected via DisplayPort.
The only think I could think of to help now would be to connect another
monitor via display port to see whether it is just a bad combination of
display manufacture and GPU/driver.
Will try to do this within the next days.

Thanks for the help and support.

Torsten


On 9 November 2013 22:01, <bugzilla-daemon@freedesktop.org> wrote:

>  Ilia Mirkin <imirkin@alum.mit.edu> changed bug 67628<https://bugs.freedesktop.org/show_bug.cgi?id=67628>
>  What Removed Added  Summary [BISECTED] Monitor on Display port shows
> distortions [NVC1] [BISECTED] Monitor on Display port shows distortions
>
>  *Comment # 7 <https://bugs.freedesktop.org/show_bug.cgi?id=67628#c7> on
> bug 67628 <https://bugs.freedesktop.org/show_bug.cgi?id=67628> from Ilia
> Mirkin <imirkin@alum.mit.edu> *
>
> Please try the kernel at nouveau/master
> (http://cgit.freedesktop.org/nouveau/linux-2.6/), or drm-next. This includes a
> fix for a particularly nasty bug for NVC1 specifically, which could manifest
> itself in a wide variety of ways (since 3.11), as well as a few DP-related
> fixes, e.g.http://cgit.freedesktop.org/nouveau/linux-2.6/commit/?id=fa95062e052723fdeaede16af736c2bcc4cd1537
> (and a couple others).
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 9 Ilia Mirkin 2014-02-05 03:21:38 UTC
Can you post boot logs of kernels with and without the problem booted with

nouveau.debug=trace drm.debug=0xe
Comment 10 Torsten Wagner 2014-02-05 09:28:31 UTC
Created attachment 93430 [details]
noveau_debug_NOT_working_kernel_3.12.9-2-ARCH.txt

Hi Ilia,
thanks for the feedback. Attached you will find the kernel log of an
working and non-working kernel.
A quick look showed that the non-working kernel reads in (or simply
log) much more data from the display port init.
The lines below can be found in the NON working kernel but not in the
working one

NOT WORKING KERNEL: (grep DP)

[    1.742406] nouveau T[   VBIOS][0000:01:00.0] 0x57ab[1]:
DP_CONDITION    0x05 0x15
[    1.742584] nouveau T[   VBIOS][0000:01:00.0] 0x591d[ ]:
DP_CONDITION    0x00 0x08
[    1.742589] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: 2 lanes
at 270000 KB/s
[    1.742772] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342:
training pattern 1
[    1.743115] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 0 00
[    1.743122] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 1 00
[    1.743796] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: status
00 00 00 00 cc cc
[    1.743797] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 0 38
[    1.743807] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 1 38
[    1.744478] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: status
11 00 00 00 cc cc
[    1.744479] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342:
training pattern 2
[    1.745486] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: status
11 00 00 00 cc cc
[    1.745487] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 0 38
[    1.745493] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 1 38
[    1.746432] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: status
11 00 00 00 88 88
[    1.746432] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 0 10
[    1.746443] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 1 10
[    1.747381] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: status
11 00 00 00 44 44
[    1.747382] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 0 08
[    1.747390] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 1 08
[    1.748330] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: status
11 00 00 00 00 00
[    1.748331] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 0 00
[    1.748340] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 1 00
[    1.749301] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: status
11 00 00 00 44 44
[    1.749301] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 0 08
[    1.749308] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 1 08
[    1.750265] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: status
77 00 01 00 44 44
[    1.750265] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 0 08
[    1.750272] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342: config lane 1 08
[    1.750596] nouveau D[   PDISP][0000:01:00.0] DP:0006:0342:
training pattern 0
[    1.751350] nouveau T[   VBIOS][0000:01:00.0] 0x58fd[0]:
DP_CONDITION    0x00 0x11


vs.

WORKING KERNEL (grep DP)

[    5.677400] nouveau T[   VBIOS][0000:01:00.0] 0x57ab[1]:
DP_CONDITION    0x05 0x15
[    5.677577] nouveau T[   VBIOS][0000:01:00.0] 0x591d[ ]:
DP_CONDITION    0x00 0x08
[    5.686368] nouveau T[   VBIOS][0000:01:00.0] 0x58fd[0]:
DP_CONDITION    0x00 0x11

As you can see, the entire PDISP part is missing not sure it is simply
not logged or it is a new feature which creates the problem.

Thanks for looking into this....


On 5 February 2014 04:21,  <bugzilla-daemon@freedesktop.org> wrote:
> Comment # 9 on bug 67628 from Ilia Mirkin
>
> Can you post boot logs of kernels with and without the problem booted with
>
> nouveau.debug=trace drm.debug=0xe
>
> ________________________________
> You are receiving this mail because:
>
> You reported the bug.
Comment 11 Torsten Wagner 2014-02-05 09:28:31 UTC
Created attachment 93431 [details]
noveau_debug_working_kernel_3.8.0-rc6-debug-00080-g5cc027f.txt
Comment 12 Michael Gulick 2014-02-12 15:20:23 UTC
*** Bug 74815 has been marked as a duplicate of this bug. ***
Comment 13 Ilia Mirkin 2014-02-12 15:24:30 UTC
One thing to note, the initial comment is VERY misleading, since it lists the wrong hash! [but the right description, which no one looks at] The correct commit that was bisected as the cause is:

commit 0a0afd282fd715dd63d64b243299a64da14f8e8d
Author: Ben Skeggs <bskeggs@redhat.com>
Date:   Mon Feb 18 23:17:53 2013 -0500

    drm/nv50-/disp: move DP link training to core and train from supervisor
    
    We need to be able to do link training for PIOR-connected ANX9805 from
    the third supervisor handler (due to script ordering in the bios, can't
    have the "user" call train because some settings are overwritten from
    the modesetting bios scripts).
    
    This moves link training for SOR-connected DP encoders to the second
    supervisor interrupt, *before* we call the modesetting scripts (yes,
    different ordering from PIOR is necessary).  This is useful since we
    should now be able to remove some hacks to workaround races between
    the supervisor and link training paths.
    
    Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Comment 14 Michael Gulick 2014-02-12 15:27:50 UTC
I have found a similar issue with the same commit, however in my case the display port output is blank (monitor doesn't sync).  As mentioned in bug #74815, this issue occurred on both Quadro 400 (GT216GL) and Quadro 600 (GF108FL).

Let me know if there is any additional information I can provide to help diagnose this bug.
Comment 15 Ilia Mirkin 2014-02-12 15:32:22 UTC
Please upload your VBIOS (for both cards, if it's not difficult). You can get it from /sys/kernel/debug/dri/0/vbios.rom
Comment 16 Michael Gulick 2014-02-12 17:25:29 UTC
Created attachment 93949 [details]
quadro 400 (gt216gl) vbios.rom
Comment 17 Michael Gulick 2014-02-12 17:26:17 UTC
Created attachment 93950 [details]
Quadro 600 (GF108GL) vbios.rom
Comment 18 Ilia Mirkin 2014-02-13 04:34:08 UTC
Looking at the earlier vbios execution logs:

working:

[    5.517869] nouveau D[   PDISP][0000:01:00.0] supervisor 0x00000010 0x000002a0
[    5.517885] nouveau T[   VBIOS][0000:01:00.0] 0x568f[0]: DONE
[    5.539255] nouveau D[   PDISP][0000:01:00.0] supervisor 0x00000020 0x000002b0
[    5.539272] nouveau T[   VBIOS][0000:01:00.0] 0x5690[0]: NV_REG	R[0x4061c00c] &= 0xfffffffe |= 0x00000001
[    5.539276] nouveau T[   VBIOS][0000:01:00.0] 0x569d[0]: NV_REG	R[0x4061c014] &= 0xff3fffff |= 0x00c00000
[    5.539279] nouveau T[   VBIOS][0000:01:00.0] 0x56aa[0]: DONE
[    5.539332] nouveau D[   PDISP][0000:01:00.0] supervisor 0x00000040 0x000002b0

non-working:

[    1.679797] nouveau D[   PDISP][0000:01:00.0] supervisor 0x00000010 0x000002a0
[    1.679816] nouveau T[   VBIOS][0000:01:00.0] 0x568f[0]: DONE
[    1.707114] nouveau D[   PDISP][0000:01:00.0] supervisor 0x00000020 0x000002b0
[    1.707128] nouveau T[   VBIOS][0000:01:00.0] 0x5690[0]: NV_REG	R[0x4061c00c] &= 0xfffffffe |= 0x00000001
[    1.707133] nouveau T[   VBIOS][0000:01:00.0] 0x569d[0]: NV_REG	R[0x4061c014] &= 0xff3fffff |= 0x00c00000
[    1.707136] nouveau T[   VBIOS][0000:01:00.0] 0x56aa[0]: DONE
[    1.707162] nouveau T[   VBIOS][0000:01:00.0] 0x53c8[0]: ZM_REG_SEQUENCE	0x04
[    1.707163] nouveau T[   VBIOS][0000:01:00.0] 0x53ce[0]: 		R[0x61c00c] = 0x01000000
[    1.707164] nouveau T[   VBIOS][0000:01:00.0] 0x53d2[0]: 		R[0x61c010] = 0x00101500
[    1.707165] nouveau T[   VBIOS][0000:01:00.0] 0x53d6[0]: 		R[0x61c014] = 0x00000000
[    1.707166] nouveau T[   VBIOS][0000:01:00.0] 0x53da[0]: 		R[0x61c018] = 0x00245af8
[    1.707166] nouveau T[   VBIOS][0000:01:00.0] 0x53de[0]: ZM_REG_SEQUENCE	0x02
[    1.707167] nouveau T[   VBIOS][0000:01:00.0] 0x53e4[0]: 		R[0x61c118] = 0x27272727
[    1.707168] nouveau T[   VBIOS][0000:01:00.0] 0x53e8[0]: 		R[0x61c11c] = 0x00000027
[    1.707169] nouveau T[   VBIOS][0000:01:00.0] 0x53ec[0]: ZM_REG_SEQUENCE	0x02
[    1.707169] nouveau T[   VBIOS][0000:01:00.0] 0x53f2[0]: 		R[0x61c198] = 0x27272727
[    1.707170] nouveau T[   VBIOS][0000:01:00.0] 0x53f6[0]: 		R[0x61c19c] = 0x00000027
[    1.707171] nouveau T[   VBIOS][0000:01:00.0] 0x53fa[0]: NV_REG	R[0x61c120] &= 0xffffffff |= 0x00000000
[    1.707175] nouveau T[   VBIOS][0000:01:00.0] 0x5407[0]: NV_REG	R[0x61c1a0] &= 0xffffffff |= 0x00000000
[    1.707178] nouveau T[   VBIOS][0000:01:00.0] 0x5414[0]: SUB_DIRECT	0x5cb0
[    1.707179] nouveau T[   VBIOS][0000:01:00.0] 0x5cb0[1]: DONE
[    1.707180] nouveau T[   VBIOS][0000:01:00.0] 0x5417[0]: SUB_DIRECT	0x546e
[    1.707180] nouveau T[   VBIOS][0000:01:00.0] 0x546e[1]: SUB_DIRECT	0x4f5c
[    1.707181] nouveau T[   VBIOS][0000:01:00.0] 0x4f5c[2]: SUB_DIRECT	0x56ab
[    1.707182] nouveau T[   VBIOS][0000:01:00.0] 0x56ab[3]: ZM_REG_SEQUENCE	0x10
[    1.707183] nouveau T[   VBIOS][0000:01:00.0] 0x56b1[3]: 		R[0x4061c040] = 0x1f0b0000
[    1.707184] nouveau T[   VBIOS][0000:01:00.0] 0x56b5[3]: 		R[0x4061c044] = 0x1f0a0000
[    1.707184] nouveau T[   VBIOS][0000:01:00.0] 0x56b9[3]: 		R[0x4061c048] = 0x1e080000
[    1.707185] nouveau T[   VBIOS][0000:01:00.0] 0x56bd[3]: 		R[0x4061c04c] = 0x1e042000
[    1.707186] nouveau T[   VBIOS][0000:01:00.0] 0x56c1[3]: 		R[0x4061c050] = 0x00008000
[    1.707187] nouveau T[   VBIOS][0000:01:00.0] 0x56c5[3]: 		R[0x4061c054] = 0x00008000
[    1.707188] nouveau T[   VBIOS][0000:01:00.0] 0x56c9[3]: 		R[0x4061c058] = 0x00008000
[    1.707188] nouveau T[   VBIOS][0000:01:00.0] 0x56cd[3]: 		R[0x4061c05c] = 0x00008000
[    1.707189] nouveau T[   VBIOS][0000:01:00.0] 0x56d1[3]: 		R[0x4061c060] = 0x00002000
[    1.707190] nouveau T[   VBIOS][0000:01:00.0] 0x56d5[3]: 		R[0x4061c064] = 0x1f002000
[    1.707191] nouveau T[   VBIOS][0000:01:00.0] 0x56d9[3]: 		R[0x4061c068] = 0x1f0c0000
[    1.707191] nouveau T[   VBIOS][0000:01:00.0] 0x56dd[3]: 		R[0x4061c06c] = 0x1f0a0000
[    1.707192] nouveau T[   VBIOS][0000:01:00.0] 0x56e1[3]: 		R[0x4061c070] = 0x1f0b8000
[    1.707193] nouveau T[   VBIOS][0000:01:00.0] 0x56e5[3]: 		R[0x4061c074] = 0x1f0b8000
[    1.707194] nouveau T[   VBIOS][0000:01:00.0] 0x56e9[3]: 		R[0x4061c078] = 0x1f0b8000
[    1.707195] nouveau T[   VBIOS][0000:01:00.0] 0x56ed[3]: 		R[0x4061c07c] = 0x1f0b8000
[    1.707195] nouveau T[   VBIOS][0000:01:00.0] 0x56f1[3]: DONE
[    1.707196] nouveau T[   VBIOS][0000:01:00.0] 0x4f5f[2]: NV_REG	R[0x4061c130] &= 0xffbf00ff |= 0x00400600
[    1.707201] nouveau T[   VBIOS][0000:01:00.0] 0x4f6c[2]: NV_REG	R[0x4061c1b0] &= 0xffbf00ff |= 0x00400600
[    1.707205] nouveau T[   VBIOS][0000:01:00.0] 0x4f79[2]: NV_REG	R[0x40614300] &= 0xfcf3ffff |= 0x00040000
[    1.707207] nouveau T[   VBIOS][0000:01:00.0] 0x4f86[2]: DONE
[    1.707208] nouveau T[   VBIOS][0000:01:00.0] 0x5471[1]: DONE
[    1.707209] nouveau T[   VBIOS][0000:01:00.0] 0x541a[0]: DONE
[    1.707486] nouveau D[   PDISP][0000:01:00.0] supervisor 0x00000040 0x000002b0
[    1.707505] nouveau T[   VBIOS][0000:01:00.0] 0x4f87[0]: SUB_DIRECT	0x4f9c
[    1.707506] nouveau T[   VBIOS][0000:01:00.0] 0x4f9c[1]: CONDITION_TIME	0x05 0xff
[    1.707508] nouveau T[   VBIOS][0000:01:00.0] 0x4f9f[1]: 	[0x05] (R[0x4061c030] & 0x10000000) == 0x00000000
[    1.707511] nouveau T[   VBIOS][0000:01:00.0] 0x4f9f[1]: RESUME
[    1.707512] nouveau T[   VBIOS][0000:01:00.0] 0x4fa0[1]: DONE
[    1.707512] nouveau T[   VBIOS][0000:01:00.0] 0x4f8a[0]: SUB_DIRECT	0x4fa1
[    1.707513] nouveau T[   VBIOS][0000:01:00.0] 0x4fa1[1]: NV_REG	R[0x4061c10c] &= 0xfffffffe |= 0x00000000
[    1.707516] nouveau T[   VBIOS][0000:01:00.0] 0x4fae[1]: DONE
[    1.707517] nouveau T[   VBIOS][0000:01:00.0] 0x4f8d[0]: DONE


And also the non-working one happens later in the driver load process... I think. (Also the commit description mentions something about that, so not unexpected.) What _is_ unexpected is that the non-working one executes a boatload more scripts. Will compare the script dispatch mechanisms... perhaps the new code is missing a 'break' somewhere.
Comment 19 Ilia Mirkin 2014-02-13 04:47:10 UTC
Hmm... nothing _obvious_ in the code...

Torsten, could you upload your vbios? It seems like your Quadro 600 doesn't quite line up with Michael's vbios (why would it, not like they're identical cards... ugh).

Alternatively, Michael, mind giving me a dmesg from before & after with

nouveau.debug=PDISP=debug,VBIOS=trace drm.debug=0xe

(and make sure to mention which machine it's on)

If it's not _too_ difficult, the before & after should be from 0a0afd282f^ and 0a0afd282f, to minimize differences due to other changes that were made.
Comment 20 Ilia Mirkin 2014-02-13 05:07:19 UTC
Hmmm... the additional vbios seems to be coming from a

exec_clkcmp(priv, head, 0, pclk, &outp);

which the bad commit moved around in core/engine/disp/nv50.c, but it isn't actually new... just moved a tad. Which begs the question... how did the old code get out of executing exec_clkcmp (looks like no one was claiming responsibility for generating the interrupt?), and whether it actually matters.
Comment 21 Michael Gulick 2014-02-13 21:11:08 UTC
Created attachment 94025 [details]
dmesg - working dp - linux git 0a0afd28^
Comment 22 Michael Gulick 2014-02-13 21:11:47 UTC
Created attachment 94026 [details]
dmesg - broken dp - linux git 0a0afd28
Comment 23 Ilia Mirkin 2014-02-13 21:58:20 UTC
OK, those two boots are WAY more similar. Which makes the differences stand out.

working:

[    4.217706] nouveau  [     DRM] 4 lanes at 270000 KB/s
[    4.217708] nouveau T[   VBIOS][0000:03:00.0] 0x577f[0]: ZM_REG	R[0x4061c00c] = 0x01000300
[    4.217709] nouveau T[   VBIOS][0000:03:00.0] 0x5788[0]: DONE
[    4.218118] nouveau  [     DRM] training pattern 1
[    4.218991] nouveau  [     DRM] config lane 0 00
[    4.219000] nouveau  [     DRM] config lane 1 00
[    4.219007] nouveau  [     DRM] config lane 2 00
[    4.219014] nouveau  [     DRM] config lane 3 00
[    4.219980] nouveau  [     DRM] status 00 00 80 02 22 22
[    4.219981] nouveau  [     DRM] config lane 0 02
[    4.219990] nouveau  [     DRM] config lane 1 02
[    4.219999] nouveau  [     DRM] config lane 2 02
[    4.220008] nouveau  [     DRM] config lane 3 02
[    4.220995] nouveau  [     DRM] status 11 11 80 02 22 22
[    4.220996] nouveau  [     DRM] training pattern 2
[    4.222708] nouveau  [     DRM] status 11 11 80 02 66 66
[    4.222708] nouveau  [     DRM] config lane 0 0a
[    4.222717] nouveau  [     DRM] config lane 1 0a
[    4.222723] nouveau  [     DRM] config lane 2 0a
[    4.222730] nouveau  [     DRM] config lane 3 0a
[    4.223994] nouveau  [     DRM] status 77 77 81 02 66 66
[    4.223994] nouveau  [     DRM] config lane 0 0a
[    4.224001] nouveau  [     DRM] config lane 1 0a
[    4.224008] nouveau  [     DRM] config lane 2 0a
[    4.224015] nouveau  [     DRM] config lane 3 0a
[    4.224440] nouveau  [     DRM] training pattern 0

non-working:

[    4.371173] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: 2 lanes at 270000 KB/s
[    4.371589] nouveau T[   VBIOS][0000:03:00.0] 0x577f[0]: ZM_REG	R[0x4061c00c] = 0x01000300
[    4.371590] nouveau T[   VBIOS][0000:03:00.0] 0x5788[0]: DONE
[    4.371596] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: training pattern 1
[    4.372470] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: config lane 0 00
[    4.372478] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: config lane 1 00
[    4.373451] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: status 00 00 80 02 22 00
[    4.373465] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: config lane 0 02
[    4.373473] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: config lane 1 02
[    4.374475] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: status 11 00 80 02 22 00
[    4.374476] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: training pattern 2
[    4.376205] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: status 11 00 80 02 66 00
[    4.376207] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: config lane 0 0a
[    4.376218] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: config lane 1 0a
[    4.377498] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: status 77 00 81 02 66 00
[    4.377499] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: config lane 0 0a
[    4.377508] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: config lane 1 0a
[    4.377934] nouveau D[   PDISP][0000:03:00.0] DP:0006:0342: training pattern 0

Note the difference in the quantity of lanes detected. No idea if this actually matters. But that's the only difference of any significance that I can see.

Could you just fudge your nouveau_dp_train function to determine that you need 4 lanes and see if that helps?
Comment 24 Michael Gulick 2014-02-13 22:19:35 UTC
It looks like that function is in drivers/gpu/drm/nouveau/core/engine/disp/dport.c, but I can't figure out how it should be modified to set the lane count.
Comment 25 Ilia Mirkin 2014-02-13 22:25:01 UTC
(In reply to comment #24)
> It looks like that function is in
> drivers/gpu/drm/nouveau/core/engine/disp/dport.c, but I can't figure out how
> it should be modified to set the lane count.

Change

		dp->link_nr = dp->dpcd[2] & DPCD_RC02_MAX_LANE_COUNT;
		while ((dp->link_nr >> 1) * link_bw[0] > datarate)
			dp->link_nr >>= 1;

To

dp->link_nr = 4;
Comment 26 Michael Gulick 2014-02-13 22:55:20 UTC
(In reply to comment #25)
Yes, after making that change, the displayport output is working again.

I'll attach the dmesg in case you're curious.
Comment 27 Michael Gulick 2014-02-13 22:56:48 UTC
Created attachment 94033 [details]
dmesg - working dp - linux git 0a0afd28 with fixed 4 lanes
Comment 28 Ilia Mirkin 2014-02-13 22:58:10 UTC
(In reply to comment #26)
> (In reply to comment #25)
> Yes, after making that change, the displayport output is working again.
> 
> I'll attach the dmesg in case you're curious.

Fantastic!

I think what would be interesting would be adding a _ton_ of prints to that function, basically at every step, and comparing to the old version (which unfortunately was in a somewhat different place, but look at the diff and it should be apparent). Reading over the code, the computations should be identical, but perhaps there's a subtle bug somewhere, or the inputs end up different.

If you're not up to figuring out how to do this yourself, I can try to write up some patches to apply to the old + new versions.
Comment 29 Ilia Mirkin 2014-02-13 23:23:37 UTC
FWIW in Torsten's logs, he also went from 4x 270000 to 2x 270000. It sounds like the value of 'bandwidth' is different now, since when forcing 4 lanes, Michael's logs indicate 4x 162000 (and it otherwise did 2x 270000... so the 'new' bandwidth value is between 540000 and 324000, while before it must have been above 648000 [4x 162000]).
Comment 30 Ilia Mirkin 2014-02-14 00:30:27 UTC
Here's a crazy idea --

in drivers/gpu/drm/nouveau/core/engine/disp/nv50.c:nv50_disp_intr_unk20_2, change

u32 ctrl = nv_rd32(priv, 0x610798 + soff);

to

u32 ctrl = nv_rd32(priv, 0x610794 + soff);

And obviously get rid of the link_nr = 4 hack.

(I'm basing this on the fact that nv50_disp_intr_unk20_2_dp reads the ctrl value from there for figuring out the bpp.)
Comment 31 Ilia Mirkin 2014-02-14 19:35:54 UTC
As per the response to the patch I sent, sounds like things are working for both Michael and Torsten now:

http://lists.freedesktop.org/archives/nouveau/2014-February/016205.html

[hm, the responses aren't there yet now, but I'm sure someone will push them through the moderation queue eventually.]

I'll mark the bug as fixed once this hits upstream.
Comment 32 Ilia Mirkin 2014-02-19 03:27:38 UTC
The fix should now be upstream and will be included in the next -rc, as well as various stable kernel trees:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a7f1c1e65b68e1e1ab70898528d5977ed68a0a7d


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.