Bug 78439 - [NVC1] Display corruption when DP connector is reattached
Summary: [NVC1] Display corruption when DP connector is reattached
Status: RESOLVED WONTFIX
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-08 13:28 UTC by Andrius Štikonas
Modified: 2016-09-20 14:44 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg (244.37 KB, text/plain)
2014-05-08 13:28 UTC, Andrius Štikonas
no flags Details
vbios.rom (63.00 KB, application/octet-stream)
2014-05-08 13:28 UTC, Andrius Štikonas
no flags Details
Photo of the screen. (1.89 MB, image/jpeg)
2014-05-08 13:30 UTC, Andrius Štikonas
no flags Details
dmesg (drm-next for 3.16) (69.94 KB, text/plain)
2014-06-12 13:30 UTC, Andrius Štikonas
no flags Details
dmesg (3.16) with debug info (249.72 KB, text/plain)
2014-06-12 14:59 UTC, Andrius Štikonas
no flags Details
dmesg (3.16) with debug info (506.88 KB, text/plain)
2014-06-12 15:17 UTC, Andrius Štikonas
no flags Details
dmesg_good (564.07 KB, text/plain)
2014-06-16 15:10 UTC, Andrius Štikonas
no flags Details
test patch (870 bytes, patch)
2014-06-17 05:51 UTC, Ben Skeggs
no flags Details | Splinter Review
dmesg_dont_touch_link (407.71 KB, text/plain)
2014-06-18 13:05 UTC, Andrius Štikonas
no flags Details
dmesg.txt (higher modes) (265.60 KB, text/plain)
2015-05-08 18:17 UTC, Andrius Štikonas
no flags Details
dmesg.txt.xz (lower modes, reattaching) (208.54 KB, application/x-xz)
2015-05-08 18:33 UTC, Andrius Štikonas
no flags Details

Description Andrius Štikonas 2014-05-08 13:28:11 UTC
Created attachment 98687 [details]
dmesg

When I unplug and replug DP connector the screen becomes corrupted (until I restart X).

I attached dmesg file. At 266s I plug DP connector, so that part of dmesg corresponds to screen corruption. Then at 386s I restart X, so that part corresponds to the working state.

I suppose nouveau DDX is doing something wrong because this only happens with X. E.g. everything is fine if I start Wayland.

I'm running:
Linux kernel:		3.14.1
xf86-video-nouveau:	1.0.10
xorg-server:		1.15.0
Comment 1 Andrius Štikonas 2014-05-08 13:28:47 UTC
Created attachment 98688 [details]
vbios.rom
Comment 2 Andrius Štikonas 2014-05-08 13:30:03 UTC
Created attachment 98689 [details]
Photo of the screen.
Comment 3 Ilia Mirkin 2014-05-21 00:17:37 UTC
Does it fix itself if you do a suspend/resume cycle?
Comment 4 Andrius Štikonas 2014-05-21 09:39:37 UTC
I can't test with suspend to RAM. My laptop only partially suspends and then it is stuck in that state.

I tried it with suspend to disk. The laptop also fails to suspend, however this time it aborts the suspend and returns to the working state with no screen corruption. Moreover, the corruption is completely gone till the next reboot, i.e. I can reattach connector and the display is still fine.
Comment 5 Ilia Mirkin 2014-06-11 23:31:05 UTC
There is a substantial DP rework bound for 3.16. Can you give http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next a shot?
Comment 6 Andrius Štikonas 2014-06-12 13:29:40 UTC
(In reply to comment #5)
> There is a substantial DP rework bound for 3.16. Can you give
> http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next a shot?

Completely broken. Does not even start DP screen.
Comment 7 Andrius Štikonas 2014-06-12 13:30:54 UTC
Created attachment 100920 [details]
dmesg (drm-next for 3.16)
Comment 8 Ilia Mirkin 2014-06-12 14:44:35 UTC
That's unfortunate... would you mind providing a dmesg from a boot with

nouveau.debug=PDISP=trace,VBIOS=trace,I2C=trace,DRM=trace
Comment 9 Andrius Štikonas 2014-06-12 14:59:38 UTC
Created attachment 100923 [details]
dmesg (3.16) with debug info
Comment 10 Andrius Štikonas 2014-06-12 15:17:50 UTC
Created attachment 100924 [details]
dmesg (3.16) with debug info
Comment 11 Ben Skeggs 2014-06-13 00:05:26 UTC
Would you be able to bisect nouveau between 3.15 and current -next to find where it got worse?
Comment 12 Andrius Štikonas 2014-06-13 00:10:27 UTC
(In reply to comment #11)
> Would you be able to bisect nouveau between 3.15 and current -next to find
> where it got worse?

I think so. But it might have to wait till Friday afternoon or Saturday (I can't do this from home now).
Comment 13 Andrius Štikonas 2014-06-14 23:14:35 UTC
I didn't have enough time to finish bisection today but I managed to narrow it down, so it might already be useful. There are just two untested revisions left.


Bad revision is 8894f4919bc43f821775db2cfff4b917871b2102
ebd6acbb068b6558735eb80aabce1e7af9e78e1e
55f083c33feb7231c7574a64cd01b0477715a370
Good revision is 13a61757db124b16cef9b8f9b60ff7337a01b398

I will finish bisecting on Monday.
Comment 14 Andrius Štikonas 2014-06-16 11:56:13 UTC
55f083c33feb7231c7574a64cd01b0477715a370 is the first bad commit which makes the problem worse.
    drm/nouveau/disp/dp: maintain link in response to hpd signal
    
    This previously worked for the most part due to userspace doing a
    modeset in response to HPD interrupts.  This will allow us to
    properly handle cases where sync is lost for other reasons, or if
    userspace isn't caring.

Also, I noticed that the original problem is only present if just DP screen is active. If I have both DP and LVDS screens active then unplugging and plugging back DP cable does not result in display corruption.
Comment 15 Ben Skeggs 2014-06-16 12:46:47 UTC
Thanks for that.  Can I grab the same debug log, but from the (working) commit before it too.

Thanks.
Comment 16 Andrius Štikonas 2014-06-16 15:10:26 UTC
Created attachment 101180 [details]
dmesg_good
Comment 17 Ben Skeggs 2014-06-17 05:51:09 UTC
Created attachment 101204 [details] [review]
test patch

Are you able to test whether this helps at all?
Comment 18 Andrius Štikonas 2014-06-17 11:05:01 UTC
Yes it helps. DP screen is no longer black.

The original issue has diminished too. The corrupted portion of the display is three times narrower. But I haven't tested whether this patch helps or one of the patches after that "bad revision" 8894f4919bc43f821775db2cfff4b917871b2102.
Comment 19 Ben Skeggs 2014-06-17 12:29:13 UTC
(In reply to comment #18)
> Yes it helps. DP screen is no longer black.
> 
> The original issue has diminished too. The corrupted portion of the display
> is three times narrower. But I haven't tested whether this patch helps or
> one of the patches after that "bad revision"
> 8894f4919bc43f821775db2cfff4b917871b2102.

Well, the interesting thing is that the "bad" patch is doing nothing wrong.  There is a change in behaviour in that we train the link at the highest rate supported between the display and the GPU instead of the bare minimum required to support the mode (something which is sensible for supporting MST, and which the NVIDIA binary driver also does).

I've seen some odd behaviour on my GF108 with one particular DP->VGA adapter at the high bit rate too, which the NVIDIA binary driver is also effected by, however, I'm not sure it's the same as you're seeing.  Mine is random, and works a lot of the time.  The bad log *does* look like the same symptoms (the link trains successfully, but continually drops out afterwards).

Would it be at all possible for you to try NVIDIA's driver, and see how it does?  And, if it works, get a trace[1] of it initialising the screen?  Hopefully in your case it works perfectly fine, and I can find something we're doing wrong for the high link rate.

[1] http://nouveau.freedesktop.org/wiki/MmioTrace/
Comment 20 Andrius Štikonas 2014-06-17 22:25:32 UTC
I sent mmiotrace to the usual address.
Comment 21 Andrius Štikonas 2014-06-18 13:05:28 UTC
Created attachment 101299 [details]
dmesg_dont_touch_link

dmesg with "drm/nouveau/disp/dp: don't touch link config after success" patch
Comment 22 Andrius Štikonas 2014-06-18 16:15:26 UTC
(In reply to comment #21)
> Created attachment 101299 [details]
> dmesg_dont_touch_link
> 
> dmesg with "drm/nouveau/disp/dp: don't touch link config after success" patch

I forgot to mention that the screen is not working even with this patch.
Comment 23 Andrius Štikonas 2014-10-24 23:20:32 UTC
I retested this with the newest kernel (drm-fixes branch) but the screen is still black.
Comment 24 Andrius Štikonas 2015-05-08 18:17:56 UTC
Created attachment 115643 [details]
dmesg.txt (higher modes)

Ilia asked to create dmesg with the following nouveau_dp_rates[] enabled:

{  648000, 0x06, 4 },
{  324000, 0x06, 2 },
{  162000, 0x06, 1 },

In this 648 mode the whole screen flickers a lot (no reattaching is necessary to reproduce it, it happens immediately)
Comment 25 Andrius Štikonas 2015-05-08 18:33:43 UTC
Created attachment 115644 [details]
dmesg.txt.xz (lower modes, reattaching)

only the lower modes are enabled:

{  324000, 0x06, 2 },
{  162000, 0x06, 1 }

When booting, everything starts fine. After reattaching DP connector, the right side of the screen flickers. (Note that flickering in 648 mode is more severe, affects the whole screen and even causes blinking of the screen)
Comment 26 Andrius Štikonas 2015-09-21 13:39:11 UTC
The screen that I used in this bug is now broken. I can't test this bug right now because the replacement screen that I was given does not have DP connector.
Comment 27 Andrius Štikonas 2016-09-20 14:44:53 UTC
I'll close this because I don't have hardware any more  and nobody else was able to reproduce it.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.