Bug 97244 - [DP] [SKL] 5k tiled dual DP (two-pipe, two-port) display sync issues
Summary: [DP] [SKL] 5k tiled dual DP (two-pipe, two-port) display sync issues
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: medium enhancement
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 105198 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-08-08 14:22 UTC by Tomas Bzatek
Modified: 2018-10-25 06:56 UTC (History)
9 users (show)

See Also:
i915 platform: SKL
i915 features: display/DP


Attachments
picture of the corruption (89.51 KB, image/jpeg)
2016-08-08 14:22 UTC, Tomas Bzatek
no flags Details
dmesg drm.debug=0x1e (4.8.0-rc1-10779-g8ca71ca-dirty, drm-intel-nightly 2016y-08m-08d-09h-02m-24s UTC) (2.69 MB, text/plain)
2016-08-08 14:25 UTC, Tomas Bzatek
no flags Details
Xorg.0.log (20.68 KB, text/plain)
2016-08-08 14:27 UTC, Tomas Bzatek
no flags Details
xrandr --verbose (8.19 KB, text/plain)
2017-03-30 14:12 UTC, Tomas Bzatek
no flags Details
attachment-32342-0.html (1.59 KB, text/html)
2018-01-03 18:21 UTC, Elio
no flags Details
RWEverything PCI dump (12.59 KB, application/gzip)
2018-01-18 20:06 UTC, Tomas Bzatek
no flags Details
attachment-5493-0.html (1.69 KB, text/html)
2018-09-21 09:00 UTC, Elio
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tomas Bzatek 2016-08-08 14:22:09 UTC
Created attachment 125593 [details]
picture of the corruption

(this is a split-off from bug 95207 after solving modelines detection)

While modelines are properly detected now and native resolution can be set, the monitor shows corrupted image (see attached photo).

This is reproducible anytime with random level of corruption. So far I have never been able to get proper stable image. Lower resolutions (via single DP cable) work fine, so does Windows in native resolution (dual DP).

No interesting messages in the logs either, attached below.

This is Dell UP2715K, a 5k tiled monitor connected via two DP ports on a C236 board running Skylake Xeon.

I'll hang on IRC on #intel-gfx (nick: tbzatek) for direct debugging.
Comment 1 Tomas Bzatek 2016-08-08 14:25:21 UTC
Created attachment 125594 [details]
dmesg drm.debug=0x1e (4.8.0-rc1-10779-g8ca71ca-dirty, drm-intel-nightly 2016y-08m-08d-09h-02m-24s UTC)

System boots up with lower resolution, the native 5120x2880 resolution set around 38.4 sec.
Comment 2 Tomas Bzatek 2016-08-08 14:27:05 UTC
Created attachment 125595 [details]
Xorg.0.log

Xorg.0.log, xorg-server 1.18.4, xf86-video-intel git
Comment 3 Tomas Bzatek 2016-08-09 18:19:55 UTC
After number of experiments I've managed to get proper image in native resolution. Hit rate is very low though, it takes about 20 tries fiddling with xrandr trying to get the right sync.

Here is my theory: the monitor is one physical panel taking data from two DP inputs making a final composited image from both, side by side. It probably takes one output as a reference and tries to synchronize scanlines from the other. It also probably expects equal timings and particular frames sent at the same moment on both outputs.

When I see tearing (e.g. scrolling a large image) it's consistent across whole image, on both outputs.

This was normally not the case when I had two monitors connected to each DP port in a clone mode - tearing was at different positions (perhaps not relevant, but worth to mention).


Related xf86-video-intel options:

  Option  "TearFree"           "off"
  Option  "VSync"              "on"
  Option  "DRI"                "3"
  Option  "Present"            "on"
  Option  "ReprobeOutputs"     "off"
  Option  "HotPlug"            "off"
  Option  "HWRotation"         "off"
  Option  "DebugFlushCaches"   "off"
  Option  "DebugWait"          "on"
  Option  "DebugFlushBatches"  "off"
  Option  "FallbackDebug"      "off"

(basically random options but seem to help a little)

This is with 4.7.0 kernel (zen patchset), from drm point of view it's basically vanilla with the following patches:

  drm/i915/skl: Add support for the SAGV, fix underrun hangs
  drm/i915/skl: Fix redundant cursor update, fix cursor underruns 

So yes, it is possible to drive the monitor with current SW, just the drivers need some tweaking. Hope this helps...
Comment 4 spgle212 2016-08-22 11:36:19 UTC
Maybe looking at xinerama helps. Xinerama does not produce syncing issues.
Comment 5 spgle212 2016-08-25 16:20:53 UTC
btw. the issues are also the same on fedora 25 (kernel 4.8), dell up2715k and nouveau (gtx 960).
Comment 6 Andrew Snow 2016-10-16 02:52:48 UTC
Same problem here with Dell 5K and a Skylake i915 cpu/gpu.

I think the problem is that the intel driver is assigning seperate PLLs for each port.  The capability exists to share a common PLL clock source for both ports.

I suspect the Windows driver sees both ports have the same resolution and framerate and shares a PLL automatically.

Is there a way to force PLL sharing to test this theory?
Comment 7 Sebastian 2016-11-10 23:22:20 UTC
I use a Dell UP2715K with Arch Linux.

And since last week (Update to nvidia Driver 375.10) it works like a charm. 

----

Nvidia Driver Version: 375.10
Xorg: 1.18.4 (11804000)
Kernel: 4.8.6.1
Comment 8 David DeCarmine 2016-11-14 21:39:30 UTC
Same issue here as well. Tried on Fedora 25 on a Dell XPS 15 (9550) laptop and the Dell 5k monitor (up2715k). It looks exactly like the corruption image that Tomas Bzatek posted.

It freaked me out at first because the flickering that happened sort of "burned in" to the monitor. Even when I unplugged it from the laptop it was still flickering the same image. If anyone else runs into this, just leave it running at a 4k resolution (or any that doesn't have the corruption issues) and eventually it fixes itself.

I'm using Skylake i915 CPU/GPU (HD Graphics 530). I confirmed it wasn't hardware since it works on Windows. After years of Linux I had to revert back to Windows so that I can use the monitor at its real resolution. =\

Will be watching this issue so I can hopefully bounce back to Linux again. I can help test/debug as well.
Comment 9 Sebastian 2016-11-15 11:30:47 UTC
@david

The flickering und tiling issue is a problem of the UP2715k. See here:

http://www.dell.com/support/article/de/de/debsdt1/The_Left_and_Right_side_of_screen_do_not_sync/align

often after these distortions you have constant flickering for 30-60 Minutes. (see the DELL remarks)

I think it was too early for DELL, to jump into the 5K-market.
Comment 10 Tomas Bzatek 2016-11-15 11:57:59 UTC
Well for the time being I managed to make it run by using the buggy BFS scheduler (out-of-the-tree CFS process scheduler replacement) that somehow makes the whole drm subsystem laggy. Please disregard my findings in comment 3, it's more or less independent of Xorg settings. No luck with setting up Xinerama either (in reply to comment 4).

My theory is that the bug causes delaying the phase the outputs are set up and once the lag is over, it activates both outputs at the same moment. It's lame but it's a sufficient workaround for now :-)

And when the outputs are in sync, the monitor holds the sync for a whole day, without flickering. There are no issues with cursor or windows DP1 <-> DP2 transition either, it really feels like a single screen, even tearing is in sync. Good work on that front.


Unfortunately I didn't have much time to debug this issue, it's still on my TODO list though. Assigning common PLL clock source (as suggested in comment 6) could be the way to go. I've also tried patching the kernel to delay the second output to be in vsync with the first (as suggested by intel developers on irc), with no luck so far.
Comment 11 David DeCarmine 2016-11-16 03:31:07 UTC
Hey Tomas Bzatek,

Would you have any instructions on how I may get the same version of BFS running? Would really love to get a working Linux even if it is buggy in other ways. Also, when I plug the monitor in, it doesn't correctly configure both monitors together. I have to manually put in some xrandr commands: `xrandr --output DP-1 --mode 2560x2880 --output DP-2 --mode 2560x2880 --right-of DP-1` Something like this. Is this what you're doing, or do you have an xorg config? Or does your desktop just auto-detect?

As soon as I run the above command is when I get the really weird and glitchy tiling errors with flickering that stays for a while. I actually tried using the kernel in this repo (https://copr.fedorainfracloud.org/coprs/hubbitus/kernel-pf/) which uses MuQSS instead of BFS, but same issue. I may try compiling BFS into the kernel to see if that works instead.
Comment 12 kingcong 2016-11-25 17:18:06 UTC
(In reply to Sebastian from comment #7)
> I use a Dell UP2715K with Arch Linux.
> 
> And since last week (Update to nvidia Driver 375.10) it works like a charm. 
> 
> ----
> 
> Nvidia Driver Version: 375.10
> Xorg: 1.18.4 (11804000)
> Kernel: 4.8.6.1

UP2715K with Arch Linux, but I can't get 5k resolution in gnome, 5k seems to be stretched and I can only see half of the full screen. 

Can you tell me some details?

Nvidia Driver Version: 375.20
Xorg: 1.18.4 (11804000)
Kernel: 4.8.10.1
Comment 13 Sebastian 2016-11-25 23:35:58 UTC
I also use Gnome. You have to set the display to its native resolution (5.120 x 2.880) in Gnome-Display-Settings. Or to 2x 2560x2888 in nVidia-Settings.

Both ways work. And the "two half displays" are handled as one by Gnome automatically.

Screenshot: http://www.naanoo.com/upstream/gnome-5k.png
Comment 14 Tomas Bzatek 2016-11-29 12:14:04 UTC
Drop me an e-mail for the workaround with BFS.

Please keep this discussion on-topic and strictly technical. This bugreport is related to intel drm, you may need to clone it to discuss other GPUs.
Comment 15 Andrew Snow 2017-01-20 02:36:49 UTC
Any news on this issue?  

Is it worth trying a new version yet?

Last time I tested it my monitor was unusable for over 24 hours so I'm not keen to be a guinea pig.

But I would like to able to test my "PLL sharing" theory (https://bugs.freedesktop.org/show_bug.cgi?id=97244#c6)
Comment 16 Tomas Bzatek 2017-03-06 17:45:20 UTC
Sorry Andrew, didn't have time to dive into the sources and hack the PLL sharing theory yet.

Elio, is the NEEDINFO targeted to the original reporter? As far as I've tested the drm-intel branch about a month ago, the issue was still present. I'll retest tonight again.
Comment 17 Elio 2017-03-06 20:52:07 UTC
Yes, my bad, probably we need to check this issue with latest kernel version so far. 4.10 since a lot of new patches were merged for DP. Changing state
Comment 18 Tomas Bzatek 2017-03-06 21:02:31 UTC
Just tested latest drm-intel and vanilla 4.11-rc1, both with the same bad results. The result in attachment 125593 [details] still stands. Also tried turning "nuclear_pageflip" module argument on and off, with no difference.
Comment 19 Jani Nikula 2017-03-07 10:41:59 UTC
(In reply to Andrew Snow from comment #6)
> Same problem here with Dell 5K and a Skylake i915 cpu/gpu.
> 
> I think the problem is that the intel driver is assigning seperate PLLs for
> each port.  The capability exists to share a common PLL clock source for
> both ports.
> 
> I suspect the Windows driver sees both ports have the same resolution and
> framerate and shares a PLL automatically.

If anyone does have the hardware and Windows readily available, dumping the PCI MMIO BAR on Windows would let us check how it configures the hardware in this case. Alas, I have no idea what the tool for dumping it is.
Comment 20 Jani Nikula 2017-03-07 10:58:04 UTC
Btw if the display requires special handling for the two port operation, the driver also needs to identify the case. Please attach 'xrandr --verbose' output for when the displays work.
Comment 21 Tomas Bzatek 2017-03-30 14:12:08 UTC
Created attachment 130582 [details]
xrandr --verbose

Attaching 'xrandr --verbose' output grabbed at the moment when it all comes up synchronized. Screen tiles can be identified by parsing the DisplayID data - see bug 95207.

I have a Windows 7 installation on my external HDD that works but since I'm a Windows lame I will need guidance with the PCI BAR dump...
Comment 22 Jani Saarinen 2017-05-24 07:06:58 UTC
Reporters, is issue still valid on latest drm-tip?
Comment 23 Tomas Bzatek 2017-05-24 08:15:14 UTC
Yes, this is still a problem with latest drm-tip.

However I noticed recently that the image is torn at a consistent point across multiple output mode switching - previously with 4.8/4.9 kernel the image was torn differently everytime the target resolution was set on both outputs.

So there's a level of improvement (and hope) now, still the outputs are not in sync.

Regarding the PCI MMIO BAR dump on Windows - I still don't have an idea how to make such dump...
Comment 24 Ricardo Madrigal 2017-06-30 19:56:16 UTC
Hello

I just tried to reproduce the problem with following configuration:

KBL NUC, using a MST sunix conected with a mini-DP to DP and 2 connectors DP-DP with 2 external monitor (acer) 3840 x 2160 (4K).

Attaching my configuration used to test

======================================
             Software
======================================
kernel version              : 4.12.0-rc3-drm-tip-ww22-commit-187376e+
architecture                : x86_64
os version                  : Ubuntu 17.04
os codename                 : zesty
kernel driver               : i915
bios revision               : 5.12
bios release date           : 09/12/2016

======================================
        Graphic drivers
======================================
mesa                      : 17.0.3
modesetting               : modesetting_drv.so
xorg-xserver              : 1.19.3
libdrm                    : 2.4.76
libva                     : 1.7.3-2
vaapi (intel-driver)      : 1.7.3
cairo                     : 1.14.8-1
intel-gpu-tools           : 1.17-1

======================================
             Hardware
======================================
platform                   : KBL-Nuc
motherboard model          : MS-B142
motherboard id             : MS-B1421
form factor                : Desktop
manufacturer               : Micro-StarInternationalCo.,Ltd.
cpu family                 : Core i7
cpu family id              : 6
cpu information            : Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
gpu card                   : Intel Corporation Device 5916 (rev 02) (prog-if 00 [VGA controller])
memory ram                 : 7.65 GB
max memory ram             : 64 GB
display resolution         : 1600x900
cpu thread                 : 4
cpu core                   : 2
cpu model                  : 142
cpu stepping               : 9
socket                     : Other
signature                  : Type 0, Family 6, Model 142, Stepping 9
hard drive                 : 111GiB (120GB)
current cd clock frequency : 540000 kHz
maximum cd clock frequency : 675000 kHz
displays connected         : DP-1 HDMI-A-2

======================================
             Firmware
======================================
dmc fw loaded             : yes
dmc version               : 1.1
guc fw loaded             : NONE
guc version wanted        : 0.0
guc version found         : 0.0

======================================
             kernel parameters
======================================
quiet splash fastboot drm.debug=0xe

Yes, this is still a problem.

Using 4K, you can connect only one  monitor first, change resolution to 1920x1080, then connect the other monitor an change the resolution to the same (1920x10180)
Comment 25 Jani Nikula 2017-07-04 13:13:02 UTC
(In reply to Ricardo Madrigal from comment #24)
> I just tried to reproduce the problem with following configuration:
> 
> KBL NUC, using a MST sunix conected with a mini-DP to DP and 2 connectors
> DP-DP with 2 external monitor (acer) 3840 x 2160 (4K).

...

> Yes, this is still a problem.
> 
> Using 4K, you can connect only one  monitor first, change resolution to
> 1920x1080, then connect the other monitor an change the resolution to the
> same (1920x10180)

So I don't know what you're observing, but the original report is about a very specific issue. The UP2715K display has two DP inputs to support 5k in a single display [1]. From the driver perspective, at least currently, they are treated as two separate displays, while the display may have stricter requirements about e.g. PLL syncing. I would be rather hesitant to make conclusions about tests done with two physically separate displays.

[1] http://www.dell.com/ae/business/p/dell-up2715k-monitor/pd
Comment 26 Elizabeth 2017-08-11 20:24:51 UTC
Hello everyone,
Any update on trying to get the PCI MMIO BAR dump in Windows?
Thank you.
Comment 27 Tomas Bzatek 2017-08-14 08:40:28 UTC
Hi Elizabeth,

as far as I understand we're still missing a howto for the dump. I'm able to grab one as long as a list of tools and brief howto is provided (see comment 19 and comment 21).

So the NEEDINFO status should be on your developers, not the reporters. You may also try to forward this query to your colleagues in the Windows driver team inhouse, perhaps they would know more.
Comment 28 nic30 2017-09-08 13:54:02 UTC
Hello,

do you still need the PCI MMIO BAR dump?
Or there is already a solution?
Comment 29 Jani Nikula 2017-09-18 09:06:36 UTC
I recently tried to acquire a dual DP port 5k display, but they are really hard to come by. Seems like Dell's discontinued UP2715K.
Comment 30 Jani Nikula 2017-09-18 09:10:01 UTC
Basically the background of the issue is that for the driver, the display shows up as two DP displays, due to the two DP ports, and the legacy modesetting tries to enable them like two independent displays. It's just that the display apparently requires much better sync for enabling and driving the two parts.
Comment 31 Nicolae Carabut 2017-11-18 03:12:21 UTC
Hi, I can help with debugging the problems, I have two Dell UP2715K running on Arch Linux @ 5K

My setup looks like this:
2 Dell UP2715K -> 5k
2 Asus MG24U -> 4k

They all are setup like this from left to right: https://goo.gl/sfmUjU

This was possible ONLY because I used the "Xinerama" in "nvidia-settings" and this is the only 
configuration that worked and allowed me to have 4 monitors up at their native resolutions.

My GPUs
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1070] [10de:1b81] (rev a1) (prog-if 00 [VGA controller])
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106GL [Quadro P2000] [10de:1c30] (rev a1) (prog-if 00 [VGA controller])



Before getting to this config I have tried and spend two days of continuous trial and error with xrandr, notihing worked.
If the monitors were up on then the image will flicker  wildly, if not flicker the what ever I did the image would how only on one half of the screen the other one would be black (screen off)


etc, complete misery 


Now, this Xinerama shows up and works @ 5k / 4k , the problem is that if I start many activities the whole thing becomes slow, noticeably.
I suspect that xrandr (disabled by the xinerama) would work then it would be a lot faster



So, here, use me, I want these beasts at their best
Comment 32 Elizabeth 2017-11-21 17:42:54 UTC
(In reply to Nicolae Carabut from comment #31)
> Hi, I can help with debugging the problems, I have two Dell UP2715K running
> on Arch Linux @ 5K...
Hello Nicolae, thanks for offering. You can give a visit to the irc of the developer community for more direct communication: https://01.org/linuxgraphics/community
Comment 33 Ross Bishop 2018-01-03 18:21:10 UTC
I'm also in the UP2715K boat with screen corruption.

There's a range of software which appears to provide information about PCI devices. I've been looking at PCIScope, it gives you access to the PCI registers and there is a dump facility.

I'm brand new to PCI though and I don't know how to translate the base address to the actual location of the MMIO registers. It feels like there's far too little information in the dump tab.

http://envytools.readthedocs.io/en/latest/hw/mmio.html#gf100-mmio-map

Supposedly this page contains the Nvidia MMIO map for GF100 devies and onwards (GP102 here). 

What are we actually looking for in this dump? It was mentioned but never specified what can be found that would be helpful for achieving sync for the two panel halves.
Comment 34 Elio 2018-01-03 18:21:26 UTC
Created attachment 136533 [details]
attachment-32342-0.html

See you next year!
Comment 35 Tomas Bzatek 2018-01-18 20:06:24 UTC
Created attachment 136834 [details]
RWEverything PCI dump

Still a problem with drm-tip 2018y-01m-18d-18h-18m-07s

I was finally able to grab the PCI configuration space from a Windows 7 system running full resolution - see attached. The dump was created with the RWEverything tool. Let me know if you need any other memory region, hope this one would be useful.

Although Dell UP2715K is now EOL, there are some other panels on the market requiring dual DP1.2 inputs: HP Z27q, Philips 275P4VYKEB. Although no idea what controllers do these panels use.
Comment 36 Ross Bishop 2018-01-23 00:23:46 UTC
(In reply to Tomas Bzatek from comment #35)
> Created attachment 136834 [details]
> RWEverything PCI dump
> 
> Still a problem with drm-tip 2018y-01m-18d-18h-18m-07s
> 
> I was finally able to grab the PCI configuration space from a Windows 7
> system running full resolution - see attached. The dump was created with the
> RWEverything tool. Let me know if you need any other memory region, hope
> this one would be useful.
> 
> Although Dell UP2715K is now EOL, there are some other panels on the market
> requiring dual DP1.2 inputs: HP Z27q, Philips 275P4VYKEB. Although no idea
> what controllers do these panels use.

Nice one Tomas, I was completely lost in attempting to do this.

It's not just the 5K monitors of the past, Dell's Ultrasharp UP3218K is an 8K tiled monitor currently available and Phillips has previewed the 328P8K which is set to release this year as well.

So far as I see it, this approach to the cutting edge isn't going to go away and as these monitors enter the second hand market, they represent excellent value for money, and are at times the only option to have access to such high resolutions. The first single port solution (ignoring the LG TB3 monitor which only works properly with Mac) is only just hitting the market, some 3 or so years after the first 5K monitors dropped.
Comment 37 Jani Nikula 2018-02-28 19:00:12 UTC
Related bug about LG 5k is bug 105198.
Comment 38 Jani Saarinen 2018-03-29 07:10:10 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 39 Tomas Bzatek 2018-04-02 19:17:39 UTC
Still a problem with drm-tip 2018y-03m-30d-18h-56m-26s

Both the bug 105198 and bug 105651 mostly concern logical monitor setup issues. What we're seeing here in this bugreport is a hardware sync problem. See also my comment in bug 105198#c19

The radeon bug 99801 is basically what I'm seeing also on i965. Although that's on HP Z27q however the corruption/flicker seems to match. You may try to acquire this one for testing if the Dell UP2715K is not available anymore.

So this bugreport has been open for almost two years now with little to no improvement... Who should I bribe to make this issue fixed? :-) If there's anything else needed from the reporter side, please let us know.
Comment 40 Jani Saarinen 2018-04-25 06:35:42 UTC
Jani, any advice to move here?
Comment 41 Christopher Snowhill 2018-09-14 04:35:56 UTC
This and the previous issue may also apply to the Retina 5k iMac series, as those are identical panels to the Dell 5k panel, and are also dual DP input tiled. I can try to grab a RWEverything dump from Boot Camp running on my iMac, as well as EDID dumps.
Comment 42 Manasi 2018-09-21 08:58:04 UTC
This is probably being caused since the feature called transcoder port sync is required for synchronizing across two pipes two ports.
This is currently not enabled in the driver. IMHO, until this is enabled, the driver should prune this mode so that the userspace only sees 4K as the preferred mode and doesnt try to modeset for 5K and cause corruption.

Manasi
Comment 43 Elio 2018-09-21 09:00:40 UTC
Created attachment 141671 [details]
attachment-5493-0.html

will be back on monday
Comment 44 Martin Peres 2018-10-19 11:36:56 UTC
Removing Elio, since he does not contribute to this bug anymore and his email client spams this thread.
Comment 45 Lakshmi 2018-10-21 18:26:45 UTC
Manasi, any updates here?
Comment 46 Lakshmi 2018-10-25 06:36:08 UTC
Updated the priority and severity based on the feature.
Comment 47 Lakshmi 2018-10-25 06:56:05 UTC
*** Bug 105198 has been marked as a duplicate of this bug. ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.