Bug 108081

Summary: LG 31MU97 monitor native resolution is 4096x2160, but highest mode found is 3840x2160
Product: DRI Reporter: Brian Vincent <brainn>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: high CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: Triaged, ReadyForDev
i915 platform: KBL i915 features: display/Other
Attachments:
Description Flags
31mu97 EDID
none
dmesg with drm.debug=14
none
dmesg drm-tip with drm.debug=0x1e
none
[PATCH] drm/dp/mst: Configure no_stop_bit correctly for remote i2c xfers
none
drm-tip with enable_dp_mst=0
none
31mu97 EDID with enable_dp_mst=0
none
31mu97 EDID when attaching to amdgpu
none
dmesg on 4.15 AMDGPU machine
none
dmesg drm-tip amdgpu machine drm.debug=0x1e none

Description Brian Vincent 2018-09-27 00:02:19 UTC
Created attachment 141755 [details]
31mu97 EDID

My working theory for this problem is that the EDID contains no information about the monitor being 4096x2160, so there is no hope in detecting this mode.

I've attached my binary EDID file for the monitor.  I'll also add dmesg with drm.devel=14.
Comment 1 Brian Vincent 2018-09-27 00:02:51 UTC
Created attachment 141756 [details]
dmesg with drm.debug=14
Comment 2 Lakshmi 2018-09-27 06:51:37 UTC
Brian, Have you ever able to view the display in native resolution (4096x2160)?

What kind of cable is used to connect the display?

Have you tried to verify this issue with latest drm-tip (https://cgit.freedesktop.org/drm-tip) ?

Can you attach dmesg from boot with kernel parameters drm.debug=0x1e log_buf_len=4M. This will give more information about the issue.
Comment 3 Brian Vincent 2018-09-27 17:48:30 UTC
Created attachment 141761 [details]
dmesg drm-tip with drm.debug=0x1e
Comment 4 Brian Vincent 2018-09-27 17:49:41 UTC
Yes.  I can get the native resolution (4096x2160) to work in X.  For a long time, I've used the following mode line:

xrandr --newmode "4096x2160_60" 556.730  4096 4104 4136 4176  2160 2208 2216 2222 +hsync +vsync

But I noticed in the DRI source code that a DMT mode for 4096x2160@60hz exists, so I tried that too, and it also works fine:

xrandr --newmode "4096x2160_60" 556.744  4096 4104 4136 4176  2160 2208 2216 2222 +hsync +vsync

I'm currently using a Dell XPS 15 9560, with a base station attached using a thunderbolt port.  The monitor is connected to the base station with a full sized display port cable.

I've compiled drm-tip and included the kernel parameters you wanted.  I waited a while before logging in and capturing this dmesg I'm attaching.
Comment 5 Ville Syrjala 2018-09-27 18:30:44 UTC
The edid has the base block repeated over where the second extension block should be. Maybe a problem with the ddc segment address handling.

Hmm. Looks like the monitor is hooked up via MST. I see we fail to set he no_stop_bit flag on the remote i2c reads, which could explain the segment address being reset by the sink back to 0 before we read the extension block, and then we end up reading the base block instead. I'll cook up a patch for that.

You could also test passing i915.enable_dp_mst=0 to the kernel cmdline to i915 not use MST.
Comment 6 Ville Syrjala 2018-09-27 18:32:03 UTC
Created attachment 141762 [details] [review]
[PATCH] drm/dp/mst: Configure no_stop_bit correctly for remote i2c  xfers

Pleas give this a go. It could fix the bad EDID in MST mode.
Comment 7 Brian Vincent 2018-09-27 19:42:22 UTC
Here's what I've noticed so far.  I applied your patch, I didn't make clean, but I made new image and modules and it rebuilt a drm module, and of course I installed it.  I booted it up and I couldn't find a difference, no new modes, and the EDID was the same.

I then tried the same kernel but with i915.enable_dp_mst=0.  No new modes, but the EDID no longer has the doubling effect you mentioned.  The EDID is all zeros from 0x100 onward.

Would you like to see the new EDID or dmesg logs from one of these configurations?
Comment 8 Ville Syrjala 2018-09-27 19:54:26 UTC
(In reply to Brian Vincent from comment #7)
> Here's what I've noticed so far.  I applied your patch, I didn't make clean,
> but I made new image and modules and it rebuilt a drm module, and of course
> I installed it.  I booted it up and I couldn't find a difference, no new
> modes, and the EDID was the same.
> 
> I then tried the same kernel but with i915.enable_dp_mst=0.  No new modes,
> but the EDID no longer has the doubling effect you mentioned.  The EDID is
> all zeros from 0x100 onward.

And otherwise identical to the earlier result?

> 
> Would you like to see the new EDID or dmesg logs from one of these
> configurations?

dmesg with drm.debug=0xe i915.enable_dp_mst=0 might show something useful.
Comment 9 Brian Vincent 2018-09-27 20:06:38 UTC
Created attachment 141766 [details]
drm-tip with enable_dp_mst=0
Comment 10 Brian Vincent 2018-09-27 20:07:06 UTC
Created attachment 141767 [details]
31mu97 EDID with enable_dp_mst=0
Comment 11 Brian Vincent 2018-09-27 20:08:19 UTC
I've attached the dmesg you asked for, on the new kernel with the patch, plus i915.enable_dp_mst=0.  I've also attached the new EDID.  Now that I look at it closer, there is another difference.  The number of extensions is 1 instead of 2, and the checksum is different.
Comment 12 Ville Syrjala 2018-09-28 13:07:47 UTC
(In reply to Brian Vincent from comment #11)
> I've attached the dmesg you asked for, on the new kernel with the patch,
> plus i915.enable_dp_mst=0.  I've also attached the new EDID.  Now that I
> look at it closer, there is another difference.  The number of extensions is
> 1 instead of 2, and the checksum is different.

Yeah. No indication in the logs that anything went wrong. So I guess the monitor just doesn't care to include the second extension block in SST mode.

So it would appear that only real mystery is why the second extension block read doesn't give us sensible data in MST mode. Since the no_stop_bit fix didn't help I'm out of good ideas for now.
Comment 13 Brian Vincent 2018-09-28 14:39:41 UTC
Created attachment 141777 [details]
31mu97 EDID when attaching to amdgpu

When I attach this same monitor to my other computer with an AMDGPU, I get the attached EDID and the mode 4096x2160 is actually found.  So this is probably what the proper EDID is supposed to look like, right?
Comment 14 Ville Syrjala 2018-09-28 17:02:59 UTC
(In reply to Brian Vincent from comment #13)
> Created attachment 141777 [details]
> 31mu97 EDID when attaching to amdgpu
> 
> When I attach this same monitor to my other computer with an AMDGPU, I get
> the attached EDID and the mode 4096x2160 is actually found.  So this is
> probably what the proper EDID is supposed to look like, right?

Interesting. The extra block is the DisplayID as I suspected. amdgpu uses the same MST sideband code that i915 uses, so can't immediately see how it could get different results.

Looking at the earlier i915 logs I see a bunch of
[    6.899169] [drm:drm_dp_mst_hpd_irq [drm_kms_helper]] Got NAK reply: req 0x22, reason 0x09, nak data 0x40

Those are REMOTE_I2C_READ/I2C_NAK replies with NAK_data 0x40 (which the spec doesn't say what it means). Could be the remaining amount of data or something I suppose.

Can you grab similar logs from amdgpu to see if we get any of those errors there?

The only clear difference I see with amdgpu vs. i915 is the number of ESI registers we read when processing the hpd irq. The following diff should make i915 match amdgpu behaviour:
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 8e465095fe06..3f94dcc96f22 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -43,7 +43,7 @@
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
-#define DP_DPRX_ESI_LEN 14
+#define DP_DPRX_ESI_LEN 4
 
 /* Compliance test status bits  */
 #define INTEL_DP_RESOLUTION_SHIFT_MASK 0
Comment 15 Brian Vincent 2018-09-28 17:39:27 UTC
There's a problem with that patch.

drivers/gpu/drm/i915/intel_dp.c: In function ‘intel_dp_check_mst_status’:
drivers/gpu/drm/i915/intel_dp.c:4265:9: warning: array subscript 10 is above array bounds of ‘u8[4]’ {aka ‘unsigned char[4]’} [-Warray-bounds]
        !drm_dp_channel_eq_ok(&esi[10], intel_dp->lane_count)) {
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I ran it anyway, and I saw an error:

[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
Comment 16 Ville Syrjala 2018-09-28 18:09:49 UTC
(In reply to Brian Vincent from comment #15)
> There's a problem with that patch.
> 
> drivers/gpu/drm/i915/intel_dp.c: In function ‘intel_dp_check_mst_status’:
> drivers/gpu/drm/i915/intel_dp.c:4265:9: warning: array subscript 10 is above
> array bounds of ‘u8[4]’ {aka ‘unsigned char[4]’} [-Warray-bounds]
>         !drm_dp_channel_eq_ok(&esi[10], intel_dp->lane_count)) {
>          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Should be OK to just comment out/remove that link status/retrain block of code.
Comment 17 Brian Vincent 2018-09-28 18:51:45 UTC
I commented out that block.  It ran, but I got the same EDID as I originally posted, and no new modes.
Comment 18 Brian Vincent 2018-09-28 18:53:01 UTC
Created attachment 141780 [details]
dmesg on 4.15 AMDGPU machine

This is the dmesg on the working amdgpu machine, on a 4.15 kernel.  Let me know if you would prefer to see the logs from drm-tip.
Comment 19 Brian Vincent 2018-09-28 18:56:25 UTC
Oh, also, on the i915 machine, I forced the EDID to be the "proper" EDID that I pulled from the amdgpu machine.  When I did that, it found the 4096x2160 mode.  So that confirms that once the EDID is being read properly, it will work.
Comment 20 Ville Syrjala 2018-09-28 19:50:40 UTC
(In reply to Brian Vincent from comment #18)
> Created attachment 141780 [details]
> dmesg on 4.15 AMDGPU machine
> 
> This is the dmesg on the working amdgpu machine, on a 4.15 kernel.  Let me
> know if you would prefer to see the logs from drm-tip.

Hmm. No MST stuff in the log that I can. I guess maybe it's decided to do SST for whatever reason. Can't see any obvious code in amdgpu to no do MST though. So that's a bit mysterious.
Comment 21 Ville Syrjala 2018-10-03 16:05:18 UTC
(In reply to Ville Syrjala from comment #20)
> (In reply to Brian Vincent from comment #18)
> > Created attachment 141780 [details]
> > dmesg on 4.15 AMDGPU machine
> > 
> > This is the dmesg on the working amdgpu machine, on a 4.15 kernel.  Let me
> > know if you would prefer to see the logs from drm-tip.
> 
> Hmm. No MST stuff in the log that I can. I guess maybe it's decided to do
> SST for whatever reason. Can't see any obvious code in amdgpu to no do MST
> though. So that's a bit mysterious.

4.15 is maybe a bit old. Can you test again with a more recent kernel on the amdgpu machine?
Comment 22 Brian Vincent 2018-10-03 18:45:39 UTC
Created attachment 141863 [details]
dmesg drm-tip amdgpu machine drm.debug=0x1e

I'm attaching dmesg with drm-tip on the amdgpu machine with drm.debug=0x1e

So this amdgpu machine is a desktop, with the monitor directly plugged into the video card with a full sized displayport cable.


More information on the problematic i915 laptop setup:

It's a Dell XPS 15 9560.  It's connected to a TB16 thunderbolt docking station.  Over the weekend, I updated the firmware on the laptop itself, the dock, the cable, everything.  I had to create a windows USB drive to update the firmware. 
 I've noticed no difference in behavior after the firmware upgrades.  I get the same EDID.  Two of the firmwares that I updated were:

Synaptics VM3320 inside Dell WD15/TB16 wired dock
Summary: Multi-Stream Transport Device

Synaptics VM3330 inside Dell WD15/TB16 wired dock
Summary: Multi-Stream Transport Device

So I think it makes sense that you would see MST stuff for this laptop setup, but you wouldn't see it on the amdgpu machine.

I also want to say that when I was booted up with Windows, it initially didn't find the mode 4096x2160, but eventually after Windows did a bunch of hardware discovery stuff, it was available.  I still have that USB drive if you would like me to look at something in a Windows environment on the problematic i915 laptop/dock setup.
Comment 23 Ville Syrjala 2018-10-03 18:54:40 UTC
(In reply to Brian Vincent from comment #22)
> Created attachment 141863 [details]
> dmesg drm-tip amdgpu machine drm.debug=0x1e
> 
> I'm attaching dmesg with drm-tip on the amdgpu machine with drm.debug=0x1e
> 
> So this amdgpu machine is a desktop, with the monitor directly plugged into
> the video card with a full sized displayport cable.
> 
> 
> More information on the problematic i915 laptop setup:
> 
> It's a Dell XPS 15 9560.  It's connected to a TB16 thunderbolt docking
> station.

Ah. Yes that would explain why amdgpu doesn't do mst. It's not the sink that's mst, it's just the dock. Still doesn't explain why i915 in sst mode didn't find the second extension block, and why in mst mode the extension block was corrupted even with the no_stop_bit patch.
Comment 24 Ville Syrjala 2019-01-22 20:26:05 UTC
The no_stop_bit stuff has been merged. While the earlier results weren't promising a retest with fresh drm-tip is probably a good idea.
Comment 25 Lakshmi 2019-02-27 14:05:55 UTC
Brian, can you give your feedback after testing with latest drmtip?
(https://cgit.freedesktop.org/drm-tip)
Comment 26 Lakshmi 2019-06-04 10:07:30 UTC
No feedback from more the 3 months. Closing this bug as WORKSFORME.
Comment 27 Lakshmi 2019-06-04 10:10:11 UTC
If the problem persists with latest drmtip, please attach the dmesg from boot with kernel parameters kernel parameters drm.debug=0x1e log_buf_len=4M.

Closing this bug.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.