Bug 112008

Summary: eDP -> Dual Channel LVDS bridge unable to accept any modelines: Corrupt display!
Product: DRI Reporter: Babblebones
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: major    
Priority: high    
Version: DRI git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
kernel5.3-drm.debug=0xe
none
kernel4.17.0-rc3-drm.debug=4 (Good display)
none
Screen Output
none
EDID.bin none

Description Babblebones 2019-10-15 02:52:25 UTC
Created attachment 145745 [details]
kernel5.3-drm.debug=0xe

At my wits end on this bug. My bisecting have proven useless and I wanted to get some help from you guys. Several reverted regressions reproduce the same behavior along the way and I have spent hours sifting and booting to see where it's at.

The short of it is that the monitor in my HP laptop boots to a 640x480 modeline on mainline kernels newer than 4.17-rc3.
Would not be a colossal problem, as I can just force a working modeline from userspace, if it didn't mean without the modeset the bridge utterly scrambles the display by lining up 640x480 as a wrapping 2D length in the very top of my screen as a jumble of colors.

The issue is GPU independent but is AMD related. Both W7170M and WX7100 affected.
Somewhere after kernel 4.17-rc3 the the rodeo begins.
My dmesg is probably going to be much much more helpful when it comes to debugging and I will GLADLY test any kernel patches if you need more info & detailed kernel/ custom dmesg dumps.

Notable stuff in the dump:

[    4.773991] [drm:amdgpu_vm_init [amdgpu]] VM update mode is SDMA
[    4.774149] [drm:drm_client_modeset_probe] 
[    4.774156] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:59:eDP-1]
[    4.774159] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:59:eDP-1] status updated from unknown to connected
[    4.774268] [drm:update_stream_scaling_settings [amdgpu]] Destination Rectangle x:0  y:0  width:640  height:480
[    4.774278] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:59:eDP-1] probed modes :
[    4.774282] [drm:drm_mode_debug_printmodeline] Modeline "640x480": 60 25175 640 656 752 800 480 490 492 525 0x40 0xa
[    4.774284] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:63:DP-1]
Comment 1 Babblebones 2019-10-15 02:54:43 UTC
Created attachment 145746 [details]
kernel4.17.0-rc3-drm.debug=4 (Good display)
Comment 2 Babblebones 2019-10-15 03:23:28 UTC
Created attachment 145747 [details]
Screen Output
Comment 3 Babblebones 2019-10-15 03:25:24 UTC
Created attachment 145748 [details]
EDID.bin
Comment 4 Alex Deucher 2019-10-15 14:30:49 UTC
Can you use git bisect the issue and find the exact commit that broke it or narrow down when the regression happened?  E.g., working in 4.17-rc3 and broken in 4.17-rc4?
Comment 5 Babblebones 2019-10-15 19:56:59 UTC
(In reply to Alex Deucher from comment #4)
> Can you use git bisect the issue and find the exact commit that broke it or
> narrow down when the regression happened?  E.g., working in 4.17-rc3 and
> broken in 4.17-rc4?

I have two commits that break my EDID reading and cause the exact same issue so far.

bisected the mainline to try and find the issue, trailed into your drm-next git...

The first commit I found that broke my EDID read:
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=8a61bc085ffab3071c59efcbeff4044c034e7490
Was later reverted.

Followed more commits after this one down a branch and into here...
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86-urgent-for-linus&id=bd4caed47a19f25fe8674344ea06d469c27ac314

Surprisingly, swapping out the memory allocs actually breaks the EDID read in this commit/ branch too. I know for a fact Ubuntu's old kernel 4.18 reverted this specific and the treewide memory allocation change, which worked!? I stopped about here, my head was starting to spin as to why THAT would do anything.

I am ready and able to fork over any binary/parameter/debug dumps to help solve this.
Comment 6 Babblebones 2019-10-21 23:46:18 UTC
This may be useful to you!


[    4.603742] [drm:amdgpu_dm_initialize_drm_device [amdgpu]] amdgpu_dm_connector_init()
[    4.604326] [drm:dm_dp_aux_transfer [amdgpu]] Op: Read, addr: 0000, SideBand Msg: Nop, Op res: OK
[    4.604327] Body: 11
[    4.604699] [drm:dm_dp_aux_transfer [amdgpu]] Op: Read, addr: 0000, SideBand Msg: Nop, Op res: OK
[    4.604699] Body: 11 06 84 00 01 00 00 00 00 00 00 00 00 00 00 00
[    4.604932] [drm:dm_dp_aux_transfer [amdgpu]] Op: Read, addr: 0000, SideBand Msg: Nop, Op res: OK
[    4.604933] Body: 11
[    4.605160] [drm:dm_dp_aux_transfer [amdgpu]] Op: Read, addr: 0200, SideBand Msg: Nop, Op res: OK
[    4.605161] Body: 41
[    4.605393] [drm:dm_dp_aux_transfer [amdgpu]] Op: Read, addr: 0000, SideBand Msg: Nop, Op res: OK
[    4.605394] Body: 11
[    4.605700] [drm:dm_dp_aux_transfer [amdgpu]] Op: Read, addr: 0400, SideBand Msg: Nop, Op res: OK
[    4.605700] Body: 00 00 00 00 00 00 00 00 00
[    4.605933] [drm:dm_dp_aux_transfer [amdgpu]] Op: Read, addr: 0000, SideBand Msg: Nop, Op res: OK
[    4.605933] Body: 11
[    4.606178] [drm:dm_dp_aux_transfer [amdgpu]] Op: Read, addr: 0409, SideBand Msg: Nop, Op res: OK
[    4.606178] Body: 00 00 00
[    4.606218] [drm:dc_conn_log_hex_linux [amdgpu]] 11 
[    4.606257] [drm:dc_conn_log_hex_linux [amdgpu]] 06 
[    4.606296] [drm:dc_conn_log_hex_linux [amdgpu]] 84 
[    4.606334] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606372] [drm:dc_conn_log_hex_linux [amdgpu]] 01 
[    4.606410] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606447] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606485] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606522] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606560] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606597] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606635] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606672] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606710] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606748] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606786] [drm:dc_conn_log_hex_linux [amdgpu]] 00 
[    4.606825] [drm:retrieve_link_cap [amdgpu]] Rx Caps: 
[    4.607189] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.607192] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.607559] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.607561] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.607925] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.607926] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.608290] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.608291] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.608661] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.608662] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.609021] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.609022] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.609381] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.609383] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.609741] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.609742] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.610189] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.610190] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.610549] [drm:dm_dp_aux_transfer [amdgpu]] Op: Write, addr: 0050, SideBand Msg: Nop, Op res: OK
[    4.610550] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[    4.610587] [drm:dc_link_detect [amdgpu]] *ERROR* No EDID read.


Kernel 4.19 Gentoo stock

Any idea why dc_conn_log_hex_linux is so short?
Other people's connectors seem to get lots of hex data over the debug drop here mine is nearly completely empty!
Comment 7 Babblebones 2019-11-09 15:41:54 UTC
The EDID seems the have some kind of proprietary tag block inbetween the last and middle detailed modes.

Would this break parsing in more recent kernels? edid-decode seems to say that this is breaks the standard.
Comment 8 Martin Peres 2019-11-19 09:58:29 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/938.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.