Bug 110371 - HP Dreamcolor display *Error* No EDID read
Summary: HP Dreamcolor display *Error* No EDID read
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/other (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-10 03:38 UTC by Babblebones
Modified: 2019-10-15 02:28 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Dmesg output from an affected kernel (74.31 KB, text/plain)
2019-04-13 02:14 UTC, Babblebones
no flags Details
Dmesg drm.debug=4 (126.93 KB, text/plain)
2019-05-02 03:09 UTC, Babblebones
no flags Details
EDID file (128 bytes, application/octet-stream)
2019-05-15 05:02 UTC, Babblebones
no flags Details

Description Babblebones 2019-04-10 03:38:39 UTC
My Tonga Firepro W7170M MXM GPU seems to have a bad reaction to kernels higher than Ubuntu's 4.18 on 18.10.
I just tried a whole bunch of mainlines and Ubuntu 19.04 and nailed it down that any kernel higher than 4.18 is causing the GPU to no longer be able to get the EDID of my dreamcolor IPS display.

Not sure how to best go about bisecting this...

It uses a colorboard to convert what I can only assume to be horizontal signal and vertical signalling information to two separate channels of LVDS from a single eDP connector coming off the board.

This results in the image shown on screen being completely unreadable with crazy scan wrapping from the channels getting mixed.

May be related to: https://bugs.freedesktop.org/show_bug.cgi?id=108806
Comment 1 Michel Dänzer 2019-04-10 07:38:05 UTC
Please attach the corresponding full output of dmesg.
Comment 2 Babblebones 2019-04-13 02:14:23 UTC
Created attachment 143957 [details]
Dmesg output from an affected kernel

Here is the issue, you can see
[    1.335096] [drm:dc_link_detect [amdgpu]] *ERROR* No EDID read.

Which is about where the display turns into fruit salad.
Please let me know if I can submit any more debug info that would assist! I am very much in need of running a newer kernel.
Comment 3 Nicholas Kazlauskas 2019-04-18 14:40:55 UTC
Looks like this is a bug in DRM itself with the parsing.

[    1.325111] [drm] parse error at position 12 in video mode '1920x1080@59.94'
[    1.335096] [drm:dc_link_detect [amdgpu]] *ERROR* No EDID read.

It's hitting the "." as part of the video mode and erroring out because DRM doesn't consider it a valid character.
Comment 4 Michel Dänzer 2019-04-18 15:11:44 UTC
(In reply to Nicholas Kazlauskas from comment #3)
> It's hitting the "." as part of the video mode and erroring out because DRM
> doesn't consider it a valid character.

Or maybe there's two separate issues? Failure to parse the mode name shouldn't affect whether or not DC picks up EDID, should it?
Comment 5 Nicholas Kazlauskas 2019-04-18 15:29:25 UTC
(In reply to Michel Dänzer from comment #4)
> (In reply to Nicholas Kazlauskas from comment #3)
> > It's hitting the "." as part of the video mode and erroring out because DRM
> > doesn't consider it a valid character.
> 
> Or maybe there's two separate issues? Failure to parse the mode name
> shouldn't affect whether or not DC picks up EDID, should it?

I suppose that doesn't actually influence whether the EDID has been read or is valid. So that's a different bug.

The "No EDID read." error comes from the result from drm_get_edid(...) being NULL, however.

While we're responsible for the actual transfers to and from the receiver the actual logic is shared there between drivers.

(In reply to Babblebones from comment #2)
> Created attachment 143957 [details]
> Dmesg output from an affected kernel
> 
> Here is the issue, you can see
> [    1.335096] [drm:dc_link_detect [amdgpu]] *ERROR* No EDID read.
> 
> Which is about where the display turns into fruit salad.
> Please let me know if I can submit any more debug info that would assist! I
> am very much in need of running a newer kernel.

Can you post a dmesg log with drm.debug=4 as part of your kernel boot parameters?
Comment 6 Babblebones 2019-04-19 17:59:49 UTC
Will do. Just switched my distro to Gentoo, specifically so I can stay on kernel 4.18 for as long as necessary to combat the issue and apply a patch when ready, and cleared all of the cruft out of the grub config.

I can drop my EDID itself here as well if that will help.

Give me a bit to get everything ready and I will post the drm debug of old and new.
Comment 7 Babblebones 2019-05-02 03:09:09 UTC
Created attachment 144124 [details]
Dmesg drm.debug=4

Funny enough it complains about missing EDID in this one too. May not even be the issue now that I'm coming in on it.
But I seem to have found the commit where my panel breaks down.

It's a big drm-next merge. It has to be one of the core changes listed on the changelog.
I'm not a git wizard, is there any way to get more granular about this commit?
Anyone have an idea what's broken in here?

54c88a029a0a86fe00a0ee7d2a15ee08e6d04db9
Comment 8 Babblebones 2019-05-05 17:48:59 UTC
I've been going down the rabbithole looking for the commit that soured my display.

https://cgit.freedesktop.org/~airlied/linux/commit/?id=5c0e0b45c4936295d6333dd7961d0b89b15b070d

Or

This branch

https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-fixes-4.17

Is the closest I can get so far, if I go one back, the kernel version works with my display.

The commit directly before that

https://cgit.freedesktop.org/~airlied/linux/commit/?id=44ef02c241e7c99af77b408d52af708aa159e968

Works just fine.

Is there anything inside this merge that would cause this?
I don't know much of what I'm looking at or looking for in these commits but I'll continue dissecting.
Comment 9 Babblebones 2019-05-05 18:31:45 UTC
I've found the exact commit!

https://lists.freedesktop.org/archives/amd-gfx/2018-July/023920.html

Fixes the issue against a few kernels affected but my issue is that the code base has been modified so heavily while retaining the same behavior that I can't apply this to kernel 5.2 linux-stable git.

I can't even discern where to manually edit the related files to change the behavior.

It may be necessary to include another fix that that list of related patches to fix the behavior for my connector/ panel. Not a programmer myself so I'm not sure what's supposed to happen here.
Comment 10 Babblebones 2019-05-10 22:32:13 UTC
Best I can tell, and I may be wrong, the error checking code was moved from the DC part straight into DRM which now replicates the exact bug which was reverted by the above DC commits but were never implemented for DRM.

https://www.st.com/en/touch-and-display-controllers/stdp8028.html

My dreamcolor display uses a STDP8028 chip, istting inbetween the display and the motherboard just behind the screen, to convert a displayport signal coming off the board into a dual channel LVDS to run the display.

https://www.st.com/en/touch-and-display-controllers/stdp8028.html

The EDID can't be read through this for some reason and it doesn't print any modelines at all for the display so it picks the lowest resolution possible and all the timings are incorrect resulting in the display scramble.

I hope the behavior highlighted in the above commit can help someone search for the regression in the new DRM mode setting as it produces the exact same type of scramble and lack of modelines.
Comment 11 Babblebones 2019-05-15 05:02:14 UTC
Created attachment 144277 [details]
EDID file

Don't know if this helps but ALL kernels seem affected by not being able to grab EDID on startup but the changes after 4.18 break something further as to make the display unusable by changing the default behavior of the panel's mode without EDID.

[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.18.20 root=/dev/sdb2 ro amdgpu.ppfeaturemask=0xffffffff amdgpu.dpm=1 amdgpu.dc=1 amdgpu.gpu_recovery=1 amdgpu.powerplay=1 drm.edid_firmware=eDP-1:edid/edid.bin
[    4.340989] [drm] Got external EDID base block and 0 extensions from "edid/edid.bin" for connector "eDP-1"
[    4.451624]  drm_do_probe_ddc_edid+0xb9/0x130
[    4.451628]  ? drm_edid_block_valid+0x180/0x180
[    4.451629]  drm_do_get_edid+0xb1/0x330
[    4.451631]  drm_get_edid+0x61/0x380
[    4.451671]  dm_helpers_read_local_edid+0x4c/0xe0 [amdgpu]
[    4.630956]  drm_do_probe_ddc_edid+0xb9/0x130
[    4.630966]  ? drm_edid_block_valid+0x180/0x180
[    4.630969]  drm_do_get_edid+0xb1/0x330
[    4.630972]  drm_get_edid+0x61/0x380
[    4.631115]  dm_helpers_read_local_edid+0x4c/0xe0 [amdgpu]
[    4.801879]  drm_do_probe_ddc_edid+0xb9/0x130
[    4.801889]  ? drm_edid_block_valid+0x180/0x180
[    4.801892]  drm_do_get_edid+0xb1/0x330
[    4.801895]  drm_get_edid+0x61/0x380
[    4.802006]  dm_helpers_read_local_edid+0x4c/0xe0 [amdgpu]
[    4.972889]  drm_do_probe_ddc_edid+0xb9/0x130
[    4.972900]  ? drm_edid_block_valid+0x180/0x180
[    4.972903]  drm_do_get_edid+0xb1/0x330
[    4.972907]  drm_get_edid+0x61/0x380
[    4.973028]  dm_helpers_read_local_edid+0x4c/0xe0 [amdgpu]
[    5.145825]  drm_do_probe_ddc_edid+0xb9/0x130
[    5.145835]  ? drm_edid_block_valid+0x180/0x180
[    5.145837]  drm_do_get_edid+0xb1/0x330
[    5.145841]  drm_get_edid+0x61/0x380
[    5.145942]  dm_helpers_read_local_edid+0x4c/0xe0 [amdgpu]
[    5.170556] [drm:dc_link_detect [amdgpu]] *ERROR* No EDID read.

Is what happens when I set the edid manually from the kernel commandline.

Setting it manually freaks out newer kernels and my display won't modeset properly making it a mess but on 4.18 it seemed to drop this.

Any debug patches I can run to help you guys figure it out?
I have included my EDID file if you want to run it through anything to see what breaks.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.