Summary: | [KBL-R] DP link training failures lead to downgraded link parameters and resolution | ||
---|---|---|---|
Product: | DRI | Reporter: | Ricardo Ribalda <ricardo.ribalda> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | RESOLVED MOVED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | low | CC: | intel-gfx-bugs, jani.nikula, manasi.d.navare, shashank.sharma, ville.syrjala |
Version: | DRI git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | Triaged, ReadyForDev | ||
i915 platform: | KBL | i915 features: | display/DP |
Attachments: |
Created attachment 139074 [details]
Screen detected as 1440p
Just tested on drm-tip with similar results But the attached patch fixes the bug for me. So far I have only seen 1 retry, so we could tune it to: tries < 2... Created attachment 139077 [details] [review] Workaround for this issue Created attachment 139078 [details]
dmesg for drm tip
Created attachment 139079 [details]
Screen detected as 1440p using drm tip and drm.debug=6
Created attachment 139080 [details]
Screen detected as HD using drm tip and drm.debug=6
Document CPU: CPU0: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz (family: 0x6, model: 0x8e, stepping: 0xa) => KBL-R. Ville, Jani, any advice here? For some reason the link training sometimes fails at the higher link bandwidth, and, as expected, the link is downgraded to ensure we get something other than a black screen. While I'm sure the degraded resolution is very annoying, it's surely more desirable than a black screen. Needs further analysis of the link training failure. Side note, you seem to be missing the DMC firmware, available from the linux-firmware repository. Probably unrelated to the problem at hand, but, well, you never know. Hi Jani Thanks for your comment. I believe that with the proposed workaround there won't be a black screen situation, on the worst case and when the link is bad, it will be 5 times slower on the link training. We can even reduce that to two times. I have never seen my system failing training twice in a row. What proposed workaround? This patch https://bugs.freedesktop.org/attachment.cgi?id=139077(In reply to Jani Nikula from comment #12) > What proposed workaround? This patch https://bugs.freedesktop.org/attachment.cgi?id=139077 Ville, Jani how does that patch mentioned looks like? I just got a situation where I got 2 and 3 failedd attempts before an ok sync. So it seems that 5 tries it not that bad guess. BTW, in bot situations the amount of black screen time was within reason. Ricardo, Do you have the model number of the Xiaomi dongle? We need to find the vendor of the LSPCON in the dongle. Nevermind. It looks like they only make one product that is a USB-C hub with HDMI output. Xiaomi USB Type-C to HDMI Multifunction Adapter Is this the device you are using? This is the device: http://item.mi.com/1163000011.html On the device it can be read, among other Chinese characters: USB-C HDMI Model: ZJQ01TM https://photos.app.goo.gl/biCnqgD1RJCRjrrR2 One of the reasons to use this adapter was to be able to charge the notebook while using the adapter. Seems that it is a "proprietary" feature and other adapters might not support it. But I am not an expert on cables :P. Retrying the Clock recovery phase 5 times without giving up is what the hack used to do earlier before it was removed for DP compliance passing. But looks like there are more non compliant panels than the compliant ones and may be we need that hack permanently so that the driver doesnt fallback the link parameters way too quickly. This same work around probably could be tried for : https://bugs.freedesktop.org/show_bug.cgi?id=105338 Manasi This issue could also manifest if link training is being attempted before the USB-C Hub is ready to accept a signal. A small delay before attempting Link Training with a USB-C LSPCON may work. Ricardo, Could you try adding a short delay msleep (200) to the start of intel_dp_start_link_train() and see if your retries drops to 0? (In reply to Clinton Taylor from comment #21) > This issue could also manifest if link training is being attempted before > the USB-C Hub is ready to accept a signal. A small delay before attempting > Link Training with a USB-C LSPCON may work. > > Ricardo, > Could you try adding a short delay msleep (200) to the start of > intel_dp_start_link_train() and see if your retries drops to 0? That did not do the trick :(, also it does not explain why I got a couple of times more than one retries. @@ -317,10 +317,18 @@ void intel_dp_start_link_train(struct intel_dp *intel_dp) { struct intel_connector *intel_connector = intel_dp->attached_connector; + int tries; - if (!intel_dp_link_training_clock_recovery(intel_dp)) - goto failure_handling; - if (!intel_dp_link_training_channel_equalization(intel_dp)) + msleep(200); + + for (tries = 0; tries < 5; tries++) { + if (!intel_dp_link_training_clock_recovery(intel_dp)) + continue; + if (intel_dp_link_training_channel_equalization(intel_dp)) + break; + } + + if (tries == 5) goto failure_handling; DRM_DEBUG_KMS("[CONNECTOR:%d:%s] Link Training Passed at Link Rate = %d, Lane count = %d", The msleep(200); should be before the first clock recovery call. Right after the int tries; I'm trying to give extra time for the LSPCON and micro controller in the USB-C dongle to complete its initialization before we try and use it. Clock Recovery is a very basic function and should not be causing this much issue. Equalization I can see compatibility issue, but not clock recovery. (In reply to Clinton Taylor from comment #23) > The msleep(200); should be before the first clock recovery call. Right after > the int tries; > > I'm trying to give extra time for the LSPCON and micro controller in the > USB-C dongle to complete its initialization before we try and use it. Clock > Recovery is a very basic function and should not be causing this much issue. > Equalization I can see compatibility issue, but not clock recovery. I believe I placed the msleep where you are saying: This is the code that I tested: intel_dp_start_link_train(struct intel_dp *intel_dp) { struct intel_connector *intel_connector = intel_dp->attached_connector; int tries; msleep(200); for (tries = 0; tries < 5; tries++) { if (!intel_dp_link_training_clock_recovery(intel_dp)) continue; if (intel_dp_link_training_channel_equalization(intel_dp)) break; } ... Also, I am never disconecting the usb-C dongle, I am reconnecting the hdmi cable, or just going to the text console and then back to X (In reply to Ricardo Ribalda from comment #25) > Also, I am never disconecting the usb-C dongle, I am reconnecting the hdmi > cable, or just going to the text console and then back to X The USB-C device should be completely ready to go before HPD is asserted to the SOC. This is just a test to try and simplify a possible quirk for this particular device. The location of the msleep() in comment 24 is now correct for this test. Thanks for helping with the debug. +Shashank. I dont think we have HDMI over USB-C (HDMI alt mode) yet, definitely not on KBL. This issue is not related to LSPCON. - Shashank (In reply to shashank.sharma@intel.com from comment #28) > I dont think we have HDMI over USB-C (HDMI alt mode) yet, definitely not on > KBL. This issue is not related to LSPCON. I think Clinton isn't talking about on-board LSPCON, but rather LSPCON embedded in a dongle or a cable, converting the USB Type-C DP Alt Mode to HDMI. One of the questions is, do we need to treat these things as some special snowflakes? (In reply to Ricardo Ribalda from comment #19) > One of the reasons to use this adapter was to be able to charge the notebook > while using the adapter. Does charging vs. not charging make a difference for link training? (In reply to Ricardo Ribalda from comment #18) > This is the device: > http://item.mi.com/1163000011.html And do you have other devices connected to the adapter? Do they make a difference? IIRC the DP alt mode spec allows for 2 or 4 lane configurations, dynamically allocating 2 lanes to other USB needs. (In reply to Jani Nikula from comment #31) > (In reply to Ricardo Ribalda from comment #18) > > This is the device: > > http://item.mi.com/1163000011.html > > And do you have other devices connected to the adapter? Do they make a > difference? > > IIRC the DP alt mode spec allows for 2 or 4 lane configurations, dynamically > allocating 2 lanes to other USB needs. I have experienced wrong sync with all the combinations :( -With and without usb -With and without charging (In reply to Jani Nikula from comment #29) > (In reply to shashank.sharma@intel.com from comment #28) > > I dont think we have HDMI over USB-C (HDMI alt mode) yet, definitely not on > > KBL. This issue is not related to LSPCON. > > I think Clinton isn't talking about on-board LSPCON, but rather LSPCON > embedded in a dongle or a cable, converting the USB Type-C DP Alt Mode to > HDMI. > > One of the questions is, do we need to treat these things as some special > snowflakes? This depends on the cable/HW specs. MCA/Parade LSPCONs are motherboard down config, and they need proper probing and enabling, I hope this device knows what its doing :-). - Shashank From the logs: [ 51.556954] [drm:drm_dp_read_desc [drm_kms_helper]] DP branch: OUI 00-1c-f8 dev-ID 176GB0 HW-rev 1.0 SW-rev 7.38 quirks 0x0000 That's Parade OUI I think. (In reply to Jani Nikula from comment #34) > From the logs: > > [ 51.556954] [drm:drm_dp_read_desc [drm_kms_helper]] DP branch: OUI > 00-1c-f8 dev-ID 176GB0 HW-rev 1.0 SW-rev 7.38 quirks 0x0000 > > That's Parade OUI I think. And dev-ID smells like https://www.paradetech.com/products/ps176/ (In reply to Jani Nikula from comment #34) > From the logs: > > [ 51.556954] [drm:drm_dp_read_desc [drm_kms_helper]] DP branch: OUI > 00-1c-f8 dev-ID 176GB0 HW-rev 1.0 SW-rev 7.38 quirks 0x0000 > > That's Parade OUI I think. Yes this is Parade OUI, but Parade LSPCON device is PS175. From the comment here, this looks like PS176, so probably an external device or cable maybe. - Shashank So then do we need a quirk specific to this OUI? Quick update, I have seen a couple of times that it needed more than 5 retries :S. I have updated my patch to 10 retries. Cheers Since there has been no proper solution for three months. Can we just apply the proposed patch? If it breaks a standard we could just add a kernel parameter to allow it. Cheers! (In reply to Ricardo Ribalda from comment #39) > Since there has been no proper solution for three months. Can we just apply > the proposed patch? Have you tried with latest drm-tip which might help in this case? We keep it open for now and consider to fix when we see similar issues in future. Changing the priority to low. When we see similar issue again, we can consider to change the priority of this bug. I have tried with a different screen (asus PB278) and a different notebook (Thinkpad T420) and I am getting the same results. A simple google search shows other people with similar problem: https://askubuntu.com/questions/581574/ubuntu-14-low-screen-resolution-on-intel-hd-display I believe that people simply does not know where to ask for help and simply use their screens at lower resolutions. And not that many people know how to patch and rebuild their kernels (or do not want to compile it for over an hour every month). I can make a patch that enables multi tries on intel_dp_start_link_train via debugfs, that api does not need to be maintained. I will post to the forums if enabling that option fix their issue and then you can have more information for setting the bug priority. Ricardo, does the patch with 5 retries of link training (clock recovery plus channel EQ) still work for you to fix this issue? In that case, the patch will have to be modified to add a quirk for this specific dongle OUI to add this retry loop outside of the DP spec only in case of this dongle. This dongle seems to not follow the DP spec and needs more retries for clock recovery. Manasi Created attachment 145386 [details]
dmesg logs for the issue where "Link Training failed at link rate = 540000"
I am able to reproduce the issue on WHL device - when two 4K external monitors are connected, one 4K monitor is trained as 2K sometimes. I am connecting monitors using (USB-C to DP) USB-C and (HDMI to HDMI) HDMI cables. Here are the error messages that I am seeing: Line 2230: [ 841.162692] [drm:intel_dp_start_link_train] Channel equalization failed 5 times Line 2231: [ 841.162846] [drm:intel_dp_start_link_train] [CONNECTOR:90:DP-2] Link Training failed at link rate = 540000, lane count = 4 I was able to reproduce the issue on drm tip but it is very hard to reproduce like one out of 50 times. I tried the patch that is specified in the above comments and couldn't reproduce the bug. However, monitors take a longer time to train. Attached the log in my previous comment for more details. I am able to reproduce the issue on WHL device - when two 4K external monitors are connected, one 4K monitor is trained as 2K sometimes. I am connecting monitors using (USB-C to DP) USB-C and (HDMI to HDMI) HDMI cables. Here are the error messages that I am seeing: Line 2230: [ 841.162692] [drm:intel_dp_start_link_train] Channel equalization failed 5 times Line 2231: [ 841.162846] [drm:intel_dp_start_link_train] [CONNECTOR:90:DP-2] Link Training failed at link rate = 540000, lane count = 4 I was able to reproduce the issue on drm tip but it is very hard to reproduce like one out of 50 times. I tried the patch that is specified in the above comments and couldn't reproduce the bug. However, monitors take a longer time to train. Attached the log in my previous comment for more details. The patch still works for me. Agree on the quirk, do you have a example of other OUI with quirk so I can use it as reference? Created attachment 145405 [details]
dmesg logs for the issue where "Link Training failed at link rate = 540000"
Adding a full log for the issue when 4K monitor is being trained as 2K.
When I connect Dell monitor using USB-C to DP cable and Samsung monitor using HDMI cable to DUT, the issue is not reproducible. However, when I have Dell connected to DUT using HDMI and Samsung connected to DUT using USB-C to DP, Samsung monitor resolution changes to 2K instead of 4K. Another data point, using LG-27UD88-W monitor with USB-C to DP cable and Dell-P2715Q with HDMI cable, the issue is not reproducible. Created attachment 145407 [details]
dmesg logs for the issue where "Link Training failed at link rate = 540000"
You are reporter of the issue currently having low priority. Do you still see issue. If so, please spesify clearly what is impact to you. On the next week I can try with the same hardware again. (In reply to Jani Saarinen from comment #51) > You are reporter of the issue currently having low priority. Do you still > see issue. If so, please spesify clearly what is impact to you. I still see the issue when I use the USB-C adapter to HDMI. I need all the resolution (2560x1440), so I have ended up buying a USB-C adapter to mDP, which seems more stable. I think that a version of the proposed patches should be merged -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/111. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 139073 [details] Screen detected as HD Tested on Debian 4.15 and 4.16 (from experimental) I have a secondary monitor connected via USB-C adapter to HDMI (detected as DP by xrandr). It can manage resolutions up to 2560x1440. Most of the time, when the system is booted the resolution is detected ok, but If I suspend the machine, or replug the screen, or alternate to the text console, the resolution is "downgraded" to Full HD. It is very anonying to reconect the cable up to 5 times to get the expected resolution when this happens. I have added the paramter drm.debug=0x06 and I can see that when it is on Full HD there are only 2 lanes detected instead of 4. The adapter is brand new (Xiaomi) and the cable should be of good quality (ethernet capable and tested on other platform). Attached you will find a trace at Full HD and 1440p