Bug 107281

Summary: Some DP MST hubs cause the driver to get stuck in link training clock recovery loop on hot-plugging
Product: DRI Reporter: Nathan Ciobanu <nathan.d.ciobanu>
Component: DRM/IntelAssignee: Dhinakaran Pandiyan <dhinakaran.pandiyan>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: DRI git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard: Triaged, ReadyForDev
i915 platform: KBL i915 features: display/DP MST
Attachments:
Description Flags
Instrumented driver dmesg none

Description Nathan Ciobanu 2018-07-18 18:53:23 UTC
Created attachment 140705 [details]
Instrumented driver dmesg

The i915 driver can get stuck in the link training clock recovery infinite loop when a USB-C DP MST hub with two DP displays is hot-plugged. 

Hardware:
- USB-C DP 1.2 MST hub with 2xDP connectors, 2xUSB2.0 ports, 1xEthernet port, 1 USB-C PD (pass-through power)
- Dell P2715Qt 3840x2160 DP display
- HP ZR24w 1920x1200 DP display
- KBL-Y Chromebook (running ChromeOS but tested with drm-tip kernel)

Steps to reproduce:
1. Connect the two displays to the MST hub: Dell in the first port, HP in the second port
1.5. The ports may not be labeled with numbers, but check the display order in the UI settings of the laptop, the order should be eDP - DP 3840x2160 - DP 1920x1200, order is important otherwise the bug will not be reproducible
2. Plug the MST Hubs USB-C connector into one of the USB-C DP ports of the laptop
3. Allow time for the driver to train the eDP and the two external displays
4. Verify the order in step 1.5 in the laptop display settings UI
5. Unplug the USB-C MST hub
6. Give the system time to recover and the eDP to be retrained
7. Plug the USB-C MST hub as described in step 2

Observed behavior:
1. The two external displays remain blank
2. The display settings UI on the laptop shows that 2 external monitors are connected
3. Subsequently hot-plugging the MST hub doesn't improve things even if the MST hub is plugged into a different USB-C port on the laptop
4. The eDP continues to work
5. Rebooting the laptop is the only solution to get the two external displays working again

Frequency:
Always (100%)

Workarounds:
- Change the order the two displays are connected on the MST hub
- Reboot laptop and do not hot-plug
Comment 1 Dhinakaran Pandiyan 2018-07-19 19:47:07 UTC
Patches submitted to list - https://patchwork.freedesktop.org/series/46797/
Comment 2 Dhinakaran Pandiyan 2018-07-26 07:45:40 UTC
commit 65172699a8bd9956705f71fb8b66b1068a1bb5cd
Author: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Date:   Wed Jul 18 10:19:43 2018 -0700

    drm/i915/mst: Continue state updates even if AUX writes fail.

    We are too late in the enabling sequence to back out cleanly, not updating
    state tracking variables, like intel_dp->active_mst_links in this
    instance, results in incorrect behaviour further along.

    v2: Fixed int v/s bool comparison

    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107281
    Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
    Reviewed-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
    Tested-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
    Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180718171943.3246-2-dhinakaran.pandiyan@intel.com

commit 45ef40aab72e21eb81147a6e8a2bca863f0234fd
Author: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Date:   Wed Jul 18 10:19:42 2018 -0700

    drm/i915/mst: Do not retrain new links

    The short pulse handler checks if channel equalization is okay and
    goes onto retrain a link if there are active MST links. This retraining
    path is not meant for new MST connections, but due to a bug elsewhere, if
    active_mst_links is < 0 the boolean check for active_mst_links passes and
    we proceed to retrain a new link. This results in a sequence of failed link
    training attempts, most likely due to the hardware not setup for link
    training at that point i.e., missing the DDI pre_enable sequence.

    [   80.301272] [drm:intel_dp_check_mst_status] channel EQ not ok, retraining
    [   80.301312] [drm:intel_ddi_prepare_link_retrain] *ERROR* Timeout waiting for DDI BUF C idle bit

    The above error gives us a hint something went wrong before link
    training started.

    Check for a positive value of active_mst_links and throw in a warning for
    invalid active_mst_links as debug aid.

    Cc: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
    Tested-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
    Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180718171943.3246-1-dhinakaran.pandiyan@intel.com
Comment 3 Radosław Szwichtenberg 2018-07-27 07:20:47 UTC
(In reply to nathan.d.ciobanu from comment #0)
> Created attachment 140705 [details]
> Instrumented driver dmesg
> 
> The i915 driver can get stuck in the link training clock recovery infinite
> loop when a USB-C DP MST hub with two DP displays is hot-plugged. 
> 
Hello!
Could you please confirm if the problem is fixed with changes prepared by Dhinakaran Pandiyan?

Thanks!
Comment 4 Nathan Ciobanu 2018-07-27 16:32:51 UTC
Yes the patches fix the bug and are also being backported to ChromeOS kernels. I also provided my Tested-by for both commits.

Thanks,
Nathan

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.