Bug 107489 - [SKL] Sometimes fails to DPMS on
Summary: [SKL] Sometimes fails to DPMS on
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Manasi
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: Triaged, ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-05 21:22 UTC by Matt Turner
Modified: 2018-11-06 18:33 UTC (History)
3 users (show)

See Also:
i915 platform: SKL
i915 features: display/DP


Attachments
drm-debug.log.gz (139.11 KB, application/gzip)
2018-08-14 20:51 UTC, Matt Turner
no flags Details
drm-debug.log (3.17 MB, text/plain)
2018-08-15 06:30 UTC, Jani Saarinen
no flags Details
drm-debug.log with Manasi's patch (282.93 KB, text/plain)
2018-08-21 22:46 UTC, Matt Turner
no flags Details
version 2 of test patch (4.95 KB, patch)
2018-08-22 01:22 UTC, Manasi
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Turner 2018-08-05 21:22:06 UTC
On my Lenovo P50, I'd guess 1/4 of the time it attempts to DPMS on, it fails to modeset, leaving me with a black screen.

In my dmesg I see:

[drm:intel_dp_start_link_train] *ERROR* [CONNECTOR:71:eDP-1] Link Training failed at link rate = 540000, lane count = 4

This has occurred with every kernel I've used on this machine, 4.9 to 4.17.5.

Typically closing the lid and letting the machine suspend/resume is sufficient to get it back.

# lspci -s 0:02 -nn
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics P530 [8086:191d] (rev 06)

# xrandr -q
Screen 0: minimum 320 x 200, current 3840 x 2160, maximum 8192 x 8192
eDP-1 connected primary 3840x2160+0+0 (normal left inverted right x axis y axis) 346mm x 194mm
   3840x2160     60.00*+

Running xserver-1.19.5 and the modesetting driver.
Comment 1 Francesco Balestrieri 2018-08-09 08:23:23 UTC
Could you try to reproduce the error using drm-tip (https://cgit.freedesktop.org/drm-tip) and kernel parameters drm.debug=0x1e log_buf_len=4M, and if the problem persists attach the full dmesg from boot. Thanks!
Comment 2 Jani Saarinen 2018-08-13 09:59:30 UTC
Matt, any luck testing latest drm-tip?
Comment 3 Matt Turner 2018-08-13 15:41:20 UTC
To be honest, I'm loathe to test drm-tip without some evidence that it might actually fix things. I've had this problem with 8 major kernel versions dating back to 2016.
Comment 4 Jani Saarinen 2018-08-14 06:29:35 UTC
Imre, Manasi, any help here?
Comment 5 Imre Deak 2018-08-14 11:45:58 UTC
(In reply to Matt Turner from comment #3)
> To be honest, I'm loathe to test drm-tip without some evidence that it might
> actually fix things. I've had this problem with 8 major kernel versions
> dating back to 2016.

Matt, could you still provide the full drm.debug=0x1e dmesg log with your current kernel up to the dpms off?
Comment 6 Imre Deak 2018-08-14 11:47:18 UTC
(In reply to Imre Deak from comment #5)
> (In reply to Matt Turner from comment #3)
> > To be honest, I'm loathe to test drm-tip without some evidence that it might
> > actually fix things. I've had this problem with 8 major kernel versions
> > dating back to 2016.
> 
> Matt, could you still provide the full drm.debug=0x1e dmesg log with your
> current kernel up to the dpms off?

I mean dpms off/on.
Comment 7 Matt Turner 2018-08-14 20:51:39 UTC
Created attachment 141086 [details]
drm-debug.log.gz

The attached log was captured with drm.debug=0x1e log_buf_len=4M

Snippet of relevant part (I think):

Aug 14 13:44:32 p50 kernel: [drm:edp_panel_on] Wait for panel power on
Aug 14 13:44:32 p50 kernel: [drm:wait_panel_status] mask b000000f value 80000008 status 0000000a control 00000003
Aug 14 13:44:32 p50 kernel: [drm:gen8_de_irq_handler] hotplug event received, stat 0x01000000, dig 0x12101010, pins 0x00000010
Aug 14 13:44:32 p50 kernel: [drm:intel_hpd_irq_handler] digital hpd port A - long
Aug 14 13:44:32 p50 kernel: [drm:intel_hpd_irq_handler] Received HPD interrupt on PIN 4 - cnt: 0
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_hpd_pulse] ignoring long hpd on eDP port A
Aug 14 13:44:32 p50 kernel: [drm:wait_panel_status] Wait complete
Aug 14 13:44:32 p50 kernel: [drm:intel_power_well_enable] enabling DDI A/E IO power well
Aug 14 13:44:32 p50 kernel: [drm:edp_panel_vdd_on] Turning eDP port A VDD on
Aug 14 13:44:32 p50 kernel: [drm:edp_panel_vdd_on] PP_STATUS: 0x80000008 PP_CONTROL: 0x0000000b
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using signal levels 00000000
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using vswing level 0
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using pre-emphasis level 0
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_program_link_training_pattern] Using DP training pattern TPS1
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_start_link_train] clock recovery OK
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_start_link_train] 5.4 Gbps link rate without sink TPS3 support
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_program_link_training_pattern] Using DP training pattern TPS2
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using signal levels 08000000
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using vswing level 2
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using pre-emphasis level 1
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using signal levels 08000000
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using vswing level 2
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using pre-emphasis level 1
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using signal levels 08000000
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using vswing level 2
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using pre-emphasis level 1
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using signal levels 08000000
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using vswing level 2
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using pre-emphasis level 1
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using signal levels 08000000
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using vswing level 2
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_set_signal_levels] Using pre-emphasis level 1
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_dump_link_status] ln0_1:0x11 ln2_3:0x11 align:0x0 sink:0x0 adj_req0_1:0x66 adj_req2_3:0x66
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_start_link_train] Channel equalization failed 5 times
Aug 14 13:44:32 p50 kernel: [drm:intel_dp_start_link_train] *ERROR* [CONNECTOR:71:eDP-1] Link Training failed at link rate = 540000, lane count = 4
Comment 8 Jani Saarinen 2018-08-15 06:30:03 UTC
Created attachment 141099 [details]
drm-debug.log

debug with plain text
Comment 9 Matt Turner 2018-08-19 21:55:34 UTC
I'm currently testing Manasi's https://patchwork.freedesktop.org/patch/223573/ on top of 4.17.5.
Comment 10 Matt Turner 2018-08-21 22:40:40 UTC
With that patch, the results have been markedly better. After a few days of use, I just got my first black screen on DPMS on. I don't see any messages about link training failing in journalctl. Not sure I can spot any errors in the log.
Comment 11 Matt Turner 2018-08-21 22:46:36 UTC
Created attachment 141227 [details]
drm-debug.log with Manasi's patch

Actually, found it, I was wrong. I see the same Link Training failed message. The attached log contains both the failure and the 'Link Training Passed' message after I suspended the laptop.
Comment 12 Manasi 2018-08-22 01:22:42 UTC
Created attachment 141229 [details] [review]
version 2 of test patch
Comment 13 Manasi 2018-08-22 01:24:27 UTC
In this case the patch doesnt have any effect and kernel fails to send a uevent to userspace asking to retrain. So uploaded v2 of the patch that unconditionally sends uevent as per Chris Wilson's suggestion.
Userspace will redo modeset a number of times until it decides to give up.
Please try v2 uploaded.

Manasi
Comment 14 Matt Turner 2018-08-29 15:49:53 UTC
Thank you so much. v2 seems to have solved the problem! I've been using it for a week on top of 4.17.5 and have not seen the issue once.

Please have a 

Tested-by: Matt Turner <mattst88@gmail.com>
Comment 15 Manasi 2018-08-29 16:14:06 UTC
Thats great! I will submit the patch v2 to the M-L and add your tested by tag to it

Manasi
Comment 16 Jani Nikula 2018-10-24 07:33:37 UTC
Presumed fixed by

commit 1e712535c51ab025ebc776d4405683d81521996d
Author: Manasi Navare <manasi.d.navare@intel.com>
Date:   Tue Oct 9 14:28:04 2018 -0700

    drm/i915/dp: Link train Fallback on eDP only if fallback link BW can fit panel's native mode

thanks for the report and testing.
Comment 17 nanericwang 2018-10-28 05:49:57 UTC
Hi Intel team,

I have a similar bug created at https://bugzilla.kernel.org/show_bug.cgi?id=201547
, but I'm not sure if it has the same root cause as this one here.

I attached dmesg in bug #201547 (https://bugzilla.kernel.org/attachment.cgi?id=279233), with kernel parameters drm.debug=0x1e log_buf_len=4M.

Can you help to confirm?

Thanks a lot!
Comment 18 James Ausmus 2018-11-06 18:33:43 UTC
From the linked kernel log, I'm not seeing any of the same "[drm:intel_dp_start_link_train] *ERROR* [CONNECTOR:71:eDP-1] Link Training failed at link rate = 540000, lane count = 4" type of messages.

Please try with drm-tip, and see if your problem persists. If it does, please open a new FDO bug with your dmesg output from drm-tip with drm.debug=0xe


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.