Bug 96436

Summary: [BDW/HSW/SNB/IVB][drm:intel_dp_link_training_clock_recovery] *ERROR* too many voltage retries, give up / failure to train DP
Product: DRI Reporter: Martin Steigerwald <Martin>
Component: DRM/IntelAssignee: Elio <elio.martinez.monroy>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: chris.cheney, colin, eugene.shatokhin, humberto.i.perez.rodriguez, intel-gfx-bugs, jani.nikula, jeroko, leho, mchehab, patrik.lundquist, peter, solstag
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: BDW, HSW, IVB, SNB i915 features: display/DP
Attachments:
Description Flags
pm_rpm-results.json
none
pm_rpm-kern.log none

Description Martin Steigerwald 2016-06-08 06:38:21 UTC
I know there is bug #70117, yet, it is about issues with older kernels. This issue started to happen with about kernel 4.5 (maybe 4.4, but I think it was 4.5), currently running 4.6 kernel. Also it seems to fail training the DP completely in my case. Also bug #83516, bug #88919 bug #92878 and other possibly related bugs are way older.

I basically get this:

[70574.696129] [drm:intel_dp_link_training_clock_recovery] *ERROR* too many voltage retries, give up
[70574.703576] [drm:intel_dp_link_training_clock_recovery] *ERROR* too many voltage retries, give up
[70574.711020] [drm:intel_dp_link_training_clock_recovery] *ERROR* too many voltage retries, give up
[70574.718444] [drm:intel_dp_link_training_clock_recovery] *ERROR* too many voltage retries, give up
[70574.725864] [drm:intel_dp_link_training_clock_recovery] *ERROR* too many voltage retries, give up
[70574.733275] [drm:intel_dp_link_training_clock_recovery] *ERROR* too many voltage retries, give up
[70574.740657] [drm:intel_dp_link_training_clock_recovery] *ERROR* too many voltage retries, give up
[70574.740845] [drm:intel_dp_start_link_train] *ERROR* failed to train DP, aborting

repeatedly.

This happens on a ThinkPad T520 with Sandybridge i5-2520M with one external display at my home office, yet I never saw this with the display in my office so far. It happens often on hibernating to disk with in kernel software suspend, there it displays this message between every 10% progress upgrade on saving the image to disk.

Now it happened also after resuming from hibernation. The external display was not active and I found these messages looping in dmesg and the mousepointer jerky in regular intervals.

I understand you may be interested in drm.debug=0xe, but it might take some time till I can catch such an output, as its mostly happening on hibernation the message and it doesn´t log this to disk at this stage. How much of an performance impact does it have to run the kernel with that debug enabled for a longer time during usual work?
Comment 1 Martin Steigerwald 2016-06-08 06:45:23 UTC
External display at home office is Fujitsu P24T-7, 24 inch, Full HD. Laptop had full HD internal display. Hmmm, one further difference between office setup, where I didn´t see this and home office setup where I see this issue is: The ThinkPad is docked at home with a Minidock Plus Series 3. I can plug the interface to the DVI port of the ThinkPad itself for testing, if you want.

Also what worked as I had this now after resume from hibernation was to disable the external screen in Plasma´s systemsettings and reenable it again. Since the the external screen works okay. The DP connection is generally a bit flaky, sometimes even on some movement of the table or so the display becomes black for a short moment, but it usually recovers from that. Maybe a better cable could help?


# xrandr
Screen 0: minimum 320 x 200, current 3840 x 1080, maximum 8192 x 8192
LVDS1 connected primary 1920x1080+0+0 (normal left inverted right x axis y axis) 344mm x 193mm
   1920x1080     60.00*+  59.93    50.00  
   1680x1050     59.95    59.88  
   1600x1024     60.17  
   1400x1050     59.98  
   1280x1024     60.02  
   1440x900      59.89  
   1280x960      60.00  
   1360x768      59.80    59.96  
   1152x864      60.00  
   1024x768      60.00  
   800x600       60.32    56.25  
   640x480       59.94  
VGA1 disconnected (normal left inverted right x axis y axis)
HDMI1 disconnected (normal left inverted right x axis y axis)
DP1 disconnected (normal left inverted right x axis y axis)
HDMI2 disconnected (normal left inverted right x axis y axis)
HDMI3 disconnected (normal left inverted right x axis y axis)
DP2 connected 1920x1080+1920+0 (normal left inverted right x axis y axis) 531mm x 299mm
   1920x1080     60.00*+  50.00    50.00    59.94  
   1920x1080i    60.00    50.00    59.94  
   1680x1050     59.95  
   1600x900      60.00  
   1280x1024     75.02    60.02  
   1440x900      59.89  
   1280x720      60.00    50.00    59.94  
   1024x768      75.03    60.00  
   800x600       75.00    60.32  
   720x576       50.00  
   720x480       60.00    59.94  
   640x480       75.00    60.00    59.94  
   720x400       70.08  
DP3 disconnected (normal left inverted right x axis y axis)


# phoronix-test-suite system-info

Phoronix Test Suite v5.2.1
System Information

Hardware:
Processor: Intel Core i5-2520M @ 3.20GHz (4 Cores), Motherboard: LENOVO 42433WG, Chipset: Intel 2nd Generation Core Family DRAM, Memory: 16384MB, Disk: 300GB INTEL SSDSA2CW30 + 480GB Crucial_CT480M50, Graphics: Intel HD 3000 (1300MHz), Audio: Conexant CX20590, Monitor: P24T-7 LED, Network: Intel 82579LM Gigabit Connection + Intel Centrino Advanced-N 6205

Software:
OS: Debian unstable, Kernel: 4.6.0-tp520-btrfstrim+ (x86_64), Desktop: KDE Frameworks 5, Display Server: X Server 1.18.3, Display Driver: intel 2.99.917, OpenGL: 3.3 Mesa 11.2.2, Compiler: GCC 5.4.0 20160603, File-System: btrfs, Screen Resolution: 3840x1080
Comment 2 yann 2016-06-08 17:28:45 UTC
*** Bug 70117 has been marked as a duplicate of this bug. ***
Comment 3 yann 2016-06-08 17:29:59 UTC
*** Bug 83516 has been marked as a duplicate of this bug. ***
Comment 4 yann 2016-06-08 17:30:24 UTC
*** Bug 88919 has been marked as a duplicate of this bug. ***
Comment 5 yann 2016-06-08 17:35:25 UTC
Jim and Maarten, please use this bug as main bug for this topic. We had 6 bugs with same root cause so taking this one as main one.
Comment 6 yann 2016-06-08 17:38:05 UTC
*** Bug 92878 has been marked as a duplicate of this bug. ***
Comment 7 yann 2016-06-08 17:42:21 UTC
*** Bug 96229 has been marked as a duplicate of this bug. ***
Comment 8 Peter Wu 2016-06-08 22:05:17 UTC
bug 92878 was closed as duplicate of this one, but it contains a reproducer and logs in https://bugs.freedesktop.org/show_bug.cgi?id=92878#c18. That bug has worse effects, the screen is simply blacked out.

Meanwhile I discovered that this also helps to revive the screen:
 xset dpms force off; xset dpms force on

This was tested with Linux 4.6.1-1 (Arch Linux), also KDE Plasma 5. CPU/GPU is a i7-6700HQ.
Comment 9 cprigent 2016-06-16 11:59:15 UTC
Created attachment 124557 [details]
pm_rpm-results.json

I reproduced it on APL by executing pm_rpm.
For example: ./pm_rpm --run-subtest pm-tiling

Platform: Broxton P A0 Platform 
CPU Name : Intel(R) @ 1.2 GHz (family: 6, model: 92, stepping: 8) – 4 cores
SoC : BROXTON-P A0
QDF : QYE2
CRB : Apollo Lake RVPC1 Fab1
Software 
Bios: APLKRVPA.X64.0116.R20.1512211905
KSC: 1.05
Linux distribution: Ubuntu 15.10 64 bits
Kernel: drm-intel-nightly 4.7.0-rc2 1d755f1 from http://cgit.freedesktop.org/drm-intel/
	commit 1d755f1572d845372fdb004c249b52b8ffc02535
	Author: Daniel Vetter <daniel.vetter@ffwll.ch>
	Date:   Wed Jun 15 22:21:13 2016 +0200
    	drm-intel-nightly: 2016y-06m-15d-20h-20m-54s UTC integration manifest
drm: libdrm-2.4.68 625d181 from git://anongit.freedesktop.org/mesa/drm
mesa: mesa-11.1.2 7bcd827 from git://anongit.freedesktop.org/mesa/mesa
cairo: 1.15.2 db8a7f1 from git://anongit.freedesktop.org/cairo
DMC 1.07
GuC 8.7
intel-gpu-tools 1.15 f5d370c from http://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools.git
Comment 10 cprigent 2016-06-16 12:00:28 UTC
Created attachment 124558 [details]
pm_rpm-kern.log
Comment 11 Imre Deak 2016-06-16 17:09:47 UTC
(In reply to cprigent from comment #10)
> Created attachment 124558 [details]
> pm_rpm-kern.log

Could you try the following on APL:
https://lists.freedesktop.org/archives/intel-gfx/2016-June/098595.html
Comment 12 Imre Deak 2016-06-22 16:48:19 UTC
(In reply to Imre Deak from comment #11)
> (In reply to cprigent from comment #10)
> > Created attachment 124558 [details]
> > pm_rpm-kern.log
> 
> Could you try the following on APL:
> https://lists.freedesktop.org/archives/intel-gfx/2016-June/098595.html

The patchset is merged now, please retry with -nightly on APL.
Comment 13 Imre Deak 2016-06-27 12:35:19 UTC
(In reply to Imre Deak from comment #12)
> (In reply to Imre Deak from comment #11)
> > (In reply to cprigent from comment #10)
> > > Created attachment 124558 [details]
> > > pm_rpm-kern.log
> > 
> > Could you try the following on APL:
> > https://lists.freedesktop.org/archives/intel-gfx/2016-June/098595.html
> 
> The patchset is merged now, please retry with -nightly on APL.

Removing APL based on the above.
Comment 14 Ricardo 2017-02-22 16:35:25 UTC
Christophe seems that a patch was provided, can you please retest
Comment 15 cprigent 2017-03-10 14:32:15 UTC
I confirm this is not reproduced on APL with boot, suspend to disk and IGT test pm_rpm@pm-tiling.

Tested with:
Platform BXT-P: APL system
CPU Name : Intel(R) Genuine Processor @ 1.1 GHz (family: 6, model: 12, stepping: 9) 4 cores
QDF : Q6HE
SoC : B1
CRB : Apollo Lake DDR3L RVP1A FAB2
Reworks: R19, R20

Software 
Bios: 144_B10 APLK_B0_IFWI_X64_R_2016_06_27_0956_SPI_RVP1.bin from \\gar\ec\proj\ba\CCG\APL BIOS\External\BIOS_Release\Daily\v144_10_2016_WW27.1\IFWI\IFWI_RVP1_Release\IFWI
KSC: 1.15
Linux distribution: Ubuntu 16.04 64 bits
DMC 1.07
GuC 8.7

Kernel: 4.11.0-rc1 e060007 branch drm-tip from https://cgit.freedesktop.org/drm-tip
  commit e06000745435e65b4c056fe8f5bf149b298a0526
  Author: Chris Wilson <chris@chris-wilson.co.uk>
  Date:   Mon Mar 6 14:40:03 2017 +0000
  drm-tip: 2017y-03m-06d-14h-39m-38s UTC integration manifest

libdrm-2.4.75-10 f6499b1 from git://anongit.freedesktop.org/mesa/drm
mesa: mesa-17.0.0 683462e from git://anongit.freedesktop.org/mesa/mesa
cairo 1.15.4 9fe6683 from git://anongit.freedesktop.org/cairo
xorg-server-1.19.0-125 7d7788e from git://git.freedesktop.org/git/xorg/xserver
xf86-video-intel 2.99.917-758 860c366 from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel
libva-1.7.3.pre1-84 e613327 from git://git.freedesktop.org/git/vaapi/libva 
vaapi-intel-driver: 1.7.3-325 03a86fc from git://git.freedesktop.org/git/vaapi/intel-driver
intel-gpu-tools-1.17-261 8f3164f from http://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools.git

External screens: ASUS PB238Q (HDMI), LG 25UM55D (DP)
Comment 16 cprigent 2017-03-10 14:33:02 UTC
Reassigned to Elio to check on the other platforms before closing it.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.