Bug 96914 - [APL / SKL] too many voltage retries / CPU pipe B FIFO underrun when suspending to disk and crash after 25 iterations
Summary: [APL / SKL] too many voltage retries / CPU pipe B FIFO underrun when suspendi...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Mauro Carvalho Chehab
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-13 13:18 UTC by cprigent
Modified: 2017-01-31 15:34 UTC (History)
1 user (show)

See Also:
i915 platform: BXT, SKL
i915 features: display/atomic, display/DP, power/suspend-resume


Attachments
dmesg-error (103.88 KB, text/plain)
2016-07-13 13:18 UTC, cprigent
no flags Details
dmesg with the patch applied on Skull Canyon (83.26 KB, text/plain)
2016-10-04 15:53 UTC, Mauro Carvalho Chehab
no flags Details

Description cprigent 2016-07-13 13:18:40 UTC
Created attachment 125052 [details]
dmesg-error

Platform: APL system
CPU Name : Intel(R) Genuine Processor @ 1.1 GHz (family: 6, model: 12, stepping: 9) 4 cores
QDF : Q6HE
SoC : B1
CRB : Apollo Lake DDR3L RVP1A FAB2
Reworks : R19, R20

Software 
Bios: 144_B10 APLK_B0_IFWI_X64_R_2016_06_27_0956_SPI_RVP1.bin from \\gar\ec\proj\ba\CCG\APL BIOS\External\BIOS_Release\Daily\v144_10_2016_WW27.1\IFWI\IFWI_RVP1_Release\IFWI
KSC: 1.15
Linux distribution: Ubuntu 16.04 64 bits
Kernel: tag drm-intel-testing-2016-07-11 4.7.0-rc6 0230e3c from http://cgit.freedesktop.org/drm-intel/
commit 0230e3c4eb76cf8f57cf40db0e908b96b84e3911
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sun Jul 10 13:24:46 2016 +0100
drm-intel-nightly: 2016y-07m-10d-12h-23m-38s UTC integration manifest
drm: libdrm-2.4.68-14 8c8d5ddfrom git://anongit.freedesktop.org/mesa/drm
mesa: mesa-11.2.2 3a9f628from git://anongit.freedesktop.org/mesa/mesa
cairo: 1.15.2 db8a7f1 from git://anongit.freedesktop.org/cairo
xserver: xorg-server-1.18.0-454 033888e from git://git.freedesktop.org/git/xorg/xserver
xf86-video-intel: 2.99.917-676 26f8ab5 from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel
libva: libva-1.7.0-26 c36971c from git://git.freedesktop.org/git/vaapi/libva
vaapi-intel-driver: 1.7.0-53 bcde10d from git://git.freedesktop.org/git/vaapi/intel-driver
GuC 8.7
DMC 1.07 from https://01.org/linuxgraphics/downloads/broxton-dmc-1.07

Steps:
------
1. Suspend to RAM and resume:
Execute commands
sudo -s 
echo mem > /sys/power/state
Wait 60 seconds
Resume with keyboard
2. Wait 30 seconds
3. Repeat previous steps several times

Actual results:
---------------
2. Log shows example:
[ 2589.334581] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
[ 2589.341383] [drm:intel_dp_link_training_clock_recovery [i915]] *ERROR* too many voltage retries, give up
3. DUT crashed after 25 iterations

Expected result:
----------------
No such error
Suspend to disk and resume works each time
Comment 1 cprigent 2016-07-13 13:19:43 UTC
Related bugs:
*ERROR* CPU pipe B FIFO underrun:
Bug 94605, Bug 94884, Bug 96226, Bug 96851

*ERROR* too many voltage retries, give up:
Bug 96436
Comment 2 Imre Deak 2016-07-22 16:27:44 UTC
Could you try the following:

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 78beb7e..3c0c2d6 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13682,12 +13682,14 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 			 */
 			intel_check_cpu_fifo_underruns(dev_priv);
 			intel_check_pch_fifo_underruns(dev_priv);
-
-			if (!crtc->state->active)
-				intel_update_watermarks(crtc);
 		}
 	}
 
+	for_each_crtc_in_state(state, crtc, old_crtc_state, i) {
+		if (needs_modeset(crtc->state) && !crtc->state->active)
+			intel_update_watermarks(crtc);
+	}
+
 	/* Only after disabling all output pipelines that will be changed can we
 	 * update the the output configuration. */
 	intel_modeset_update_crtc_state(state);
Comment 3 Imre Deak 2016-08-01 14:03:26 UTC
Ping: could you try the patch at
https://lists.freedesktop.org/archives/intel-gfx/2016-July/101462.html
Comment 4 yann 2016-08-01 14:12:45 UTC
Christophe, please try with Imre's patch (also available at https://patchwork.freedesktop.org/series/10206/)
Comment 5 Mauro Carvalho Chehab 2016-10-04 15:52:48 UTC
(In reply to yann from comment #4)
> Christophe, please try with Imre's patch (also available at
> https://patchwork.freedesktop.org/series/10206/)

I'm having the same trouble with a Skull Canyon NUC. Those are the relevant drm messages:

[    8.603825] [drm] Initialized drm 1.1.0 20060810
[    9.630681] [drm] Found 128MB of eDRAM
[    9.630906] [drm] Memory usable by graphics device = 4096M
[    9.630908] fb: switching to inteldrmfb from EFI VGA
[    9.630973] [drm] Replacing VGA console driver
[    9.637229] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    9.637230] [drm] Driver supports precise vblank timestamp query.
[    9.643972] [drm] Finished loading i915/skl_dmc_ver1_26.bin (v1.26)
[    9.911954] [drm] failed to retrieve link info, disabling eDP
[    9.913633] [drm] GuC firmware load skipped
[   10.346209] [drm] Initialized i915 1.6.0 20160711 for 0000:00:02.0 on minor 0
[   10.668749] fbcon: inteldrmfb (fb0) is primary device
[   10.879006] [drm:intel_dp_link_training_clock_recovery [i915]] *ERROR* too many voltage retries, give up
[   10.915657] [drm:intel_dp_link_training_clock_recovery [i915]] *ERROR* too many voltage retries, give up
[   11.354708] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[   11.792904] [drm] RC6 on
[   15.501684] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[  283.095160] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe C FIFO underrun


The CPU is:

model name	: Intel(R) Core(TM) i7-6770HQ CPU @ 2.60GHz

The dmidecode relevant data:

  BIOS Information
        Vendor: American Megatrends Inc.
        Version: KYSKLi70.86A.0041.2016.0817.1130
        Release Date: 08/17/2016

  Base Board Information
        Manufacturer: Intel Corporation
        Product Name: NUC6i7KYB
        Version: H90766-404

BIOS is using the default settings.

In my case, I'm using 3 monitors, one on each port (mini-DP, HDMI and USB-C), as shown via xrandr:

  DP-1 connected 1920x1080+3840+0 (normal left inverted right x axis y axis) 510mm x 287mm
  DP-2 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 509mm x 286mm
  HDMI-3 connected primary 1920x1080+1920+0 (normal left inverted right x axis y axis) 886mm x 498mm

I tested applying https://patchwork.freedesktop.org/series/10206/ on the top of Kernel 4.8. No changes.
Comment 6 Mauro Carvalho Chehab 2016-10-04 15:53:57 UTC
Created attachment 126998 [details]
dmesg with the patch applied on Skull Canyon
Comment 7 Imre Deak 2016-10-11 11:19:17 UTC
(In reply to Mauro Carvalho Chehab from comment #6)
> Created attachment 126998 [details]
> dmesg with the patch applied on Skull Canyon

This doesn't look like the same problem as the originally reported, the problem in your case happens without a suspend/resume cycle. If that's really the case could you file a separate bug for it?
Comment 8 Jani Saarinen 2016-12-09 10:52:53 UTC
Is this still valid?
Comment 9 Rami 2017-01-31 15:21:41 UTC
This bug is not reproduced with last setup on both bxt and skl platform but now we have different error.
setup:
=========
SKLS1:
------
Platform SKL Gigabyte
CPU: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz (family 6, model 94, stepping 3)
GPU: Intel® HD Graphics 530 - Intel Corporation Sky Lake Integrated Graphics (rev 06)
Motherboard version: H170N-WIFI-CF
Memory: 2x 4GB Kingston 9905622-055.A00G

Software
Bios: F3
Linux distribution: Ubuntu 16.04 64 bits
DMC 1.26 from https://01.org/sites/default/files/downloads/intelr-graphics-linux/skldmcver126.tar_1.bz2
GUC 6.1 from https://01.org/sites/default/files/downloads/intelr-graphics-linux/sklgucver61.tar.bz2

BXTP6:
------
Platform BXT-P: APL system
CPU Name : Intel(R) Genuine Processor @ 1.1 GHz (family: 6, model: 12, stepping: 9) 4 cores
QDF : Q6HE
SoC : B1
CRB : Apollo Lake DDR3L RVP1A FAB2
Reworks: R19, R20

Software 
Bios: 144_B10 APLK_B0_IFWI_X64_R_2016_06_27_0956_SPI_RVP1.bin from \\gar\ec\proj\ba\CCG\APL BIOS\External\BIOS_Release\Daily\v144_10_2016_WW27.1\IFWI\IFWI_RVP1_Release\IFWI
KSC: 1.15
Linux distribution: Ubuntu 16.04 64 bits
DMC 1.07
GuC 8.7
Kernel: drm-tip: 2017y-01m-30d-21h-14m-37s UTC integration manifest
commit 123d798c350471aba7e0625c154c6d9e395756c8
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:  Mon Jan 30 21:15:12 2017 +0000
drm:libdrm-2.4.75-3-gd4b8344 from git://anongit.freedesktop.org/mesa/drm
mesa: mesa-13.0.3-bec04114 from git://anongit.freedesktop.org/mesa/mesa
cairo:1.15.4-68bbb693 from git://anongit.freedesktop.org/cairo
xserver: xorg-server-1.19.0-66-ga6fcb15 from git://git.freedesktop.org/git/xorg/xserver
xf86-video-intel: 2.99.917-750-g2d6f2e8 from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel
libva:  libva-1.7.3.pre1-48-ge677ad9 from git://git.freedesktop.org/git/vaapi/libva
vaapi-intel-driver: 1.7.3-287-g05d2d25 from git://git.freedesktop.org/git/vaapi/intel-driver


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.