Bug 91697 - [SKL, BIOS upgrade regression] distorted display after resume from suspend
Summary: [SKL, BIOS upgrade regression] distorted display after resume from suspend
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: highest blocker
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-08-20 07:25 UTC by Timo Aaltonen
Modified: 2017-07-24 22:45 UTC (History)
4 users (show)

See Also:
i915 platform: SKL
i915 features:


Attachments
dmesg dump (316.52 KB, text/plain)
2015-08-20 10:56 UTC, Timo Aaltonen
no flags Details
dmesg from dpms cycle (225.27 KB, text/plain)
2015-08-20 10:58 UTC, Timo Aaltonen
no flags Details
drm log for distorted display in CRB (1.36 MB, text/plain)
2015-08-25 05:14 UTC, Gary Wang
no flags Details
BIOS version for comment #5 (948.29 KB, text/plain)
2015-08-25 05:15 UTC, Gary Wang
no flags Details
Distorted eDP display for comment #5 (1.24 MB, text/plain)
2015-08-25 05:17 UTC, Gary Wang
no flags Details
Dump reg for symptom (before S3 and after resume) (4.60 KB, application/x-7z-compressed)
2015-08-26 10:31 UTC, Gary Wang
no flags Details
force to check CDCLK in gfx driver during resuming from S3 (1.32 KB, text/plain)
2015-08-27 09:53 UTC, Gary Wang
no flags Details
set CDCLK if DPLL0 enabled during resuming from S3 (1.56 KB, text/plain)
2015-08-28 02:43 UTC, Gary Wang
no flags Details
set CDCLK if DPLL0 enabled during resuming from S3 (1.56 KB, text/plain)
2015-08-28 05:33 UTC, Gary Wang
no flags Details

Description Timo Aaltonen 2015-08-20 07:25:20 UTC
A Skylake machine with BIOS upgraded now gets a distorted display after resume from suspend, and the logs show:

[   52.754334] [drm:gen8_irq_handler [i915]] *ERROR* Fault errors on pipe A
[   52.754334] : 0x00000000
[   52.754343] [drm:gen8_irq_handler [i915]] *ERROR* Fault errors on pipe A
[   52.754343] : 0x00000000[drm:gen8_irq_handler [i915]] *ERROR* Fault errors on pipe A
[   52.755689] : 0x00000000
[   52.755696] [drm:gen8_irq_handler [i915]] *ERROR* Fault errors on pipe A
[   52.755696] : 0x00000000[drm:gen8_irq_handler [i915]] *ERROR* Fault errors on pipe A
[   52.756626] : 0x00000000

repeated ad infinitum.

happens with current drm-intel-nightly too.

Apparently the new BIOS has "Intel RC code 1.4.0", whatever that means.. Old working one had 1.3.0.
Comment 1 Timo Aaltonen 2015-08-20 10:56:47 UTC
Created attachment 117805 [details]
dmesg dump
Comment 2 Timo Aaltonen 2015-08-20 10:58:59 UTC
Created attachment 117806 [details]
dmesg from dpms cycle

apparently a DPMS cycle clears the error, and normal operation resumes
Comment 3 Phidias Chiang 2015-08-24 03:47:42 UTC
In further test, DPMS cycle can't clear the error on all platforms.
Comment 4 Rodrigo Vivi 2015-08-24 21:50:49 UTC
What dmc version are you using?
Could you please check what happens with latest ver1_21: https://01.org/linuxgraphics/downloads/skldmcver121

Thanks,
Rodrigo.
Comment 5 Gary Wang 2015-08-25 05:14:39 UTC
Created attachment 117905 [details]
drm log for distorted display in CRB

I used drm-intel-nightly (2015y-08m-24d-07h-39m-09s) in Ubuntu 15.04 with following fire-wares on CRB for BIOS RC 1.4, this issue could be reproduced.

skl_dmc_ver1_21
Details
Release Date: 
14 Aug, 2015
Version: ver1_21
Type: dmc

skl_guc_ver4_3
Details
Release Date: 
23 Jul, 2015
Version: ver4_3
Type: guc
Comment 6 Gary Wang 2015-08-25 05:15:54 UTC
Created attachment 117906 [details]
BIOS version for comment #5

List BIOS version for comment #5
Comment 7 Gary Wang 2015-08-25 05:17:42 UTC
Created attachment 117907 [details]
Distorted eDP display for comment #5

Distorted eDP display for comment #5
Comment 8 Gary Wang 2015-08-25 07:08:50 UTC
For comment #5, if I change to BIOS RC 1.3 for CRB, this issue went away.
Comment 9 Gavin Hindman 2015-08-26 04:24:26 UTC
This is using a Dell BIOS or an Intel reference BIOS?  Assuming a Dell BIOS what Intel BIOS version corresponds to that RC code version?
Comment 10 Gary Wang 2015-08-26 04:31:40 UTC
I checked it by Intel BIOS (please refer https://bugs.freedesktop.org/attachment.cgi?id=117906).
Comment 11 Gary Wang 2015-08-26 10:31:45 UTC
Created attachment 117924 [details]
Dump reg for symptom (before S3 and after resume)

Check related values of reg for issue being happened (please refer "reg_dump.7z", only DSPASURF changed by intel_reg_dumper).
Comment 12 Cooper Chiou 2015-08-26 10:37:37 UTC
This is critical issue reported by customer and blocking customer shipping schedule now. Please help to fix it. Thanks,
Comment 13 Gary Wang 2015-08-27 09:53:05 UTC
Created attachment 117946 [details]
force to check CDCLK in gfx driver during resuming from S3

Hi Timo, 

I found BIOS RC 1.4 will enable CDCLK PLL during S3 resume, then Linux gfx driver would not set CDCLk during its early init cdclk phase. Force to check CDCLk could avoid display corruption. I verify my patch with test cycle 10 times, it passed the test. 

Could you please help to verify it "0001-drm-i915-force-to-check-CDCLK-during-resuming-from-S.patch" before official solution from Intel back-end? Thanks!
Comment 14 Gary Wang 2015-08-27 09:56:17 UTC
I verified it on  drm-intel-nightly 2015/08/24.
Comment 15 Damien Lespiau 2015-08-27 15:13:42 UTC
Could you post your patch to the intel-gfx mailing list for review?

One thing that is important to check as well is if we correctly re-initialize the DDI PHY registers, could you check the value of register 0x64E00?

$ sudo ./tools/intel_reg read 0x64E00

Making sure that's not set to the reset value will avoid hard-to-debug bugs down the line.
Comment 16 Damien Lespiau 2015-08-27 15:45:20 UTC
As an early review:

Your patch makes the code still go through skl_dpll0_enable() when PLL0 is on after resume, which touches the CDCLK and DPLL0 registers. It sort of work because the PLL is already locked and the writes ignored but doesn't feel totally satisfactory.

Could you skip skl_dpll0_enable() if DPLL0 is already enabled?

Thanks,
Comment 17 Gary Wang 2015-08-28 02:43:21 UTC
Created attachment 117957 [details]
set CDCLK if DPLL0 enabled during resuming from S3

Hi Rodrigo and Damien,

Thanks for your comments. Yes, skl_dpll0_enable() should be skipped if DPLL0 enabled. 

I verified this issue with new patch on drm-intel-nightly 8/24/2015, it passed test (10 test cycles).
Comment 18 Gary Wang 2015-08-28 02:59:43 UTC
Hi Rodrigo and Damien,

I has checked DDI PHY reg for phases of before S3 and resume completed, they are the same,
(0x00064e00): 0x00000018
Comment 19 Gary Wang 2015-08-28 05:33:21 UTC
Created attachment 117958 [details]
set CDCLK if DPLL0 enabled during resuming from S3

Please ignore patch from comment #17.

I upload patch for comment #17/#18. Thanks!
Comment 20 Timo Aaltonen 2015-08-28 10:25:10 UTC
we've verified that the latest version from #19 works fine
Comment 21 Akshu Agrawal 2015-09-01 11:43:33 UTC
If we just enable the ‘DBUF Power Request’ bit in DBUF_CNTL, there is no display corruption. IMO this bit is left in reset state as we are enabling the PW2 in BIOS. Following change would also fix the issue:

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index af0bcfe..fd69c4c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -5698,6 +5698,12 @@ void skl_init_cdclk(struct drm_i915_private *dev_priv)
        /* DPLL0 already enabed !? */
        if (I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE) {
                DRM_DEBUG_DRIVER("DPLL0 already running\n");
+               /* check for DBUF status and enable if in reset state */
+               if (!(I915_READ(DBUF_CTL) & DBUF_POWER_STATE)) {
+                       I915_WRITE(DBUF_CTL, I915_READ(DBUF_CTL)
+                                       | DBUF_POWER_REQUEST);
+                       POSTING_READ(DBUF_CTL);
+               }
                return;
        }
Comment 22 Jani Nikula 2015-09-01 13:14:58 UTC
(In reply to Akshu Agrawal from comment #21)
> If we just enable the ‘DBUF Power Request’ bit in DBUF_CNTL, there is no
> display corruption. IMO this bit is left in reset state as we are enabling
> the PW2 in BIOS. Following change would also fix the issue:

We already have

commit 39d9b85a4d4fa1642663ca0d208b5c246a3d6f50
Author: Gary Wang <gary.c.wang@intel.com>
Date:   Fri Aug 28 16:40:34 2015 +0800

    drm/i915: set CDCLK if DPLL0 enabled during resuming from S3

merged in drm-intel-next-fixes. I just overlooked closing the bug. If there's a problem with that commit, please reopen.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.