Bug 61457

Summary: [Intel G45] Screen turns blank when terminating a low res X server on a different low res KMS mode.
Product: DRI Reporter: Egbert Eich <eich>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: daniel, rodrigo.vivi
Version: XOrg git   
Hardware: x86 (IA32)   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
xorg.conf which triggers this problem.
none
Ugly workaround.
none
Workaround without the mdelay(9)
none
Simper workaround. none

Description Egbert Eich 2013-02-25 19:02:36 UTC
Created attachment 75525 [details]
xorg.conf which triggers this problem.

When booting Intel G45 graphics into a low res KMS mode (ie 800x600) and
starting an Xserver (using the native Intel driver) in a different low res mode
(for instance 640x480) the text console remains blank after terminating the Xserver.

This issue can easily be reproduced with the attached Xserver config file:

1. Boot the machine into an 800x600 mode. This works with any monitor if the 
   command line option: video=VGA-1:800x600 is specified.
2. Once booted start an Xserver with attached config file.

When the Xserver terminates the probability that the screen remains blank is
high (> 90 percent).
Comment 1 Egbert Eich 2013-02-25 19:10:57 UTC
Created attachment 75526 [details] [review]
Ugly workaround.

After a long debugging session I came to the conclusion that this issue was related to the hardware cursor plane being disabled when enabling the display plane.
I tried several workarounds, the one attached is the simplest.
I would like to put it up for discussion as it contains an ugly and totally not-understood delay which was determined experimentally.
It seems that the only way to avoid this problem is to wait for a vblank and then add an delay of about 8-15 milliseconds.
It turned out that a shorter or longer delay let the problem reoccur.
Comment 2 Chris Wilson 2013-02-25 22:09:45 UTC
Hmm, my first guess would be the WM programming is screwy across the transition and that is enough to prevent enabling the pipe.
Comment 3 Egbert Eich 2013-02-28 10:42:24 UTC
(In reply to comment #2)
> Hmm, my first guess would be the WM programming is screwy across the
> transition and that is enough to prevent enabling the pipe.

Chris, is there anything I can test here?

(This is again an issue that we are seeing on one of the POS systems)
Comment 4 Chris Wilson 2013-02-28 11:30:54 UTC
I would try:

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 61fee7f..f52379a 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -1357,6 +1357,7 @@ static void g4x_update_wm(struct drm_device *dev)
 	int plane_sr, cursor_sr;
 	unsigned int enabled = 0;
 
+#if 0
 	if (g4x_compute_wm0(dev, 0,
 			    &g4x_wm_info, latency_ns,
 			    &g4x_cursor_wm_info, latency_ns,
@@ -1368,6 +1369,13 @@ static void g4x_update_wm(struct drm_device *dev)
 			    &g4x_cursor_wm_info, latency_ns,
 			    &planeb_wm, &cursorb_wm))
 		enabled |= 2;
+#else
+	planea_wm = g4x_wm_info.max_size;
+	cursora_wm = g4x_cursor_wm_info.max_size;
+
+	planeb_wm = g4x_wm_info.max_size;
+	cursorb_wm = g4x_cursor_wm_info.max_size;
+#endif
 
 	if (single_plane_enabled(enabled) &&
 	    g4x_compute_srwm(dev, ffs(enabled) - 1,


in order to set the WM to a constant value and disable the low-power switching.
Comment 5 Egbert Eich 2013-02-28 17:21:54 UTC
(In reply to comment #4)
> 
> in order to set the WM to a constant value and disable the low-power
> switching.

Hrm, right - WM == 'watermarks'. Darn acronyms :(

Ok, I've looked into this - it looks promising.

It is this code path that is causing the problems:

        if (single_plane_enabled(enabled) &&
            g4x_compute_srwm(dev, ffs(enabled) - 1,
                             sr_latency_ns,
                             &g4x_wm_info,
                             &g4x_cursor_wm_info,
                             &plane_sr, &cursor_sr)) {
                I915_WRITE(FW_BLC_SELF, FW_BLC_SELF_EN);
       } ...

The problem occurs when the FW_BLC_SELF_EN bit is set after it has been unset before.
The reason why the issue doesn't occur for larger modes seems to be that g4x_compute_srwm() will return false for those.
Comment 6 Chris Wilson 2013-02-28 21:35:03 UTC
Hmm, maybe it is a stale SR value, so perhaps changing the enabling sequence:

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 61fee7f..f32aa00 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -1375,7 +1375,6 @@ static void g4x_update_wm(struct drm_device *dev)
                             &g4x_wm_info,
                             &g4x_cursor_wm_info,
                             &plane_sr, &cursor_sr)) {
-               I915_WRITE(FW_BLC_SELF, FW_BLC_SELF_EN);
        } else {
                I915_WRITE(FW_BLC_SELF,
                           I915_READ(FW_BLC_SELF) & ~FW_BLC_SELF_EN);
@@ -1399,6 +1398,9 @@ static void g4x_update_wm(struct drm_device *dev)
        I915_WRITE(DSPFW3,
                   (I915_READ(DSPFW3) & ~(DSPFW_HPLL_SR_EN | DSPFW_CURSOR_SR_MAS
                   (cursor_sr << DSPFW_CURSOR_SR_SHIFT));
+
+       if (plane_sr && cursor_sr)
+               I915_WRITE(FW_BLC_SELF, FW_BLC_SELF_EN);
 }
 
 static void i965_update_wm(struct drm_device *dev)
Comment 7 Egbert Eich 2013-03-01 13:57:07 UTC
(In reply to comment #6)
> Hmm, maybe it is a stale SR value, so perhaps changing the enabling sequence:
> 

Ok, this sounds like a reasonable idea but unfortunately it didn't help :(
Comment 8 Chris Wilson 2013-03-01 14:06:36 UTC
Let's try that a little bit harder:

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 61fee7f..ec01ddd 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -1369,18 +1369,17 @@ static void g4x_update_wm(struct drm_device *dev)
 			    &planeb_wm, &cursorb_wm))
 		enabled |= 2;
 
-	if (single_plane_enabled(enabled) &&
-	    g4x_compute_srwm(dev, ffs(enabled) - 1,
-			     sr_latency_ns,
-			     &g4x_wm_info,
-			     &g4x_cursor_wm_info,
-			     &plane_sr, &cursor_sr)) {
-		I915_WRITE(FW_BLC_SELF, FW_BLC_SELF_EN);
-	} else {
-		I915_WRITE(FW_BLC_SELF,
-			   I915_READ(FW_BLC_SELF) & ~FW_BLC_SELF_EN);
+	I915_WRITE(FW_BLC_SELF,
+		   I915_READ(FW_BLC_SELF) & ~FW_BLC_SELF_EN);
+	POSTING_READ(FW_BLC_SELF);
+
+	if (!single_plane_enabled(enabled) ||
+	    !g4x_compute_srwm(dev, ffs(enabled) - 1,
+			      sr_latency_ns,
+			      &g4x_wm_info,
+			      &g4x_cursor_wm_info,
+			      &plane_sr, &cursor_sr))
 		plane_sr = cursor_sr = 0;
-	}
 
 	DRM_DEBUG_KMS("Setting FIFO watermarks - A: plane=%d, cursor=%d, B: plane=%d, cursor=%d, SR: plane=%d, cursor=%d\n",
 		      planea_wm, cursora_wm,
@@ -1399,6 +1398,9 @@ static void g4x_update_wm(struct drm_device *dev)
 	I915_WRITE(DSPFW3,
 		   (I915_READ(DSPFW3) & ~(DSPFW_HPLL_SR_EN | DSPFW_CURSOR_SR_MASK)) |
 		   (cursor_sr << DSPFW_CURSOR_SR_SHIFT));
+
+	if (plane_sr && cursor_sr)
+		I915_WRITE(FW_BLC_SELF, FW_BLC_SELF_EN);
 }
 
 static void i965_update_wm(struct drm_device *dev)
Comment 9 Egbert Eich 2013-03-01 16:46:27 UTC
(In reply to comment #8)
> Let's try that a little bit harder:

No, unfortunately this doesn't do the trick either.
Comment 10 Egbert Eich 2013-03-01 17:24:39 UTC
I've played around a bit more - my original workaround seems to work without the not-understood mdelay(9) if I do a:

u32 fw_blc_self = I915_READ(FW_BLC_SELF);
I915_WRITE(FW_BLC_SELF, fw_blc_self & ~FW_BLC_SELF_EN);

before I enable the cursor plane and restore the previous value after I disable it again.
If the cursor plane is off already or (fw_blc_self & FW_BLC_SELF_EN) == 0 I don't have to do anything.

Of course this is only to the extent I've tested using the described test case and
and a limited number of test cycles - since this involves visual inspection if the screen lights up it's hard to automate.

I was a bit surprised that it wasn't sufficient to unset the FW_BLC_SELF_EN bit before enabling the display plane and restore the old value after it's enabled but that also the cursor had to be enabled when enabling the display plane.
Comment 11 Egbert Eich 2013-03-01 17:30:24 UTC
Created attachment 75744 [details] [review]
Workaround without the mdelay(9)

This workaround seems to fix the problem seen with the test case.
Comment 12 Chris Wilson 2013-03-02 21:59:12 UTC
The w/a looks safe enough. Can you make it a g4x_enable_plane function and do the if IS_G4x check higher, i.e.

if (IS_G4X(dev))
  g4x_enable_plane(dev);
else
  intel_enable_plane(dev);

I don't know of any errata that matches this, so send it to list + Daniel with Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 13 Egbert Eich 2013-03-04 14:06:44 UTC
I've investigated this some more.
The issue seems to occur:
a. when a previous mode had the cursor plane enabled (Xserver).
b. when this or the previous mode was using self refresh.

The reason why a blank screen is not seen in the Xserver is that the Xserver has the cursor plane enabled. Enabling the cursor plane on the console manually by writing to the appropriate registers lets the screen light up.

Now I've got a slightly simpler workaround which has survived my tests.
Comment 14 Egbert Eich 2013-03-04 14:12:41 UTC
Created attachment 75897 [details] [review]
Simper workaround.

This patch seems to work also.

Chris, what do you think?
Comment 15 Chris Wilson 2013-03-04 14:17:52 UTC
I would have thought you needed the I915_WRITE(CURBASE, I915_READ(CURBASE)) before the wait_for_vblank() as well. Just try that for sanity's sake, but otherwise it looks like an acceptably minimal w/a.
Comment 16 Egbert Eich 2013-03-04 14:31:52 UTC
(In reply to comment #15)
> I would have thought you needed the I915_WRITE(CURBASE, I915_READ(CURBASE))
> before the wait_for_vblank() as well. Just try that for sanity's sake, but
> otherwise it looks like an acceptably minimal w/a.

Exactly, one would think so as this triggers the latch.
But surprisingly this doesn't seem to be the case. I've verified this by poking the registers directly: writing CURSOR_MODE_64_ARGB_AX to register 0x70080 made the screen light up.
Comment 17 Egbert Eich 2013-03-04 15:25:38 UTC
Chris, are you ok with an
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
for the last patch?
Comment 18 Chris Wilson 2013-03-04 15:36:30 UTC
Aye, if it does the job, I cannot see any harm that it might cause, so acked.
Comment 19 Egbert Eich 2013-03-04 15:46:42 UTC
(In reply to comment #18)
> Aye, if it does the job, I cannot see any harm that it might cause, so acked.

Ok, patch is on its way. 

Thanks Chris for suffering thru this - with me :)

I will close this ticket when the patch has made it in.
Comment 20 Daniel Vetter 2013-03-04 16:15:11 UTC
Fix merged:

commit c69b22e11c917245adf2b72b5a274a4cc24f893b
Author: Egbert Eich <eich@suse.com>
Date:   Mon Mar 4 09:24:38 2013 -0500

    DRM/i915: On G45 enable cursor plane briefly after enabling the display plane.
Comment 21 Egbert Eich 2016-01-16 15:56:08 UTC
Just for the records: 
Intel meanwhile found the root of the issues seen. The proper fix (which also contains a description of the background) is in:

commit 564ed191f5d816d24501664296991ec70327e2bc
Author: Imre Deak <imre.deak@intel.com>
Date:   Fri Jun 13 14:54:21 2014 +0300

    drm/i915: gmch: fix stuck primary plane due to memory self-refresh mode

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.