Bug 42923 - [IVB] System fails to resume from S3/S4 with recent bioses
Summary: [IVB] System fails to resume from S3/S4 with recent bioses
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other Linux (All)
: high critical
Assignee: Keith Packard
QA Contact: fangxun
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 42991
  Show dependency treegraph
 
Reported: 2011-11-14 07:22 UTC by roberth
Modified: 2017-07-24 23:03 UTC (History)
11 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg of S3 failure on 3.0 (209.16 KB, text/plain)
2011-11-14 07:22 UTC, roberth
no flags Details
S3 on 3.2-rc1 (4.91 KB, text/plain)
2011-11-14 07:23 UTC, roberth
no flags Details
S4 on 3.0 (107.50 KB, text/plain)
2011-11-14 07:24 UTC, roberth
no flags Details
Good bios, reg dump after boot. (11.04 KB, text/plain)
2011-11-14 10:26 UTC, roberth
no flags Details
Good bios, reg dump after resume. (11.04 KB, text/plain)
2011-11-14 10:26 UTC, roberth
no flags Details
Bad bios, reg dump after boot. (11.04 KB, text/plain)
2011-11-14 10:27 UTC, roberth
no flags Details
Bad bios, reg dump after (failed) resume. (11.04 KB, text/plain)
2011-11-14 10:27 UTC, roberth
no flags Details

Description roberth 2011-11-14 07:22:49 UTC
Created attachment 53531 [details]
dmesg of S3 failure on 3.0

System environment: 

chipset:    ivybridge mobile
arch:       x86_64
distro:     ubuntu 11.10
kernels:    3.0.0-13.22 (3.0.6+backports), 3.1, 3.2-rc1
userspace:  xserver:           1.10.4 and 1.11.2
            libdrm:            2.4.26 and git master (nov 10th)
            mesa:              7.11 and git master (nov 10th)
            xf86-video-intel:  2.16 and git master (nov 10th)


Starting after bios updates in mid october, all ivybridge systems we have come across are exhibiting strange behavior. 3.0 kernels did not boot until "drm/i915: enable ring freq scaling, RC6 and graphics turbo on Ivy Bridge v3" was backported from 3.1 and i915 failed to load with this error

 [drm:init_status_page], render ring hws offset: 0x00000000
 [drm:init_ring_common] *ERROR* render ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000
 [drm:i915_driver_load] *ERROR* failed to init modeset

Now they come up, but every kernel tried oopses on resume from S3 or S4 every time. Attached are logs of the oops on various kernels. Downgrading the bios avoids the problem on every machine, but this affects machines from multiple ODM's including a reference platform from Intel after a bios update.
Comment 1 roberth 2011-11-14 07:23:33 UTC
Created attachment 53532 [details]
S3 on 3.2-rc1
Comment 2 roberth 2011-11-14 07:24:27 UTC
Created attachment 53533 [details]
S4 on 3.0
Comment 3 Chris Wilson 2011-11-14 09:08:50 UTC
Hmm, can you attach the intel_reg_dumper after boot vs resume (with both BIOSes). I presume that the BIOS is doing additional bring up that we need to replicate upon resume.
Comment 4 Chris Wilson 2011-11-14 09:11:09 UTC
On second thoughts (having read the OOPS), may I say wtf happened to our data structures upon resume?
Comment 5 roberth 2011-11-14 10:26:12 UTC
Created attachment 53542 [details]
Good bios, reg dump after boot.
Comment 6 roberth 2011-11-14 10:26:40 UTC
Created attachment 53543 [details]
Good bios, reg dump after resume.
Comment 7 roberth 2011-11-14 10:27:10 UTC
Created attachment 53544 [details]
Bad bios, reg dump after boot.
Comment 8 roberth 2011-11-14 10:27:33 UTC
Created attachment 53545 [details]
Bad bios, reg dump after (failed) resume.
Comment 9 roberth 2011-11-14 10:31:24 UTC
(In reply to comment #8)
> Created attachment 53545 [details]
> Bad bios, reg dump after (failed) resume.

--- X13_After_Resume.txt	2011-11-14 13:28:39.837222007 -0500
+++ X18_After_Resume.txt	2011-11-14 13:28:57.465221705 -0500
@@ -28,10 +28,10 @@
                  PIPEA_LINK_N1: 0x00041eb0 (val 0x41eb0 270000)
                  PIPEA_LINK_M2: 0x00000000 (val 0x0 0)
                  PIPEA_LINK_N2: 0x00000000 (val 0x0 0)
-                      DSPACNTR: 0xd8004400 (enabled)
+                      DSPACNTR: 0xd8004000 (enabled)
                       DSPABASE: 0x00000000
-                    DSPASTRIDE: 0x00001600 (88)
-                      DSPASURF: 0x046fb008
+                    DSPASTRIDE: 0x00001580 (86)
+                      DSPASURF: 0x00063000
                    DSPATILEOFF: 0x00000000 (0, 0)
                      PIPEBCONF: 0x00000000 (disabled, inactive, 8bpc)
                       HTOTAL_B: 0x00000000 (1 active, 1 total)
Comment 10 Gordon Jin 2011-11-14 15:53:27 UTC
Xun, do you have problem for S3/S4 after upgrading the BIOS to the latest one (v67)?
Comment 11 Chris Wilson 2011-11-14 16:06:50 UTC
Ok, not too much interesting to see there; a few minor differences in link configuration that would be good to understand but would appear not to be relevant to this issue. Back to hunting for an explanation for the apparent memory corruption.
Comment 12 fangxun 2011-11-15 02:03:59 UTC
we can reproduce this problem on our Ivybridge(both desktop and mobile) af(In reply to comment #10)
> Xun, do you have problem for S3/S4 after upgrading the BIOS to the latest one
> (v67)?

We can reproduce this problem for S3 on our Ivybridge(both desktop and mobile) after upgrading the BIOS to the latest one.
S4 seems good. It does't happens in text mode.
Comment 13 Eugeni Dodonov 2011-11-19 01:59:45 UTC
Hi folks,

for those of you affected by this issue, could you please test Jesse/Keith's patch at http://lists.freedesktop.org/archives/intel-gfx/2011-November/013544.html and report your results?

We'd like to see Tested-by acknowledgements if it works for you if possible..
Comment 14 fangxun 2011-11-20 22:12:24 UTC
This patch works for me.  The problem about S3 disappears on our Ivybridge(both desktop and mobile).
Comment 15 Stefan Dirsch 2011-11-21 03:54:40 UTC
BTW, I was unable to figure out which VBIOS is in use on our machines. Intel talks about
such a version theme:

54 (old)
59 (adds "multi-threaded force wake",which possibly requires Linux
    graphics driver update)
60
64
67

On our HP machine I see "INTEL 2120". I have no idea how this maps to the Intel version. Apparently HP doesn't know either. In dmidecode I couldn't find anything which would match
to the Intel version theme either. Jesse told me to look there. :-(
Comment 16 roberth 2011-11-21 07:50:19 UTC
(In reply to comment #13)
> Hi folks,
> 
> for those of you affected by this issue, could you please test Jesse/Keith's
> patch at
> http://lists.freedesktop.org/archives/intel-gfx/2011-November/013544.html and
> report your results?
> 
> We'd like to see Tested-by acknowledgements if it works for you if possible..

Indeed Keith's version does work, sent my tested-by. Thank you very much!
Comment 17 Manoj Iyer 2011-11-21 08:44:48 UTC
Following tests passed:
1. Test s3 10 times resumes ok with no oops
2. Connect external monitor, open terminal move it to the external monitor, do s3, resumes ok, able to move mouse and windows back and forth after resume.
3. boot with i915.i915_enable_rc6=1 and lightdm comes up

Note: If I connect an external monitor, open a terminal move it to the external monitor, disconnect the external monitor, when X resizes to the lcd screen the window in the monitor is lost. ie clicking on the icon for the terminal does nothing.
Comment 18 Keith Packard 2011-11-21 09:03:14 UTC
I think the other issue you're seeing is an unrelated DRM problem; doing an xrandr --off VGA1 followed by xrandr --auto VGA1 can leave VGA1 off when you set it to the same mode as it was before.
Comment 19 Gordon Jin 2011-11-30 16:42:40 UTC
(In reply to comment #13)
> Hi folks,
> for those of you affected by this issue, could you please test Jesse/Keith's
> patch at
> http://lists.freedesktop.org/archives/intel-gfx/2011-November/013544.html and
> report your results?
> We'd like to see Tested-by acknowledgements if it works for you if possible..

This patch has been committed to drm-intel-fixes:
http://cgit.freedesktop.org/~keithp/linux/commit/?h=drm-intel-fixes&id=8d715f0024f64ad1b1be85d8c081cf577944c847

Can the reporters verify with drm-intel-fixes?
Comment 20 fangxun 2011-12-04 22:04:39 UTC
I confirm it works with drm-intel-fixes commit 5be93ad2ebb975df8ba01f6c76b541ff4e9929f4.
Comment 21 Florian Mickler 2012-02-01 13:11:05 UTC
A patch referencing a commit referencing this bug report has been merged in Linux v3.3-rc2:

commit 8109021313c7a3d8947677391ce6ab9cd0bb1d28
Author: Daniel Vetter <daniel@ffwll.ch>
Date:   Fri Jan 13 16:20:06 2012 -0800

    drm/i915: convert force_wake_get to func pointer in the gpu reset code


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.