Bug 86067 - [[GM965][DDX SNA]] Resume fails and causes reboot
Summary: [[GM965][DDX SNA]] Resume fails and causes reboot
Status: RESOLVED DUPLICATE of bug 76554
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-09 16:47 UTC by Jean-Pierre van Riel
Modified: 2014-11-09 19:03 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Xorg.0.log with SNA option (18.79 KB, text/plain)
2014-11-09 16:47 UTC, Jean-Pierre van Riel
no flags Details
Xorg.0.log with UXA option (28.44 KB, text/plain)
2014-11-09 16:53 UTC, Jean-Pierre van Riel
no flags Details
dmesg before (success) (73.83 KB, text/plain)
2014-11-09 16:54 UTC, Jean-Pierre van Riel
no flags Details
dmesg after (success) (77.75 KB, text/plain)
2014-11-09 16:55 UTC, Jean-Pierre van Riel
no flags Details
i915 module parameters (1.00 KB, text/plain)
2014-11-09 16:57 UTC, Jean-Pierre van Riel
no flags Details
Cannot identify error in pm-suspend.log and think bug is triggered way before resume scripts run (475.74 KB, text/plain)
2014-11-09 17:15 UTC, Jean-Pierre van Riel
no flags Details
UXA fails too, but not as often as SNA (88.60 KB, text/plain)
2014-11-09 17:34 UTC, Jean-Pierre van Riel
no flags Details

Description Jean-Pierre van Riel 2014-11-09 16:47:10 UTC
Created attachment 109158 [details]
Xorg.0.log with SNA option

Bug description:

Resume from suspend on Intel GM965/GL960 (GMA X3100) with DDX SNA fails and causes a reboot, but resume works fine if the non-default UXA DDX acceleration option is used. 

System environment:

-- chipset: Intel GM965/GL960 (GMA X3100)
-- system architecture: 32-bit (i686)
-- xf86-video-intel: 2:2.99.910-0ubuntu1.1 
-- xserver: 2:1.15.1-0ubuntu2.1
-- kernel: 3.13.0-39
-- Linux distribution: Mint 17 Quiana, Cinnamon, 32-bit
-- Machine or mobo model: Acer Extensa 5220, model: 5220-051G08Mi MS2205
-- Display connector: standard laptop lcd

Reproducing steps:

- trigger bug with Option "AccelMethod" "sna"and resume from suspend often causes reboot
- avoid bug with with Option "AccelMethod" "uxa" and resume from suspend seems reliable (no reboot noticed yet after several tests)

Example config to test with
/etc/X11/xorg.conf.d/20-intel.conf
  Section "Device"
     Identifier "Intel Graphics"
     Driver "intel"
     #Option "AccelMethod" "uxa"
     Option "AccelMethod" "sna"
  EndSection

Probability: Very frequent. I'd say only 1 in 10 resume attempts (with Option "AccelMethod" "sna") worked and the rest failed and caused a reboot.

Attempted to follow the guide here: https://01.org/linuxgraphics/documentation/how-debug-suspend-resume-issues-0
- resume from S3 suspend to RAM without mdm (display manager) running appeared to work fine, e.g. 'echo mem > /sys/power/state'
- the default suspend via menu GUI in Cinnamon often caused the resume bug to surface
- S4 hibernation worked fine and didn't seem to trigger the bug
- given the system reboots, unable to login via another terminal or ssh and cannot capture dmesg after, etc
- have not had time to try use intel_reg_dumper or intel_gpu_dump output

Found several similar bug reports for Intel 965 / X3100 type laptops and Intel drivers in Ubuntu 14.04 and or Kernel 3.13. My suspicion is newer Intel video drivers have caused a number of regressions in the latest Kernels, especially with older Intel Chipsets. These are similar bugs, but there fixes don't appear to work in my case:
- [TOSHIBA Satellite U400] suspend/resume failure: https://bugs.launchpad.net/ubuntu/+source/linux-meta/+bug/1290787
- [965gm regression v3.13] TOSHIBA Satellite U400 intel GM965/GL960 suspend/resume failure kernel 3.14 rc7, rc6, 3.13: https://bugs.freedesktop.org/show_bug.cgi?id=76520
- [Dell Inspiron 1525] Cannot resume from suspend: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1331654

Logged bug specific to Mint here, but realised this is quite probably an upstream bug and probably not specific to Mint: https://bugs.launchpad.net/ubuntu/+bug/1390923
Comment 1 Jean-Pierre van Riel 2014-11-09 16:50:51 UTC
Changed bug to 'normal' as I'm not sure if and how this affects other laptop models with GM965/GL960. Haven't 100% ruled out if it's distro or BIOS specific, but do know Xorg.conf and UXA workaround show it's probably a regression in SNA code that triggers the issue.
Comment 2 Jean-Pierre van Riel 2014-11-09 16:53:25 UTC
Created attachment 109159 [details]
Xorg.0.log with UXA option
Comment 3 Jean-Pierre van Riel 2014-11-09 16:54:17 UTC
Created attachment 109160 [details]
dmesg before (success)
Comment 4 Jean-Pierre van Riel 2014-11-09 16:55:26 UTC
Created attachment 109161 [details]
dmesg after (success)

Note, I don't have a valid dmesg after in the failure to resume scenario because a reboot is caused by the bug.
Comment 5 Jean-Pierre van Riel 2014-11-09 16:57:13 UTC
Created attachment 109162 [details]
i915 module parameters
Comment 6 Jean-Pierre van Riel 2014-11-09 17:15:05 UTC
Created attachment 109164 [details]
Cannot identify error in pm-suspend.log and think bug is triggered way before resume scripts run

When first noticing the bug, I did multiple tests and the following pattern are seen in the pm-suspend.log. Unfortunately, I was unable to catch any error here, even when I later modified the script using 'set -x' to try catch which script in the resume process was causing the issue, but that was futile since the resume scripts never even get triggered and the reboot occurs before then. So I think I can rule out any scripts called during resume being the cause.

# When it is able to resume #

Tue Sep  2 00:06:02 SAST 2014: Running hooks for suspend.
...
Tue Sep  2 00:06:02 SAST 2014: performing suspend
Tue Sep  2 00:06:20 SAST 2014: Awake.
Tue Sep  2 00:06:20 SAST 2014: Running hooks for resume
...
Running hook /usr/lib/pm-utils/sleep.d/000kernel-change resume suspend:
/usr/lib/pm-utils/sleep.d/000kernel-change resume suspend: success.

Tue Sep  2 00:06:20 SAST 2014: Finished.


# When it fails to resume #

Tue Sep  2 00:10:19 SAST 2014: Running hooks for suspend.
...
Tue Sep  2 00:10:19 SAST 2014: performing suspend
...
??? not followed by <date>: Awake. ???
...

Instead, in the failure case, we don't get the awake message. The next entry in the log is for the next suspend resume test ater rebooting. I.e. instead of seeing awake, we see the next test.
Tue Sep  2 00:11:34 SAST 2014: Running hooks for suspend.
Comment 7 Jean-Pierre van Riel 2014-11-09 17:34:39 UTC
Created attachment 109166 [details]
UXA fails too, but not as often as SNA

Sadly it appears SNA isn't fully at fault. It just triggers the bug less frequently compared to SNA. After about 10 successful resumes, I had the same failure with UXA configured.

Attached pm-suspend.log which I copied after the system rebooted when attempting to resume. As before, no resume scripts called (or at least no logs written to disc during resume process).
Comment 8 Chris Wilson 2014-11-09 19:03:02 UTC

*** This bug has been marked as a duplicate of bug 76554 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.