Bug 86565

Summary: black screen after resume from hibernation since linux kernel 3.18
Product: DRI Reporter: Martin Steigerwald <Martin>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: medium CC: intel-gfx-bugs, noga.dany
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
debug log of a working hibernation cycle
none
debug log of a hibernation cycle with longer delay with black screen on resume
none
debug log of a hibernation cycles with longer delay with black screen on resume
none
debug log of a hibernation cycle with longer delay with black screen on resume none

Description Martin Steigerwald 2014-11-22 09:58:22 UTC
Since linux kernel 3.18-rc2 I sometimes get a black screen after resuming from hibernation via in-kernel-suspend. With 3.17 I didn't have this issue. Additionally, sometimes on hibernation the machine after writing the image does not switch off, but stays on with the power LED blinking. I then usually switch it off manually by pressing power button longer and it usually resumes just fine after it.

Since the black screen issue is more severe this bug is about it. I only mentioned the other issue in case it can somehow be related.

This is with a ThinkPad T520 with BIOS Version: 8AET63WW (1.43 ), Release Date: 05/08/2013. The black screen happens with 3.18-rc2 to 3.18-rc5.


The machine appears to run and I switch it off manually and reboot to get things back to work. Since it doesn't happen all the time and since the machine carries production data, I am very reluctant to try to bisect this one. It would likely take a lot of time and also mean running probably unstable kernel versions between 3.17 and 3.18-rc2.

I use the following customization for pm-utils:

merkaba:/etc/pm/config.d> grep . *                     
sleepmodule.conf:SLEEP_MODULE=kernel
unload_vbox_modules.conf:SUSPEND_MODULES="$SUSPEND_MODULES vboxdrv vboxnetflt vboxnetadp vboxpci "

That unloading of vbox modules is new. I will test with this, due to having read:

https://www.virtualbox.org/ticket/9260
https://www.virtualbox.org/ticket/9305

But these are already closed with VirtualBox 4.1.2 and I have 4.3.18 on this machine. Additionally I didn't see any black screen issues with kernel 3.17.

merkaba:~> lspci -nn | grep -i vga
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09)

Is there a way to force pm-utils to switch to text console so that I can probably see an error? If it happens over the weekend, I may try to SSH into the box when it only shows the black screen. But I bet SSH would not work, as it also didn't write any /var/log/kernel.log entries between hibernation and reboot:

Nov 22 00:06:26 merkaba kernel: [30777.215518] CPU1: Package temperature/speed normal
Nov 22 00:12:16 merkaba kernel: [31127.220536] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
Nov 22 00:12:16 merkaba kernel: [31127.228142] PM: Hibernation mode set to 'platform'
Nov 22 10:15:15 merkaba kernel: [    0.000000] Initializing cgroup subsys cpuset
Nov 22 10:15:15 merkaba kernel: [    0.000000] Initializing cgroup subsys cpu
Nov 22 10:15:15 merkaba kernel: [    0.000000] Initializing cgroup subsys cpuacct

Distribution is Debian Sid with:

xserver-xorg-video-intel:amd64/sid 2:2.21.15-2+b2 uptodate
xserver-xorg-video-intel:i386 not installed
libdrm-dev:amd64/sid 2.4.58-2 uptodate
libdrm-dev:i386 not installed
libdrm-intel1:amd64/sid 2.4.58-2 uptodate
libdrm-intel1:i386/sid 2.4.58-2 uptodate
libdrm-nouveau2:amd64/sid 2.4.58-2 uptodate
libdrm-nouveau2:i386/sid 2.4.58-2 uptodate
libdrm-radeon1:amd64/sid 2.4.58-2 uptodate
libdrm-radeon1:i386/sid 2.4.58-2 uptodate
libdrm2:amd64/sid 2.4.58-2 uptodate
libdrm2:i386/sid 2.4.58-2 uptodate

Thanks,
Martin
Comment 1 Martin Steigerwald 2014-11-22 10:01:47 UTC
I wonder whether

Bug 80773 - [hsw backlight bisected] backlight is off after resume

with this fixing commit

sna/transform: Correctly check for imprecise fractional translations
http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=7ecc778691c452285f754743a93a46fa1d3da52f

may be related to it. But its a driver fix in X.org, yet for me working is kernel 3.17 and not working is kernel 3.18. So might be something different.

I can also try changing sleep modules to userspace software suspend… maybe I can tell it to switch to VT1 before hibernation this way so I may see some kernel debug messages.
Comment 2 Martin Steigerwald 2014-11-22 10:05:06 UTC
Oh it definately loads in the hibernation image. I see the kernel messages where it tells progress in 10 percent steps. Then it switches to black screen and sits there while the machine seems to be not completely locked. I think I saw some disk activity. Next time I may leave it running like that for some minutes. Maybe it will eventually write something into logs.
Comment 3 Imre Deak 2014-11-22 21:42:43 UTC
(In reply to Martin Steigerwald from comment #2)
> Oh it definately loads in the hibernation image. I see the kernel messages
> where it tells progress in 10 percent steps. Then it switches to black
> screen and sits there while the machine seems to be not completely locked. I
> think I saw some disk activity. Next time I may leave it running like that
> for some minutes. Maybe it will eventually write something into logs.

Could you boot with the kernel parameters "drm.debug=0xe no_console_suspend initcall_debug" and provide the dmesg log after the hang? If you can't access the device otherwise, you could try using serial console or netconsole.

Can you reproduce the problem if you boot with nomodeset?

Could you try the git://anongit.freedesktop.org/drm-intel drm-intel-nightly branch? It contains some S3/S4 fixes that may help here.
Comment 4 Martin Steigerwald 2014-11-24 08:42:21 UTC
Created attachment 109928 [details]
debug log of a working hibernation cycle

I am now running

merkaba:~> cat /proc/version
Linux version 3.18.0-rc6-tp520 (martin@merkaba) (gcc version 4.9.2 (Debian 4.9.2-1) ) #12 SMP PREEMPT Mon Nov 24 08:46:56 CET 2014

with

merkaba:~> cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.18.0-rc6-tp520 root=/dev/mapper/sata-debian ro rootflags=subvol=debian resume=/dev/mapper/sata-swap init=/bin/systemd drm.debug=0xe no_console_suspend initcall_debug

I attach a log of a hibernation cycle that worked okay.

As soon as I find something during the next days I will try to capture logs.

Thanks,
Martin
Comment 5 Martin Steigerwald 2015-01-06 09:34:54 UTC
Created attachment 111837 [details]
debug log of a hibernation cycle with longer delay with black screen on resume

Happy new year!

This still happens with 3.19-rc2. As I enabled debug I didn´t see the issue for a long time and I disabled debugging after a while again – and then it happened again. But with debugging I had it once that it took a longer time from black screen to usable system and I thought I may be hitting that issue again.

Log attached. Maybe it yields something useful.

I will now add "no_console_suspend initcall_debug" to the boot options again and hope that it will happen again with debug enabled. Maybe it doesn´t as the timing, but then I can use debug output as a work-around to make this annoying issue disappear at least.
Comment 6 Martin Steigerwald 2015-01-10 09:43:19 UTC
Created attachment 112050 [details]
debug log of a hibernation cycles with longer delay with black screen on resume

This still happens with 3.19-rc3. Now with debug enabled as in:

martin@merkaba:~> cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.19.0-rc3-tp520-trim-all-bgroups+ root=/dev/mapper/sata-debian ro rootflags=subvol=debian resume=/dev/mapper/sata-swap init=/bin/systemd no_console_suspend no_console_suspend initcall_debug

Hmmm, I see, there is still something missing. The drm debug drm.debug=0xe, and I accidentally added no_console_suspend twice, I will adapt this for the next cycles.

Anyway, I attach a syslog with above kernel command line and the black screen delay happening. I don´t see any noticable delays in the initcall debug statements. I have another one from a few days before available as well.

To more observations:

1. It seems that if I wait long enough after resume, the box, I can actually get the Plasma/KDE 4.14 screen locker, instead of the black screen.

2. it seems that the other issue of the delay on hibernating, with the blinking power LED of the laptop instead of the laptop shutting down after saving the image, seems related. Everytime I have this issue, on the next resume, I have the black screen delay as well.

Okay, next round with drm.debug=0xe reenabled as well. This is really annoying and I don´t like something breaks my user experience that way.

I also thought to try without systemd and using sysvinit again, but let me first catch a DRM debug log as well.
Comment 7 Jani Nikula 2015-01-12 13:48:48 UTC
Please try http://patchwork.freedesktop.org/patch/40237/
Comment 8 Martin Steigerwald 2015-01-17 09:12:26 UTC
Created attachment 112375 [details]
debug log of a hibernation cycle with longer delay with black screen on resume

Okay, now I catched a log with:

merkaba:~> cat /proc/version /proc/cmdline
Linux version 3.19.0-rc4-tp520-trim-all-bgroups+ (martin@merkaba) (gcc version 4.9.2 (Debian 4.9.2-10) ) #17 SMP PREEMPT Mon Jan 12 10:43:23 CET 2015
BOOT_IMAGE=/vmlinuz-3.19.0-rc4-tp520-trim-all-bgroups+ root=/dev/mapper/sata-debian ro rootflags=subvol=debian resume=/dev/mapper/sata-swap init=/bin/systemd no_console_suspend drm.debug=0xe initcall_debug

So this is with all debugging and it is huge.

Again for the case where I get a longer delay on resuming from hibernation. Also on hibernation itself the power LED continued to blink and the laptop didn´t switch off after saving the image. It may be that it would eventually do so, but I didn´t want to have it running all night as I was about to go to bed.

On resuming I saw something interesting: I had the black screen for at least a minute, I think more two minutes, and then I saw the tty shortly displaying about half a screen of messages before the KDE/Plasma screen locker appeared. I don´t know what it displayed there as I was too far away from the laptop.

Jani, I will look at your suggestion next. May take a while.
Comment 9 Martin Steigerwald 2015-02-07 19:08:12 UTC
I think I didn´t see this with 3.19-rc7 anymore, maybe even an earlier version. Thus closing. Thank you, Martin
Comment 10 Martin Steigerwald 2015-02-18 08:00:04 UTC
This still happens. Currently on 3.19-rc7.

Jani, does 3.20 contain the patch from your comment #7? If so, I think I simply wait for 3.20-rc2 or so. Otherwise I try the patch.

I think I also ask on linux-pm about it, at it may be more related to PM in general than to Intel graphics as I noticed now that the laptop makes a beep after this delay. So it may be stuck in general resuming procedure.
Comment 11 Jani Nikula 2015-02-18 13:33:23 UTC
Odd, dmesg is filled with

Jan 16 11:35:14 merkaba kernel: [166660.246491] [drm:add_framebuffer_internal] [FB:914]
Comment 12 Jani Nikula 2015-02-18 13:35:38 UTC
(In reply to Martin Steigerwald from comment #10)
> Jani, does 3.20 contain the patch from your comment #7? If so, I think I
> simply wait for 3.20-rc2 or so. Otherwise I try the patch.

The patch has been merged in v3.19-rc7.
Comment 13 Martin Steigerwald 2015-03-04 10:33:28 UTC
Thanks, Jani. I now tested with 4.0-rc2, but it doesn´t hibernate at all:

[Bug 94241] New: Blanks screen on hibernation but does not switch off the machine
https://bugzilla.kernel.org/show_bug.cgi?id=94241
Comment 14 Jani Nikula 2015-03-04 11:25:12 UTC
(In reply to Martin Steigerwald from comment #13)
> [Bug 94241] New: Blanks screen on hibernation but does not switch off the
> machine
> https://bugzilla.kernel.org/show_bug.cgi?id=94241

I suspect we have this covered already, please check my comment on the bug and report back.
Comment 15 Martin Steigerwald 2015-03-05 08:24:23 UTC
Okay, with

https://bugzilla.kernel.org/show_bug.cgi?id=94241#c3

on top of

https://bugzilla.kernel.org/show_bug.cgi?id=94241#c1

the machine hibernates again.

I still get the symptom described in this bug on resume. The screen is black for a minute or more and then suddenly is blanked on.

And I know something else now: I am able to log into the machine from a second ThinkPad at that time.

Yet, I do not see *anything* regarding the initial wait in kern.log. I am still back at 3.19-rc7 due to the issue I described in

https://bugzilla.kernel.org/show_bug.cgi?id=94241#c4

as the backtrace described there has

Mar  5 08:57:41 merkaba kernel: [  858.107406]  [<ffffffff812b4315>] do_unblank_screen+0xd3/0x141

in it, there may be some relation although that issue happens on opening a second Plasma/KDE session as I described in the comment in the other bug. So either this is a third bug, or this is somehow related to the other or this bug.
Comment 16 Martin Steigerwald 2015-04-25 12:45:38 UTC
I am now using vanilla 4.0 without any hibernation or intel gfx related patch and I still have this issue. I am not using the patch from

https://bugzilla.kernel.org/show_bug.cgi?id=94241#c4

anymore, since Ilya doesn´t think that the patch would help on my machine.

I am reopening since this is still unsolved.
Comment 17 Imre Deak 2015-06-22 08:48:31 UTC
Martin,

related to https://bugzilla.kernel.org/show_bug.cgi?id=94241#c14:

Did the issue in this report also get resolved meanwhile? You mentioned in comment 15 that the patch from

https://bugzilla.kernel.org/show_bug.cgi?id=94241#c3

made a difference for you. Could you confirm if that's still needed or not? I'd assume it's not needed based on the comments in the kernel bugzilla ticket.

Thanks.
Comment 18 Martin Steigerwald 2015-06-22 12:11:36 UTC
Hello Imre, I am not seeing this issue either anymore with no patches related to this issue applied. Thank you, Martin

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.