Bug 70145 - Random blank screen after login with Haswell 4400
Summary: Random blank screen after login with Haswell 4400
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-04 16:51 UTC by drewvs
Modified: 2017-07-24 22:57 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
xorg (34.98 KB, text/plain)
2013-10-04 16:53 UTC, drewvs
no flags Details
dmesg snippet with "drm.debug=0xe" set (27.19 KB, text/plain)
2014-03-30 01:21 UTC, Maksim Lin
no flags Details
full dmesg shwoing issue happening soon after boot (247.54 KB, text/plain)
2014-03-31 21:36 UTC, Maksim Lin
no flags Details

Description drewvs 2013-10-04 16:51:31 UTC
There are odd black/blank/gray screen issues that are occurring with the xf86-video-intel driver in the 3.11.1 kernel.

I've updated to the latest driver with the same result.

Sometimes, on a cold-boot the login manager (gdm3 in this case) will properly show but after entering credentials the screen will either produce a black screen or gray screen with thick black lines.

This can also happen (less often) after typing a LUKS encryption passphrase. It attempts to switch from a lower resolution display to a higher (3200x1800) in this laptops case. When the resolution switches, it fails and has a permanent black screen. I can tell the system has successfully booted at least to the desktop as the FN sound keys function.

This will happen 50% of the time. About 10% of the time switching to a virtual terminal and back will fix the issue and display the desktop as expected.

This also happens randomly after a suspend/resume cycle.

Xorg log coming soon.

I am not passing any relevant kernel parameters at boot but when using the nomodeset option, there are no issues present.

3.12 kernel/driver was compiled and the issue is still present.

lscpi:
00:00.0 Host bridge: Intel Corporation Haswell-ULT DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 09)
00:03.0 Audio device: Intel Corporation Device 0a0c (rev 09)
00:14.0 USB controller: Intel Corporation Lynx Point-LP USB xHCI HC (rev 04)
00:16.0 Communication controller: Intel Corporation Lynx Point-LP HECI #0 (rev 04)
00:1b.0 Audio device: Intel Corporation Lynx Point-LP HD Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 1 (rev e4)
00:1c.2 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 3 (rev e4)
00:1c.3 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 4 (rev e4)
00:1d.0 USB controller: Intel Corporation Lynx Point-LP USB EHCI #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation Lynx Point-LP LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation Lynx Point-LP SATA Controller 1 [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation Lynx Point-LP SMBus Controller (rev 04)
02:00.0 Network controller: Intel Corporation Wireless 7260 (rev 6b)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)


Please let me know if additional information is needed and I will provide it promptly
Comment 1 drewvs 2013-10-04 16:53:48 UTC
Created attachment 87132 [details]
xorg

drew@8bit:~$ inxi -Gx
Graphics:  Card: Intel Haswell-ULT Integrated Graphics Controller bus-ID: 00:02.0 
           X.Org: 1.12.4 drivers: intel (unloaded: fbdev,vesa) Resolution: 1920x1080@59.9hz 
           GLX Renderer: Mesa DRI Intel Haswell Mobile GLX Version: 3.0 Mesa 9.1.6 Direct Rendering: Yes
Comment 2 Paulo Zanoni 2013-10-08 20:01:35 UTC
Hi

Please boot with the "drm.debug=0xe" Kernel parameter, reproduce the problem and attach the output of "dmesg" here.

What happens if you don't have X running? Does it always work? It seems you're using Debian, so you can try to use the "text" Kernel parameter to test this.

Thanks,
Paulo
Comment 3 drewvs 2013-10-08 20:02:32 UTC
Please close, I returned the laptop.

Sent from my mobile, please pardon the brevity.
On Oct 8, 2013 4:01 PM, <bugzilla-daemon@freedesktop.org> wrote:

>  Paulo Zanoni <przanoni@gmail.com> changed bug 70145<https://bugs.freedesktop.org/show_bug.cgi?id=70145>
>  What Removed Added  CC   przanoni@gmail.com
>
>  *Comment # 2 <https://bugs.freedesktop.org/show_bug.cgi?id=70145#c2> on bug
> 70145 <https://bugs.freedesktop.org/show_bug.cgi?id=70145> from Paulo
> Zanoni <przanoni@gmail.com> *
>
> Hi
>
> Please boot with the "drm.debug=0xe" Kernel parameter, reproduce the problem
> and attach the output of "dmesg" here.
>
> What happens if you don't have X running? Does it always work? It seems you're
> using Debian, so you can try to use the "text" Kernel parameter to test this.
>
> Thanks,
> Paulo
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 4 Paulo Zanoni 2013-10-08 20:11:24 UTC
(In reply to comment #3)
> Please close, I returned the laptop.

Did you conclude somehow that the problem was happening because the hardware had a problem?

I was expecting this bug to be our fault since we don't have many users of 3200x1800 yet :(
Comment 5 drewvs 2013-10-08 21:39:01 UTC
It was not the hardware, I returned it, twice.

Sent from my mobile, please pardon the brevity.
On Oct 8, 2013 3:11 PM, <bugzilla-daemon@freedesktop.org> wrote:

>   *Comment # 4 <https://bugs.freedesktop.org/show_bug.cgi?id=70145#c4> on bug
> 70145 <https://bugs.freedesktop.org/show_bug.cgi?id=70145> from Paulo
> Zanoni <przanoni@gmail.com> *
>
> (In reply to comment #3 <https://bugs.freedesktop.org/show_bug.cgi?id=70145#c3>)> Please close, I returned the laptop.
>
> Did you conclude somehow that the problem was happening because the hardware
> had a problem?
>
> I was expecting this bug to be our fault since we don't have many users of
> 3200x1800 yet :(
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 6 Paulo Zanoni 2013-10-08 21:58:52 UTC
Closing bug.

Thanks for the report. If you get the machine again, please reopen the bug. 3200x1800 sounds too exciting to not work :)
Comment 7 Maksim Lin 2014-03-30 01:21:02 UTC
Hi I'm re-opening this since the issue I'm seeing sounds very similiar, I'm also on haswell HD4400 (Dell Latitude 7440) and I'm seeing the issue running everything from 3.12 through to 3.14rc7, though my LVDS is just 1080p not exciting 3200x1800.

I'm basically having the screen go black at random times or as random as I can tell since there seems to be no correlation with what I am doing, if the laptop is in the eDock, connected by HDMI or with no external display.

I'll attach the dmesg snippet of what I got the last time this happened a few min ago, though when it did, I'm sorry I instinctly did ctrl-alt-f1 to switch virt console then alt-f7 to go back to X as I have noticed that once in a while, doing this brings the lvds screen back on (but this time it didnt).
Comment 8 Maksim Lin 2014-03-30 01:21:55 UTC
Created attachment 96606 [details]
dmesg snippet with "drm.debug=0xe" set
Comment 9 Maksim Lin 2014-03-30 01:26:11 UTC
Sorry I should have also said that I am usually running with the patch from https://bugs.freedesktop.org/show_bug.cgi?id=73694 applied, though I have seen this happen when running with kernels without the patch applied.

Also sometimes this occurs, even a reboot does not resolve the problem - even the firmware ("bios") splash screens dont show and I have to pull out the battery and reboot - this has only happened when I've been out and not has an external screen to connect to, as when this same thing happens when docked in a Dell eDock or connected to HDMI via laptops own HDMI port, I see the firmware boot screens show up on the external monitor.

Also since I'm building my own kernel already I'm happy to try out any patches or alternative git branches that people could point me to test.
Comment 10 Maksim Lin 2014-03-30 01:56:21 UTC
sorry forgot to include from lspci -v in case it helps:

00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b) (prog-if 00 [VGA controller])
        Subsystem: Dell Device 05cb
        Flags: bus master, fast devsel, latency 0, IRQ 63
        Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
        Memory at e0000000 (64-bit, prefetchable) [size=256M]
        I/O ports at f000 [size=64]
        Expansion ROM at <unassigned> [disabled]
        Capabilities: <access denied>
        Kernel driver in use: i915
        Kernel modules: i915
Comment 11 Maksim Lin 2014-03-31 21:36:56 UTC
Created attachment 96684 [details]
full dmesg shwoing issue happening soon after boot

Hi, sorry, being a dev myself, I realised how annoying it would be to just have the log snippet, so I managed to get a full dmesg, with the laptop scree going black a little after a boot, with a ext monitor connected via laptops own hdmi port (not edock).

I didnt notice exactly when it happended, but I think it was around the 09:19:53 mark in the dmesg log attached.

Hope this helps.
Comment 12 Jani Nikula 2014-04-01 09:38:00 UTC
First things first, please check your kernel build process and make sure your kernel and modules aren't tainted. Grep for "taint" in the dmesg. This one you'll need to figure out yourself.

One thing to try is i915.i915_enable_fbc=0 (or i915.enable_fbc=0 on drm-next).
Comment 13 Maksim Lin 2014-04-01 09:45:20 UTC
Thanks Jani.
I did and I do get:
dmesg|grep taint
[Tue Apr  1 08:13:41 2014] videodev: module has bad taint, not creating trace events
[Tue Apr  1 08:13:42 2014] kvm: module has bad taint, not creating trace events

I thought "taints" were something to with using binary only (non gpl'd) kernel modules but I guess this is something else since I'm running a mainline kernel I built myself - as you say I'll google and try to educate myself on this.

And thanks for the suggested kernel params, I'll try them too.

Also would I be better off trying to compile from the drm-next branch instead of Linus's master branch?
Comment 14 Jani Nikula 2014-04-01 10:32:33 UTC
(In reply to comment #13)
> Also would I be better off trying to compile from the drm-next branch
> instead of Linus's master branch?

The released v3.14 is fine for now. We're in the middle of the merge window so Linus' master might be a bit too volatile right now.
Comment 15 Maksim Lin 2014-04-03 00:58:54 UTC
Hi,

Ok having read this LWN article http://lwn.net/Articles/588799/ I'm assumming this output:
dmesg|grep taint
[Thu Apr  3 11:41:39 2014] sdhci: module verification failed: signature and/or  required key missing - tainting kernel
[Thu Apr  3 11:41:40 2014] sunrpc: module has bad taint, not creating trace events
[Thu Apr  3 11:41:40 2014] nfs: module has bad taint, not creating trace events
[Thu Apr  3 11:41:40 2014] cfg80211: module has bad taint, not creating trace events
...etc

is just because of the bad re-use of the taint flag that will be fixed in 3.15

So given I'm building my own 3.14 kernel with just 1 patch applied from Daniel (bug #73694), I'm pretty sure I dont have any tainted modules.

I'm rebuilding the 3.14 released kernel now (instead of rc7), will try using "i915.i915_enable_fbc=0" and will report back.
Comment 16 Maksim Lin 2014-04-03 01:08:54 UTC
Ok maybe I got the wrong end of the stick there - I just tried:

cat /proc/sys/kernel/tainted
2

and according to this (http://kmaiti.blogspot.com.au/2011/09/how-to-check-whether-current-running.html) it means I have done a forced module load, but I have just booted (ubuntu 12.02.3) and done no such thing myself - does ubuntu to this for some reason in a init script for some reason?

More to the point: I have found my trial&error that doing a switch from virt console 7 to 1 then back into X again and then suspending the laptop and waking it again immediately does seem to bring the screen back on again. Doing jsut one of those 2 actions does not. Though it seems that after 3 or 4 times of doing that, that stops working too and then nothing short of force powering down, pulling battery and restarting fixes it.
Comment 17 drewvs 2014-04-03 01:12:00 UTC
Save your breath it seems things like this never get solved, I returned the
laptop and bought a different one...

I had the same behavior with tty terminals too...

Sent from my mobile, please pardon the brevity.
On Apr 2, 2014 8:08 PM, <bugzilla-daemon@freedesktop.org> wrote:

>   *Comment # 16 <https://bugs.freedesktop.org/show_bug.cgi?id=70145#c16>
> on bug 70145 <https://bugs.freedesktop.org/show_bug.cgi?id=70145> from
> Maksim Lin <maks@manichord.com> *
>
> Ok maybe I got the wrong end of the stick there - I just tried:
>
> cat /proc/sys/kernel/tainted
> 2
>
> and according to this
> (http://kmaiti.blogspot.com.au/2011/09/how-to-check-whether-current-running.html)
> it means I have done a forced module load, but I have just booted (ubuntu
> 12.02.3) and done no such thing myself - does ubuntu to this for some reason in
> a init script for some reason?
>
> More to the point: I have found my trial&error that doing a switch from virt
> console 7 to 1 then back into X again and then suspending the laptop and waking
> it again immediately does seem to bring the screen back on again. Doing jsut
> one of those 2 actions does not. Though it seems that after 3 or 4 times of
> doing that, that stops working too and then nothing short of force powering
> down, pulling battery and restarting fixes it.
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 18 Maksim Lin 2014-04-03 01:18:39 UTC
drewvs: I'm a bit more optimistic than that as the previous issue I reported here in #73694 Daniel Vetter provided a patch that fixed it very quickly.
Even though this is a *VERY* annoying issue when I'm using the laptop without an external monitor, I'm hoping that if I can provide the kernels devs where with enough information, they can figure out what the issue is or at least a workaround.
Comment 19 Maksim Lin 2014-04-05 04:07:46 UTC
sorry to waste everyones time - this has turned out to be a hardware fault with the laptop.
Comment 20 Daniel Vetter 2014-04-05 10:15:15 UTC
In the future please file a new bug report, even when you're fairly sure that the symptoms match. GFX bugs are tricky and bugzilla is a pain when unrelated issues are mixed together in one report.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.