When grub2 starts the kernel it may hand off the VGA controller either initialised in text or graphics mode, up till now text has been the common case. When handing off the console in graphics mode the display is not correctly initialised by the intel i915 kernel driver leading to either an all black or all purple display (it is likely this is a corner of the background colour which X is attempting to display on the framebuffer at the time). gdm starts normally and if you type blind you can login, the display does not change. If you then suspend and resume the machine the display is restored to normal.
As recommended I have picked up intel_reg_dumper dumps for the display as handed off from GRUB in both text and graphics mode and dumps when the display is solid purple and when it is working.
Looking at the difference between the GRUB text mode and the GRUB graphics mode dumps it appears there is very little difference:
$ diff -u GRUB-TEXT GRUB-GRAPHICS
- DSPBBASE: 0x00000000
+ DSPBBASE: 0x0012c000
And although there are a number of other differences between the purple and working dumps, there is a similar discrepancy:
$ diff -u PURPLE-DUMP WORKING-DUMP | grep BASE
- DSPBBASE: 0x0012c000
+ DSPBBASE: 0x00000000
Could this register account for the behaviour differences. I will attach these dumps below. All of this is testing with grub 1.99 and with v2.6.37-rc3 of the kernel.
Created attachment 40771 [details]
Intel register dump in early boot with GRUB using text mode
Created attachment 40772 [details]
Intel register dump in early boot with GRUB using graphics mode
Created attachment 40773 [details]
Intel register dump at gdm prompt, display solid purple
Created attachment 40774 [details]
Intel register dump at gdm prompt following suspend/resume cycle, display normal
I think the key clue is the FIFO_UNDERRUN on pipe B.
BIOS starts on pipe A, plane A (probably).
grub2 switches to pipe B, plane B (though neither plane is actually enabled!)
i915.ko starts on pipe B, plane A.
My hypothesis is that we fail to actually teardown the state correctly (because it doesn't match our pipe/plane coupling) so there is a small window where the registers are misconfigured, leading to undefined behaviour.
Created attachment 40777 [details] [review]
Sanitize modesetting registers.
Created attachment 40778 [details] [review]
Sanitize modesetting registers.(v2)
Second version that should hopefully not also clear a uninitialised plane.
I have pulled the V2 patch and tested it on the affected hardware. Over 30 reboots I have had no failures at all, all consistantly displaying the gdm screen as expected. Without the patch we are 0/5 for a warm reboot.
Thank you for your work on this:
Tested-by: Andy Whitcroft <firstname.lastname@example.org>
I've pushed the patch to -fixes and will be sending a pull request to Dave in a couple of days if nothing turns up.
Author: Chris Wilson <email@example.com>
Date: Fri Dec 3 15:37:31 2010 +0000
drm/i915: Clean conflicting modesetting registers upon init
If we leave the registers in a conflicting state then when we attempt
to teardown the active mode, we will not disable the pipes and planes
in the correct order -- leaving a plane reading from a disabled pipe and
possibly leading to undefined behaviour.
Reported-and-tested-by: Andy Whitcroft <firstname.lastname@example.org>
Signed-off-by: Chris Wilson <email@example.com>
Andy did verify, and couple days have gone and nothing turned up ... closing.