Bug 25417 - [855GM KMS bisected] blank screen after upgrading to 2.6.32
[855GM KMS bisected] blank screen after upgrading to 2.6.32
Status: RESOLVED FIXED
Product: DRI
Classification: Unclassified
Component: DRM/Intel
XOrg git
Other All
: high major
Assigned To: Jesse Barnes
: NEEDINFO, regression
Depends on:
Blocks: miss-2.6.33
  Show dependency treegraph
 
Reported: 2009-12-03 07:29 UTC by Miklos Vajna
Modified: 2010-03-01 05:02 UTC (History)
1 user (show)

See Also:


Attachments
2.6.32 kernel config to reproduce the bug. (108.32 KB, text/plain)
2009-12-03 07:29 UTC, Miklos Vajna
no flags Details
Patch that reverts the "first bad commit", and tmp works around the issue here. (1.03 KB, text/plain)
2009-12-04 20:31 UTC, Miklos Vajna
no flags Details
dmidecode output (6.29 KB, text/plain)
2010-02-10 15:21 UTC, Miklos Vajna
no flags Details
blacklist Kapok M450C boards (479 bytes, patch)
2010-02-10 15:41 UTC, Jesse Barnes
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Miklos Vajna 2009-12-03 07:29:08 UTC
Created attachment 31708 [details]
2.6.32 kernel config to reproduce the bug.

Hi,

Bug description:

Today I upgraded my kernel from 2.6.31.4 to 2.6.32 on my test machine.

I was using KMS and it worked fine - after the upgrade I get a blank screen (blacklight seems to be turned off) right after grub. I booted with init=/bin/sh and confirmed that the blank screen occurs when I load the i915 kernel module.

System environment:
-- chipset:

$ sudo lspci |grep VGA
00:02.0 VGA compatible controller: Intel Corporation 82852/855GM Integrated Graphics Device (rev 02)

-- system architecture: 32-bit

-- xf86-video-intel:
-- xserver:
-- mesa:
-- libdrm:


$ pacman -Q xf86-video-intel xorg-server mesa libdrm
xf86-video-intel 2.8.0-1
xorg-server 1.6.1-8
mesa 7.5.1-2
libdrm 2.4.11-1

-- kernel: vanilla 2.6.32 (as mentioned above this upgrade caused the problem, 2.6.31.4 works fine. and it also works fine if I append i915.modeset=0 to the kernel commandline)
-- Linux distribution: Frugalware Linux 1.2pre1
-- Machine or mobo model: Clevo M450C
-- Display connector: Nothing special, notebook internal LCD

Reproducing steps:

Upgrade to 2.6.32 and reboot.

Given that the box freezes when KMS is enabled, I can't really attach logs - if it's needed, I can attach xorg log and dmesg when KMS is disabled.

Additional info:

The same 2.6.32 kernel on an other machine with an other intel card:

00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 09)

just works fine.

I'm attaching the 2.6.32 kernel config I use.

Thanks!
Comment 1 Jesse Barnes 2009-12-03 09:38:07 UTC
Are you sure the machine hangs?  If you wait for it to boot can you ssh into it?  Another thing to try would be to boot w/o loading the i915 driver (e.g. by booting to single user) then writing a small script to load the driver, collect the log, and reboot.  If you can get a dmesg with the drm debug param set to 1 we might get some good information...

Otherwise you could try a bisect to figure out where things went bad.
Comment 2 Miklos Vajna 2009-12-03 16:19:04 UTC
On Thu, Dec 03, 2009 at 09:38:08AM -0800, bugzilla-daemon@freedesktop.org wrote:
> Are you sure the machine hangs?  If you wait for it to boot can you ssh into
> it?  Another thing to try would be to boot w/o loading the i915 driver (e.g. by
> booting to single user) then writing a small script to load the driver, collect
> the log, and reboot.  If you can get a dmesg with the drm debug param set to 1
> we might get some good information...

As I said earlier, it is responsive till the i915 module is loaded (ie I
can switch the numlock on/off), but at least the display and the
keyboard are no longer accessible after it.

Does it make sense to just load the drm module (or anything else) and
try to see if that outputs anything useful to dmesg _and_ does not
hangs the machine?

> Otherwise you could try a bisect to figure out where things went bad.

Sure, I'll give it a try tomorrow.
Comment 3 Miklos Vajna 2009-12-04 04:42:24 UTC
Okay, here is the result of bisect:

$ git bisect bad
58a27471d00dc09945cbcfbbc5cbcdcd3c28211d is the first bad commit
commit 58a27471d00dc09945cbcfbbc5cbcdcd3c28211d
Author: Zhenyu Wang <zhenyuw@linux.intel.com>
Date:   Fri Sep 25 08:01:28 2009 +0000

    drm/i915: Fix FDI M/N setting according with correct color depth

    FDI M/N calculation hasn't taken the current pipe color depth into account,
    but always set as 24bpp. This one checks current pipe color depth setting,
    and change FDI M/N calculation a little to use bits_per_pixel first, then
    convert to bytes_per_pixel later.

    This fixes display corrupt issue on Arrandle LVDS with 1600x900 panel
    in 18bpp dual-channel mode.

    Cc: Stable Team <stable@kernel.org>
    Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
    Signed-off-by: Eric Anholt <eric@anholt.net>

:040000 040000 6483e84c0b178172bbb3e80033c7e5c7860754bc 3964465109b745523a5ce641dc47bf347fca5701 M      drivers

Full log:

$ git bisect log
git bisect start 'drivers/gpu'
# bad: [22763c5cf3690a681551162c15d34d935308c8d7] Linux 2.6.32
git bisect bad 22763c5cf3690a681551162c15d34d935308c8d7
# good: [74fca6a42863ffacaf7ba6f1936a9f228950f657] Linux 2.6.31
git bisect good 74fca6a42863ffacaf7ba6f1936a9f228950f657
# good: [aa96e341c2a14d6bec114c933bd813ecb972605f] drm/radeon: Fix setting of bits
git bisect good aa96e341c2a14d6bec114c933bd813ecb972605f
# good: [c03342fa6d4617a77cb867ee0ec71665d520eb69] drm/i915: disable powersave feature for Ironlake currently
git bisect good c03342fa6d4617a77cb867ee0ec71665d520eb69
# good: [6fa8d66af83710b3610bd3b2581f051074f2b416] drm/radeon/kms: remove some misleading debugging output
git bisect good 6fa8d66af83710b3610bd3b2581f051074f2b416
# bad: [ca9ab10033d190c1ede85fdf456307bdfdabf079] drm/i915: Select CONFIG_SHMEM
git bisect bad ca9ab10033d190c1ede85fdf456307bdfdabf079
# bad: [4204878179c99d419d392d78d817729992b4c442] drm/i915: Ironlake suspend/resume support
git bisect bad 4204878179c99d419d392d78d817729992b4c442
# bad: [b1f60b7029989da71fd8ea1b1176480fac9e846c] drm/i915: fix panel fitting filter coefficient select for Ironlake
git bisect bad b1f60b7029989da71fd8ea1b1176480fac9e846c
# bad: [0eb96d6ed38430b72897adde58f5477a6b71757a] drm/i915: save/restore BLC histogram control reg across suspend/resume
git bisect bad 0eb96d6ed38430b72897adde58f5477a6b71757a
# bad: [58a27471d00dc09945cbcfbbc5cbcdcd3c28211d] drm/i915: Fix FDI M/N setting according with correct color depth
git bisect bad 58a27471d00dc09945cbcfbbc5cbcdcd3c28211d
Comment 4 Miklos Vajna 2009-12-04 10:07:45 UTC
I suspect this will be a config issue. I just wanted to make sure that c03342fa6d4617a77cb867ee0ec71665d520eb69 is really good (that's the parent of 58a27471d00dc09945cbcfbbc5cbcdcd3c28211d) and now it's bad as well. I'm trying to produce a "good" and a "bad" config, then attach the diff, just in case you could spot any difference that may be counting.
Comment 5 Miklos Vajna 2009-12-04 19:53:42 UTC
So I did a bisect again but this time I did not limit it to a path and I started every step with a 'cp -f ../config.orig .config && yes "" |make config' to avoid the problem with the always changing config.

Here is the result:

$ git bisect good
b42d4c5c6a872815d711e5d51a600f5122c38eee is the first bad commit
commit b42d4c5c6a872815d711e5d51a600f5122c38eee
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Thu Sep 10 15:28:04 2009 -0700

    drm/i915: use ACPI LID status for LVDS ->detect hook

    We can't load or hotplug detect LVDS like we can other outputs, but if
    there's a lid device present we can use it as a proxy.  This allows the
    LFP state to be determined at ->detect time, making configurations
    requiring manual intervention today "just work" assuming the lid device
    status is correct.

    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
    Signed-off-by: Eric Anholt <eric@anholt.net>

:040000 040000 e8860db5d490afe490ea66083c57a639874fcafa e004d6ea8d9f32c812a28f1ec59e5068db49ad85 M      drivers

The log:

git bisect start
# bad: [22763c5cf3690a681551162c15d34d935308c8d7] Linux 2.6.32
git bisect bad 22763c5cf3690a681551162c15d34d935308c8d7
# good: [74fca6a42863ffacaf7ba6f1936a9f228950f657] Linux 2.6.31
git bisect good 74fca6a42863ffacaf7ba6f1936a9f228950f657
# good: [73c583e4e2dd0fbbf2fafe0cc57ff75314fe72df] Merge branch 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
git bisect good 73c583e4e2dd0fbbf2fafe0cc57ff75314fe72df
# bad: [8b3f6af86378d0a10ca2f1ded1da124aef13b62c] Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
git bisect bad 8b3f6af86378d0a10ca2f1ded1da124aef13b62c
# good: [a87e84b5cdfacf11af4e8a85c4bca9793658536f] Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linux
git bisect good a87e84b5cdfacf11af4e8a85c4bca9793658536f
# good: [fd8b327ee46593ccc5230bfd053287fbf7c38a69] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6
git bisect good fd8b327ee46593ccc5230bfd053287fbf7c38a69
# good: [9f6ac7850a9c6363f4117fd2248e232a2d534627] Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
git bisect good 9f6ac7850a9c6363f4117fd2248e232a2d534627
# good: [2c9871de0ae89a0e2c365ea6e277135fe031d8b4] Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
git bisect good 2c9871de0ae89a0e2c365ea6e277135fe031d8b4
# bad: [94e0fb086fc5663c38bbc0fe86d698be8314f82f] Merge branch 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel
git bisect bad 94e0fb086fc5663c38bbc0fe86d698be8314f82f
# bad: [2d7ef395b310e17c86fa6190f21ea1f2eccae5d1] drm/i915: Immediately discard any backing storage for uneeded objects
git bisect bad 2d7ef395b310e17c86fa6190f21ea1f2eccae5d1
# bad: [11ed50ec2a316928c2bacc1149bded86c6a96068] drm/i915: Implement GPU reset on i965
git bisect bad 11ed50ec2a316928c2bacc1149bded86c6a96068
# bad: [8082400327d8d2ca54254b593644942bed0edd25] drm/i915: framebuffer compression for pre-GM45
git bisect bad 8082400327d8d2ca54254b593644942bed0edd25
# good: [e270846fa7c350712553d767e61cf8b3bbfbd58a] drm/i915: Add the missing clone_mask for SDVO-VGA(RGB1)
git bisect good e270846fa7c350712553d767e61cf8b3bbfbd58a
# good: [7e12715ecc47a8a59154afe2746e48998225bb69] ACPI button: provide lid status functions
git bisect good 7e12715ecc47a8a59154afe2746e48998225bb69
# bad: [b42d4c5c6a872815d711e5d51a600f5122c38eee] drm/i915: use ACPI LID status for LVDS ->detect hook
git bisect bad b42d4c5c6a872815d711e5d51a600f5122c38eee
# good: [c1c7af60892070e4b82ad63bbfb95ae745056de0] drm/i915: force mode set at lid open time
git bisect good c1c7af60892070e4b82ad63bbfb95ae745056de0
Comment 6 Miklos Vajna 2009-12-04 20:30:00 UTC
FYI, reverting the "first bad commit" works around the issue for me. Given that git revert results in a merge conflict, I'm attaching the patch I'm using here.
Comment 7 Miklos Vajna 2009-12-04 20:31:38 UTC
Created attachment 31760 [details]
Patch that reverts the "first bad commit", and tmp works around the issue here.
Comment 8 Vasyl Demin 2010-01-01 06:09:31 UTC
Same problem on laptop HP Compaq nx9020.
Video: Intel Corporation 82852/855GM Integrated Graphics Device (rev 02)
OS: Arch Linux i686
Kernel: 2.6.32.2

When the i915 module is loaded with modeset=1, the screen is blanked and stays
that way, with the following in dmesg:

[drm] set up 31M of stolen space
[drm] DAC-6: set mode 640x480 0
------------[ cut here ]------------
WARNING: at drivers/gpu/drm/drm_crtc_helper.c:1032
drm_helper_initial_config+0x57/0x60 [drm_kms_helper]()
Hardware name: compaq nx9020 (PG711ES#ABB)       
No connectors reported connected with modes
Modules linked in: i915(+) drm_kms_helper drm i2c_algo_bit button i2c_core
video output intel_agp agpgart
Pid: 33, comm: modprobe Not tainted 2.6.32-ARCH #1
Call Trace:
 [<c103ea9e>] ? warn_slowpath_common+0x6e/0xb0
 [<ee77bb27>] ? drm_helper_initial_config+0x57/0x60 [drm_kms_helper]
 [<c103eb2b>] ? warn_slowpath_fmt+0x2b/0x30
 [<ee77bb27>] ? drm_helper_initial_config+0x57/0x60 [drm_kms_helper]
 [<ee8a3b0c>] ? i915_driver_load+0x136c/0x1540 [i915]
 [<ee8a2780>] ? i915_vga_set_decode+0x0/0x20 [i915]
 [<ee75faf0>] ? drm_get_minor+0x1b0/0x2e0 [drm]
 [<ee75feb9>] ? drm_get_dev+0x299/0x4c0 [drm]
 [<c11322ef>] ? sysfs_addrm_start+0x3f/0xb0
 [<c1182990>] ? pci_match_device+0xa0/0xc0
 [<c118281b>] ? local_pci_probe+0xb/0x10
 [<c1183601>] ? pci_device_probe+0x61/0x80
 [<c11f697b>] ? driver_probe_device+0x7b/0x170
 [<c1182990>] ? pci_match_device+0xa0/0xc0
 [<c11f6ae9>] ? __driver_attach+0x79/0x80
 [<c11f6a70>] ? __driver_attach+0x0/0x80
 [<c11f61e2>] ? bus_for_each_dev+0x52/0x80
 [<c11f6816>] ? driver_attach+0x16/0x20
 [<c11f6a70>] ? __driver_attach+0x0/0x80
 [<c11f5ad6>] ? bus_add_driver+0xc6/0x2b0
 [<c1183540>] ? pci_device_remove+0x0/0x40
 [<c11f6d83>] ? driver_register+0x63/0x120
 [<ee75b52d>] ? drm_init+0x2d/0xf0 [drm]
 [<ee8e4000>] ? i915_init+0x0/0x48 [i915]
 [<c118382d>] ? __pci_register_driver+0x3d/0xb0
 [<c100112f>] ? do_one_initcall+0x2f/0x190
 [<c1073294>] ? sys_init_module+0xb4/0x220
 [<c1003ad4>] ? syscall_call+0x7/0xb
---[ end trace bff4d44a38794df4 ]---
[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
Comment 9 Miklos Vajna 2010-01-10 12:37:14 UTC
I just wanted to mention that I tried to upgrade to 2.6.33-rc3, which has the same problem. (And the same patch "fixes" it for me).
Comment 10 Jesse Barnes 2010-02-08 16:43:50 UTC
Sounds like your lid status is incorrect.  What does /proc/acpi/button/lid/LID/state report on your machine after bootup?  Many machines report "closed" until the first lid switch event...

If that's the case, can you attach the output of 'dmidecode' so we can quirk your machine?
Comment 11 Miklos Vajna 2010-02-10 15:21:51 UTC
Created attachment 33229 [details]
dmidecode output
Comment 12 Miklos Vajna 2010-02-10 15:37:09 UTC
On Mon, Feb 08, 2010 at 04:43:50PM -0800, bugzilla-daemon@freedesktop.org wrote:
> Sounds like your lid status is incorrect.  What does
> /proc/acpi/button/lid/LID/state report on your machine after bootup?  Many
> machines report "closed" until the first lid switch event...

That's right, it reports closed, even if it's obviously open.

> If that's the case, can you attach the output of 'dmidecode' so we can quirk
> your machine?

Done.

Thanks,

Miklos
Comment 13 Jesse Barnes 2010-02-10 15:41:49 UTC
Created attachment 33230 [details] [review]
blacklist Kapok M450C boards

Can you give this patch a try?  Hopefully it'll work around your machine's broken _LID method.
Comment 14 Miklos Vajna 2010-02-10 17:02:49 UTC
Yes, this fix the problem. :)

Thanks!
Comment 15 Eric Anholt 2010-02-26 11:36:38 UTC
Queued to -next:

commit 7b9c5abee98c54f85bcc04bd4d7ec8d5094c73f4
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Fri Feb 12 09:30:00 2010 -0800

    drm/i915: give up on 8xx lid status