Bug 34462 - 180 second hang on boot, DRM doesn't seem to initialize (firmware issue?)
Summary: 180 second hang on boot, DRM doesn't seem to initialize (firmware issue?)
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-18 14:01 UTC by Owen Riddy
Modified: 2014-09-14 07:04 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg from Debian Unstable, fails to load (55.50 KB, patch)
2011-02-18 14:01 UTC, Owen Riddy
no flags Details | Splinter Review

Description Owen Riddy 2011-02-18 14:01:20 UTC
Created attachment 43540 [details] [review]
dmesg from Debian Unstable, fails to load

I have a Radeon 3450 graphics card and have been using Debian Squeeze (linux 2.6.32). The card was/is very well supported by the radeon graphics driver, and firmware was being properly loaded (Debian splits the firmware into a separate package from the kernel).

Whenever I compiled a newer vanilla kernel KMS didn't work (starting with 2.6.33, I believe. It was a while ago) and X wouldn't load. Every so often I tried a different kernel, and the problem remained. On X, the system would completely hang with a blank screen (I couldn't access any virtual terminals). I assumed I'd mis-configured my kernel and gave up.

Debian has now upgraded to linux 2.6.37-1 and this image seems to have the same problem. After upgrading, I get a suspicious 180 second pause when the kernel "populates /dev", which I now attribute to a bug. I'm getting an error message about modprobe blocking (in dmesg).

I suspect firmware loading problems:
Wheezy (fails, 2.6.37):
dmesg | grep -C 4 icroc

[    4.325551] [drm] radeon: 256M of VRAM memory ready
[    4.325554] [drm] radeon: 512M of GTT memory ready.
[    4.325619] [drm] radeon: irq initialized.
[    4.325622] [drm] GART: num cpu pages 131072, num gpu pages 131072
[    4.326544] [drm] Loading RS780 Microcode
[    4.329997] input: HDA ATI SB Headphone as /devices/pci0000:00/0000:00:14.2/sound/card0/input7
[    4.414321] radeon 0000:01:05.0: WB enabled
[    4.446469] [drm] ring test succeeded in 1 usecs
[    4.446595] [drm] radeon: ib pool ready.

Squeeze (works, 2.6.32):
dmesg | grep -C 4 icroc

[    4.616162] [drm] radeon: 256M of VRAM memory ready
[    4.616163] [drm] radeon: 512M of GTT memory ready.
[    4.616204] [drm] radeon: irq initialized.
[    4.616206] [drm] GART: num cpu pages 131072, num gpu pages 131072
[    4.616624] [drm] Loading RS780 Microcode
[    4.616627] platform radeon_cp.0: firmware: requesting radeon/RS780_pfp.bin
[    4.714728] hda_codec: ALC1200: BIOS auto-probing.
[    4.716081] input: HDA Digital PCBeep as /devices/pci0000:00/0000:00:14.2/input/input6
[    4.809447] platform radeon_cp.0: firmware: requesting radeon/RS780_me.bin
--
[    4.941941] radeon 0000:02:00.0: irq 30 for MSI/MSI-X
[    4.941946] radeon 0000:02:00.0: radeon: using MSI.
[    4.941978] [drm] radeon: irq initialized.
[    4.941980] [drm] GART: num cpu pages 131072, num gpu pages 131072
[    4.942370] [drm] Loading RV620 Microcode
[    4.942372] platform radeon_cp.0: firmware: requesting radeon/RV620_pfp.bin
[    4.982496] platform radeon_cp.0: firmware: requesting radeon/RV620_me.bin
[    4.988086] platform radeon_cp.0: firmware: requesting radeon/R600_rlc.bin
[    5.025570] [drm] ring test succeeded in 0 usecs

To be sure I haven't tainted the configuration, the method to get a test platform was:
* Installed the Squeeze release base system
* Upgrade to Wheezy, rebooted to test KMS (virtual terminal screen resolution was large, suggesting it worked)
* Upgraded to unstable and reboot, at which point I get the 180 second hang and modprobe errors, as well as a rather low resolution VT.

lspci and other info can be found on this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=613922
Comment 1 Alex Deucher 2011-02-18 14:29:18 UTC
Make sure you have the firmware in your initrd if you are using modules, or built into the kernel if you built radeon into the kernel rather than as a module.  See the "Troubleshooting Extra Firmware" section of this page:
http://wiki.x.org/wiki/radeonBuildHowTo
Comment 2 Owen Riddy 2011-02-18 21:23:16 UTC
Comparing the initrd of 2.6.37 and 2.6.32 show they are both the same, and neither of them have any firmware in them. They didn't actually have the radeon module at all.

I compared /lib/firmware to the git repository http://git.kernel.org/?p=linux/kernel/git/dwmw2/linux-firmware.git , everything matched (I checked with md5sum) except the newer BARTS/CAICOS/BTC/PALM/SUMO/TURKS firmware which I assume is all later than my card.

I tried to force update-initramfs to include the radeon kernel modules by listing it in /etc/initramfs-tools/modules, but this caused the update-initramfs to hang when loading radeon.ko in.

Rather than try to insert everything manually in my initrd, I blacklisted radeon in /etc/modeprobe.d and modprobe'd it in after a complete boot to a virtual terminal.

On my old kernel, this worked swimmingly (no radeon in lsmod, module loaded nicely). On the debian stock kernel modprobe hangs, I stopped timing after ten minutes, the screen doesn't resize as usual for KMS, lsmod lists radeon/drm/drm_kms_helper. dmesg again shows only RS780 Microcode being loaded.
Comment 3 Owen Riddy 2011-02-20 01:47:38 UTC
:/ Sorry, my last paragraph was a little ambigous.

On my old debian 2.6.32 kernel, this worked swimmingly (module loaded nearly instantly, screen resolution jumped up). 
On the debian 2.6.37 kernel modprobe hangs, I stopped timing after ten minutes, the screen doesn't resize as usual for KMS but lsmod lists radeon/drm/drm_kms_helper. dmesg again shows only RS780 Microcode being loaded.

In each case, before starting, I checked with lsmod to make sure radeon was not loaded.



Since my last comment, I also checked Ubuntu 10.10 (2.6.35), and it has graphics issues as well - it is much harder to see what is happening because plymouth gives me a blank screen, but it also has a modprobe timeout, apparently no KMS loading, and firmware seems to be available in /lib/firmware.
Comment 4 Michel Dänzer 2011-02-23 06:05:20 UTC
(In reply to comment #4)
> [    4.414321] radeon 0000:01:05.0: WB enabled

Does passing the parameter no_wb=1 to the radeon kernel module help?

Alternatively, does disabling the integrated RS780 GPU work around the problem?
Comment 5 Owen Riddy 2011-02-24 02:11:27 UTC
> Does passing the parameter no_wb=1 to the radeon kernel module help?

It doesn't seem to help. `dmesg | grep WB` shows it as Disabled, but modprobe still hangs on load at the RS780 firmware.

> Alternatively, does disabling the integrated RS780 GPU work around the problem?

Works. With the RS780 GPU disabled, everything functions.

BIOS -> Advanced Chipset Features -> Surround View -> [Disabled], KMS works, no modprobe hang.

I didn't know that that card could be switched on and off. Thanks :)
Comment 6 Jonathan Nieder 2011-12-03 00:16:17 UTC
Owen, can you bisect?

Just trying a few intermediate versions from http://snapshot.debian.org/ would already be useful.  For narrowing down the regression range beyond that, "git bisect" can help:

 git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
 cd linux
 git bisect start -- drivers/gpu/drm/radeon
 git checkout (bad version)
 make localmodconfig; # minimal configuration
 make -j2 deb-pkg
 dpkg -i ../(package).deb
 reboot
 # confirm that it is actually bad
 git bisect bad

 git checkout (good version)
 make silentoldconfig; # reuse configuration
 make -j2 deb-pkg
 dpkg -i ../(package).deb
 reboot
 # confirm that it is good
 git bisect good

 # now it checks out an intermediate version to test
 make silentoldconfig
 make -j2 deb-pkg
 dpkg -i ../(package).deb
 reboot
 git bisect bad; # if it hangs in the same way
 git bisect good; # if it boots correctly
 git bisect skip; # if some other problem makes it hard to test

 ... rinse and repeat ...

 git bisect visualize; # to watch the regression range narrowing
 git bisect log; # to summarize partial results if you get bored
Comment 7 Owen Riddy 2014-09-14 07:04:17 UTC
This problem was resolved a couple of kernel updates ago.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.