Bug 83792

Summary: Kernel hangs on boot without nomodeset option
Product: DRI Reporter: Tim Nelson <wayland>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium    
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Kernel messages, round 1
none
Kernel messages, last working version (3.6.11) none

Description Tim Nelson 2014-09-12 08:09:53 UTC
Hi all.  I have a Radeon FireMV 2400.  I'm attempting to upgrade from Fedora 13 (kernel 2.6.33) to Fedora 20.  More recent kernels have not worked for me; I recently discovered that if I use the nomodeset option, then I can book, but xorg no longer functions.  Since it seems that using the nomodeset option is no longer going to be supported, I need to make this work without the nomodeset option.  Unfortunately, the computer hangs (no ping via network) during boot.  I tried to give it the netconsole option, but this did not seem to produce any results.  What else can I try to help resolve this problem?  

Thanks,
Comment 1 Alex Deucher 2014-09-12 12:46:56 UTC
Can you elaborate on what "not worked" means?  Blank screen, corruption, hangs?  Are you able to access the machine remotely?  If so, can you attach your xorg log and dmesg output with the radeon driver loaded?  Can you bisect the kernel to determine what commit broke your card?
Comment 2 Tim Nelson 2014-09-12 13:06:12 UTC
Regarding "not worked", these kernels have always hung on boot with a blank screen; most of these are from my attempts to upgrade to various different versions of Fedora between 13 (currently working for me) and 20 (the one I'm attempting to install now).  I suspect that these kernels weren't being passed the "nomodeset" option by default, but I have no evidence of that, and I don't plan to go back and look unless it becomes necessary, since I want to get the Fedora 20 kernel working, rather than the intermediate ones.  

When I boot the machine (after removing the "quiet rhgb" options, and adding "debug"), the machine spits out a bunch of log messages that scroll too quickly for me to see, and then the screens go blank (the LCD one displaying the message "No Signal"), and I can't access the machine remotely at that point; unfortunately this means that I also can't post the relevant dmesg and xorg information.  I think the machine is hanging before the network card is enabled, although I will try to double-check that.  

As for bisecting the kernel, I can have a go, but I suspect it will be the original KMS commit.  

Thanks for the time you've put into this.  I'll have some more time to put into it later, but probably not for another week or two, unfortunately.
Comment 3 Christian König 2014-09-12 13:13:20 UTC
Well the nomodeset option only makes sense with the UMS support and that was deprecated in January 2013.

So as long as Fedora 20 doesn't compile their kernels with a deprecated feature they probably won't support nomodeset any more.

Without any logs it would be rather hard to figure out what's going wrong here.

So is there a possibility to attach for example a serial or network console to grab the last few lines of system log before the machine goes into nirvana?
Comment 4 Alex Deucher 2014-09-12 13:41:04 UTC
Can you also try blacklisting radeon and loading it after the box has booted and you have remote access?  E.g.,  append modprobe.blacklist=radeon 3 to the kernel command line in grub (the 3 is to boot into runlevel 3 so X doesn't start).  Then once the system is booted and you have remote access, as root, run: modprobe radeon
and see if you can get the dmesg output.
Comment 5 Tim Nelson 2014-09-16 13:45:56 UTC
Just reporting back; blacklisting radeon as Alex specifies allows the machine to boot.  "modprobe radeon" after boot makes it die.  Unfortunately I only had a little time to put into it, so that's all my results at the moment, but I hope to put more time into it in the next couple of days.
Comment 6 Tim Nelson 2014-09-30 10:49:42 UTC
Created attachment 107113 [details]
Kernel messages, round 1

Good news; I have some kernel messages to attach.  These were gained using netconsole, which was on another machine.  It seems to me that it detects the radeon FireMV2400 (although it never mentions it by name), and detects two of the four connectors (I'm assuming two that are connected to the same chip; this card has 2 chips with 2 displays each; total of 4 displays).
Comment 7 Tim Nelson 2014-09-30 11:10:54 UTC
Anyway, it looks like I was wrong about the kernel modesetting; I think this is a side-effect of the more modern radeon driver not supporting nomodeset.  So it's just a bug in the radeon driver, pure and simple.  So now I suspect I have to do a kernel bisect.  If you have any other advice, though, let me know.
Comment 8 Alex Deucher 2014-09-30 12:58:57 UTC
(In reply to comment #6)
> Created attachment 107113 [details]
> Kernel messages, round 1
> 
> Good news; I have some kernel messages to attach.  These were gained using
> netconsole, which was on another machine.  It seems to me that it detects
> the radeon FireMV2400 (although it never mentions it by name), and detects
> two of the four connectors (I'm assuming two that are connected to the same
> chip; this card has 2 chips with 2 displays each; total of 4 displays).

If netconsole works, can you get remote access to the box (e.g., ssh) as well or does it hang completely?
Comment 9 Tim Nelson 2014-10-15 04:56:51 UTC
OK, I've narrowed it down quite a bit.  

The kernel which Fedora (18) calls kernel-debug-3.6.10-4.fc18.x86_64 works.  Xorg segfaults, but the radeon module successfully loads.  The kernel which Fedora (19) calls kernel-debug-3.9.5-301.fc19.x86_64 crashes the machine.  

Anyway, I will do some more narrowing down, and I'll try ssh access as well.
Comment 10 Tim Nelson 2014-10-17 10:57:24 UTC
OK, another report.  First, when it locks up, there is no ssh access possible; it doesn't even ping.  

Second, I've narrowed it down this far:

Working: kernel-debug-3.6.11-3.fc18.x86_64
Not working: config-3.7.0-0.rc0.git2.4.fc19.x86_64

Those kernel version numbers are the RPM packages that I installed.  

Third, I grabbed a copy of the dmesg messages on the working kernel, so that we can see what happened immediately after the place where the crash was introduced.  These may also be enlightening.  I'll attach them.
Comment 11 Tim Nelson 2014-10-17 10:58:22 UTC
Created attachment 107982 [details]
Kernel messages, last working version (3.6.11)
Comment 12 Tim Nelson 2014-10-17 11:00:30 UTC
Oh, one more comment - is there anything else I can do to help track down this bug further?
Comment 13 Tim Nelson 2014-10-17 11:20:39 UTC
Hmm.  Just noticed that that one kernel version number says "config" at the start instead of "kernel".  That's because I got the version number from the config file in /boot.
Comment 14 Alex Deucher 2014-10-17 13:41:38 UTC
(In reply to Tim Nelson from comment #12)
> Oh, one more comment - is there anything else I can do to help track down
> this bug further?

Ideally you could use git to bisect between 3.6 and 3.7 to see what change caused the regression.
Comment 15 Martin Peres 2019-11-19 08:56:11 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/530.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.