Bug 66942

Summary: Cayman HD 6950 hangs at start when loading kernel 3.11.0-rc1 or drm-next
Product: DRI Reporter: Alexandre Demers <alexandre.f.demers>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED NOTABUG QA Contact:
Severity: major    
Priority: medium CC: h.judt
Version: XOrg git   
Hardware: All   
OS: All   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=66963
Whiteboard:
i915 platform: i915 features:

Description Alexandre Demers 2013-07-15 20:19:56 UTC
I just compiled and tested kernel 3.11.0-rc1. It hangs and do nothing when initializing (last thing displayed is "Loading initial ramdisk").

Setting radeon.dpm=1 or 0 gives the same result. 

Since I know I'm also experiencing this hang with last week drm-next branch, I'll try to bisect and see where it leads us.
Comment 1 Alexandre Demers 2013-07-15 20:21:08 UTC
As a side note, it could be related to bug 66940
Comment 2 Alex Deucher 2013-07-15 20:26:47 UTC
probably a duplicate of bug 66551.
Comment 3 Alexandre Demers 2013-07-15 23:17:03 UTC
(In reply to comment #2)
> probably a duplicate of bug 66551.

Probably not, the bug happens prior to the commit identified in bug 66551. I'm bisecting as we are talking.

I also noticed a small glitch that was not in kernel 3.9 but is present in 3.10 that I'll report in another bug.
Comment 4 Alexandre Demers 2013-07-16 23:22:10 UTC
Here is the first bad commit, which is pretty large since it doesn't point what was changed before. It could be about any commit before that that touches Cayman.

69e0b57a91adca2e3eb56ed4db39ab90f3ae1043 is the first bad commit
commit 69e0b57a91adca2e3eb56ed4db39ab90f3ae1043
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Fri Apr 12 16:42:42 2013 -0400

    drm/radeon/kms: add dpm support for cayman (v5)
    
    This adds dpm support for cayman asics.  This includes:
    - clockgating
    - dynamic engine clock scaling
    - dynamic memory clock scaling
    - dynamic voltage scaling
    - dynamic pcie gen1/gen2 switching (requires additional acpi support)
    - power containment
    - shader power scaling
    
    Set radeon.dpm=1 to enable.
    
    v2: fold in tdp fix
    v3: fix indentation
    v4: fix 64 bit div
    v5: attempt to fix state enable
    
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Reviewed-by: Jerome Glisse <jglisse@redhat.com>

:040000 040000 aa90c4d442d7640629695b1dc48e0ef1b3958e20 fc06e4965a658e0600ab5b250ae135e9e8225ca4 M	drivers
Comment 5 Alex Deucher 2013-07-16 23:28:28 UTC
Sounds like you may be waiting for the firmware loader to timeout.  Make sure you have the new cayman smc ucode installed:
http://people.freedesktop.org/~agd5f/radeon_ucode/CAYMAN_smc.bin
Comment 6 Alexandre Demers 2013-07-17 01:22:31 UTC
(In reply to comment #5)
> Sounds like you may be waiting for the firmware loader to timeout.  Make
> sure you have the new cayman smc ucode installed:
> http://people.freedesktop.org/~agd5f/radeon_ucode/CAYMAN_smc.bin

Yes, it was missing. I'll test kernel 3.11-rc1 with it and with bug 66551's patch.
Comment 7 Alexandre Demers 2013-07-17 04:17:00 UTC
I may be doing something wrong, but even with the firmware in the same folder as the others (/usr/lib/firmware/radeon) and 66551's patch, it still hangs at the same place.

Here is the content of that folder.
/usr/lib/firmware/radeon$ ls -l CAYMAN*
-rw-r--r-- 1 root root 24148 May 29 20:22 CAYMAN_mc.bin
-rw-r--r-- 1 root root  8704 May 29 20:22 CAYMAN_me.bin
-rw-r--r-- 1 root root  8704 May 29 20:22 CAYMAN_pfp.bin
-rw-r--r-- 1 root root  4096 May 29 20:22 CAYMAN_rlc.bin
-rw-r--r-- 1 root root 31212 Jul 16 21:10 CAYMAN_smc.bin
Comment 8 Michel Dänzer 2013-07-17 08:52:13 UTC
Did you re-generate the initrd after installing the missing firmware?

If the problem is the missing firmware, it should continue booting after three minutes or so. Have you waited that long?
Comment 9 Alexandre Demers 2013-07-18 03:25:05 UTC
(In reply to comment #8)
> Did you re-generate the initrd after installing the missing firmware?
> 
> If the problem is the missing firmware, it should continue booting after
> three minutes or so. Have you waited that long?

Yes. And just to be sure, I rebuilt the kernel tonight and reinstalled it (with a new generated initramfs). I also decompressed the initramfs file and had a look if the firmware had been packaged. There it was.

Also, it never timed out, it sat there for ever (I took time to have dinner).
Comment 10 Alexandre Demers 2013-07-20 16:03:51 UTC
Tested with today's kernel from Linus' git. Still the same. I built the radeon driver in the kernel instead of building it as a module. No change.
Comment 11 Alex Deucher 2013-07-20 16:32:39 UTC
Does it boot correctly if you build radeon as a module, then disable radeon (add radeon.modeset=0 to your kernel command line in grub) and boot to a non-X runlevel, then load radeon manually?

modprobe -r radeon
modprobe radeon modeset=1
Comment 12 Alexandre Demers 2013-07-22 13:39:39 UTC
(In reply to comment #11)
> Does it boot correctly if you build radeon as a module, then disable radeon
> (add radeon.modeset=0 to your kernel command line in grub) and boot to a
> non-X runlevel, then load radeon manually?
> 
> modprobe -r radeon
> modprobe radeon modeset=1

No, sadly, it still hangs at the same place. Problem is, I can't tell if it hangs when loading the initramfs or when it tries to initialize the display beyond that...

Any other suggestion? By the way, I built an 3.11.0-rc2 kernel this morning with as few drivers as possible for my system, just in case, but without any luck.
Comment 13 Harald Judt 2013-07-27 08:27:51 UTC
Same/similar problem here, with firmware built into the kernel (vanilla 3.11.0-rc2). With nomodeset the system boots, without it it doesn't.
Comment 14 Harald Judt 2013-07-28 17:47:42 UTC
Please ignore my post, it was a kernel configuration problem. Got confused by the many tests and used a wrong config... DPM works fine now, but radeon ring 5 problem with GPU stall after resuming from hibernation still exists in 3.11-rc2. Maybe some patches haven't made it into git yet, so I'll look for them.
Comment 15 Alexandre Demers 2013-07-28 19:14:34 UTC
(In reply to comment #14)
> Please ignore my post, it was a kernel configuration problem. Got confused
> by the many tests and used a wrong config... DPM works fine now, but radeon
> ring 5 problem with GPU stall after resuming from hibernation still exists
> in 3.11-rc2. Maybe some patches haven't made it into git yet, so I'll look
> for them.

What was the error in your kernel config?
Comment 16 Harald Judt 2013-07-28 19:50:06 UTC
Sorry, I think I have to disappoint you, I don't think it will help with your problem; I got the wrong config without the CAYMAN_smc.bin, and it hang for a minute or so. After adding it, all worked fine (except for the issue with hibernate/resume).
Comment 17 Alexandre Demers 2013-07-30 02:35:24 UTC
Alex, is there a chance for me to reverse some commits prior to 69e0b57 to find which one or which feature is hanging my computer? Any approach I should try?
Comment 18 Alex Deucher 2013-07-30 02:49:59 UTC
(In reply to comment #17)
> Alex, is there a chance for me to reverse some commits prior to 69e0b57 to
> find which one or which feature is hanging my computer? Any approach I
> should try?

I'm not really sure what would have broken your system.  I also don't really see how 69e0b57 could have broken anything since nothing changes with that as long as dpm is disabled.  Do you still have the issue if you reset your tree to the commit prior to 69e0b57?  Do you still get hangs if you disable radeon (e.g., add radeon.modeset=0 to your kernel command line in grub)?
Comment 19 Alex Deucher 2013-07-30 02:52:00 UTC
Does booting a recent 3.11rc kernel with radeon.aspm=0 help?
Comment 20 Alexandre Demers 2013-07-30 17:21:33 UTC
(In reply to comment #19)
> Does booting a recent 3.11rc kernel with radeon.aspm=0 help?

No, still hangs at the same point.
Comment 21 Alexandre Demers 2013-07-30 17:24:00 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > Alex, is there a chance for me to reverse some commits prior to 69e0b57 to
> > find which one or which feature is hanging my computer? Any approach I
> > should try?
> 
> I'm not really sure what would have broken your system.  I also don't really
> see how 69e0b57 could have broken anything since nothing changes with that
> as long as dpm is disabled.  Do you still have the issue if you reset your
> tree to the commit prior to 69e0b57?  Do you still get hangs if you disable
> radeon (e.g., add radeon.modeset=0 to your kernel command line in grub)?

I'll have to test when I'll get home, but I rebuilt a rc3 yesterday night and I made sure to make it clean. I'll see if this helps. I'll also go back just prior to 69e0b57 if it doesn't help. Finally, I'll play with the init and disable some options just in case. Who knows what might help.
Comment 22 Alexandre Demers 2013-07-31 05:05:15 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > Alex, is there a chance for me to reverse some commits prior to 69e0b57 to
> > find which one or which feature is hanging my computer? Any approach I
> > should try?
> 
> I'm not really sure what would have broken your system.  I also don't really
> see how 69e0b57 could have broken anything since nothing changes with that
> as long as dpm is disabled.  Do you still have the issue if you reset your
> tree to the commit prior to 69e0b57?  Do you still get hangs if you disable
> radeon (e.g., add radeon.modeset=0 to your kernel command line in grub)?

Good new, if I can say. I cleaned my build tree, updated my packages (one was about gcc), redownloaded the firmware and recompiled 69e0b57: it now boots with and without radeon.dpm=1. So, I'm closing this bug. One of those things must have been the culprit.

I just have to test with rc3 now and hope for the best.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.