Created attachment 125267 [details]
dmesg from my latest boot (with some wifi messages at end deleted for privacy)
I have an HP Pavilion 17-g133cl with A10 Carrizo+Topaz. I struggled for hours/days to get this to actually display anything at all using the amdgpu driver. In the mean time, efifb works perfectly every time (except of course no 3d, no X resolution switching, no brightness control, and slow enough that video playing is also impossible). Since it seemed like there were just issues powering up the display controller correctly, I decided at first to add my notes to https://bugzilla.kernel.org/show_bug.cgi?id=117591 but my symptoms (and cures) are, indeed, different. After much blind playing with my system (I have no way to ssh in, and the keyboard is flaky as well, so that was a lot of fun), I made the following observations:
- kernel power management options appeared to have no effect
- sometimes (very rarely), it just starts working, regardless
- once it starts working for more than 1 minute, it stays working
- often, it starts partially working, by giving me a flickering display
- often, it starts working with a stable display, only to start flickering again after a few seconds (very rarely even going completely black gain)
- my first way of fixing it is fairly reliable, but almost always requires at least one blind reboot before it starts giving my a display: xrandr --output eDP --crtc 1 (added to my .xinitrc)
- if, when it finally comes up, it is flickering, it can be cured by toggling the crtc between 0 and 1 often enough until it remains stable for at least a minute.
- if crtc 1 is enabled when X is killed, the machine goes blank and hangs hard; switching consoles while X is up works, though (although the consoles remain black).
- changing the crtc to 1 in X does nothing for the console; only X displays anything. I have not tried to write a libdrm program to make the console switch to crtc 1, nor have I managed to trace where the crtc list comes from, or how to force it in the amdgpu driver itself.
- if, instead of playing with the crtc, I play with power management again, I seem to be able to get it working by booting once with power management enabled, then rebooting with it disabled, and then it works (video in console as well as X, and no crtc switching necessary). However, that may just have been how it decides to work today, and tomorrow it will no longer work.
I get identical behavior with kernels 4.6.2, 4.6.3, 4.6.4, 4.7-rc4 (which I decided to try given the supposed major amdgpu overhaul), 4.4.15 (which I decided to try given that the poster of the kernel.org bug was using Ubuntu's 4.4 kernel) (all on Gentoo; with the exception of 4.7-rc4, this is with Gentoo's fbdecor patches, but I obviously have that disabled while working through this problem).
Overall, this is very weird and frustrating. I've had power management issues with previous Radeon laptops (all of them), but the gpu pm issue usually manifested as hard locks while playing games with power management enabled, not this crap. I have a feeling that anything that gets it to work only gets it to work due to random chance, and it's just that I'm beating at it often enough that I finally hit the jackpot at some point, and it keeps working correctly until I power cycle for an extended period again. I also get the feeling that I must be a major masochist to keep using ATI/AMD hardware, given that in 15+ years of using it I've never had an experience better than "meh, mostly works". If I weren't dead broke, I'd have chucked this machine over a freeway overpass and bought something else.
Created attachment 125268 [details]
My latest Xorg.0.log
Even though it's irrelevant to the fact that I couldn't even get console video, here's my Xorg.0.log. I suppose it may provide more detailed chip info than the kernel log. Also, I should mention that I diffed dmesg, Xorg.0.log, and xrandr outputs with and without video, and found no differences outside of time stamps (and of course the chosen crtc in xrandr when using that technique to get it working, but nothing is printed to kernel or X logs during this procedure).
Still broken in 4.7.0, in case that wasn't obvious. In related news, took me 20 minutes of blind reboots to get this piece of crap running this afternoon (came up after "only" 4 boots this morning). Now it seems to like going black in X shortly after starting, even with power management turned off. Thermal issues? Not likely, since the machine runs hot during use and doesn't flake out, but who knows? Defective hardware? Not likely, since efifb works perfectly every time (but I do want to occasionally do things that are not affected by the limitations of efifb). Given the complete lack of documentation for just about anything power related, both hardware and software (how about that incredibly useful barely English documentation "select this option will enable AMD powerplay component." for the "Enable AMD powerplay component" compile-time option? Good thing everybody knows exactly what AMD powerplay is, and how it differs from all the other undocumented power management options), I can't exactly fix this myself.
This bug affects me on Ubuntu 14.04, 15.10, 16.04, and Arch Linux. I too have the same AMD chip, A10-8700p, on an HP 17-g121wm. I can only get past the black screen if I boot into low-graphics mode with the nomodeset parameter.
I have scoured countless online resources looking for a solution to this problem. I hope someone knowledgeable and able to do something about it takes notice of this page and works on the bug.
to the kernel command line in grub help?
(In reply to Alex Deucher from comment #4)
> Does appending:
> to the kernel command line in grub help?
No, appending amdgpu.runpm=0 does not help.
Hi, I have slightly different problem with slightly different HW (FX APU, same GPU) so this may be unrelated but try this kernel: https://uloz.to/!qq2wDEC87/linux45-4-5rc6-1-x86-64-pkg-tar-xz (it's from Manjaro, works just fine in Arch). If I wait 2 minutes before booting the system I have no problems as long as I don't suspend it. It may be more stable for you too.
(In reply to Tom from comment #6)
Not insane enough yet to try a binary kernel, ever, even from a more reputable source than some site in a language I don't even understand.
> If I wait 2 minutes before booting the
> system I have no problems as long as I don't suspend it.
I don't understand what this means. Do you let it sit in the bootloader for 2 minutes, or what?
> It may be more stable for you too.
The official kernels are stable for me, once the display comes up and stays up for more than a minute. Most of the time, this involves doing a quick boot with power management enabled, starting X, then rebooting normally, with power management disabled. Sometimes this does not work, which doesn't surprise me, since this is basically just blindly throwing shit at the wall until something sticks.
I'm not going to even try suspending until I can at least reliably boot. Suspending always brings its own set of problems, and has never worked 100% reliably for me.
In related news, 4.8-rc1 still doesn't work. Actually, it's worse, in that X won't start at all (something about unable to schedule ib on the TOPAZ). I'm pretty sure that's a different issue, though, so I won't elaborate (or report, until this bug is fixed first).
> Not insane enough yet to try a binary kernel, ever, even from a more
> reputable source than some site in a language I don't even understand.
Okay, understand that. I tried to find it on some reliable site but they just delete old packages without anything like ALA. Maybe 4.5rc6 vanilla kernel would work but I'm not sure whether they used some patches etc. BTW the page is in English (and other languages), there's a flag in the top right corner to change language.
> I don't understand what this means. Do you let it sit in the bootloader for
> 2 minutes, or what?
Yeah, I have 2 minute timeout in bootloader and wait. Then it works.
> The official kernels are stable for me, once the display comes up and stays
> up for more than a minute. Most of the time, this involves doing a quick
> boot with power management enabled, starting X, then rebooting normally,
> with power management disabled. Sometimes this does not work, which doesn't
> surprise me, since this is basically just blindly throwing shit at the wall
> until something sticks.
The magical kernel just works without reboot cycles, changing parameters etc. It doesn't need any user interaction.
> In related news, 4.8-rc1 still doesn't work. Actually, it's worse, in that
> X won't start at all (something about unable to schedule ib on the TOPAZ).
> I'm pretty sure that's a different issue, though, so I won't elaborate (or
> report, until this bug is fixed first).
Just thought I'd mention that 4.8-rc2 and 4.8-rc3 are still worse off than before, even though both have changes for amdgpu power management (i.e., no fix for this, and X still won't start any more, although I suppose maybe the later is because I'd need a matching update to xf86-video-amdgpu I'm not aware of). I won't report any further lack of progress.
4.8-rc5 appears to no longer prevent X from starting. I guess that's some sort of progress (or at least no longer regress). Still doesn't fix this bug, though.
I too am affected by this issue. Is this issue ever going to be fixed? Who do I need to talk to in order to get this issue properly looked into? Where can I go to contribute my services to try and get this issue resolved? Because this was not an issue with fglrx/Catalyst drivers. But now suddenly the new driver that has replaced it can't handle what is expected of it. I would be more than happy to take my complaints and possible offering of services to the proper place, so who needs to be informed of this issue so it can finally be fixed?
(In reply to Josh from comment #11)
> I too am affected by this issue.
It might help if you add yourself to the CC list. Not that anyone talks on this bug but me.
> Is this issue ever going to be fixed?
I have my doubts. I have made efifb (nomodeset) my default boot now, so I can at least use this machine. Can't adjust brightness, play games or watch full-screen videos, but I guess I'll just have to live with that. The only way I expect this will ever get fixed for me is if I get enough money together to buy a new machine.
The fact is, the only changes I've noticed have been regressions. The non-start of X in 4.8 was fixed, but later 4.8-series kernels (and 4.9 kernels as well) seem to crash badly (panic? hard to tell, since I have no way of knowing what happens when the screen is black -- I can't even get LEDs on the keyboard to flash, since this piece of crap machine also suffers from a non-linux-compatible keyboard: http://unix.stackexchange.com/questions/233396/system-creates-extra-shift-alt-control-keypresses/302890, and my attempts to get it to log to EFI have been unsuccessful). The crash then forces a filesystem check on reboot, which takes forever and I don't have the patience to deal with that any more. I was able to get amdgpu running "properly" (w/o power management) in 4.8.11 eventually using a 4.4.32 kernel to do the initial boot with power management enabled, but it's still unreliable enough that it isn't worth trying very often.
I said I wouldn't report any more on the lack of progress, but yeah, 4.8.11 and 4.9-rc7 are still complete garbage, even worse than before.
@Thomas J Moore
My HP laptop with AMD A10-7300 (Kaveri iGPU, Topaz dGPU) has a fully working X up to kernel 4.9 (not beyond) with these module options applied as such :
# cat /etc/modprobe.d/amdgpu.conf
install radeon /bin/false
options amdgpu aspm=0 bapm=0 runpm=0 powerplay=1
In case it helps.
@Thomas J Moore
One more note : my laptop won't boot with dpm=0.
(I suppose you have efficiently blacklisted radeon, and that the kernel is compiled with CIK enabled).
(In reply to SET from comment #13)
> My HP laptop with AMD A10-7300 (Kaveri iGPU, Topaz dGPU) has a fully working
> X up to kernel 4.9 (not beyond) with these module options applied as such :
Thanks. I probably went through that particular set of options back when I was still banging my head against the wall trying to make it work (and tried it again just now in case it miraculously started working). It does nothing for me in 4.9, but may help others. Right now, turning amdgpu off entirely (via modprobe.blacklist=amdgpu or nomodeset if compiled in) is the only thing that reliably works (and thus using efifb instead). And, in case anyone reading this thinks otherwise, it's not just X that doesn't work; the console doesn't work, either (on the rare occasions the console works and stays working, X and Mesa work as well). The only thing that consistently works with the amdgpu driver is the backlight. The display is still black, and maybe occasionally will get some stray pixels temporarily lit up during boot, as if the display is pointing to nonexistent memory.
*** Bug 97605 has been marked as a duplicate of this bug. ***
Is this the place for AMD-related kernel bugs to die? Is that why I was told to open the bug here, rather than the Linux kernel bugzilla? Who knows? In any case, not one person who could actually address the problem has made a single comment in nearly a year. In fact, the sum total of comments I've read on other bugs have been "are you using the latest" and "try without power management". Whatever. I have solved this problem for myself in the only way I could: I bought a new machine. Whom do I send the bill to? Oh, right, you want me to pay *you* to look at bugs. I'm not sure what possessed me to by another AMD machine, but I did, and it has almost identical graphics (but unlike the HP Pavilion, it only suffers from the kernel panic with power management enabled, not this bug, and given the response I've gotten from this bug, I won't even bother opening a new one). I am tempted to close this bug, or mark it as a duplicate of #97605 (even though the opposite is true, so I went ahead and marked it as such). Instead I'll simply ignore it like everyone else.
Would be good to get to the bottom of this one as it affects my wife's laptop :(
Currently stuck with Windows 10 which is sluggish to do anything...
Running the kernels from https://github.com/M-Bab/linux-kernel-amdgpu-binaries did actually allow it to boot to a proper desktop. Though was a little flakey, mainly very fast flickering of the screen when logged out and additionally when the laptop was tried to move to sleep mode it just went nuts, the screen was just overdrawing on itself until the entire screen was white!
Still it does look as if the DC merge in the kernel may help significantly with this problem if those kernels are anything to go by. Saying that my wireless is completely broken with this kernel :(
Just can't win!
(In reply to Thomas J. Moore from comment #17)
> Is this the place for AMD-related kernel bugs to die?
Yes. Unfortunately, the only solution is to get a computer that is already known to be supported. You won't get any proper support, because there is no company or support to go to. Nobody who knows how to fix these posts actually reads these posts. This is the power of open source. Think I'll just bite the bullet and buy a Mac.
(In reply to James Payne from comment #18)
> Would be good to get to the bottom of this one as it affects my wife's
> laptop :(
> Currently stuck with Windows 10 which is sluggish to do anything...
> Running the kernels from
> https://github.com/M-Bab/linux-kernel-amdgpu-binaries did actually allow it
> to boot to a proper desktop. Though was a little flakey, mainly very fast
> flickering of the screen when logged out and additionally when the laptop
> was tried to move to sleep mode it just went nuts, the screen was just
> overdrawing on itself until the entire screen was white!
> Still it does look as if the DC merge in the kernel may help significantly
> with this problem if those kernels are anything to go by. Saying that my
> wireless is completely broken with this kernel :(
> Just can't win!
Have you tried running ROCK(ROCm) kernel? my machine (acer E5 fx-9800P + topaz) works OKish with 1.6.x branch, even my wireless works(Atheros QCA9377).
The black screen problem seems to be solved ->
Read carefully entire description of this bug in which you followed another bug solutions and end up with few observation related to your issues in operating HP Pavilion 17-g133cl with A10 Carrizo+Topaz.
Read carefully entire description and tried to understand what exactly issue you faced during using the amdgpu driver on HP Pavilion 17-g133cl with A10 Carrizo+Topaz and solutions provided by the experts in comments that will sure help other users.
Can I simply just say what a comfort to find somebody that really knows what they're discussing on the internet.
You definitely understand how to bring an issue to light and make it important. https://www.papersjunction.co.uk
More and more people really need to read this and understand this side of your story. It's surprising you're not more popular since you certainly possess the gift.
very useful post. thanks to you.