Bug 97055

Summary: Black screens on A10-8780P (Carrizo) + R7 M260/M265 (Topaz) Combo
Product: DRI Reporter: Thomas J. Moore <darktjm>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: critical    
Priority: medium CC: jamoflaw, nmset
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg from my latest boot (with some wifi messages at end deleted for privacy)
none
My latest Xorg.0.log none

Description Thomas J. Moore 2016-07-23 02:32:31 UTC
Created attachment 125267 [details]
dmesg from my latest boot (with some wifi messages at end deleted for privacy)

I have an HP Pavilion 17-g133cl with A10 Carrizo+Topaz.  I struggled for hours/days to get this to actually display anything at all using the amdgpu driver.  In the mean time, efifb works perfectly every time (except of course no 3d, no X resolution switching, no brightness control, and slow enough that video playing is also impossible).  Since it seemed like there were just issues powering up the display controller correctly, I decided at first to add my notes to https://bugzilla.kernel.org/show_bug.cgi?id=117591 but my symptoms (and cures) are, indeed, different.  After much blind playing with my system (I have no way to ssh in, and the keyboard is flaky as well, so that was a lot of fun), I made the following observations:

  - kernel power management options appeared to have no effect
  - sometimes (very rarely), it just starts working, regardless
  - once it starts working for more than 1 minute, it stays working
  - often, it starts partially working, by giving me a flickering display
  - often, it starts working with a stable display, only to start flickering again after a few seconds (very rarely even going completely black gain)
  - my first way of fixing it is fairly reliable, but almost always requires at least one blind reboot before it starts giving my a display:  xrandr --output eDP --crtc 1 (added to my .xinitrc)
   - if, when it finally comes up, it is flickering, it can be cured by toggling the crtc between 0 and 1 often enough until it remains stable for at least a minute.
   - if crtc 1 is enabled when X is killed, the machine goes blank and hangs hard; switching consoles while X is up works, though (although the consoles remain black).
   - changing the crtc to 1 in X does nothing for the console; only X displays anything.  I have not tried to write a libdrm program to make the console switch to crtc 1, nor have I managed to trace where the crtc list comes from, or how to force it in the amdgpu driver itself.
   - if, instead of playing with the crtc, I play with power management again, I seem to be able to get it working by booting once with power management enabled, then rebooting with it disabled, and then it works (video in console as well as X, and no crtc switching necessary).  However, that may just have been how it decides to work today, and tomorrow it will no longer work.

I get identical behavior with kernels 4.6.2, 4.6.3, 4.6.4, 4.7-rc4 (which I decided to try given the supposed major amdgpu overhaul), 4.4.15 (which I decided to try given that the poster of the kernel.org bug was using Ubuntu's 4.4 kernel) (all on Gentoo; with the exception of 4.7-rc4, this is with Gentoo's fbdecor patches, but I obviously have that disabled while working through this problem).

Overall, this is very weird and frustrating.  I've had power management issues with previous Radeon laptops (all of them), but the gpu pm issue usually manifested as hard locks while playing games with power management enabled, not this crap.  I have a feeling that anything that gets it to work only gets it to work due to random chance, and it's just that I'm beating at it often enough that I finally hit the jackpot at some point, and it keeps working correctly until I power cycle for an extended period again.  I also get the feeling that I must be a major masochist to keep using ATI/AMD hardware, given that in 15+ years of using it I've never had an experience better than "meh, mostly works".  If I weren't dead broke, I'd have chucked this machine over a freeway overpass and bought something else.
Comment 1 Thomas J. Moore 2016-07-23 02:39:11 UTC
Created attachment 125268 [details]
My latest Xorg.0.log

Even though it's irrelevant to the fact that I couldn't even get console video, here's my Xorg.0.log.  I suppose it may provide more detailed chip info than the kernel log.  Also, I should mention that I diffed dmesg, Xorg.0.log, and xrandr outputs with and without video, and found no differences outside of time stamps (and of course the chosen crtc in xrandr when using that technique to get it working, but nothing is printed to kernel or X logs during this procedure).
Comment 2 Thomas J. Moore 2016-07-31 20:10:41 UTC
Still broken in 4.7.0, in case that wasn't obvious.  In related news, took me 20 minutes of blind reboots to get this piece of crap running this afternoon (came up after "only" 4 boots this morning).  Now it seems to like going black in X shortly after starting, even with power management turned off.  Thermal issues?  Not likely, since the machine runs hot during use and doesn't flake out, but who knows?  Defective hardware?  Not likely, since efifb works perfectly every time (but I do want to occasionally do things that are not affected by the limitations of efifb).  Given the complete lack of documentation for just about anything power related, both hardware and software (how about that incredibly useful barely English documentation "select this option will enable AMD powerplay component." for the "Enable AMD powerplay component" compile-time option?  Good thing everybody knows exactly what AMD powerplay is, and how it differs from all the other undocumented power management options), I can't exactly fix this myself.
Comment 3 Josh 2016-08-05 17:34:18 UTC
This bug affects me on Ubuntu 14.04, 15.10, 16.04, and Arch Linux. I too have the same AMD chip, A10-8700p, on an HP 17-g121wm. I can only get past the black screen if I boot into low-graphics mode with the nomodeset parameter.

I have scoured countless online resources looking for a solution to this problem. I hope someone knowledgeable and able to do something about it takes notice of this page and works on the bug.
Comment 4 Alex Deucher 2016-08-05 17:41:40 UTC
Does appending:
amdgpu.runpm=0
to the kernel command line in grub help?
Comment 5 Josh 2016-08-09 17:20:24 UTC
(In reply to Alex Deucher from comment #4)
> Does appending:
> amdgpu.runpm=0
> to the kernel command line in grub help?

No, appending amdgpu.runpm=0 does not help.
Comment 6 eegeeBahdoo1ohHoothie3Ajaegee5ia4Ri6cheizoawohkoogoleawahs0Uxahn 2016-08-11 16:38:55 UTC
Hi, I have slightly different problem with slightly different HW (FX APU, same GPU) so this may be unrelated but try this kernel: https://uloz.to/!qq2wDEC87/linux45-4-5rc6-1-x86-64-pkg-tar-xz (it's from Manjaro, works just fine in Arch). If I wait 2 minutes before booting the system I have no problems as long as I don't suspend it. It may be more stable for you too.
Comment 7 Thomas J. Moore 2016-08-13 00:44:28 UTC
(In reply to Tom from comment #6)
> https://uloz.to/!qq2wDEC87/linux45-4-5rc6-1-x86-64-pkg-tar-xz

Not insane enough yet to try a binary kernel, ever, even from a more reputable source than some site in a language I don't even understand.

> If I wait 2 minutes before booting the
> system I have no problems as long as I don't suspend it.

I don't understand what this means.  Do you let it sit in the bootloader for 2 minutes, or what?

> It may be more stable for you too.

The official kernels are stable for me, once the display comes up and stays up for more than a minute.  Most of the time, this involves doing a quick boot with power management enabled, starting X, then rebooting normally, with power management disabled.  Sometimes this does not work, which doesn't surprise me, since this is basically just blindly throwing shit at the wall until something sticks.

I'm not going to even try suspending until I can at least reliably boot.  Suspending always brings its own set of problems, and has never worked 100% reliably for me.

In related news, 4.8-rc1 still doesn't work.  Actually, it's worse, in that X won't start at all (something about unable to schedule ib on the TOPAZ).  I'm pretty sure that's a different issue, though, so I won't elaborate (or report, until this bug is fixed first).
Comment 8 eegeeBahdoo1ohHoothie3Ajaegee5ia4Ri6cheizoawohkoogoleawahs0Uxahn 2016-08-14 20:56:33 UTC
> Not insane enough yet to try a binary kernel, ever, even from a more
> reputable source than some site in a language I don't even understand.

Okay, understand that. I tried to find it on some reliable site but they just delete old packages without anything like ALA. Maybe 4.5rc6 vanilla kernel would work but I'm not sure whether they used some patches etc. BTW the page is in English (and other languages), there's a flag in the top right corner to change language.

> I don't understand what this means.  Do you let it sit in the bootloader for
> 2 minutes, or what?

Yeah, I have 2 minute timeout in bootloader and wait. Then it works.
 
> The official kernels are stable for me, once the display comes up and stays
> up for more than a minute.  Most of the time, this involves doing a quick
> boot with power management enabled, starting X, then rebooting normally,
> with power management disabled.  Sometimes this does not work, which doesn't
> surprise me, since this is basically just blindly throwing shit at the wall
> until something sticks.

The magical kernel just works without reboot cycles, changing parameters etc. It doesn't need any user interaction.

> In related news, 4.8-rc1 still doesn't work.  Actually, it's worse, in that
> X won't start at all (something about unable to schedule ib on the TOPAZ). 
> I'm pretty sure that's a different issue, though, so I won't elaborate (or
> report, until this bug is fixed first).

Same here.
Comment 9 Thomas J. Moore 2016-08-30 16:11:43 UTC
Just thought I'd mention that 4.8-rc2 and 4.8-rc3 are still worse off than before, even though both have changes for amdgpu power management (i.e., no fix for this, and X still won't start any more, although I suppose maybe the later is because I'd need a matching update to xf86-video-amdgpu I'm not aware of).  I won't report any further lack of progress.
Comment 10 Thomas J. Moore 2016-09-09 15:58:23 UTC
4.8-rc5 appears to no longer prevent X from starting.  I guess that's some sort of progress (or at least no longer regress).  Still doesn't fix this bug, though.
Comment 11 Josh 2016-12-04 02:31:45 UTC
I too am affected by this issue. Is this issue ever going to be fixed? Who do I need to talk to in order to get this issue properly looked into? Where can I go to contribute my services to try and get this issue resolved? Because this was not an issue with fglrx/Catalyst drivers. But now suddenly the new driver that has replaced it can't handle what is expected of it. I would be more than happy to take my complaints and possible offering of services to the proper place, so who needs to be informed of this issue so it can finally be fixed?
Comment 12 Thomas J. Moore 2016-12-04 21:54:30 UTC
(In reply to Josh from comment #11)
> I too am affected by this issue.

It might help if you add yourself to the CC list.  Not that anyone talks on this bug but me.

> Is this issue ever going to be fixed?

I have my doubts.  I have made efifb (nomodeset) my default boot now, so I can at least use this machine.  Can't adjust brightness, play games or watch full-screen videos, but I guess I'll just have to live with that.  The only way I expect this will ever get fixed for me is if I get enough money together to buy a new machine.

The fact is, the only changes I've noticed have been regressions.  The non-start of X in 4.8 was fixed, but later 4.8-series kernels (and 4.9 kernels as well) seem to crash badly (panic?  hard to tell, since I have no way of knowing what happens when the screen is black -- I can't even get LEDs on the keyboard to flash, since this piece of crap machine also suffers from a non-linux-compatible keyboard:  http://unix.stackexchange.com/questions/233396/system-creates-extra-shift-alt-control-keypresses/302890, and my attempts to get it to log to EFI have been unsuccessful).  The crash then forces a filesystem check on reboot, which takes forever and I don't have the patience to deal with that any more.  I was able to get amdgpu running "properly" (w/o power management) in 4.8.11 eventually using a 4.4.32 kernel to do the initial boot with power management enabled, but it's still unreliable enough that it isn't worth trying very often.

I said I wouldn't report any more on the lack of progress, but yeah, 4.8.11 and 4.9-rc7 are still complete garbage, even worse than before.
Comment 13 SET 2017-01-02 21:54:11 UTC
@Thomas J Moore

My HP laptop with AMD A10-7300 (Kaveri iGPU, Topaz dGPU) has a fully working X up to kernel 4.9 (not beyond) with these module options applied as such :

# cat /etc/modprobe.d/amdgpu.conf 

install radeon /bin/false
options amdgpu aspm=0 bapm=0 runpm=0 powerplay=1

In case it helps.
Comment 14 SET 2017-01-02 22:06:18 UTC
@Thomas J Moore

One more note : my laptop won't boot with dpm=0.

(I suppose you have efficiently blacklisted radeon, and that the kernel is compiled with CIK enabled).
Comment 15 Thomas J. Moore 2017-01-03 23:38:38 UTC
(In reply to SET from comment #13)
> My HP laptop with AMD A10-7300 (Kaveri iGPU, Topaz dGPU) has a fully working
> X up to kernel 4.9 (not beyond) with these module options applied as such :

Thanks.  I probably went through that particular set of options back when I was still banging my head against the wall trying to make it work (and tried it again just now in case it miraculously started working).  It does nothing for me in 4.9, but may help others.  Right now, turning amdgpu off entirely (via modprobe.blacklist=amdgpu or nomodeset if compiled in) is the only thing that reliably works (and thus using efifb instead).  And, in case anyone reading this thinks otherwise, it's not just X that doesn't work; the console doesn't work, either (on the rare occasions the console works and stays working, X and Mesa work as well).  The only thing that consistently works with the amdgpu driver is the backlight.  The display is still black, and maybe occasionally will get some stray pixels temporarily lit up during boot, as if the display is pointing to nonexistent memory.
Comment 16 Thomas J. Moore 2017-06-06 15:38:40 UTC
*** Bug 97605 has been marked as a duplicate of this bug. ***
Comment 17 Thomas J. Moore 2017-06-06 15:40:11 UTC
Is this the place for AMD-related kernel bugs to die?  Is that why I was told to open the bug here, rather than the Linux kernel bugzilla?  Who knows?  In any case, not one person who could actually address the problem has made a single comment in nearly a year.  In fact, the sum total of comments I've read on other bugs have been "are you using the latest" and "try without power management".  Whatever.  I have solved this problem for myself in the only way I could:  I bought a new machine.  Whom do I send the bill to?  Oh, right, you want me to pay *you* to look at bugs.  I'm not sure what possessed me to by another AMD machine, but I did, and it has almost identical graphics (but unlike the HP Pavilion, it only suffers from the kernel panic with power management enabled, not this bug, and given the response I've gotten from this bug, I won't even bother opening a new one).  I am tempted to close this bug, or mark it as a duplicate of #97605 (even though the opposite is true, so I went ahead and marked it as such).  Instead I'll simply ignore it like everyone else.
Comment 18 James Payne 2017-10-26 22:52:52 UTC
Would be good to get to the bottom of this one as it affects my wife's laptop :(

Currently stuck with Windows 10 which is sluggish to do anything...

Running the kernels from https://github.com/M-Bab/linux-kernel-amdgpu-binaries did actually allow it to boot to a proper desktop. Though was a little flakey, mainly very fast flickering of the screen when logged out and additionally when the laptop was tried to move to sleep mode it just went nuts, the screen was just overdrawing on itself until the entire screen was white!

Still it does look as if the DC merge in the kernel may help significantly with this problem if those kernels are anything to go by. Saying that my wireless is completely broken with this kernel :( 

Just can't win!
Comment 19 Josh 2017-11-01 14:54:56 UTC
(In reply to Thomas J. Moore from comment #17)
> Is this the place for AMD-related kernel bugs to die? 
Yes. Unfortunately, the only solution is to get a computer that is already known to be supported. You won't get any proper support, because there is no company or support to go to. Nobody who knows how to fix these posts actually reads these posts. This is the power of open source. Think I'll just bite the bullet and buy a Mac.
Comment 20 Jan Vesely 2017-11-01 15:23:56 UTC
(In reply to James Payne from comment #18)
> Would be good to get to the bottom of this one as it affects my wife's
> laptop :(
> 
> Currently stuck with Windows 10 which is sluggish to do anything...
> 
> Running the kernels from
> https://github.com/M-Bab/linux-kernel-amdgpu-binaries did actually allow it
> to boot to a proper desktop. Though was a little flakey, mainly very fast
> flickering of the screen when logged out and additionally when the laptop
> was tried to move to sleep mode it just went nuts, the screen was just
> overdrawing on itself until the entire screen was white!
> 
> Still it does look as if the DC merge in the kernel may help significantly
> with this problem if those kernels are anything to go by. Saying that my
> wireless is completely broken with this kernel :( 
> 
> Just can't win!

Have you tried running ROCK(ROCm) kernel? my machine (acer E5 fx-9800P + topaz) works OKish with 1.6.x branch, even my wireless works(Atheros QCA9377).
Comment 21 FFAB 2017-12-18 00:29:19 UTC
The black screen problem seems to be solved ->
https://bugs.freedesktop.org/show_bug.cgi?id=101483#c36
Comment 22 jackponting@gmail.com (Spammer; Account disabled) 2019-01-21 05:07:31 UTC Comment hidden (spam)
Comment 23 alyssakpatterson@yahoo.com.au (Spammer; Account disabled) 2019-02-15 05:09:34 UTC Comment hidden (spam)
Comment 24 alan.martin.pmp@gmail.com (Spammer; Account disabled) 2019-05-15 13:27:31 UTC Comment hidden (spam)
Comment 25 omarandemad@gmail.com (Spammer; Account disabled) 2019-07-20 00:35:31 UTC Comment hidden (spam)
Comment 26 Martin Peres 2019-11-19 08:09:05 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/83.
Comment 27 jennyaly18@gmail.com (Spammer; Account disabled) 2019-11-22 09:38:15 UTC Comment hidden (spam)
Comment 28 juliasamy09@gmail.com (Spammer; Account disabled) 2019-11-25 07:22:32 UTC Comment hidden (spam)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.