Created attachment 128807 [details]
AC Power Unigine Heaven dmesg
I have a Dell laptop with an Intel iGPU and a Radeon dGPU (a Firepro W5130M, recognized as a Radeon HD 8830M by lspci).
Whenever I attempt to use the Radeon card via DRI_PRIME=1 while the laptop is plugged in, the card crashes. When on battery the card performs fine.
One way I have tested this is with glxgears. Often glxgears will work the first time it is run, but if it is closed and then opened a few seconds later, it will crash.
I have also tried running programs like Portal 2 and Unigine Heaven. These always hang. Unfortunately the error messages produced in dmesg are not the most consistent. Sometimes I get a backtrace, sometimes even a kernel panic on reboot.
I'll post the various logs I have. One thing I have found is that glxgears does not crash if I use radeon.runpm=0, however it still hangs when running something like Portal.
Created attachment 128808 [details]
AC Power Portal 2 dmesg
Created attachment 128809 [details]
AC Power glxgears
The last part of the log, 121 onward is the second failed attempt for glxgears.
The previous part 94-96 is a successful glxgears attempt seconds earlier. These logs are the only thing that have been consistent from my tests.
Possibly a duplicate of bug 98897. Does attachment 128780 [details] [review] fix the issue?
It has no effect.
Created attachment 128811 [details]
Created attachment 128812 [details]
I get these sometimes when I reboot after testing the card. Usually after testing it with something like Portal or Unigine Heaven. It doesn't always happen.
Sorry it's a picture, I haven't been able to get kdump to work, though if necessary I'll give it another try.
It's been a while, but this is still happening. I've moved distro by now (Fedora 26) but I'm getting the exact same error. Though it's nearly identical, I'll upload a new log generated by journalctl -b --no-pager | egrep 'radeon|drm|kernel' >> radeon_crash
Created attachment 132833 [details]
Crash on AC Power
Generated by journalctl -b --no-pager | egrep 'radeon|drm|kernel' >> radeon_crash
after running DRI_PRIME=1 vblank_mode=0 glxgears
To clarify, since it has been a while, the radeon card crashes when the laptop is plugged in, but not when it's unplugged. It has been like this as long as I have had it (since July 2016)
I have also a Dell notebook with the same dGPU, but cannot reproduce your error with Ubuntu 16.04 kernel 4.10.0.
Instead I have the problem, that the dGPU only gets powered at a low frequency, and becomes slower than the Intel GPU by doing so.
When I run glmark2, I look at "/sys/kernel/debug/dri/0/radeon_pm_info" and always see the following output:
default engine clock: 925000 kHz
current engine clock: 299990 kHz
default memory clock: 1000000 kHz
current memory clock: 148990 kHz
voltage: 1225 mV
PCIE lanes: 16
Btw.: This dGPU cannot be connected to a CRT directly, as there seems to be no connection available at all.
(In reply to Mathias from comment #10)
> I have also a Dell notebook with the same dGPU, but cannot reproduce your
> error with Ubuntu 16.04 kernel 4.10.0.
> Instead I have the problem, that the dGPU only gets powered at a low
> frequency, and becomes slower than the Intel GPU by doing so.
> When I run glmark2, I look at "/sys/kernel/debug/dri/0/radeon_pm_info" and
> always see the following output:
> default engine clock: 925000 kHz
> current engine clock: 299990 kHz
> default memory clock: 1000000 kHz
> current memory clock: 148990 kHz
> voltage: 1225 mV
> PCIE lanes: 16
> Btw.: This dGPU cannot be connected to a CRT directly, as there seems to be
> no connection available at all.
Is the GPU at low power while plugged in, or always?
glmark2 while plugged in instantly crashes for me, but on battery I get a score of 1395 which is horrible compared to the 3680 from my integrated card. Plus horrible rendering artifacts. But I'm not sure if that is really related to my main problem.
This is "/sys/kernel/debug/dri/0/radeon_pm_info" on battery with glmark2 for me:
uvd vclk: 0 dclk: 0
power level 2 sclk: 40000 mclk: 30000 vddc: 900 vddci: 0 pcie gen: 3
Oddly enough, a completely different format than yours.
This card seems to have a multitude of problems
Created attachment 133178 [details]
Rendering artifacts on battery
Perhaps not entirely related, but these are rendering artifacts while on battery with glmark2. Note the horizontal lines, this is the main issue, that and major stuttering. glxgears produces similar results.
As an owner of the same GPU, I can confirm this bug still appears under linux 4.15.0-rc7.
The card card crashes or hangs when used with DRI_PRIME=1. After seeing this report, I tried with the laptop plugged out and it surprisingly did work.
It would be nice to have a solution as the w5130m isn't currently supported by amdgpu either leaving it unusable under linux.
(In reply to l.hartmail from comment #13)
> As an owner of the same GPU, I can confirm this bug still appears under
> linux 4.15.0-rc7.
> The card card crashes or hangs when used with DRI_PRIME=1. After seeing this
> report, I tried with the laptop plugged out and it surprisingly did work.
> It would be nice to have a solution as the w5130m isn't currently supported
> by amdgpu either leaving it unusable under linux.
Glad to finally find someone else with the same issue! I was beginning to think I was the only one.
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/767.