I am running an AMD Ryzen 2400G, using its integrated graphics in the Linux host and a gtx 1070 bound to VFIO for virtualized passthrough. When I boot with a kernel fetched from git using amd-dri-next on 4.16, it halts the booting process, usually somewhere around when it checks UTMP. It does not lock up my keyboard lights, but nothing is displayed on the screen, it does respond to control-alt-delete. My system is set up to boot to the command line, I use X11 from there with "startx" usually; when the system completes boot of course.
Can you attach your kernel log or dmesg output from the boot? Do other kernels work?
Where is my kernel log located? I successfully use a 4.16 mainline kernel... it's dri-next-staging that's causing problems. I am using mesa from git also... though I don't know if it ever gets around to using 3d graphics before the driver fails anyway.
How do I find an prior boot's dmesg?
options amdgpu dpm=0 dc=1 and seee if it still locks up.
(In reply to Edward Kigwana from comment #4)
> options amdgpu dpm=0 dc=1 and seee if it still locks up.
That's in /etc/modconf.d or the like, right?
On Ubuntu the kernel log keeps appending to /var/log/kern.log, but that might look different on different distros.
If you have a luxury of a second system you might be able to ssh into the Ryzen system and run dmesg that way.
As for the options Edward mentioned, you can pass them to the kernel command line. If you use grub for your bootloader you'd press 'e' on the selected kernel and append " amdgpu.dpm=0 amdgpu.dc=1" at the end of the line that starts with "linux". Alternatively you can append those to GRUB_CMDLINE_LINUX in /etc/default/grub and run "sudo update-grub"
Keep in mind that this is how I'd do it on Ubuntu. There' might be a way to pass these through /etc/modconf.d as well.
I'm not sure where the kernel log is on arch. When I add that option to my kernel command line that you recommended, both my drm-next-staging kernel and 4.16 mainline kernels fail. I have to remove it, then my 4.16 kernel works, but the drm-next-staging kernel still fails to operate the screen. (The kernel doesn't crash, as my keyboard still works, I can even press control-alt-delete to reboot, so I suspect it just isn't using my screen in the amdgpu driver.)
Is CONFIG_DRM_AMD_DC_DCN1_0 enabled in the kernel build configuration in both cases?
This is possibly the same bug described in bug #105760.
And yes, Arch has had CONFIG_DRM_AMD_DC_DCN1_0=y since 4.15.
(In reply to Joshua Lee from comment #3)
> How do I find an prior boot's dmesg?
Try "journalctl -b -1" (for the boot attempt directly prior to this one, -2 for the one before that, etc...).
Someone on the /r/VFIO discord with a Ryzen APU (he usually boots his VM from the console, rather than having a graphical host) confirmed the crashiness by running Furmark, which crashed his GPU driver in ten minutes; his dmesg showed that as well.
13877 0.1 0.0 0 0 pts/1 ZNl+ 10:09 0:00 [GpuTest] <defunct>
[90972.383503] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled seq=36081, last emitted seq=36083
[90972.383512] [drm] IP block:psp is hung!
[90972.383514] [drm] GPU recovery disabled.
To be clear, the Furmark was being run in his host system, not within a VM.
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/342.