With ASUS Radeon R9 270, just after I boot up and GDM is supposed to start, my screen goes blank and some GPU faults are reported in dmesg. Upgrading to kernel 3.13 solves this issue. One time the graphics also recovered with 3.12.8 after coming back from suspend, but I have not been able to reproduce that. Using up-to-date Arch Linux testing with: xorg-server 1.15.0 mesa 10.0.2 libdrm 2.4.51 xf86-video-ati 7.2.0 kernel versions tested: 3.12.8, 3.12.1, 3.11.5, 3.10.10 The following errors are reported in dmesg (full log attached): radeon 0000:01:00.0: GPU fault detected: 147 0x005e7001 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x091C0002 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x1E070001 VM fault (0x01, vmid 15) at page 152829954, read from CP (112) radeon 0000:01:00.0: GPU fault detected: 147 0x02de8801 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 VM fault (0x00, vmid 0) at page 0, read from unknown (0) radeon 0000:01:00.0: GPU fault detected: 147 0x02de8801 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 VM fault (0x00, vmid 0) at page 0, read from unknown (0) radeon 0000:01:00.0: GPU fault detected: 147 0x04df8402 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00080826 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x1F084002 VM fault (0x02, vmid 15) at page 526374, write from TC (132)
Created attachment 92487 [details] dmesg.log
Created attachment 92488 [details] Xorg.0.log
(In reply to comment #0) > With ASUS Radeon R9 270, just after I boot up and GDM is supposed to start, > my screen goes blank and some GPU faults are reported in dmesg. > > Upgrading to kernel 3.13 solves this issue. Any chance you could bisect to see what the fix was?
(In reply to comment #3) > Any chance you could bisect to see what the fix was? Is that safe? I'm not thrilled at the thought of booting prerelease kernels on my primary workstation. There's a chance of hitting filesystem/RAID/etc corruption bugs, no? Are there any liveUSB systems I could use instead of my own main installation?
(In reply to comment #3) > Any chance you could bisect to see what the fix was? I got my hands on a spare disk and did the bisect. Strangely enough this turned up (!?) ["first bad commit" meaning good, since I had to invert bisect bad/good] ad41550666f89b5af9335fcde9e98b61190daf99 is the first bad commit commit ad41550666f89b5af9335fcde9e98b61190daf99 Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Sep 26 13:11:18 2013 -0400 drm/radeon: enable hdmi audio by default Seems to be stable enough for the majority of users. It can be disabled on the fly via connector attributes. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Just to make sure, I double-checked... uname -r && dmesg |grep 'VM fault' 3.12.0-rc3-ARCH-00404-gad41550 uname -r && dmesg |grep 'VM fault' 3.12.0-rc3-ARCH-00403-g10ebc0b [ 18.716461] VM fault (0x00, vmid 0) at page 0, read from unknown (0) [ 18.716466] VM fault (0x00, vmid 0) at page 0, read from unknown (0) [ 18.716470] VM fault (0x00, vmid 0) at page 0, read from unknown (0) ... GDM successfully starts with the 1st and fails with the 2nd. What does this even mean, how can disabling HDMI audio break things? :)
Created attachment 92625 [details] dmesg when booting with git rev 10ebc0b This message looks interesting (also present int the original upload) Jan 23 00:45:54 tewn kernel: radeon 0000:01:00.0: Invalid ROM contents
Does manually disabling audio on 3.13 break things? E.g., set radeon.audio=0 on the kernel command line in grub.
Created attachment 92681 [details] xorg.log with EDID I also experience problems related to this but in my case the situation is revered. Kernel 3.12 works fine but 3.13 gives me a blank screen on my HDMI head. Booting 3.13 with radeon.audio=0 solves the problem. There are no warnings in dmesg or Xorg.log. My card is a Juniper HD6770.
(In reply to comment #7) > Does manually disabling audio on 3.13 break things? E.g., set > radeon.audio=0 on the kernel command line in grub. No, that doesn't seem to change anything. GDM still works, no errors in logs. alsamixer still displays an "HDA ATI HDMI" card. Am I doing it wrong? % cat /proc/cmdline initrd=\initramfs-linux.img root=UUID=f76fdeca-b4f3-49f7-891e-910c1c17b1f8 rw radeon.audio=0
Also this bug doesn't occur on the same hardware with a fresh installation on kernel 3.12.8, it's something specific to my system configuration. Any ideas?
Created attachment 92964 [details] dmesg Here is a dmesg drm.debug=0xe log for debugging my problem (assuming it's related to this bug) This dmesg is taken with 3.13 with drm.debug=0xe. The kernel framebuffer shows up on both monitors. After starting X only the secondary dvi head shows anything. I ran "xrandr --output HDMI-0 --set audio off --auto" and this brings the primary hdmi head back. After that I took the dmesg dump. It's possible that my monitor is defect. It's been unreliable in the past.
Alex, I spent many hours setting up the bisect environment and doing all the builds. It would be fair if you would spend some of your time to at least give me a reply. Thomas, you have a different card and the symptoms are different. Unless you have good reasons to believe otherwise, I think you're experiencing a different issue.
The audio hardware doesn't interact with the memory controller (or 3D engine for that matter) so I don't really see how it could cause a GPU page fault. Also, the fact that disabling audio on a newer kernel doesn't break things leads me to believe it's not related to the audio at all. Maybe some stale mesa stuff floating around on your system? Nothing else really comes to mind.
(In reply to comment #13) > Also, the fact that disabling audio on a newer kernel doesn't break things If I disable audio, shouldn't the Radeon HDMI ALSA device disappear? That didn't happen for me when I set radeon.audio=0. Am I doing something wrong? > The audio hardware doesn't interact with the memory controller (or 3D engine > for that matter) so I don't really see how it could cause a GPU page fault. Could it be a timing issue? Audio init delays startup enough that it doesn't hit some races anymore? A broken system sometimes managed to recover after coming back from suspend, though rarely.
(In reply to comment #14) > (In reply to comment #13) > > Also, the fact that disabling audio on a newer kernel doesn't break things > > If I disable audio, shouldn't the Radeon HDMI ALSA device disappear? That > didn't happen for me when I set radeon.audio=0. Am I doing something wrong? > No. disabling audio in the radeon driver just disables the audio stream in the hdmi stream. The audio device itself can't be disabled. > > The audio hardware doesn't interact with the memory controller (or 3D engine > > for that matter) so I don't really see how it could cause a GPU page fault. > > Could it be a timing issue? Audio init delays startup enough that it doesn't > hit some races anymore? If you disable acceleration (add Option "NoAccel" "true" to the device section of your xorg config) do you still get the problems? It's most likely some issue related to the 3D engine set up in mesa.
(In reply to comment #15) > If you disable acceleration (add Option "NoAccel" "true" to the device > section of your xorg config) do you still get the problems? It's most > likely some issue related to the 3D engine set up in mesa. You're right, NoAccel true also fixes this issue.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/428.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.