Currently on Arch Linux after they shipped an update to the linux-firmware set (20170217.12987ca-1), there's been reports of various issues ranging from power management failing to in my case (AMD Radeon RX 460) Xorg failing to work at all (it either blinks and goes back to a frozen VT as the GPU hangs, or the GPU hangs on a full-screen corruption of some kind.) This is broken on both kernel 4.9.11 and 4.10 in my testing, on Xorg 1.19.1. The system still responds to SSH connections, but fails to shutdown properly if attempted over that. Tracker link: https://bugs.archlinux.org/task/53042 I've traced it back to a specific commit to linux-firmware, 7a110b85a46d7f884f4ac712ff52e02ed57234bd, https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu/carrizo_ce.bin?id=7a110b85a46d7f884f4ac712ff52e02ed57234bd, pushed to the git repo on 2-17-17, which updates a large subset of the firmware images used by AMDGPU. Seeing as how this is a binary file set, I'm really not sure how to proceed from here in testing it to give any more useful information here. Apologies if this is the wrong place to report a firmware issue, but I was unsure where to file it otherwise.
There's a dmesg set on the Arch bug tracker provided for the power management failure case - Previous firmware revision: https://bugs.archlinux.org/task/53042?getfile=15004 Current firmware revision: https://bugs.archlinux.org/task/53042?getfile=15005 Looking at my Xorg logs over SSH at the time, there were no differences to a successful useage of Xorg on the previous firmware. I wasn't thoughtful enough to take a dmesg capture, and I've got a large workload running on my machine right now. I can probably experiment with getting the logs for my case in a day or two.
Does the firmware here fix the issue? https://people.freedesktop.org/~agd5f/radeon_ucode/polaris/
I tried the firmware you linked and the problems persisted (GPU hang when starting Xorg). I did take the opportunity and the fact the machine still responds over SSH to capture my Xorg and kernel logs, which I will attach. For the record, the symptoms are the same with AMDGPU with my standard config (DRI 3, TearFree), a blank config file, and with Modesetting.
Created attachment 129843 [details] 4.9.11 kernel log with old functioning firmware.
Created attachment 129844 [details] 4.9.11 kernel log with new malfunctioning firmware.
Created attachment 129845 [details] Xorg 1.19.1 log with new malfunctioning firmware.
Created attachment 129846 [details] Xorg 1.19.1 log with old functional firmware.
RX 460 user here. Same issue. Kernel panic and backtrace messages in my log file might help.
Created attachment 129852 [details] journal log
I've reverted the polaris 11 changes in the firmware git tree. Just waiting for them to land.
Okay, that's glad for me to hear. There's still the people on Polaris10 and others having power management failures - someone's card doubled in idle temperature.
Arch has a testing update (linux-firmware-20170217.12987ca-2) that's the same git revison that was causing problems with the troublesome AMD commits reverted, and this has fixed both my RX 460 GPU hang and the issues with power management on an R9 Fury.
Does the new firmware work properly with kernel 4.10 or newer?
Which new firmware? The one you linked earlier in the discussion, or the new setup with the one git commit reverted?
(In reply to saunders.52 from comment #14) > Which new firmware? The one you linked earlier in the discussion, or the new > setup with the one git commit reverted? Either or both.
(In reply to Alex Deucher from comment #15) > (In reply to saunders.52 from comment #14) > > Which new firmware? The one you linked earlier in the discussion, or the new > > setup with the one git commit reverted? > > Either or both. The GIT commit reversion should be similar enough to a manual change I tried with both 4.9 and 4.10 I can almost certainly say it would work (trying the old firmware manually). I haven't tried the other, and won't be able to for about 5 hours (away from the desktop in question).
(In reply to saunders.52 from comment #16) > > The GIT commit reversion should be similar enough to a manual change I tried > with both 4.9 and 4.10 I can almost certainly say it would work (trying the > old firmware manually). I haven't tried the other, and won't be able to for > about 5 hours (away from the desktop in question). So the new firmware works in 4.10, but not in 4.9?
(In reply to Alex Deucher from comment #17) > (In reply to saunders.52 from comment #16) > > > > The GIT commit reversion should be similar enough to a manual change I tried > > with both 4.9 and 4.10 I can almost certainly say it would work (trying the > > old firmware manually). I haven't tried the other, and won't be able to for > > about 5 hours (away from the desktop in question). > > So the new firmware works in 4.10, but not in 4.9? The old firmware works in 4.10. The new firmware hasn't been tested by me outside of 4.9.
Well, the one you linked above didn't work in 4.9. The one shipping in the repos that is getting reverted (20170217.12987ca-1) didn't work in 4.9 and 4.10. The oldest of the three (the one shipped originally as part of 20161222.4b9559f) is stable in both. Are there some version numbers I can refer to these by to make this less insanely confusing?
And I didn't check the one you linked in 4.10. I think.
(In reply to saunders.52 from comment #19) > Are there some version numbers I can refer to these by to make this less > insanely confusing? The 5th dword in each binary is the version.
Okay, assuming I'm reading this right with hexdump... On an RX 460 (4 GB): Old Committed Version (0080 0000): Works on 4.9 and 4.10. New Committed Version, Now Uncommitted (0083 0000): Does not work on 4.9 and 4.10. Download Version (0086 0000): Tested on 4.9, where it doesn't work. Probably not tested on 4.10 (I don't remember.)
I was able to get back to the machine in question sooner than I thought. The version you have for download in Comment 2, (0086 0000) does not work on 4.10, and has the same crash issue.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/142.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.