Bug 107001 - hard system freeze with mesa 18.1.x on AMD RX 580
Summary: hard system freeze with mesa 18.1.x on AMD RX 580
Status: RESOLVED WORKSFORME
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-06-22 14:36 UTC by claude
Modified: 2018-09-19 16:37 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description claude 2018-06-22 14:36:29 UTC
I'm running Debian Buster, quite happily with my AMD RX 580, until Monday when I did a routine upgrade and got Mesa 18.1.1 and my system then became unreliable, with randomly occuring hard freezes (sshd disconnects, nothing in logs, no mouse cursor, no caps/numlock led function, even SysRq key combo doesn't work).  Nothing particular seems to trigger it, the system is under light load (firefox, thunderbird, a few terminals and other apps).

I tried 18.1.2 from unstable repository, it didn't fix the issue.  Uptime has ranged between 10mins and 10 hours between lockups.

$ uname -a
Linux eiskaffee 4.16.0-2-amd64 #1 SMP Debian 4.16.12-1 (2018-05-27) x86_64 GNU/Linux
(I believe this version means amdgpu.dc=0 because I haven't set anything manually)

$ sudo lspci | grep VGA
1d:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480] (rev e7)

$ clinfo | grep Device\ Name | head -n 1
  Device Name                                     Radeon RX 580 Series (POLARIS10 / DRM 3.23.0 / 4.16.0-2-amd64, LLVM 6.0.0)

I downgraded to Mesa 18.0.5 and all seems fine again.

$ apt-cache policy mesa-common-dev 
mesa-common-dev:
  Installed: 18.0.5-1
  Candidate: 18.1.1-1
  Version table:
     18.1.2-1 500
        500 http://ftp.uk.debian.org/debian unstable/main amd64 Packages
     18.1.1-1 990
        990 http://ftp.uk.debian.org/debian buster/main amd64 Packages
 *** 18.0.5-1 100
        100 /var/lib/dpkg/status

I selected 1 package arbitrarily, all the corresponding Debian binary packages for Mesa 18.0.5 are installed. The 18.0.5-1 is built and installed by me from source following this recipe:

mkdir work
cd work
sudo apt-get build-dep mesa
git clone https://salsa.debian.org/xorg-team/lib/mesa.git
cd mesa
git checkout mesa-18.0.5-1
dpkg-buildpackage -us -uc
cd ..
sudo dpkg -i *.deb
Comment 1 claude 2018-07-27 13:09:33 UTC
Seems fixed in mesa-18.1.4, no freezes since upgrading.  New kernel too, with amdgpu.dc=0 set for an unrelated issue.

$ uptime
 14:07:18 up 1 day, 12:48,  1 user,  load average: 16.86, 16.36, 15.81

$ uname -a
Linux eiskaffee 4.17.0-1-amd64 #1 SMP Debian 4.17.8-1 (2018-07-20) x86_64 GNU/Linux

$ clinfo | grep Device\ Name | head -n 1
  Device Name                                     Radeon RX 580 Series (POLARIS10, DRM 3.25.0, 4.17.0-1-amd64, LLVM 6.0.1)

$ apt-cache policy mesa-common-dev
mesa-common-dev:
  Installed: 18.1.4-1
  Candidate: 18.1.4-1
  Version table:
 *** 18.1.4-1 990
        990 http://ftp.uk.debian.org/debian buster/main amd64 Packages
        500 http://ftp.uk.debian.org/debian unstable/main amd64 Packages
        100 /var/lib/dpkg/status
Comment 2 claude 2018-08-08 22:25:35 UTC
Got plenty of freezes after upgrading to Debian Mesa 18.1.5-1, so I downgraded to 18.0.5 again.

Will retest 18.1.4 soonish.
Comment 3 claude 2018-09-11 13:55:55 UTC
So, I got some freezes with Linux 4.18.5 and Mesa 18.0.5 (the version of Mesa that I assumed was bug free), and so far (24 hours uptime) no freezes with Linux 4.18.7 and Mesa 18.1.7-1.  The Linux change log mentions some fixes for various deadlocks, perhaps I was just unlucky that some Mesa versions were more likely than others to trigger these Linux bugs on my system.  I'll leave this open for another week or so of testing.
Comment 4 claude 2018-09-19 16:37:31 UTC
Closing this as I've had no problems for a week or two.

I debated selecting "not our bug" as the reason as I think it's most likely a Linux kernel bug that just happened to be exposed more often by certain Mess version, but I have no proof of this, so I selected the more neutral "works for me".


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.