Kernel: 4.5.0-0.rc4.git2.2.fc24.x86_64 MESA: git master, Feb 19-20th builds LLVM: trunk 3.9, Feb 19-20th builds DDX: For some reason, X crashes with radeonsi triggering VM fault, when in GNOME or KDE environments: If you open up a shell console (gnome-terminal, konsole etc), run mock as non-root, as soon as an attempt to prompt for root happens (with gtksu) it locks up system. [ 38.111551] radeon 0000:01:00.0: GPU fault detected: 146 0x0008480c [ 38.111861] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 38.112219] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0804800C [ 38.112576] VM fault (0x0c, vmid 4) at page 0, read from 'TC2' (0x54433200) (72) [ 38.112931] radeon 0000:01:00.0: GPU fault detected: 146 0x0008440c [ 38.113229] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08000000 [ 38.113587] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001 [ 38.113945] VM fault (0x01, vmid 4) at page 134217728, read from 'TC3' (0x54433300) (68) I've attached a VM fault debug of crash
Created attachment 121873 [details] R600_DEBUG=check_vm capture of VM fault
(In reply to Shawn Starr from comment #0) > If you open up a shell console (gnome-terminal, konsole etc), run mock as > non-root, as soon as an attempt to prompt for root happens (with gtksu) it > locks up system. Since you can retrieve the GPUVM fault messages, it's hard to believe that the system locks up completely. Can you try logging in via ssh after the problem occurs and getting a gdb backtrace of the Xorg process?
This isn't limited to gtksu. I can reproduce it fairly quickly by playing around with a MATE desktop session (which seems to use GTK2). OTOH I haven't run into it with this GNOME3 session, which mostly uses GTK3, though with some GTK2 apps as well. I bisected it to 9aaf28da ("radeonsi: enable compiling one variant per shader"). I also confirmed that it happens with Marek's current si-one-variant branch as well as an older snapshot of that branch. Now the "fun" part will be tracking down which glamor shaders are broken by this and why. Meanwhile, it might be better to disable the single shader variant by default, especially on the 11.2 branch.
Attempting to attach gdb to X, I am unable to break out of gdb. X info: X.Org X Server 1.18.0 Release Date: 2015-11-09 X Protocol Version 11, Revision 0
Created attachment 121931 [details] apitrace reproducing the problem This apitrace reproduces the problem for me on Kaveri and Tonga.
Sadly, I can't reproduce this on Verde, Bonaire, Tonga using the apitrace. Could you please get a new check_vm report with this branch? https://cgit.freedesktop.org/~mareko/mesa/log/?h=ddebug-shader-dump
Created attachment 121979 [details] check_vm dump from ddebug-shader-dump branch
The bad news is the check_vm report probably doesn't contain the problematic shaders. The good news is I can reproduce this after updating LLVM, thus this is an LLVM bug. I'm bisecting.
The first bad commit: commit 98ef4478258fda9028cd1786841eca952c136319 Author: Tom Stellard <thomas.stellard@amd.com> Date: Fri Feb 12 23:45:29 2016 +0000 AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260765 91177308-0d34-0410-b5e6-96231b3b80d8
Created attachment 121988 [details] problematic shader
The problematic shader is attached. It has "s_branch" at the end "ret" somewhere in the middle. My initial theory is that the shader fails to jump to the epilog, which is outside of the binary, and jumps somewhere else. It may be even stuck in an infinite loop due to an incorrect jump.
got the same issue. Gigabyte HD7870 Arch Linux x64 mesa-git - from mesa-git repo (http://pkgbuild.com/~lcarlier/mesa-git/) kernel - linux-mainline 4.5.0-rc7-mainline radeon 0000:01:00.0: GPU fault detected: 147 0x0c024801 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FFFF860 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02048001 last working version was on commit 89d25a8 (mesa-11.2) problems started after commit ff360a5 (mesa-11.3) when i launch 'plank' or 'mate-system-monitor' everything freezes (on some versions/commits my mouse is still working and on some it doesn't) and my Xorg server crashes. sometimes the session restarts, but after login the opengl is not available (glxinfo shows some errors)
The fix is under review: http://reviews.llvm.org/D17964
Fixed in LLVM SVN r263441.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.