Created attachment 105638 [details] kernel.log file with kernel 3.17rc3 * Tested with both kernel 3.16.1 and kernel 3.17rc3, with and without hyperz * OpenGL renderer string: Gallium 0.4 on AMD PITCAIRN * OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.4.0-devel (git-021e84f) I can reproduce the lockup with the trace: http://pkgbuild.com/~lcarlier/trace/Sam3.tar.xz
Can not reproduce it on Kabini, with same git version 021e84f. That with mesa builded against current llvm-3.6 svn just pass fine, and when i build mesa against 3.5 this this apitrace just segfault... in both cases no lockup. Debian.
I get no lockup either, but I do see the same GPUVM protection faults: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FF00819 The FF bits make me suspect bits 32-4x of the GPUVM address are getting clobbered, maybe because of the LLVM backend generating invalid shader code.
Created attachment 105674 [details] ouput of 'R600_DEBUG=ps,vs glretrace Sam3.trace' LLVM is 3.6svn r216889
Link to the trace in google drive: https://drive.google.com/file/d/0B1WCo3k21FK3dTZmaFFmU2wwQzQ/edit?usp=sharing
(In reply to comment #2) > I get no lockup either, but I do see the same GPUVM protection faults: > > radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FF00819 > > The FF bits make me suspect bits 32-4x of the GPUVM address are getting > clobbered, maybe because of the LLVM backend generating invalid shader code. For me nothing new in dmesg, but there is something very interesting here happen. When radeonsi.so is striped this trace segfault for me, if not striped it pass fine no segfault, what that can be? Hmm...
Just to note that this trace is produced with apitrace 5.0 and with the following commandline: GALLIUM_HUD=num-bytes-moved apitrace32 trace %command%
Created attachment 105676 [details] segfault (In reply to comment #5) > For me nothing new in dmesg, but there is something very interesting here > happen. When radeonsi.so is striped this trace segfault for me, if not > striped it pass fine no segfault, what that can be? Hmm... After restart it works but segfault again, wwird... this one tried on a pure 32bit OS.
@Laurent carlier Is this new issue or regressions maybe? Don't have SSAM3 game, but i remember from earlier versions that Serios Sam have bunch of different settings, maybe you can try some different settings started with Low or something, maybe only some of settings triggers the issue, etc.
Try also some stable mesas if you can 10.2 or 10.3, i have very strange issues with 32bit mesa and apps, particulary build system in current git seems very broken for me. Make install, SSE41 macro compile needs much more CPU time, striping does not work fine, default optimization level is not good -O3 fixes it, etc.
Just tried with mesa-10.2.6/llvm-3.4.2 and the trace works fine except the following from LLVM: LLVM ERROR: ran out of registers during register allocation Here are the flags used: CPPFLAGS="-D_FORTIFY_SOURCE=2" CFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4" CXXFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4" LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro" DEBUG_CFLAGS="-g -fvar-tracking-assignments" DEBUG_CXXFLAGS="-g -fvar-tracking-assignments"
(In reply to comment #2) > I get no lockup either, but I do see the same GPUVM protection faults: > > radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FF00819 > > The FF bits make me suspect bits 32-4x of the GPUVM address are getting > clobbered, maybe because of the LLVM backend generating invalid shader code. I've found similar bug with incorrect high part of the address and the problem was that llvm backend uses S_ADD/SUB_I32 for lowering 64-bit integer add/sub, but it should use _U32 versions instead. I was going to send the patch but the fix is trivial, basically just replace all uses of S_ADD/SUB_I32 with S_ADD/SUB_U32. I'm not sure if you are hitting the same issue though.
Created attachment 105709 [details] [review] Fix suggested by Vadim Can you try this patch?
(In reply to comment #12) > Can you try this patch? The patch fixes the GPUVM faults for me while replaying the apitrace.
(In reply to comment #12) > Created attachment 105709 [details] [review] [review] > Fix suggested by Vadim > > Can you try this patch? It doesn't fix the lockup for me. I've tested mesa-git with llvm 3.4.3 both the trace and the game, and they failled both with the following error: LLVM ERROR: Cannot select: 0x1671def0: i32 = truncate 0x16716ff4 [ORD=21] [ID=121] 0x16716ff4: i128 = srl 0x1671cb14, 0x16717198 [ORD=21] [ID=102] 0x1671cb14: i128,ch = load 0x166a9484, 0x167123bc, 0x16712e20<LD16[%32](tbaa=!"const")> [ORD=21] [ID=90] 0x167123bc: i64,ch = CopyFromReg 0x166a9484, 0x16712330 [ID=81] 0x16712330: i64 = Register %vreg66 [ID=2] 0x16712e20: i64 = undef [ID=8] 0x16717198: i32 = Constant<96> [ID=76] In function: main
I can confirm that 8bd67231797e5d79d72a4e91b37ea81da30c6df3 is fixing the hang. Thanks Marek, closing!
Bad luck, it's hanging again! -> reopened
Does this Mesa patch help? https://bugs.freedesktop.org/attachment.cgi?id=105755
(In reply to comment #17) > Does this Mesa patch help? > > https://bugs.freedesktop.org/attachment.cgi?id=105755 No, it doesn't help
Fixed with current mesa trunk, so closing
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.