Bug 61747

Summary: [r600g] GPU lockup when playing WoW with HD6450 with htile enabled
Product: Mesa Reporter: Chris Rankin <rankincj>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: medium CC: niels_ole
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: dmesg output, showing lockup and failed reset.
Xorg.0.log file from the failed session.
dmesg output showing GPU lockup

Description Chris Rankin 2013-03-03 17:06:46 UTC
Created attachment 75852 [details]
dmesg output, showing lockup and failed reset.

Playing WoW on Fedora's kernel-3.8.1-201.fc18.x86_64, and the GPU locked up moments later. A GPU reset did not succeed in restoring anything, and I was forced to "kill -HUP" the Xorg process from a virtual console.

Mesa's git HEAD is:

commit 0b6e72f8d75a31ef233ad5be0c9f59497880657f
Author: Brian Paul <brianp@vmware.com>
Date:   Fri Mar 1 17:36:34 2013 -0700

    st/mesa: add switch case for ir_txf_ms to silence warning
Comment 1 Chris Rankin 2013-03-03 17:07:37 UTC
Created attachment 75853 [details]
Xorg.0.log file from the failed session.
Comment 2 Chris Rankin 2013-03-03 17:08:49 UTC
I have successfully played WoW using this exact same version of Mesa and a stock x86_64 3.8.1 kernel with my RV790.
Comment 3 Chris Rankin 2013-03-03 17:25:59 UTC
WTF? According to glxinfo, it's not using r600g at all...?!?!

$ LIBGL_DEBUG=verbose LD_LIBRARY_PATH=/usr/local/lib64 glxinfo 
name of display: :1
libGL: screen 0 does not appear to be DRI2 capable
libGL: OpenDriver: trying /usr/local/lib64/dri/swrast_dri.so
libGL error: dlopen /usr/local/lib64/dri/swrast_dri.so failed (/usr/local/lib64/dri/swrast_dri.so: cannot open shared object file: No such file or directory)
libGL error: unable to load driver: swrast_dri.so
libGL error: failed to load driver: swrast
display: :1  screen: 0
direct rendering: No (If you want to find out why, try setting LIBGL_DEBUG=verbose)

OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 0x301)
OpenGL version string: 1.4 (2.1 Mesa 9.0.1)

It can't find swrast_dri.so because I didn't compile it, but r600g_dri.so is definitely there!

Fedora's gnome-shell process has suspiciously switched to swrast_dri.so too. I have no idea what has just happened.
Comment 4 Chris Rankin 2013-03-03 17:58:46 UTC
After a reboot, the new glxinfo reports:

OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD CAICOS
OpenGL version string: 3.0 Mesa 9.2-devel (git-0b6e72f)
OpenGL shading language version string: 1.30

I am guessing that it switched to swrast_dri after it failed to recover from the GPU lockup correctly.
Comment 5 Alex Deucher 2013-03-04 13:44:52 UTC
Can you bisect mesa?
Comment 6 Chris Rankin 2013-03-04 13:48:11 UTC
(In reply to comment #5)
> Can you bisect mesa?

I'm not sure that makes sense. The crashes started after Fedora upgraded its 3.7.9 kernel to one based on 3.8.1, and so I have no idea if there's even a "good" commit in Mesa to start bisecting from.
Comment 7 Alex Deucher 2013-03-04 15:14:39 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Can you bisect mesa?
> 
> I'm not sure that makes sense. The crashes started after Fedora upgraded its
> 3.7.9 kernel to one based on 3.8.1, and so I have no idea if there's even a
> "good" commit in Mesa to start bisecting from.

I'm guessing the kernel update enabled some additional feature in mesa (htile or async/cp dma support).  Does disabling htile support help?  Set env var R600_HYPERZ=0
Comment 8 Chris Rankin 2013-03-06 20:40:43 UTC
(In reply to comment #7)
> Does disabling htile support help?  Set env var R600_HYPERZ=0

Yes, this prevents the GPU from locking up.
Comment 9 Alex Deucher 2013-03-06 21:35:35 UTC
Possibly a duplicate of bug 59592 or bug 60848.
Comment 10 Chris Rankin 2013-03-23 17:19:08 UTC
*** Bug 62577 has been marked as a duplicate of this bug. ***
Comment 11 Jerome Glisse 2013-04-24 19:23:48 UTC
Please check if below patch fix the issue:

http://people.freedesktop.org/~glisse/0001-r600g-force-full-cache-for-hyperz.patch
Comment 12 Chris Rankin 2013-05-01 19:10:19 UTC
Created attachment 78735 [details]
dmesg output showing GPU lockup

No, it doesn't appear to. I compiled this version of Mesa after recompiling libdrm-2.4.44-2.fc19.src.rpm for F18:

OpenGL core profile version string: 3.1 (Core Profile) Mesa 9.2.0 (git-3bba787)
OpenGL core profile shading language version string: 1.40
OpenGL core profile context flags: (none)
Comment 13 Chris Rankin 2013-05-04 19:57:54 UTC
This version of Mesa is a lot more promising!

OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD CAICOS
OpenGL core profile version string: 3.1 (Core Profile) Mesa 9.2.0 (git-8c347d4)
OpenGL core profile shading language version string: 1.40
OpenGL core profile context flags: (none)

No lock-ups so far!
Comment 14 Jerome Glisse 2013-05-20 14:44:19 UTC
Closing

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.