Bug 74868 - r600g: Diablo III Crashes After a few minutes
r600g: Diablo III Crashes After a few minutes
Status: RESOLVED FIXED
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r600
git
x86-64 (AMD64) Linux (All)
: medium normal
Assigned To: Default DRI bug account
:
Depends on:
Blocks: 85596
  Show dependency treegraph
 
Reported: 2014-02-12 02:20 UTC by Nick Tenney
Modified: 2014-10-29 11:37 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg Log (41.77 KB, text/plain)
2014-02-12 02:20 UTC, Nick Tenney
Details
dmesg log (72.72 KB, text/plain)
2014-02-12 02:21 UTC, Nick Tenney
Details
Mesa PKGBUILD (6.89 KB, text/plain)
2014-02-12 02:26 UTC, Nick Tenney
Details
Diablo III Valgrind output (2.47 MB, text/plain)
2014-02-16 02:05 UTC, Nick Tenney
Details
D3 Valgrind output with debugging symbols (62.20 KB, text/plain)
2014-02-21 03:24 UTC, Nick Tenney
Details
r600g: Don't leak bytecode on shader compile failure (1.74 KB, patch)
2014-03-17 07:02 UTC, Michel Dänzer
Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Tenney 2014-02-12 02:20:19 UTC
Created attachment 93894 [details]
Xorg Log

Disclaimer: This is my first bug report, so I apologize if I do not attach enough information.

I tried booting up Diablo III a few days back to see the new geometry shaders in action on my 4890, and alas, I had a few minutes of happy gameplay followed by a startling black screen. I ran it in a terminal window, though wine was not kind enough to allow me to properly pipe the output (only grabbed the launcher window and gave up) it contained ooodles of r600_shader_select issues. There were thousands of the (type=1) -1 lines prior to the the -12 lines right before the crash:

EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -1
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -1
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -1
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -1
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -1
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -1
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_texture.c:1003 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy
Mesa: User error: GL_OUT_OF_MEMORY in glCompressedTexSubImage2D
err:d3d:wined3d_debug_callback 0x1c8178: "GL_OUT_OF_MEMORY in glCompressedTexSubImage2D".
err:d3d_surface:surface_upload_data >>>>>>>>>>>>>>>>> GL_OUT_OF_MEMORY (0x505) from glCompressedTexSubImage2DARB @ ../../../wine-1.7.12/dlls/wined3d/surface.c / 1688
EE r600_texture.c:1003 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy
Mesa: User error: GL_OUT_OF_MEMORY in glCompressedTexSubImage2D
err:d3d:wined3d_debug_callback 0x1c8178: "GL_OUT_OF_MEMORY in glCompressedTexSubImage2D".
err:d3d_surface:surface_upload_data >>>>>>>>>>>>>>>>> GL_OUT_OF_MEMORY (0x505) from glCompressedTexSubImage2DARB @ ../../../wine-1.7.12/dlls/wined3d/surface.c / 1688
EE r600_texture.c:1003 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy
Mesa: User error: GL_OUT_OF_MEMORY in glCompressedTexSubImage2D
err:d3d:wined3d_debug_callback 0x1c8178: "GL_OUT_OF_MEMORY in glCompressedTexSubImage2D".
err:d3d_surface:surface_upload_data >>>>>>>>>>>>>>>>> GL_OUT_OF_MEMORY (0x505) from glCompressedTexSubImage2DARB @ ../../../wine-1.7.12/dlls/wined3d/surface.c / 1688
EE r600_texture.c:1003 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy
Mesa: User error: GL_OUT_OF_MEMORY in glCompressedTexSubImage2D
err:d3d:wined3d_debug_callback 0x1c8178: "GL_OUT_OF_MEMORY in glCompressedTexSubImage2D".
err:d3d_surface:surface_upload_data >>>>>>>>>>>>>>>>> GL_OUT_OF_MEMORY (0x505) from glCompressedTexSubImage2DARB @ ../../../wine-1.7.12/dlls/wined3d/surface.c / 1688
EE r600_texture.c:1003 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy
Mesa: User error: GL_OUT_OF_MEMORY in glCompressedTexSubImage2D
err:d3d:wined3d_debug_callback 0x1c8178: "GL_OUT_OF_MEMORY in glCompressedTexSubImage2D".
err:d3d_surface:surface_upload_data >>>>>>>>>>>>>>>>> GL_OUT_OF_MEMORY (0x505) from glCompressedTexSubImage2DARB @ ../../../wine-1.7.12/dlls/wined3d/surface.c / 1688
EE r600_texture.c:1003 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy
Mesa: User error: GL_OUT_OF_MEMORY in glCompressedTexSubImage2D
err:d3d:wined3d_debug_callback 0x1c8178: "GL_OUT_OF_MEMORY in glCompressedTexSubImage2D".
err:d3d_surface:surface_upload_data >>>>>>>>>>>>>>>>> GL_OUT_OF_MEMORY (0x505) from glCompressedTexSubImage2DARB @ ../../../wine-1.7.12/dlls/wined3d/surface.c / 1688
Comment 1 Nick Tenney 2014-02-12 02:21:45 UTC
Created attachment 93895 [details]
dmesg log
Comment 2 Nick Tenney 2014-02-12 02:26:32 UTC
Created attachment 93896 [details]
Mesa PKGBUILD

Distribution: Archlinux 64-bit
Kernel: 3.14rc2
Mesa: git (PKGBUILD attached with ops)
libdrm: 2.4.52
Comment 3 Michel Dänzer 2014-02-12 08:10:02 UTC
Looks like there's a memory leak somewhere. Does valgrind --leak-check=full give any hints? (Not sure if or how that can be used for Wine though)
Comment 4 Nick Tenney 2014-02-13 03:26:47 UTC
I took a quick peak at Valgrind, but Diablo III has some workarounds that seem to be messing with it. I'll toy around with it a bit more on the weekend. Thanks!
Comment 5 Nick Tenney 2014-02-15 17:37:18 UTC
I have tried various configurations with valgrind as recommended here: http://wiki.winehq.org/Wine_and_Valgrind

I have gotten it to begin loading, but before it completes, Diablo crashes with an internal error. Are there any other tools available or more limited valgrind traces I could run to track this down? I know wine stuff is tricky to work with.
Comment 6 Ilia Mirkin 2014-02-15 20:33:18 UTC
(In reply to comment #5)
> Are there any other tools available or more limited
> valgrind traces I could run to track this down? I know wine stuff is tricky
> to work with.

apitrace -- records all OpenGL calls. If replaying a trace causes the same leak, then you're all set. (You could also run valgrind on the trace replay.)
Comment 7 Nick Tenney 2014-02-16 02:05:20 UTC
Created attachment 94142 [details]
Diablo III Valgrind output

Thanks Ilia, that did the trick! I just grabbed the first few minutes of game play in Valgrind up until it started encountering the error that was popping previously.
Comment 8 Ilia Mirkin 2014-02-16 04:15:30 UTC
(In reply to comment #7)
> Created attachment 94142 [details]
> Diablo III Valgrind output
> 
> Thanks Ilia, that did the trick! I just grabbed the first few minutes of
> game play in Valgrind up until it started encountering the error that was
> popping previously.

I'm not a radeon guy, just a drive-by commenter, but I'm going to go ahead and guess that this output is going to be of little use to anyone debugging :(

A trace with symbols in r600_dri would be much more useful. With what you provided, we just know that the memory is allocated with malloc and calloc... which is of little surprise.
Comment 9 Nick Tenney 2014-02-16 17:21:23 UTC
Thanks for the help so far! I am happy to keep on debugging with more specificity, but will need some help as I'm still pretty new with this stuff. Let me know if there is anything I can provide to help shine light on this.
Comment 10 Michel Dänzer 2014-02-17 03:29:09 UTC
Make sure your /usr/lib/xorg/modules/dri/r600_dri.so has debugging symbols.
Comment 11 Nick Tenney 2014-02-21 03:24:23 UTC
Created attachment 94468 [details]
D3 Valgrind output with debugging symbols

Recompiled and ran Valgrind again. The resulting Valgrind looks more informative than previous attempts. Let me know if anything else would be helpful.
Comment 12 Nick Tenney 2014-03-07 03:49:34 UTC
Commands used:

$ apitrace32 wine Diablo\ III.exe
$ valgrind --leak-check=full --trace-children=yes --vex-iropt-register-updates=allregs-at-mem-access --workaround-gcc296-bugs=yes --log-file=d3-valgrind-debug_v2 --suppressions=$HOME/valgrind-suppressions apitrace replay d3debug_v2.trace

Realized I didn't have an apitrace on here, so I grabbed a new one for posting. Dropbox link is as follows:

https://www.dropbox.com/s/1zlzceaho5ym5pa/d3-trace.tar.bz2
sha256sum: f60ded613d5a9dd71629ec75ad2cc3b1f7c93ed80b22f92aeadd0f94f97ade19

My valgrind output was a bit above the limit too:
https://www.dropbox.com/s/uyqzxsf1ma5sqds/d3-valgrind-debug_v2
sha256sum: ffb4a35cfd714c78b228bfa0219445648190430324020924c8d04d3ce217a5dc
Comment 13 Nick Tenney 2014-03-15 01:22:07 UTC
I decided to try running with R600_DEBUG=hyperz,nosb to see what happened and I got another line in the shader failure output about the shader requiring 227 registers:

EE r600_state_common.c:750 r600_shader_select - Failed to build shader variant (type=1) -12
EE r600_shader.c:2183 r600_shader_from_tgsi - GPR limit exceeded - shader requires 227 registers
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !

Any tips on further debugging to find which shader is failing?
Comment 14 Michel Dänzer 2014-03-17 07:02:09 UTC
Created attachment 95909 [details] [review]
r600g: Don't leak bytecode on shader compile failure

Does this patch help?
Comment 15 Reiner Herrmann 2014-03-22 00:08:22 UTC
(In reply to comment #14)
> Created attachment 95909 [details] [review] [review]
> r600g: Don't leak bytecode on shader compile failure
> 
> Does this patch help?

Yes, this patch is helping, thank you! :)
I applied it to current Mesa 10.1 and was able to play Diablo 3 for several hours without crashing due to being out of memory.
Comment 16 640bugs 2014-03-29 14:28:24 UTC
(In reply to comment #14)
> Created attachment 95909 [details] [review] [review]
> r600g: Don't leak bytecode on shader compile failure
> 
> Does this patch help?

I can also confirm, that the game doesnt crash anymore.
Comment 17 Nick Tenney 2014-03-29 20:44:24 UTC
(In reply to comment #14)
> Created attachment 95909 [details] [review] [review]
> r600g: Don't leak bytecode on shader compile failure
> 
> Does this patch help?

Finally had a chance to test the patch. I compiled this morning with and without the patch and GALLIUM_HUD=fps,requested-VRAM,requested-GTT. It clearly leaked away without and the flow was stemmed with the proper error handling patch. Awesome!

I see that the patch properly destroys the shader on error now, which is definitely an improvement. The error spamming to stdout is still occurring, so I'm not sure whether that should be another bug report/left open in this one with a new name or anything.
Comment 18 John Lenz 2014-03-31 03:45:32 UTC
Just to confirm, I had this exact problem with mesa 10.1 and diablo 3, and the attached patch fixes the problem.  I played for ~1.5 hours with no crash, when before it would crash within 2-3 minutes.
Comment 19 Mike Mestnik 2014-04-06 13:45:46 UTC
The patch worked for me, binary download based on Oibaf package.

https://launchpad.net/~cheako/+archive/buildingrounds/+build/5883102/+files/libgl1-mesa-dri_10.2~git1404021931.0f641b~mmestnik~gd~s_i386.deb
Comment 20 Michel Dänzer 2014-04-09 06:35:26 UTC
(In reply to comment #17)
> The error spamming to stdout is still occurring, so I'm not sure whether that
> should be another bug report/left open in this one with a new name or anything.

Please file separate bugs for the shader compile failures.
Comment 21 Michel Dänzer 2014-04-10 06:42:15 UTC
Module: Mesa
Branch: master
Commit: ee2bcf38a4c8930d8f9cecfac580030a45c41dae
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ee2bcf38a4c8930d8f9cecfac580030a45c41dae

Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Wed Apr  9 15:38:21 2014 +0900

r600g: Don't leak bytecode on shader compile failure