Created attachment 101971 [details]
callstack of the wrong read access
I am trying to understand why "XCOM: Enemy Uknown" (a game that was released for Linux x64 last week on Steam), leads to crashes when running with the open source Radeon driver. The people who did the port do not officialy support the open source driver but seems to be open to the idea of making it run on it.
Ubuntu 14.04 x64
Readon HD 4870
Unfortunately that Radeon model is not supported by AMD's closed source driver. However, the game runs very well with the open source driver, with the exception of several random crashes. This bug report is one of the problem I have found and for which I have some information.
The main crash I was experimenting was corrupting the heap, so I used a tool to detect illegal read and write accesses. The first problem it found is an illegal read access at the start of the game, after the different logos, when the main menu is about to appear.
What is happening is that a memcpy with a size of 1360 bytes is made with a source buffer of only 1280 bytes in u_upload_data(). I will attach the callstack to this ticket.
In that context, the entry point in the driver is:
GLuint start, GLuint end,
GLsizei count, GLenum type,
const GLvoid *indices,
Here are the values of the parameters:
mode: 4 (GL_TRIANGLES)
type: 5123 (GL_UNSIGNED_SHORT)
indices: 72 bytes:
00 00 02 00 03 00 00 00 01 00 02 00 04 00 06 00
07 00 04 00 05 00 06 00 08 00 0a 00 0b 00 08 00
09 00 0a 00 0c 00 0e 00 0f 00 0c 00 0d 00 0e 00
00 00 04 00 06 00 00 00 06 00 02 00 01 00 03 00
07 00 01 00 07 00 05 00
reordered as 36 u16:
0000 0002 0003 0000 0001 0002 0004 0006
0007 0004 0005 0006 0008 000a 000b 0008
0009 000a 000c 000e 000f 000c 000d 000e
0000 0004 0006 0000 0006 0002 0001 0003
0007 0001 0007 0005
Now I'm no expert in OpenGL and it is the first time I look at Mesa code, so I can't identify the main reason for the error. Hopefully you will have a clue.
A tip to debug the game: if you have Steam and the game, I renamed the main exectuable in ".../Steam/SteamApps/common/XCom-Enemy-Unknown/binaries/linux/" named "game.x86_64" as "game.x86_64_real" and created a shell script named "game.x86_64" that performs necessary settings for debugging and then launch "game.x86_64_real". In my case I am setting LD_PRELOAD with a library that catches malloc() calls and fences the buffer before returning it.
Created attachment 101972 [details]
Hi, your Mesa seems to be rather old. Did you try to update it to the newer version 10.2.x or even to git-master? It could be the case that your crash was already resolved. For ubuntu you could use oibaf-ppa.
hie, it seems to be a duplicate of this bug :
I have tried with mesa git, and kernel 3.15 and 3.16 rc2 , the crash still occurs.
(In reply to comment #2)
> Hi, your Mesa seems to be rather old. Did you try to update it to the newer
> version 10.2.x or even to git-master? It could be the case that your crash
> was already resolved. For ubuntu you could use oibaf-ppa.
Hi, I installed from oibaf-ppa and got version 10.3.0-devel (git-15b5e66 trusty-oibaf-ppa) but unfortunately I could not start any game with it, I only got several error messages, so I rollbacked to 10.1.3 :-/
10.1.3 is the version released with Ubuntu 14.04 which is from May, I saw that the 10.1 branch is now in 10.1.6.
I've been looking to the commits of the 10.1 branch but could not see anything relevant to that problem. I would like to know more about it and perform a step by step debugging inside Mesa code but for that I need to know how to build a debug version and make Ubuntu use my built version instead of the real one. I have no idea how to do it unfortunately :(
Maybe spying OpenGL calls until the fatal call to glDrawRangeElementsBaseVertex would help too. I heard Valve made a tool like that but I am not sure if it can act as an OpenGL validator.
(In reply to comment #3)
> hie, it seems to be a duplicate of this bug :
> I have tried with mesa git, and kernel 3.15 and 3.16 rc2 , the crash still
Hi. The bug you are referencing describes a crash that occurs later in the game so I am not sure if it is related with the bug described here that occurs at the start of the game when the menu is about to be shown.
I am making progress, I can build r600_dri.so in debug mode, make programs use it instead of the version released with Ubuntu and perform step by step debugging. However some functions are located in libgallium.so and I can't figure out how to build that library. The only thing close I get is a static library in a hidden directory with path "src/gallium/auxiliary/.libs"
I finally got all needed library to build in debug and being used by the game! I followed the instruction on this page: http://x.debian.net/howto/build-mesa.html with the exception of "--with-dri-drivers=r600" that didn't work but I guess it is not important as gallium is standard now for r600, and the addition of "--enable-debug".
However with those debug libraries I do not get the crash at the very same moment, but something like one second later. However it is the same kind of problem: a read access made too far from the source buffer (I'll attach the call stack again). This time a memcpy of 520 bytes is attempted from a buffer of 512 bytes. Now that I can do step by step debugging with all symbols identified by gdb I will try to understand what's happening.
Created attachment 102330 [details]
callstack of other similar problem when running with Mesa debug libs
This time 520 bytes were read but the buffer is 512 bytes only.
It seems that what is happening is that the "end" argument is referencing a vertex that does not exist when glDrawRangeElementsBaseVertex is called. The documentation says that "start" and "end" are inclusive indices so I guess a missing "-1" in the game code is the cause of the problem.
In the last callstack, the game informs the driver that vertices 0 to 64 (inclusive) should be used for drawing: vbo_exec_DrawRangeElementsBaseVertex(mode=4, start=0, end=64, count=168, type=5123, indices=0x7f22c87bfeb0, basevertex=0)
And later the driver will upload data with those arguments:
u_vbuf_upload_buffers(mgr=0x7f22efa9e760, start_vertex=0, num_vertices=65, start_instance=0, num_instances=1)
So an attempt to upload 65 vertices is done for a range of [0..64], which is correct as per the spec, but in my case the buffer only holds 64 elements.
I do not know how the driver can protect itself from such a case. I've been looking at the glVertexPointer API but the client does not inform the driver of the total size of the pointed buffer, so I don't know how to verify this kind of out of bounds errors.
The only idea I have would be to scan all indices to get min and max values and issue a warning if they do not fit exactly in the "start" and "end" indices. However this would probably be ineffective as the purpose of "Range" functions seems to be precisely to inform the driver of the range so it does not need to compute it. Moreover, though it is ineffective to include useless vertices in the range, the spec does not seem to forbid it.
Sadly, it seems there is nothing to do on the Mesa side. At least I will inform the game developers about my analysis and then close this bug.
Hadrien, your findings are correct. Can we close this bug now?
(In reply to comment #9)
> Hadrien, your findings are correct. Can we close this bug now?
Sure. I gave all the information to the game developers.
A new version of this game has been released which should fix this issue.
Marking as FIXED as the issue was resolved on the client side in 2014