Created attachment 46638 [details] openarena + verbose vertex buffer upload logs Test case : openarena + anholt benchmark + r600g I added several fprintf debug trace to mesa code around Vertex buffer uploading. Basically, openarena render things using : glVertexPointeer() glLockArraysEXT() several glDrawElements() glUnlockArraysEXT() It seems that each call to glDrawElements implies a reupload of the vertex buffers, even if it has not changed (as glLock/Unlock calls tell, at least for part of the buffer). Another related bug (it seems) is in : cso_set_vertex_buffers() which does a test before calling : util_copy_vertex_buffers and pipe->set_vertex_buffers This test always return true, thus the 2 above functions are always called. The test is always true because it memcmp all pipe_vertex_buffer, which contains a 'buffer' pointer, which changes at each frame (see st_draw.c:349). I'm trying to build a patch which fix the cso_set_vertex_buffers problem and then, taking advantage of glLock/Unlock calls to fix the upload issue. What do you think ?
(In reply to comment #0) > Created an attachment (id=46638) [details] > openarena + verbose vertex buffer upload logs > > Test case : openarena + anholt benchmark + r600g > > I added several fprintf debug trace to mesa code around Vertex buffer > uploading. > Basically, openarena render things using : > glVertexPointeer() > glLockArraysEXT() > several glDrawElements() > glUnlockArraysEXT() > > It seems that each call to glDrawElements implies a reupload of the vertex > buffers, even if it has not changed (as glLock/Unlock calls tell, at least for > part of the buffer). EXT_compiled_vertex_array is dead. It has been dead for almost 10 years. No new applications should use it, and old applications that use it should be fast enough on modern hardware. I really don't think we should add any code to Mesa to make this case fast. We'd be better off submitting patches to openarena to use VBOs instead. > Another related bug (it seems) is in : cso_set_vertex_buffers() which does a > test before calling : util_copy_vertex_buffers and pipe->set_vertex_buffers > This test always return true, thus the 2 above functions are always called. The > test is always true because it memcmp all pipe_vertex_buffer, which contains a > 'buffer' pointer, which changes at each frame (see st_draw.c:349). > > I'm trying to build a patch which fix the cso_set_vertex_buffers problem and > then, taking advantage of glLock/Unlock calls to fix the upload issue. > > What do you think ?
(In reply to comment #0) > What do you think ? You can't optimize the uploads because the sizes of all buffers are not known when glLockArraysEXT is called. The only things you've got are pointers to the buffers.
> EXT_compiled_vertex_array is dead. It has been dead for almost 10 years. No > new applications should use it, and old applications that use it should be fast > enough on modern hardware. I really don't think we should add any code to Mesa > to make this case fast. We'd be better off submitting patches to openarena to > use VBOs instead. That's why I asked for external opinions. I still feel that IF it can brings significant enough performance improvement, it could be useful - mainly because openarena and all games using quake3 based engine are used in every open-source drivers benchmarks.
(In reply to comment #2) > (In reply to comment #0) > > What do you think ? > > You can't optimize the uploads because the sizes of all buffers are not known > when glLockArraysEXT is called. The only things you've got are pointers to the > buffers. Sure, but I was thinking to something more like : - glLockArraysEXT does nothing more than curently - uploading a vertex buffer is done as before except in one case : when trying to upload the same buffer AND glLock is enabled => the uploading is skipped as data is already in GPU memory (which is the case shown in the attached trace file)
Note that ioquake3 will probably support VBOs soon: https://bugzilla.icculus.org/show_bug.cgi?id=4358
(In reply to comment #3) > That's why I asked for external opinions. I still feel that IF it can brings > significant enough performance improvement, it could be useful - mainly because > openarena and all games using quake3 based engine are used in every open-source > drivers benchmarks. Hey! A bug that can be solved by ignoring Phoronix!
Created attachment 46804 [details] [review] vertex buffer upload caching Here's a patch adding vertex buffer upload caching. It's not extensively tested (only with custom app and openarena), and is a bit hackish (mainly regarding : ctx->Array.NewState field, but it seems unused elsewhere). Anyway, it allowed me to do some testing and confirm the initial guess : it brings no performance improvements to openarena :-/ (but that was expected, as profiling the game tells that u_upload_data < 1% cpu time) So I guess the handling of EXT_compiled_vertex_array is useless - maybe with 1 exception : lower end video cards with lower bandwidth ? I can't test this scenario here, as I only have AMD HD4850.
Closing. Ian's comment explains it: > EXT_compiled_vertex_array is dead. It has been dead for almost 10 years. > No new applications should use it, and old applications that use it should > be fast enough on modern hardware.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.