Created attachment 105688 [details] Glut sample code The following behavior happens on an Aaeon Mainboard QM77 with Intel HD Grapics 4000 (Ivy Bridge) with Ubuntu 14.04 (3.13.0-34) and Mesa 10.2.2 Copying HD bitmap data into a texture unit takes significantly longer than with Ubuntu 12.04 (3.2.0-36) and Mesa 8.0.4 (using the same hardware). Consider the following sample code: // Init GLuint texId = 0; glBindTexture(GL_TEXTURE_2D, g_texId); glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 1920, 1080, 0, GL_RGBA, GL_UNSIGNED_BYTE, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glBindTexture(GL_TEXTURE_2D, 0); // Render unsigned char* hdData = ...; glBindTexture(GL_TEXTURE_2D, gtexIdtexId); glTexSubImage2D( GL_TEXTURE_2D, 0, 0, 0, 1920, 1080, GL_RGBA, GL_UNSIGNED_BYTE, hdData); glBindTexture(GL_TEXTURE_2D, 0); With Ubuntu 12.04 and Mesa 8.0.4 the render code takes approx 6-8 ms. The exact same code on the same hardware but with Ubuntu 14.04 and Mesa 10.2.2 results in 10-12 ms. In our specific application the bitmap data coming in has the format RGBA and the dimensions 1920x1080. I experimented with different internalFormat/format/types but no combination achieved the same results as with Mesa 8.0.4. I also tried pixel buffer objects but all attempts did not achieve the time as with the older Linux/Mesa. I attached a simple glut program demonstrating the effect by copying an array of 1920x1080*4 bytes into a texture object in each render call. The time measurement is done with boost chrono and some glReadPixel() calls before and after the copying, to make sure the render pipeline is executed during the time measurement. Another observation (that is not the topic of this issue but may be a hint to the explanation). When I use the Qt framework and paint some overlay text into a fullscreen (1920x1080) QGLWidget using the renderText() methods, the performance is nearly halfed with Ubuntu 14.04 and Mesa 10.2.2 I attached the results of glxinfo and Xorg.0.log for both configurations. Any help how to optimize the behavior with the latest drivers would be much appreciated.
Created attachment 105689 [details] glxinfo for Ubuntu 14.04 and Mesa 10.2.2
Created attachment 105690 [details] Xorg log for Ubuntu 14.04 and Mesa 10.2.2
Created attachment 105691 [details] glxinfo for Ubuntu 12.04 and Mesa 8.0.4
Created attachment 105692 [details] Xorg log for Ubuntu 12.04 and Mesa 8.0.4
There have been a bunch of changes recently in the texture upload paths. Can you retest with Mesa master?
Yeah, this should be hitting the tiled_memcpy path and be crazy-fast. Even if it's not hitting that path, it should still be pretty good on recent mesa.
(In reply to comment #6) > Yeah, this should be hitting the tiled_memcpy path and be crazy-fast. You mean the SSSE3 code that's only built if you're using appropriate CFLAGS? :)
Thanks for the fest response. Ok, I will try to install the latest Mesa sources. Just to make sure: Mesa master is version 10.3, correct ? Any special flags/settings I should set for the build (e.g. should I enable the DRI drivers) ? Or is it basically just configure,make,sudo make install. Thanks.
I'm not sure what flags ubuntu builds with, so you probably don't want to do a "sudo make install". Instead, just build and then set LD_LIBRARY_PATH=$BUILDDIR/lib LIBGL_DRIVERS_PATH=$BUILDDIR/lib where $BUILDDIR is wherever you built mesa. You may need to use "lib64" instead of "lib" depending on your system.
I ran the provided glut test program with today's master (95058bd) on IVB. The render code takes approx 3-6 ms.
Sounds like it's already fixed.
I compiled the 10.3 Mesa version and can confirm that the sample code is now performing better than the 8.0.4 Mesa version. It also turned out that some glPixelStoragei() calls where causing the performance drop. I forgot to add this in the sample code. I was using the following lines to split one large 1920x1080 sideByside Bitmap into 2 texture objects: // Storing left texture. glPixelStorei(GL_UNPACK_ROW_LENGTH, 1920); glPixelStorei(GL_UNPACK_SKIP_PIXELS, 0); glTexSubImage2D(...) // Storing right texture. glPixelStorei(GL_UNPACK_ROW_LENGTH, 1920); glPixelStorei(GL_UNPACK_SKIP_PIXELS, 960); glTexSubImage2D(...) For some reason glTexSubImage2D() together with these glPixelStoragei() calls are performing better with the old Mesa version. I am now splitting the 1920x1080 texture inside the shader so I do not need the glPixelStoragei() calls any more. In total I gained a performance increase as glTexSubImage2D() without changing the pixel storage is behaving better with the new Mesa versions. I guress this ticket can be set to fixed then. Thanks.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.