Bug 83442 - [IVB regression] glTexSubImage2D performance
Summary: [IVB regression] glTexSubImage2D performance
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 10.2
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-03 14:57 UTC by Mark Newiger
Modified: 2014-09-15 14:49 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Glut sample code (2.44 KB, text/plain)
2014-09-03 14:57 UTC, Mark Newiger
Details
glxinfo for Ubuntu 14.04 and Mesa 10.2.2 (17.12 KB, text/plain)
2014-09-03 14:59 UTC, Mark Newiger
Details
Xorg log for Ubuntu 14.04 and Mesa 10.2.2 (22.56 KB, text/plain)
2014-09-03 15:01 UTC, Mark Newiger
Details
glxinfo for Ubuntu 12.04 and Mesa 8.0.4 (12.27 KB, text/plain)
2014-09-03 15:04 UTC, Mark Newiger
Details
Xorg log for Ubuntu 12.04 and Mesa 8.0.4 (25.66 KB, text/plain)
2014-09-03 15:05 UTC, Mark Newiger
Details

Description Mark Newiger 2014-09-03 14:57:44 UTC
Created attachment 105688 [details]
Glut sample code

The following behavior happens on an Aaeon Mainboard QM77 with Intel HD Grapics 4000 (Ivy Bridge) with Ubuntu 14.04 (3.13.0-34) and Mesa 10.2.2

Copying HD bitmap data into a texture unit takes significantly longer than with Ubuntu 12.04 (3.2.0-36) and Mesa 8.0.4 (using the same hardware).

Consider the following sample code:
// Init
GLuint texId = 0;
glBindTexture(GL_TEXTURE_2D, g_texId);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA,  1920,  1080, 0, GL_RGBA, GL_UNSIGNED_BYTE, 0);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);	
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glBindTexture(GL_TEXTURE_2D, 0);
// Render
unsigned char* hdData = ...;
glBindTexture(GL_TEXTURE_2D, gtexIdtexId);
glTexSubImage2D( GL_TEXTURE_2D, 0, 0, 0, 
                 1920, 
                 1080, 
	         GL_RGBA,
                 GL_UNSIGNED_BYTE,
                 hdData);
glBindTexture(GL_TEXTURE_2D, 0);

With Ubuntu 12.04 and Mesa 8.0.4 the render code takes approx 6-8 ms. The exact same code on the same hardware but with Ubuntu 14.04 and Mesa 10.2.2 results in 10-12 ms.
In our specific application the bitmap data coming in has the format RGBA and the dimensions 1920x1080. I experimented with different internalFormat/format/types but no combination achieved the same results as with Mesa 8.0.4.
I also tried pixel buffer objects but all attempts did not achieve the time as with the older Linux/Mesa.

I attached a simple glut program demonstrating the effect by copying an array of 1920x1080*4 bytes into a texture object in each render call. The time measurement is done with boost chrono and some glReadPixel() calls before and after the copying, to make sure the render pipeline is executed during the time measurement.

Another observation (that is not the topic of this issue but may be a hint to the explanation).
When I use the Qt framework and paint some overlay text into a fullscreen (1920x1080) QGLWidget using the renderText() methods, the performance is nearly halfed with Ubuntu 14.04 and Mesa 10.2.2

I attached the results of glxinfo and Xorg.0.log for both configurations.

Any help how to optimize the behavior with the latest drivers would be much appreciated.
Comment 1 Mark Newiger 2014-09-03 14:59:35 UTC
Created attachment 105689 [details]
glxinfo for Ubuntu 14.04 and Mesa 10.2.2
Comment 2 Mark Newiger 2014-09-03 15:01:32 UTC
Created attachment 105690 [details]
Xorg log for Ubuntu 14.04 and Mesa 10.2.2
Comment 3 Mark Newiger 2014-09-03 15:04:28 UTC
Created attachment 105691 [details]
glxinfo for Ubuntu 12.04 and Mesa 8.0.4
Comment 4 Mark Newiger 2014-09-03 15:05:12 UTC
Created attachment 105692 [details]
Xorg log for Ubuntu 12.04 and Mesa 8.0.4
Comment 5 Ian Romanick 2014-09-04 18:23:47 UTC
There have been a bunch of changes recently in the texture upload paths.  Can you retest with Mesa master?
Comment 6 Jason Ekstrand 2014-09-05 05:07:28 UTC
Yeah, this should be hitting the tiled_memcpy path and be crazy-fast.  Even if it's not hitting that path, it should still be pretty good on recent mesa.
Comment 7 Matt Turner 2014-09-05 06:17:49 UTC
(In reply to comment #6)
> Yeah, this should be hitting the tiled_memcpy path and be crazy-fast.

You mean the SSSE3 code that's only built if you're using appropriate CFLAGS? :)
Comment 8 Mark Newiger 2014-09-05 15:26:49 UTC
Thanks for the fest response.

Ok, I will try to install the latest Mesa sources.
Just to make sure:
Mesa master is version 10.3, correct ?
Any special flags/settings I should set for the build (e.g. should I enable the DRI drivers) ?
Or is it basically just configure,make,sudo make install.

Thanks.
Comment 9 Jason Ekstrand 2014-09-05 20:13:00 UTC
I'm not sure what flags ubuntu builds with, so you probably don't want to do a "sudo make install".  Instead, just build and then set

LD_LIBRARY_PATH=$BUILDDIR/lib
LIBGL_DRIVERS_PATH=$BUILDDIR/lib

where $BUILDDIR is wherever you built mesa.  You may need to use "lib64" instead of "lib" depending on your system.
Comment 10 Samuel Iglesias Gonsálvez 2014-09-12 08:06:26 UTC
I ran the provided glut test program with today's master (95058bd) on IVB. The render code takes approx 3-6 ms.
Comment 11 Jason Ekstrand 2014-09-12 17:40:21 UTC
Sounds like it's already fixed.
Comment 12 Mark Newiger 2014-09-15 14:49:11 UTC
I compiled the 10.3 Mesa version and can confirm that the sample code is now performing better than the 8.0.4 Mesa version.
It also turned out that some glPixelStoragei() calls where causing the performance drop. I forgot to add this in the sample code.
I was using the following lines to split one large 1920x1080 sideByside Bitmap into 2 texture objects:
// Storing left texture.
glPixelStorei(GL_UNPACK_ROW_LENGTH, 1920);
glPixelStorei(GL_UNPACK_SKIP_PIXELS, 0);
glTexSubImage2D(...)
// Storing right texture.
glPixelStorei(GL_UNPACK_ROW_LENGTH, 1920);
glPixelStorei(GL_UNPACK_SKIP_PIXELS, 960);
glTexSubImage2D(...)
For some reason glTexSubImage2D() together with these glPixelStoragei() calls are performing better with the old Mesa version.
I am now splitting the 1920x1080 texture inside the shader so I do not need the glPixelStoragei() calls any more. In total I gained a performance increase as glTexSubImage2D() without changing the pixel storage is behaving better with the new Mesa versions. 

I guress this ticket can be set to fixed then.

Thanks.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.