System Environment: ---------------------------------------------- Platform: HSW/BYT/IVB Libdrm:(master)2.4.47-11-gda738d1ed0a0941a0c Mesa:(master) git-a594cec Xf86_video_intel:(master)2.99.905-102-g922a8bab89c1a59 Cairo:(master)56a195a76554abe1d5567c733ba679058fe01303 Kernel:(drm-intel-nightly) 164a4cb4c1431a0 Bug detailed description: ---------------------------------------------- lightsmark v2008 performance reduced ~30%. The problem exists on both gnome-session and Raw X. It’s Mesa regression, bisecting show that the first bad commit is: 59b01ca252bd6706f08cd80a864819d71dfe741c Author: Fredrik H枚glund <fredrik@kde.org> Date: Tue Apr 9 20:54:25 2013 +0200 mesa: Add ARB_vertex_attrib_binding update_array() and update_array_format() are changed to update the new attrib and binding states, and the client arrays become derived state. Reviewed-by: Eric Anholt eric@anholt.net Performance -------------------------------------------------------------------- Test lightsmark on ULT with Raw X git-a594cec: 49.40 git-c6a3fb6: 75.68 Reproduce steps: --------------------------------------------- 1, xinit& 2. vblank_mode=0 ./backend silent 1920x1080
The most likely explanation is the overhead from computing the gl_client_arrays in _mesa_update_state(). Lightsmark makes a lot of calls to glDrawElements() each frame, and changes the vertex arrays between those calls. That being said, I can't reproduce the regression with r600g. I get approximately 115 FPS with and without the ARB_vertex_attrib_binding changes. Could you investigate further?
Looking at other test data from relevant time period: * Why Lighstmark test is marked on HSW D as "blocked"? * On HSW ULT & Mobile, performance drop is ~30% between 11-10 and 11-11, on IVB D & M it's 37%, on BYT, the performance drop between 11-07 and 11-11 is ~45%. * On HSW D, x11perf "aa10text" regressed during this interval 13% and "rgb10text" regressed by 68%. x11perf stuff is CPU bound, but it could be unrelated as on BYT "aa10text" got 45% increase in its performance. * There's no performance drop in any other tests on any other tested HW. > Lightsmark makes a lot of calls to glDrawElements() each frame, Nope. It's an old program using glBegin/glEnd... Stuff that Lighstmarks does in every frame: 44922 glBindTexture (avg = 27.4 calls in each of 1638 frames) 42369 glEnable (avg = 25.9 calls in each of 1638 frames) 42336 glActiveTexture (avg = 25.8 calls in each of 1638 frames) 30908 glVertex2f (avg = 18.9 calls in each of 1638 frames) 30908 glMultiTexCoord2f (avg = 18.9 calls in each of 1638 frames) 22780 glGetUniformLocation (avg = 13.9 calls in each of 1638 frames) 22000 glMatrixMode (avg = 13.4 calls in each of 1637 frames) 20253 glDisable (avg = 12.4 calls in each of 1638 frames) 13310 glGetIntegerv (avg = 8.1 calls in each of 1638 frames) 9737 glUniform1i (avg = 5.9 calls in each of 1638 frames) 9582 glBlendFunc (avg = 5.9 calls in each of 1637 frames) 7727 glEnd (avg = 4.7 calls in each of 1638 frames) 7727 glBegin (avg = 4.7 calls in each of 1638 frames) 7586 glUseProgram (avg = 4.6 calls in each of 1638 frames) 7255 glUniform4f (avg = 4.4 calls in each of 1638 frames) 6714 glPushMatrix (avg = 4.1 calls in each of 1637 frames) 6714 glPopMatrix (avg = 4.1 calls in each of 1637 frames) 6639 glLoadMatrixd (avg = 4.1 calls in each of 1637 frames) 6534 glDepthMask (avg = 4.0 calls in each of 1638 frames) 4917 glXMakeContextCurrent (avg = 3.0 calls in each of 1638 frames) 4837 glClear (avg = 3.0 calls in each of 1637 frames) 4762 glAlphaFunc (avg = 2.9 calls in each of 1637 frames) 3954 glTexImage2D (avg = 33.5 calls in each of 118 frames) 3754 glMultMatrixd (avg = 2.3 calls in each of 1637 frames) 3274 glLoadIdentity (avg = 2.0 calls in each of 1637 frames) 3267 glGetBooleanv (avg = 2.0 calls in each of 1638 frames) 2708 glViewport (avg = 1.7 calls in each of 1638 frames) 2509 glCallList (avg = 1.5 calls in each of 1637 frames) 2464 glGetFloatv (avg = 1.5 calls in each of 1637 frames) 1991 glColor3f (avg = 1.2 calls in each of 1637 frames) 1922 glUniform3f (avg = 1.2 calls in each of 1637 frames) 1736 glUniform4fv (avg = 1.1 calls in each of 1635 frames) 1638 glFlush (avg = 1.0 calls in each of 1638 frames) 1637 glOrtho (avg = 1.0 calls in each of 1637 frames) 1637 glColor4f (avg = 1.0 calls in each of 1637 frames) Stuff that Lighstmark does in some frames: 60558 glClientActiveTexture (avg = 348.0 calls in each of 174 frames) 60378 glTexCoordPointer (avg = 347.0 calls in each of 174 frames) 30787 glVertexPointer (avg = 176.9 calls in each of 174 frames) 30786 glDrawElements (avg = 176.9 calls in each of 174 frames) 30441 glNormalPointer (avg = 174.9 calls in each of 174 frames) 29929 glColorPointer (avg = 173.0 calls in each of 173 frames) 28877 glCullFace (avg = 166.0 calls in each of 174 frames) 6702 glTexParameteri (avg = 36.8 calls in each of 182 frames) 3685 glPixelStorei (avg = 31.2 calls in each of 118 frames) 2347 glBindFramebufferEXT (avg = 3.5 calls in each of 680 frames) 1583 glFramebufferTexture2DEXT (avg = 2.3 calls in each of 680 frames) 1583 glEnableClientState (avg = 9.1 calls in each of 174 frames) 1569 glGenTextures (avg = 65.4 calls in each of 24 frames) 1360 glColorMask (avg = 2.0 calls in each of 680 frames) 933 glGetTexLevelParameteriv (avg = 1.4 calls in each of 684 frames) 883 glDisableClientState (avg = 5.1 calls in each of 174 frames) 819 glCheckFramebufferStatusEXT (avg = 1.2 calls in each of 680 frames) 736 glPolygonOffset (avg = 1.1 calls in each of 680 frames) 680 glClearDepth (avg = 1.0 calls in each of 680 frames) 218 glUniformMatrix4fv (avg = 2.2 calls in each of 101 frames) 177 glColor4ub (avg = 1.0 calls in each of 174 frames) 167 glClearColor (avg = 2.0 calls in each of 84 frames) 167 glReadBuffer (avg = 2.0 calls in each of 83 frames) 167 glDrawBuffer (avg = 2.0 calls in each of 83 frames) 83 glUniform2f (avg = 1.0 calls in each of 83 frames) 83 glReadPixels (avg = 1.0 calls in each of 83 frames)
Verified it with Mesa git-45a56ce39.
(In reply to comment #3) > Verified it with Mesa git-45a56ce39. Can you bisect that? I don't see anything in the range a594cec..45a56ce39. The only thing that worries me more than unexplained bugs is unexplained fixes. :( Also, is the original performance drop reproducible on the 10.0 branch? Did the kernel change at any point?
(In reply to comment #4) by bisect the fixed commit is commit ff353c218a1ab1fd3fb797a4780612ec4b0451d8 Author: Fredrik Höglund <fredrik@kde.org> Date: Mon Nov 11 18:54:15 2013 +0100 mesa: Fix derived vertex state not being updated in glCallList() AEcontext::NewState is not always set when the vertex array state is changed.
(In reply to comment #4) It's reproducible on 10.0 branch.
(In reply to comment #6) > (In reply to comment #4) > It's reproducible on 10.0 branch. But ff353c218a1ab1fd3fb797a4780612ec4b0451d8 was cherry-picked to the 10.0 branch on November 15 according to git.
(In reply to comment #7) Hi, it's reproducible on 10.0 branch because git-59b01ca also exists on 10.0. It's also fixed with patch ff353c218 on 10.0.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.