Bug 27277 - i965: Severe 3d performance regression 7.7 -> 7.8-rc2 (bisected)
Summary: i965: Severe 3d performance regression 7.7 -> 7.8-rc2 (bisected)
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Kristian Høgsberg
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-03-23 18:23 UTC by Nick Bowler
Modified: 2010-05-16 11:21 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Video of the tunnel demo being jerky. (909.26 KB, video/x-matroska)
2010-03-23 18:31 UTC, Nick Bowler
Details
dri2: Reverse invalidate and swapbuffer compatibility ordering. (1.62 KB, patch)
2010-04-06 15:10 UTC, Nick Bowler
Details | Splinter Review
dri2: Delete invalidate compatibility code. (2.13 KB, patch)
2010-04-06 18:30 UTC, Nick Bowler
Details | Splinter Review
dri2: Fix compatibility with old servers. (1.65 KB, patch)
2010-04-30 17:46 UTC, Nick Bowler
Details | Splinter Review
Make tunnel demo use frontbuffer rendering with glFlush. (501 bytes, patch)
2010-05-10 11:52 UTC, Nick Bowler
Details | Splinter Review
intel_always_throttle.patch (507 bytes, patch)
2010-05-16 10:50 UTC, Francisco Jerez
Details | Splinter Review

Description Nick Bowler 2010-03-23 18:23:31 UTC
After "upgrading" to mesa 7.8-rc2 from 7.7, I experience _abysmal_
performance in all but the most trivial OpenGL programs, often to the
point where they are no longer usable.  Neverball is completely
unplayable (but it works perfectly with mesa 7.7).  Most of the mesa
demos also experience problems.  The tunnel demo is so bad that I
captured it on camera and am attaching the video to this bug.

The 3D performance problems in one program affect the entire desktop:
_every_ application becomes choppy while one 3D program is running
slowly.  It's somewhat hard to describe: things appear to render in
pulses.  See the attached video.

I'm using a ThinkPad T500 with an Intel GM45 card, with xf86-video-intel-2.10.0, xorg-server-1.7.6 and Linux 2.6.34-rc2.

Bisection reveals the following:

61d26bc82e7c4100acfb551cbb0ba9d84bbc4ba5 is the first bad commit
commit 61d26bc82e7c4100acfb551cbb0ba9d84bbc4ba5
Author: Francisco Jerez <currojerez@riseup.net>
Date:   Mon Feb 8 19:27:56 2010 +0100

    dri2: Event driven buffer validation.
    
    When a buffer invalidation event is received from the X server, the
    "invalidate" hook of the DRI2 flush extension is executed: A generic
    implementation (dri2InvalidateDrawable) is provided that just bumps
    the "pStamp" sequence number in __DRIdrawableRec.
    
    For old servers not supporting buffer invalidation events, the
    invalidate hook will be called before flushing the fake front/back
    buffer (that's typically once per frame -- not a lot worse than the
    situation we were in before).
    
    No effort has been made on preserving backwards compatibility with
    version 2 of the flush extension, but I think it's acceptable because
    AFAIK no released stack is making use of it.
    
    Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>

:040000 040000 7e6b36bc02f7d5e058b2f4125fe0f8b0f735efa3 0ea06a25121475553f89abc2992c662f94370f4e M	include
:040000 040000 5dc9b9f149fbdf5876564611293deaa35872a4a0 9b67d805c245acf5b54840bd7d98d973cac67dac M	src

git bisect start
# bad: [516334b7fff5e0167d3f3fbcd15de08b5ca89747] i965: Stop abusing ctx->NewState flags for storing driver internal changes.
git bisect bad 516334b7fff5e0167d3f3fbcd15de08b5ca89747
# good: [7e210b93376ab6fef63995c48d7b1766c4335ad8] mesa: set version string to 7.7
git bisect good 7e210b93376ab6fef63995c48d7b1766c4335ad8
# good: [51362a75a70f982dc076064ff266e8eb6a0e3a8b] intel: intelScreenContext() is no longer used
git bisect good 51362a75a70f982dc076064ff266e8eb6a0e3a8b
# good: [bee9964b29b2428ee75e2d1efc0e1d2c2518a417] Merge remote branch 'origin/master' into lp-binning
git bisect good bee9964b29b2428ee75e2d1efc0e1d2c2518a417
# bad: [e3114d3f0ff45f6e3ef38c59cceb9b6923b7b0eb] Cygwin build fix: Fix linkage
git bisect bad e3114d3f0ff45f6e3ef38c59cceb9b6923b7b0eb
# good: [2717d9066d46bff9b015f3d17dc05ce1335d0883] Merge branch 'master' of git+ssh://git.freedesktop.org/git/mesa/mesa
git bisect good 2717d9066d46bff9b015f3d17dc05ce1335d0883
# good: [2b4575f16d24a212b9a43cbd4a9966b3668e4b32] os: Make streams abstract.
git bisect good 2b4575f16d24a212b9a43cbd4a9966b3668e4b32
# bad: [0583c29313a4bcd01cc8d30846c815cdc775cb16] progs/demos: update GL version test to accept GL 3.x
git bisect bad 0583c29313a4bcd01cc8d30846c815cdc775cb16
# good: [1587eff1b6d6ab74c302ca2ccb767d917ce3b1dd] progs/redbook: Temporarily disable the demos that extensions not part of windows to fix build.
git bisect good 1587eff1b6d6ab74c302ca2ccb767d917ce3b1dd
# bad: [41b19c279a0eae61f0f95c3b66376a25635241fd] nouveau: fix legacy dri driver build
git bisect bad 41b19c279a0eae61f0f95c3b66376a25635241fd
# good: [97ec6076f596a3820a9b7c4b7eae18bd98c6676b] glx: Move GetGLXDRIDrawable() prototype to glxclient.h
git bisect good 97ec6076f596a3820a9b7c4b7eae18bd98c6676b
# bad: [215d0dae6151e83ca4dc1a65c96d56b0835d27e7] glx: Only register wire handlers for the events the server supports
git bisect bad 215d0dae6151e83ca4dc1a65c96d56b0835d27e7
# bad: [61d26bc82e7c4100acfb551cbb0ba9d84bbc4ba5] dri2: Event driven buffer validation.
git bisect bad 61d26bc82e7c4100acfb551cbb0ba9d84bbc4ba5
# good: [925b901ba313a3ddd7567eca088951be39414430] dri2: Allocate cliprect as part of the __DRIdrawableRec
git bisect good 925b901ba313a3ddd7567eca088951be39414430
Comment 1 Nick Bowler 2010-03-23 18:31:07 UTC
Created attachment 34382 [details]
Video of the tunnel demo being jerky.
Comment 2 Kristian Høgsberg 2010-03-24 12:42:21 UTC
I'm have a GM45 here and I can not reproduce the problem.  The tunnel demo runs at 170 fps here and openarena and neverball are both at normal framerates.  Can you try a make clean and the build mesa from scratch?
Comment 3 Nick Bowler 2010-03-24 13:48:01 UTC
Every single build in that bisection run was a completely clean rebuild
of mesa, done by:

 git clean -xfd
 ./autogen.sh --with-dri-drivers=swrast,i965 --enable-xcb --disable-gallium
 make -j3.

Removing the --enable-xcb and --disable-gallium configure options does
not change anything.  I can reproduce the problem on another machine,
this one is a desktop with a G45.  The issue occurs on the master
branch, as well.

Comment 4 Nick Bowler 2010-03-24 18:48:07 UTC
OK: if I install *both* xf86-video-intel-2.10.903 and xorg-server-1.7.99.902, the problem is resolved.  With just one, the problem persists.

If this sort of dependency on pre-release drivers and server is intended, it should at least be mentioned in the release notes for 7.8.
Comment 5 Nick Bowler 2010-03-24 20:49:30 UTC
Well, I may have spoken too soon.  While the tunnel demo seems to run
fine now, neverball is still jerky even after upgrading xf86-video-intel
and xorg-server (albeit rather less so).  Once again, it's perfectly
smooth if I use mesa 7.7, thus I ran another bisection.

Due to compile errors near the end, the bisection result was not
conclusive, but it seems like the cause is different this time
around.

There are only 'skip'ped commits left to test.
The first bad commit could be any of:
5e1851b144a97bd577409dd5c6f3f6f45b4ff56f
c7fc9bfb2207638a479ddaff3ad108ffd9cd294a
7aed23c36288c2b343073d6d06ca0ea167805cd3
We cannot bisect more!
Comment 6 Nick Bowler 2010-04-05 13:07:39 UTC
Some more data points:

 * The issue does not occur if I set LIBGL_ALWAYS_INDIRECT=1.
 * The issue does not occur if I run compiz.
 * The issue *does* still occur if I run xcompmgr.
Comment 7 Nick Bowler 2010-04-06 15:10:25 UTC
Created attachment 34732 [details] [review]
dri2: Reverse invalidate and swapbuffer compatibility ordering.

After studying the commit implicated by the first bisection, I believe I have found the issue.  Please apply the attached patch, which fixes this bug.

[PATCH] dri2: Reverse invalidate and swapbuffer compatibility ordering.

After bisecting a major jerkiness regression on my GM45 to commit
61d26bc82e7c4100a ("dri2: Event driven buffer validation."), I noticed
that the patch reversed the ordering between the swapbuffers
compatibility code and the invalidate compatibility code.

Restoring the original ordering (swapbuffers followed by invalidate)
fixes the stuttering problems.

Fixes fdo bug #27277.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
Comment 8 Nick Bowler 2010-04-06 18:30:57 UTC
Created attachment 34742 [details] [review]
dri2: Delete invalidate compatibility code.

So the last patch is wrong.  It fixes the problem for backbuffer rendering only, but (presumably) only because the return in the !swapbuffers case prevents the !invalidate code from running.  The previous patch does not fix the issue for frontbuffer rendering.  It turns out that everything works when the compatibility code is deleted completely.

[PATCH] dri2: Delete invalidate compatibility code.

Commit 61d26bc82e7c4100a ("dri2: Event driven buffer validation.")
introduced severe stuttering on my GM45 when using X server 1.7.
This appears to have been caused by the compatibility code added
to handle the case where the server does not emit invalidate events.

Removing the code entirely solves the issue.

Fixes fdo bug #27277.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
Comment 9 toolman33 2010-04-26 12:51:44 UTC
Dell Inspiron 1545 Laptop here with GM45 graphics and I do indeed have the issue shown in the video. The patch to remove invalidation code fixes the isssue for me. Running an up to date Arch Linux system.
Comment 10 Nick Bowler 2010-04-30 08:41:00 UTC
OK, I've done some more testing.  First, I cannot reproduce any failures on my
r600 card: it works with and without the patch with both servers 1.7 and 1.8
(this might be a fluke).  So all this is done with xf86-video-intel-2.11 on two
machines, one with a G45 and the other with a GM45.  I'm also using mesa git
master, but it's just as broken on the 7.8 branch.

Next, both servers 1.7.6 and 1.8.0 are detected as not supporting invalidate
events.  As a result, the invalidate compatibility code (the code removed by my
patch) is run on both servers.  Perhaps a notable difference is that 1.8.0
supports swap events, but 1.7.6 does not.

With server 1.7.6, the invalidate compatibility code is run and causes the
problem shown in the video.  Applying the patch to delete the code totally
solves the problem with this server.

With server 1.8.0, the invalidate compatibility code is run and seems to be
required: applying the patch to delete the code causes the GPU to hang,
requiring a reboot to recover.  [seems to be caused by running out of memory,
the X log gets spammed with "(WW) intel(0): divisor 0 get vblank counter
failed: Cannot allocate memory"].
Comment 11 Nick Bowler 2010-04-30 17:46:56 UTC
Created attachment 35351 [details] [review]
dri2: Fix compatibility with old servers.

This is really looking now like it's because 1.7.6 does not support swap
events: the issue is exactly reproducible on 1.8.0 by forcing swapAvailable to
0.

So here's a new patch which both solves the problem *and* works on both the old
and new servers!

[PATCH v3] dri2: Fix compatibility with old servers.

Commit 61d26bc82e7c4100a ("dri2: Event driven buffer validation.")
introduced code to handle "old" servers which do not support invalidate
events (i.e. 1.8.0).  However, the compatibility code causes severe
stuttering on my GM45 with "older" servers that do not support swap
events (i.e. 1.7.6).  It seems that the solution is to only run the
compatibility code when swap events are supported.

Fixes fdo bug #27277.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
Comment 12 Nick Bowler 2010-05-03 11:55:59 UTC
Even with the last patch, frontbuffer rendering using glFlush() is still broken (even on server 1.8.0).  glFinish() is OK, however...
Comment 13 Kristian Høgsberg 2010-05-10 11:34:57 UTC
(In reply to comment #12)
> Even with the last patch, frontbuffer rendering using glFlush() is still broken
> (even on server 1.8.0).  glFinish() is OK, however...

When you say it's "broken" you mean that it's still very slow, right?
Comment 14 Nick Bowler 2010-05-10 11:52:39 UTC
Created attachment 35550 [details] [review]
Make tunnel demo use frontbuffer rendering with glFlush.

Yes, "broken" here means "inordinately jerky, as shown in the video".

Attached is a patch which modifies the tunnel demo to use frontbuffer rendering
with glFlush.  This causes the jerkiness shown in the video even with latest
git mesa and server 1.8.0, with or without my compatibility fix patch.

This works fine with mesa 7.7.
Comment 15 Kristian Høgsberg 2010-05-10 12:49:30 UTC
(In reply to comment #14)
> Created an attachment (id=35550) [details]
> Make tunnel demo use frontbuffer rendering with glFlush.
> 
> Yes, "broken" here means "inordinately jerky, as shown in the video".
> 
> Attached is a patch which modifies the tunnel demo to use frontbuffer rendering
> with glFlush.  This causes the jerkiness shown in the video even with latest
> git mesa and server 1.8.0, with or without my compatibility fix patch.
> 
> This works fine with mesa 7.7.

Great.  I tracked this down independently with another reproducer and then
remembered your bug report.  The problem is that when we're frontbuffer
rendering, intel_prepare_render() to update our drawbuffers we end up calling
intel_flush() which then ends up in dri2FlushFrontBuffer() which calls
DRI2CopyRegion() and then invalidates the buffers.  Which means that when we
call intel_prepare_renderbuffers() next time, we end up doing the whole thing
(including  DRI2CopyRegion()) again.  intel_prepare_render() was supposed to be
a quick cheap check in most cases...

I'll commit an edited version of your patch - we don't need the invalidate call
in dri2FlushFrontBuffer() at all, so I'll take that out, and in
dri2SwapBuffers, I'll just move the dri2InvalidateBuffers() call below the
!pdp->swapAvailable test.
Comment 16 Kristian Høgsberg 2010-05-10 14:51:33 UTC
Fixes pushed to mesa master, please give it a try and close the bug if it fixes the problem.  Thanks!
Comment 17 Nick Bowler 2010-05-10 18:02:06 UTC
Just tried mesa master: we're not quite in the clear yet.  Backbuffer rendering with glutSwapBuffers and frontbuffer rendering with glFlush are fine (both server versions).

However, frontbuffer rendering with glFinish has regressed: nothing ever appears in the window.  We get an empty window frame (with the usual visual problems).  This occurs with both servers 1.7 and 1.8.
Comment 18 Kristian Høgsberg 2010-05-10 18:11:55 UTC
Argh, of course.  That should be fixed now.
Comment 19 Nick Bowler 2010-05-10 18:22:20 UTC
Looks like victory to me!  Thanks a lot.
Comment 20 Francisco Jerez 2010-05-16 10:50:24 UTC
Created attachment 35696 [details] [review]
intel_always_throttle.patch

(In reply to comment #19)
> Looks like victory to me!  Thanks a lot.

Not quite yet, nouveau-classic and all the gallium drivers depend
on the compat code this patch has removed (see [1]). IMHO we do
want this code (because otherwise we'll have to put the
glViewport hacks back and that'd be pretty ugly, especially in
the case of gallium. Moreover it wouldn't be completely correct
because the client isn't required to call glViewport() on
resize).

I suspect that your problem is being caused by the intel code
that tries to decide whether to turn frame throttling on or
off. If that's the case I think we could fix it using the new
useInvalidate loader extension Kristian has implemented or
something similar.

Could you try the attached patch and check if it has the same
effect as 2d00d16da7f5d2255cb37b48edaf4cbb9ca7e930?

[1] http://lists.freedesktop.org/archives/mesa-dev/2010-May/000692.html
Comment 21 Nick Bowler 2010-05-16 11:19:53 UTC
Yes, your patch solves all jerkiness issues with both server versions, just
like 2d00d16da7f5d does.

Furthermore, your patch *also* fixes fixes bug 28072!
Comment 22 Nick Bowler 2010-05-16 11:21:26 UTC
Er, I forgot to say that I applied the patch on top of the 7.8 branch.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.