Bug 34156 - r600g performance regression between 780c183b and 862ebb41
Summary: r600g performance regression between 780c183b and 862ebb41
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r600 (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-10 19:06 UTC by Dave Witbrodt
Modified: 2012-02-22 10:07 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Final output of 'git bisect log': a77e813d is first bad commit -- r600g: Split r600_bc_alu_src [Henri Verbeet] (2.18 KB, text/plain)
2011-02-12 12:56 UTC, Dave Witbrodt
Details

Description Dave Witbrodt 2011-02-10 19:06:24 UTC
Performance suddenly improved by 50-150% across the board in late January, so I've been very happy with Mesa r600g lately!  I've been trying to do quick builds for testing every couple of days when I get home from work, watching out for any changes in performance.

My last build was at commit 780c183b (Feb. 6), and that was working great.

Tonight's build was at commit 862ebb41, and performance was degraded severely.  For example, the previous build allowed torcs to hit fps in mid 30's, occasionally hitting up to 60 fps; tonight I'm getting mid 20's, occasionally hitting 30 fps.  Other apps, and even the desktop, are noticeably more sluggish.  (Reinstalling the previous packages restores excellent performance!)

I will try to bisect this, but some home improvement projects will prevent me from attempting it until tomorrow night at the soonest, and more likely Saturday morning.  Sorry for the delay.


Hardware:  Radeon HD 5750 (Evergreen JUNIPER)

Software:
  kernel 2.6.37 (+ radeon cherry-picks from drm-fixes)
  libdrm 2.4.23
  xorg-server 1.9.4
  xf86-video-ati 6.14.0
Comment 1 Tobias Jakobi 2011-02-11 02:56:31 UTC
780c183b = http://cgit.freedesktop.org/mesa/mesa/commit/?id=780c183b8fdf2d301e1eea7f0b83cd96fb6cbf84
862ebb41 = http://cgit.freedesktop.org/mesa/mesa/commit/?id=862ebb411b911f28bc93316e9e68c42f69f4dff3

I would guess it's either Marek's work on the vertex upload manager or Henri's patches for relative adressing. Since performance is involved my vote would be for Marek :)
Comment 2 Dave Witbrodt 2011-02-12 12:43:16 UTC
(In reply to comment #1)
> 780c183b =
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=780c183b8fdf2d301e1eea7f0b83cd96fb6cbf84
> 862ebb41 =
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=862ebb411b911f28bc93316e9e68c42f69f4dff3
> 
> I would guess it's either Marek's work on the vertex upload manager or Henri's
> patches for relative adressing. Since performance is involved my vote would be
> for Marek :)

I'm a fan of Marek's recent work, as mentioned in my off-topic comment here:

    https://bugs.freedesktop.org/show_bug.cgi?id=33867#c1

I just finished bisecting this regression, and your hunch was wrong about Marek but correct about Henri.  I will post my results next.
Comment 3 Dave Witbrodt 2011-02-12 12:56:02 UTC
Created attachment 43304 [details]
Final output of 'git bisect log':  a77e813d is first bad commit -- r600g: Split r600_bc_alu_src [Henri Verbeet]

After bisecting this regression, the last run of 'git bisect' gave this output:

a77e813de32643ae2dfffd7ad12abed596172cab is first bad commit
commit a77e813de32643ae2dfffd7ad12abed596172cab
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:07 2011 +0100

    r600g: Split r600_bc_alu_src.
    
    The r600_bc_alu_src structure is used in two different ways, as a vector and
    for the individual channels of that same vector. This is somewhat fragile,
    and probably confusing.

:040000 040000 d8beff954fcb75012395bb5068c10d14088150bd 0456b8700e9170a15554f2a425374b17b891c485 M	src


Before this commit, I can get min fps of 30+ on the 'torcs' track called "E-Road", with an upper bound of 60+ fps occasionally.  At, or after, this commit the upper bound is about 33 fps, roughly the same as the minimum frame rate I was getting before.

I made two naive attempts to revert a77e813d in a local branch based on the HEAD commit I had available (a6b7393e... of Feb. 12), but neither would build because the resulting conflict requires more massaging than simply keeping the "before" or "after" choices offered by git.

At any rate, I'll be sticking with the commit before the one causing the regression (3b1c1f02) until it is decided by the devs whether this is a bug or a feature.
Comment 4 Tobias Jakobi 2011-02-12 13:06:12 UTC
(In reply to comment #2)
> I'm a fan of Marek's recent work, as mentioned in my off-topic comment here:
Haha, me too. I mean, how could you _not_ be a fan of Marek's work! ;)


>     https://bugs.freedesktop.org/show_bug.cgi?id=33867#c1
> 
> I just finished bisecting this regression, and your hunch was wrong about Marek
> but correct about Henri.  I will post my results next.
Looks like another point for Marek then :)
Comment 5 Dave Witbrodt 2011-02-18 11:21:39 UTC
In comment 3, I made two naive attempts to revert the problem commit.  Both failed, but I am only able to work on this stuff in my spare time -- and I have not been able to look at this more carefully until today.

I discovered that several commits following the problem commit touch the same code in 'src/gallium/drivers/r600/r600_shader.c', so today I decided to create a branch from Mesa master at the last of those commits, revert them in reverse order, then merge master onto the branch.  I expected to have the benefits of code added since Feb. 7 without the performance regression identified in this bug.

The commits involved were:

    $ git log a77e813d^..HEAD src/gallium/drivers/r600/r600_shader.c

    commit 077c448d184799e0d9ec962013ec784c6a5c1807
    Author: Henri Verbeet <hverbeet@gmail.com>
    Date:   Mon Feb 7 15:22:08 2011 +0100

        r600g: Add support for relative addressing on constant buffers.

    commit 7687eabaa0470261e059a2d6502628fffd209345
    Author: Henri Verbeet <hverbeet@gmail.com>
    Date:   Mon Feb 7 15:22:07 2011 +0100

        r600g: Split constants in r600_shader_from_tgsi().

    commit 1fa95c7f9e7f1b63364b1f9c6289690418cf6313
    Author: Henri Verbeet <hverbeet@gmail.com>
    Date:   Mon Feb 7 15:22:07 2011 +0100

        r600g: Do the tgsi_full_src_register to r600_shader_src conversion in
    r600_s

    commit a77e813de32643ae2dfffd7ad12abed596172cab
    Author: Henri Verbeet <hverbeet@gmail.com>
    Date:   Mon Feb 7 15:22:07 2011 +0100

        r600g: Split r600_bc_alu_src.

I branched from 077c448d, reverted each of these, then merged master onto the branch.  It merged cleanly.

Unfortunately, this did not restore performance.  From that, I am concluding that later code changes introduced at least one other regression.  I will be busy with other tasks for the rest of the day, but tomorrow I will try some more involved attempts to find out what is going on.
Comment 6 Dave Witbrodt 2011-02-19 22:28:35 UTC
I created a local branch using 077c448d as the start point, then reverted the 4 commits mentioned in Comment 5.  Performance was restored.

Since merging "master" onto this local branch resulted in loss of performance again, I went looking for the cause.  Using 'git merge master' relocated the revert commits to the top/newest part of the list, so I had to use 'git rebase'.  After some experimentation, I was able to create a branch with the following properties:

1. identical commits to "master" until 077c448d
2. local commits reverting 077c448d, 7687eaba, 1fa95c7f, and a77e813d
3. commits identical to those from "master" after 077c448d up through fd8d4b32.


With the commits applied in this order, I was able to investigate whether:

a. reverting Henri's 4 commits would restore performance [yes]
b. there was one or more commits added later which degraded performance [yes]


What follows is a sort of map of what I found using 'git bisect' after reverting 077c448d, 7687eaba, 1fa95c7f, and a77e813d; I use ellipses to indicate commits omitted, while commits not separated by an ellipsis appear consecutively in 'git log':


[TESTED MANUALLY:  BAD]
commit fd8d4b32ede6ebeae332539b71d38c36420e2654
Author: Marek Olšák <maraeo@gmail.com>
Date:   Fri Feb 18 15:29:00 2011 +0100

    r300g: remove tracking whether vertex buffers need to be validated

...


[BAD]
commit 467023e8080489abeff53e18ac83560eaf851827
Author: Marek Olšák <maraeo@gmail.com>
Date:   Tue Feb 8 15:21:35 2011 +0100

    r600g: use the same upload buffer for vertices, indices, and constants

    
[GOOD]
commit 80424700574e128070f457d79e1920d512a1efda
Author: Thomas Hellstrom <thellstrom@vmware.com>
Date:   Tue Feb 8 01:14:22 2011 -0800

    mesa/st: Plug a fragment program variant parameter leak

...


[TESTED MANUALLY:  GOOD]
commit bb1036aae5b31af583b9aa263fec73c899383493
Author: Thomas Hellstrom <thellstrom@vmware.com>
Date:   Mon Feb 7 15:37:05 2011 +0100

    mesa/st: Fix vertex buffer leak
    

[IGNORED; IRRELEVANT ON MY SYSTEM]
commit 1e1b89103e1cdce9f20b9f32825dd166f8d9d6e1
Author: Kristian Høgsberg <krh@bitplanet.net>
Date:   Mon Feb 7 20:49:58 2011 -0500

    wayland-egl: Add struct wl_egl_display argument to +wl_egl_window_create()


[4 LOCAL COMMITS HERE REVERTING 077c448d, 7687eaba, 1fa95c7f, & a77e813d]


commit 077c448d184799e0d9ec962013ec784c6a5c1807
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:08 2011 +0100

    r600g: Add support for relative addressing on constant buffers.


commit 871460eb149b9868e5750f13b8206e271743c4a2
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:08 2011 +0100

    r600g: Set the fetch type in r600_bc_vtx_build().


commit 4c30a80e384d523803d70ead3403cf3ca30b8868
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:07 2011 +0100

    r600g: Handle the ADD_INT instruction in r600_bc_get_num_operands().


commit 5c59eebfae55240a2308c02b0a6ad971c9b83304
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:07 2011 +0100

    r600g: Generalize the pipe_add_vertex_attrib() functions.

    
commit b9fd1a1e4b2121225195056ea1b679d62c399ddb
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:07 2011 +0100

    r600g: Remove vs_resource and ps_resource from the pipe context.

    
commit 7687eabaa0470261e059a2d6502628fffd209345
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:07 2011 +0100

    r600g: Split constants in r600_shader_from_tgsi().


commit 1fa95c7f9e7f1b63364b1f9c6289690418cf6313
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:07 2011 +0100

    r600g: Do the tgsi_full_src_register to r600_shader_src conversion in
    r600_shader_from_tgsi().


commit a77e813de32643ae2dfffd7ad12abed596172cab
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:07 2011 +0100

    r600g: Split r600_bc_alu_src.

    
[GOOD]
commit 3b1c1f02537544a11772b94a8f2e8c3d4c886ca8
Author: Henri Verbeet <hverbeet@gmail.com>
Date:   Mon Feb 7 15:22:07 2011 +0100

    r600g: Store literal values in the r600_bc_alu_src structure.
    
...


As the bisect made clear, there was a second regression present after the one I was trying to remove for this bug report; that regression occurs in this commit:

    commit 467023e8080489abeff53e18ac83560eaf851827
    Author: Marek Olšák <maraeo@gmail.com>
    Date:   Tue Feb 8 15:21:35 2011 +0100

        r600g: use the same upload buffer for vertices, indices, and constants

I will open a new bug report for this.  I can confirm that reverting the 4 commits by Henri and this other commit by Marek restores performance.  Currently, the HEAD of my Mesa repository is at fd8d4b32 (Feb. 18), and performance is as good or better than it was before with just these 5 commits reverted.


Let me make clear that I do not understand the code, and I am not sure whether all 4 of Henri's commits cause a loss in performance.  I do know for certain that a77e813d is the first to introduce a negative effect; I had to revert the other three because they depended on the changes in that particular commit.
Comment 7 Henri Verbeet 2011-03-01 23:29:39 UTC
I'm seeing some performance impact from 077c448d, mostly due to the extra calls to evergreen_context_pipe_state_set_vs_resource() / evergreen_context_pipe_state_set_ps_resource(). The earlier commits don't seem to make much of a difference for me. (And I'd be surprised if they did, that code is only called during shader compilation.) As for the issue with 077c448d, constant buffer updates happen a lot, and that commit made them more expensive, so that makes some sense. On the other hand, it seems to me that this also implies state updates being disproportionately expensive. That would make it a general problem with the r600 winsys.
Comment 8 Dave Witbrodt 2011-03-02 04:47:31 UTC
(In reply to comment #7)
> I'm seeing some performance impact from 077c448d, mostly due to the extra calls
> to evergreen_context_pipe_state_set_vs_resource() /
> evergreen_context_pipe_state_set_ps_resource(). The earlier commits don't seem
> to make much of a difference for me.

Hmmm... for me performance was first impacted with a77e813d, and building Mesa from the commit before that one (3b1c1f02) still gave me the faster performance Mesa had jumped to in January.  (I only reverted the other 3 commits because they depended on a77e813d, and I did not know how to preserve them without it.)  Because of this, I was never really able to separate the impact of 077c448d by itself.

As described above, just moving from 3b1c1f02 to a77e813d reduces (for me) the frame rate in 'torcs' by roughly 30%.
Comment 9 Dave Witbrodt 2011-04-11 20:15:22 UTC
I have been away from Linux and testing radeon support since the Japan crisis began.  Updating to Mesa 7.11.0-devel, commit a26121f3, caused the same slowdown I had observed before:  min/max framerate in 'torcs' dropped from 28/54 fps to 17/34 fps.  I was planning on restoring local packages of my last fast-working Mesa, but decided to rebuild my entire X stack against my newly updated kernel (updated from 2.6.38-rc8 to 2.6.38.2).

To my shock, I found that replacing xorg-server 1.9.99.903 with 1.10.0.902 and updating the radeon driver to the latest git version, 6.14.99-devel, commit cc7d1fa3 -- both built against the Mesa mentioned above -- improved Mesa performance in 'torcs' dramatically.  It is no longer as consistently high as I was getting, but gives me 17/54 fps.

I also notice that another test program, 'prboom-plus', has dramatically improved performance all around.  As a result, I am abandoning my local git branch where I intended to pin down the exact cause of the performance changes -- I haven't touch it for 7 weeks anyway -- and start following the Mesa HEAD again.

Sorry for the false alarm.  It appears that Mesa changes alone were not to blame for whatever performance problems I was experiencing.
Comment 10 Jerome Glisse 2012-02-22 10:07:28 UTC
Closing, performance should be lot better with newer, ddx, mesa and kernel


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.