Bug 21029 - [EXA] x11perf performance regression
Summary: [EXA] x11perf performance regression
Status: VERIFIED WONTFIX
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: Other Linux (All)
: medium major
Assignee: Carl Worth
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-03 02:22 UTC by zhao jian
Modified: 2009-06-09 14:44 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
xorg.0.log (48.00 KB, text/plain)
2009-04-03 02:22 UTC, zhao jian
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description zhao jian 2009-04-03 02:22:57 UTC
Created attachment 24502 [details]
xorg.0.log

System Environment:
----------------------
Platform:               G45
Arch:           x86_64
Platform:               G45
OSD:            Fedora release 8 (Werewolf)
Kernel_version:         2.6.29
Libdrm:                (master)51d6346f9f3c425f49e57d185530c6bcaeb94f5e
Mesa:           (mesa_7_4_branch)7be149cfd131c0b3f7d4337bb83e6fba5f563bf9
Xserver:        (server-1.6-branch)60c161545af80eb78eb790a05bde79409dfdf16e
Xf86_video_intel:     (2.7)10b5014c42dc055d9559ee112cc7a017e887d813
Kernel:        (drm-intel-2.6.29)0e56a4d653b66d4729f944b23935a00c4472f987

Bug Description:
---------------------
In EXA mode, start X and  run 2D benchmark x11perf of aa10text and rgb10text, both of them will have a very bad performance, its performance has a 80%-90% drop. It also exists on 945gm-32. 

Reproduce Steps:
---------------------
1. xinit&
2. x11perf -aa10text
Comment 1 Gordon Jin 2009-04-03 19:38:17 UTC
This is a regression after Q1-rc1 (2.6.99.902). Carl, we can bisect next week if needed.
Comment 2 Carl Worth 2009-04-07 12:54:15 UTC
Thanks for the report, (particularly with all the detailed versions).

I took quick look with my 945 and seem to have replicated this. Here are results I see on my 945 at the moment:

UXA --aa10text:  261000
EXA --aa10text:   69400

UXA --rgb10text: 242000
EXA --rgb10text:  28100

Gordon, I'm glad to hear that you know it's a regression. Do you mean that you know that it's the xf86-video-intel component that causes this?

Anyway, I'll start trying to find the last time things were working fast. Any suggestions you have would be useful.

And of course, the easy answer to give anyone in this situation is: "Use UXA, not EXA". I'll very much look forward to being able to drop EXA from our driver completely so we have fewer combinations to have to maintain.

-Carl

Comment 3 Carl Worth 2009-04-07 15:12:39 UTC
(In reply to comment #1)
> This is a regression after Q1-rc1 (2.6.99.902). Carl, we can bisect next week
> if needed.

Gordon, if you can bisect, that would be great. I tried running 2.6.99.902 and couldn't even get the X server to start with EXA, (it's hanging the machine quite reliably).

-Carl (who continues to be convinced that the best way to fix all of the EXA bugs is to just delete the EXA paths from our driver...)
Comment 4 zhao jian 2009-04-07 20:55:04 UTC
I bisected it , and find it was caused by the commit   f6f59ee2533e786906dc9a32cf7072f2d2796201 in 2.7 branch of Xf86_video_intel. 
commit f6f59ee2533e786906dc9a32cf7072f2d2796201
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Mon Mar 30 09:53:40 2009 -0700

    Tiling fixes, third set
Comment 5 Carl Worth 2009-04-08 13:25:07 UTC
(In reply to comment #4)
> I bisected it , and find it was caused by the commit  
> f6f59ee2533e786906dc9a32cf7072f2d2796201 in 2.7 branch of Xf86_video_intel. 
> commit f6f59ee2533e786906dc9a32cf7072f2d2796201
> Author: Jesse Barnes <jbarnes@virtuousgeek.org>
> Date:   Mon Mar 30 09:53:40 2009 -0700
> 
>     Tiling fixes, third set

Excellent! Thanks for the bisection work.

That patch can't be trivially reverted for testing with git-revert, (results in several conflicts). But independently, Jesse recently came up with the following patch (originally for bug #21027) that might actually help:

http://bugs.freedesktop.org/attachment.cgi?id=24627

Oddly, if I try that patch with my 945 then starting the X server causes the machine to hang hard. But Jesse doesn't have that problem.

Gordon, if the patch works for you, then it might be interesting to see if it also helps with the performance issue.

-Carl
Comment 6 Carl Worth 2009-04-08 15:45:52 UTC
(In reply to comment #5)
> But independently, Jesse recently came up with the
> following patch (originally for bug #21027) that might actually help:
> 
> http://bugs.freedesktop.org/attachment.cgi?id=24627
> 
> Oddly, if I try that patch with my 945 then starting the X server causes the
> machine to hang hard. But Jesse doesn't have that problem.

I upgraded my X server and pixman repositories, and now the hang has gone away.

Also, I'm happy to report that the patch fixes the performance regression as well. Here are the results I get on my 945GM:

aa10text
--------
UXA:            278000
EXA:             72000
EXA-patched:    245000

rgb10text
---------
UXA:            265000
EXA:             28400
EXA-patched:    228000

I'll be pushing out Jesse's patch momentarily.

-Carl
Comment 7 Carl Worth 2009-04-08 15:54:11 UTC
Patch pushed to both master and the 2.7 branch.

-Carl

commit 620e97bbd6a811ad69b8ac94df1fe2c9edf65549
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Wed Apr 8 15:49:00 2009 -0700

    Don't enable kernel execbuf fencing w/EXA
    
    If we enable kernel execbuf fence register management, it's best if the
    kernel manages all fence registers.  This works fine if the accel
    method is managing pixmaps or doesn't use offscreen pixmaps.  However
    with EXA, pixmap accesses are done relative to the framebuffer BAR
    mapping (pI830->FbBase) and the Screen pixmap address.  So if we try to
    set the screen pixmap to point at a GTT mapped (and therefore properly
    fenced) address, later calls to intel_get_pixmap_offset() will call
    into EXA, which will use the pseudo-random pixmap addr and the EXA
    offscreen base addr (which is really just FbBase) to calculate the
    offset.  This will fail.  So disable kernel fence reg management in the
    EXA case (this is easier than adding proper EXA pixmap management to
    xf86-video-intel, and makes more sense since we'll be removing EXA soon
    anyway).
    
    Fixes FDO #21027.
    
    Also happens to fix FDO #21029 (as tested by Carl Worth <cworth@cworth.org).
Comment 8 zhao jian 2009-04-10 01:08:04 UTC
Yes. On my 945Gm-32, it works well now.  
G45-64: 
aa10text
--------
EXA:                          136k
EXA(With the newest code):    272k

rgb10text
---------
EXA:                           54k
EXA(With the newest code):    233k
Comment 9 zhao jian 2009-04-20 18:00:53 UTC
G45-64 with EXA:
RC3 code(2D: 2.7.0), 
--rgb10text: 277k
--aa10text : 300k

RC1(2D: 2.6.99.902)
--rgb10text: 313k
--aa10text : 352k

We can see with the current code, its performance is still lower than RC1's. So I reopen it. 
Comment 10 Gordon Jin 2009-04-20 18:32:57 UTC
I'm marking this not blocking 2.7 release.
Comment 11 Gordon Jin 2009-06-09 14:44:47 UTC
Not too big gap. I guess Carl won't pursue this EXA performance issue.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.