Bug 6877

Summary: RENDER support for mach64
Product: xorg Reporter: George - <fufutos610>
Component: Driver/mach64Assignee: Xorg Project Team <xorg-team>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: high CC: alexdeucher, cbm, morgoth6
Version: gitKeywords: patch
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 6017, 6818, 6911    
Bug Blocks:    
Attachments:
Description Flags
RENDER support for mach64
none
RENDER support for mach64 - try 1
none
RENDER support for mach64 - try 2
none
RENDER support for mach64 - try 2 / EXA
none
RENDER support for mach64 - try 3
none
RENDER support for mach64 - try 4
none
RENDER support for mach64 - try 4 / DRI
none
RENDER support for mach64 - try 5
none
RENDER support for mach64 - try 6
none
RENDER support for mach64 - try 7
none
RENDER support for mach64 - try 7 / EXA none

Description George - 2006-05-10 06:01:50 UTC
This bug tracks development of RENDER acceleration for mach64.

Basic EXA support is bug #6017.
Comment 1 George - 2006-05-10 06:11:38 UTC
Created attachment 5587 [details] [review]
RENDER support for mach64

Using the rendercheck terminology, the status of the attached patch is:

* blend
  acclerates the basic set of operators, exceptions being Atop, AtopR,
Saturate,
  and A8 src with non-A8 dst.

* composite
  no acceleration is provided, the patch has the basic infrastructure but have
  not found out yet how to emulate the IN operator with mach64 multitexturing.

Any help or advise from people knowledgeable in OpenGL or mach64
multitexturing,
greatly appreciated.
Comment 2 George - 2006-05-10 06:13:53 UTC
The patch is against git//git.freedesktop.org/~libv/xf86-video-mach64 .
Comment 3 Eric Anholt 2006-05-13 09:28:45 UTC
Another note that I forgot to tell you on IRC: componentAlpha should only affect
your behavior when it's set in a mask, not source or destination.
Comment 4 George - 2006-05-15 10:49:21 UTC
Created attachment 5633 [details] [review]
RENDER support for mach64 - try 1

This is my current diff against:
git://git.freedesktop.org/~libv/xf86-video-mach64.

It accelerates tranlucent windows due to the special nature of this operation
which uses "1x1 R" A8 masks. It is pretty useless though, since it does not
accelerate generic A8 masks which are required for aa fonts.

Any comments/suggestions on how to handle generic A8 masks greatly appreciated
(the patch contains some comments on problems and possible approaches ...)
Comment 5 George - 2006-05-17 16:25:18 UTC
Created attachment 5652 [details] [review]
RENDER support for mach64 - try 2

This patch adds acceleration for aa fonts. I don't see any measureable (with
time ls -lR) or perceivable difference, though.

Overall, the patch accelerates the following cases:
* no-mask
  basic common cases with single-pass
* 1x1-R mask
  basic common cases with single-pass
* generic mask and 1x1-R source
  basic common cases with two-pass using the exaTryComponentAlphaHelper() 
  which is applicable in this case also.

The patch seems pretty smooth with xfwm4 4.3.90.1 (built-in compositor) on
PII/350, GTPRO/8MB, DRI disabled, a day or two usage.
Comment 6 George - 2006-05-17 16:27:45 UTC
Created attachment 5653 [details] [review]
RENDER support for mach64 - try 2 / EXA

Patch for EXA.

Call exaTryComponentAlphaHelper() also when the src is 1x1-R.
Comment 7 George - 2006-05-20 08:44:35 UTC
Created attachment 5693 [details] [review]
RENDER support for mach64 - try 3

This patch adds acceleration for component alpha with solid source. This passes
rendercheck but I cannot actually test it due to lack of an LCD display.

Two remaining issues:
* x11perf --aa24text uses fonts with A1 mask, while the desktop uses fonts 
  with A8 mask. How do I make x11perf uses fonts with A8 mask ?

* save/restore 3D state when running with DRI.

Apart from the above issues or bug fixes, I do not plan to do any more work on
render acceleration for mach64.
Comment 8 George - 2006-05-22 22:25:03 UTC
Created attachment 5714 [details] [review]
RENDER support for mach64 - try 4

This patch adds save/restore and caching for the texture registers, it also
adds a "RenderAccel" option. Note that with render acceleration, a DRI client
has to upload the texture registers when it acquires the DRI lock.

Results from x11perf (-comp means with compositor, -trans means on a
translucent window):

1: text/xaa
2: text/xaa-comp
3: text/xaa-trans
4: text/exa-greedy
5: text/exa-smart
6: text/exa-smart-comp
7: text/exa-smart-trans

 12000.0   
 10500.0 (  0.88)   
  1570.0 (  0.13)   
 11900.0 (  0.99)   
 13900.0 (  1.16)   
 12500.0 (  1.04)   
 11100.0 (  0.93)  

 Char in 30-char aa line (Charter 24) 

Overall, render acceleration provides a modest speedup of 1.16 for aa fonts.
The results also confirm that for fonts on a translucent window exa/smart has
an acceptable performance penalty while xaa falls apart.
Comment 9 George - 2006-05-22 22:27:14 UTC
Created attachment 5715 [details] [review]
RENDER support for mach64 - try 4 / DRI

Patch for DRI. Restore texture state when acquiring the DRI lock.
Comment 10 George - 2006-06-14 09:57:32 UTC
Created attachment 5909 [details] [review]
RENDER support for mach64 - try 5

This is my current diff against:
git://git.freedesktop.org/~libv/xf86-video-mach64.

I think this patch is commitable.
Comment 11 George - 2006-06-20 19:14:43 UTC
The attached patch is against xf86-video-mach64 which can be downloaded with:

git-clone git://git.freedesktop.org/~libv/xf86-video-mach64

You also need to apply the patches from the dependencies of this bug to xserver.
Comment 12 Michel Dänzer 2006-06-21 01:41:41 UTC
Curious, why is bug 6772 a dependency?
Comment 13 George - 2006-06-21 05:53:27 UTC
(In reply to comment #12)
> Curious, why is bug 6772 a dependency?

For performance reasons only. It is very simple and avoids some hiccups.

Now, you remind me of your workaround for tiled pixmaps which is a dependency
for correctness:

diff --git a/exa/exa.c b/exa/exa.c
index 0676838..2359ba2 100644
--- a/exa/exa.c
+++ b/exa/exa.c
@@ -419,9 +419,12 @@ #endif
        if (!pGC->tileIsPixel && FbEvenTile (pGC->tile.pixmap->drawable.width *
                                             pDrawable->bitsPerPixel))
        {
-           exaPrepareAccess(&pGC->tile.pixmap->drawable, EXA_PREPARE_SRC);
+           /* XXX This fixes corruption with tiled pixmaps, but may just be a
+            * workaround for broken drivers
+            */
+           exaMoveOutPixmap(pGC->tile.pixmap);
            fbPadPixmap (pGC->tile.pixmap);
-           exaFinishAccess(&pGC->tile.pixmap->drawable, EXA_PREPARE_SRC);
+           exaDrawableDirty(pDrawable);
        }
        /* Mask out the GCTile change notification, now that we've done FB's
         * job for it.


Adventurous people could also try to merge in the 'exa-damagetrack' branch of
xserver which already contains the above fix.
Comment 14 Michel Dänzer 2006-06-21 06:43:06 UTC
Curious, what's your experience with exa-damagetrack with Mach64?
Comment 15 George - 2006-06-21 08:15:24 UTC
(In reply to comment #14)
> Curious, what's your experience with exa-damagetrack with Mach64?

Compatible with existing reports: It really filters out the outcasts I see on my
8MB card where exa whould move in a first pixmap, perform some operation, then
move in a second pixmap while kicking out the first one, perform some operation,
then move in the first pixmap while kicking out the second one and so on so
forth. OTOH, it hurts peak performance a little.

Overall, the desktop is snappier.
Comment 16 George - 2006-07-11 13:36:04 UTC
Created attachment 6192 [details] [review]
RENDER support for mach64 - try 6

This patch adds support for transforms. Nearest seems ok, bilinear seems close
but not quite right.

I also added the same license as radeon_exa_render.c on the basis that both
mach64 and radeon RENDER mostly derive from the kdrive ati driver.
Comment 17 George - 2006-08-09 11:16:28 UTC
Created attachment 6506 [details] [review]
RENDER support for mach64 - try 7

This is what I want to push to xf86-driver-ati/atimisc.

RENDER accleration is disabled by default. It is strongly recommended that the
patch from bug #6772 and the exa-damagetrack branch are merged in the xserver
before enabling RENDER acceleration. To enable RENDER acceleration, add the
following in xorg.conf:

Section "Device"
	[...]

	Option		"AccelMethod"		"exa"
	Option		"RenderAccel"		"true"
EndSection
Comment 18 George - 2006-08-10 14:26:01 UTC
Created attachment 6515 [details] [review]
RENDER support for mach64 - try 7 / EXA

And this is what I want to push to xserver/exa.
Comment 19 George - 2006-08-12 12:41:41 UTC
Committed.

Drop dependancies on bug #6772, #6811 and close.

Spawn bug #7861 for DRI patch.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.