| Summary: | r200 driver should support ARB_point_parameters | ||
|---|---|---|---|
| Product: | Mesa | Reporter: | Roland Scheidegger <sroland> |
| Component: | Drivers/DRI/r200 | Assignee: | Default DRI bug account <dri-devel> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | enhancement | ||
| Priority: | high | ||
| Version: | git | ||
| Hardware: | x86 (IA32) | ||
| OS: | Linux (All) | ||
| Whiteboard: | |||
| i915 platform: | i915 features: | ||
| Attachments: | patch to enable ARB_point_parameters (working) and ARB_point_sprite (broken) | ||
|
Description
Roland Scheidegger
2005-10-07 10:46:10 UTC
Created attachment 3509 [details] [review] patch to enable ARB_point_parameters (working) and ARB_point_sprite (broken) This passes gleans pointAtten test, but only if the code is changed so it won't use large points (i.e. the point sprite primitive) when smooth points are enabled. Though using 1.0 sized "smooth" points doesn't strike me as particularly better than simply ignoring the smooth bit in case of large point size/point atten, it is apparently what the spec calls for. Maximum epsilon which allows the driver to pass gleans point atten test is around 0.9 (i.e. it passes with 0.9 but doesn't with 0.8, both for hw and sw tcl). A smooth point with size 1.0 can still partially overlap multiple pixels on screen. I non-smooth point with size 1.0 will always occupy exactly 1 pixel. It's the same as with line and polygon antialiasing. I suspect that's the difference that you're seeing. Commited (a bit different) to Mesa cvs.
I didn't really solve the "problem" though that the point size calculation is
always done (if Point Attenuation is "enabled"), regardless if actually points
are drawn. I'd suspect the performance hit in tcl mode is fairly small.
Not sure about swtcl though, I think something should be done about submitting
unneded point sizes.
Would something like that make sense? Going through all primitives might be a
bit on the expensive side?
diff -u -r1.32 r200_swtcl.c
--- r200_swtcl.c 13 Oct 2006 22:10:05 -0000 1.32
+++ r200_swtcl.c 14 Oct 2006 00:44:59 -0000
@@ -118,8 +118,22 @@
}
if (RENDERINPUTS_TEST( index_bitset, _TNL_ATTRIB_POINTSIZE )) {
- EMIT_ATTR( _TNL_ATTRIB_POINTSIZE, EMIT_1F, R200_VTX_POINT_SIZE );
- offset += 1;
+ int needptsz = 0;
+ if ((ctx->Polygon.FrontMode == GL_POINT) || (ctx->Polygon.BackMode ==
GL_POINT))
+ needptsz = 1;
+ else {
+ int i;
+ for (i = 0 ; i < VB->PrimitiveCount ; i++) {
+ if ((VB->Primitive[i].mode & PRIM_MODE_MASK) == GL_POINTS) {
+ needptsz = 1;
+ break;
+ }
+ }
+ }
+ if (needptsz) {
+ EMIT_ATTR( _TNL_ATTRIB_POINTSIZE, EMIT_1F, R200_VTX_POINT_SIZE );
+ offset += 1;
+ }
}
rmesa->swtcl.coloroffset = offset;
Note though that the actual (unneeded) point size calculation would still take
place, the sizes just not submitted to the hardware I think.
The potential hitch is that you need to make sure this test is reevaluated for all primitives drawn. Typically the drivers look at the Mesa state and set hardware state appropriately and then try to avoid re-checking again and again unless there are actual Mesa statechanges. Unfortunately you now need to do that because you are basing hardware state off not just Mesa state but what primitives are coming down the pipe, which Mesa doesn't (yet) consider a statechange. At a certain level, if the app requests this state, its' fair enough that it pays for computing the values. Unless you find a real instance where this is a problem - ie an app that asks for this state and then doesn't render points *and* becomes noticably slower as a result, I would consider it not worth optimizing for. (In reply to comment #5) > At a certain level, if the app requests this state, its' fair enough that it > pays for computing the values. Unless you find a real instance where this is a > problem - ie an app that asks for this state and then doesn't render points > *and* becomes noticably slower as a result, I would consider it not worth > optimizing for. Fair enough. The theoretical maximum performance hit would probably be 25% (which is the vertex size increase caused by submitting point size since the minimum otherwise is 3 floats for position and 1 float (4UB) for color0, I'd think the point size calculation itself shouldn't be that expensive), though I'd think in practice it should well stay below that. And the additional tests needed in tcl mode for ARB_point_sprite (to disable perspective correction) didn't seem to have a visible performance impact neither, though I wondered if it's the reason ati does not support that extension on r200 with any of its drivers (even those for other operating systems). Mass version move, cvs -> git |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.