Bug 5366 - arbfpspec runs abnormally slow
Summary: arbfpspec runs abnormally slow
Status: CLOSED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/r300 (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-12-18 00:08 UTC by Aapo Tahkola
Modified: 2006-10-02 07:13 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Aapo Tahkola 2005-12-18 00:08:13 UTC
This is caused by vp writing col0 but fragprog not reading it.
Im leaning torwards solution where dead code removal on vps is issued based on
fragprog inputs. This would be done with a hasher similar to what t_vp_build.c
and texenvprogram.c have.

It would be a lot easier to do this if Mesa would naturally (try) to split vp
programs into smaller pieces based on results written. However, I'm not sure its
worth implementing...
Comment 1 Keith Whitwell 2005-12-18 00:41:56 UTC
This ties into the work IDR's student group are doing on vertex program
optimization.

It seems like this might be an R300-specific slowdown, as other architectures
cope fine with a vertex program that produces results that are subsequently
ignored by the fragment program.

Out of interest, if you modify the fragment program to read the computed color
value, does performance improve?
Comment 2 Aapo Tahkola 2005-12-18 03:33:16 UTC
(In reply to comment #1)
> It seems like this might be an R300-specific slowdown, as other architectures
> cope fine with a vertex program that produces results that are subsequently
> ignored by the fragment program.
> 
> Out of interest, if you modify the fragment program to read the computed color
> value, does performance improve?

Yes, as long as it gets read.
Comment 3 Ben Skeggs 2005-12-18 04:04:09 UTC
It's been quite a while since I looked at r300, but I suspect that the routing
of data between vtx and frag programs still isn't quite right.

Do you also see a speedup if you modify r300_state.c::r300_setup_rs_unit to
route col0 into a fragment program register, even if it isn't read?

You'll also have to modify r300_fragprog.c::init_program to assume that col0
exists so that the fragment program reads the correct registers.
Comment 4 Aapo Tahkola 2005-12-18 04:27:43 UTC
(In reply to comment #3)
> It's been quite a while since I looked at r300, but I suspect that the routing
> of data between vtx and frag programs still isn't quite right.
> 
> Do you also see a speedup if you modify r300_state.c::r300_setup_rs_unit to
> route col0 into a fragment program register, even if it isn't read?
> 
> You'll also have to modify r300_fragprog.c::init_program to assume that col0
> exists so that the fragment program reads the correct registers.

I cannot really do that. R300_RS_ROUTE_0_COLOR is actually vp result register
number. Although this starts at zero, its assigned at vof(R300_RS_ROUTE_0_COLOR).

AFAIK, passing undefined values to interpolators is also illegal and will lock
the GPU.
Comment 5 Ben Skeggs 2005-12-18 04:42:16 UTC
(In reply to comment #4)
> I cannot really do that. R300_RS_ROUTE_0_COLOR is actually vp result register
> number. Although this starts at zero, its assigned at vof(R300_RS_ROUTE_0_COLOR).
> 
> AFAIK, passing undefined values to interpolators is also illegal and will lock
> the GPU.

Yes, I believe this is the case.  But I don't see how the data is undefined, you
mention that the vp writes the value.

Currently, the rs unit setup will just throw away col0 if the fp doesn't want
it.  I'm suggesting that you make it pass the value to the fp register anyway
(which it'll just ignore).  I can't recall any lockups from not using a value in
a fragment program.

Perhaps I'm misunderstanding the issue though..
Comment 6 Aapo Tahkola 2005-12-18 05:13:10 UTC
> Currently, the rs unit setup will just throw away col0 if the fp doesn't want
> it.  I'm suggesting that you make it pass the value to the fp register anyway
> (which it'll just ignore).  I can't recall any lockups from not using a value in
> a fragment program.

That seems to fix it as well.
Comment 7 Ben Skeggs 2005-12-18 05:21:49 UTC
Cool.  I wonder if the entire rs_unit code should be modified to pass all
registers that are written by the vp, regardless of whether they're used or not.

I can't recall how fglrx does this now.
Comment 8 Aapo Tahkola 2005-12-18 05:51:52 UTC
(In reply to comment #7)
> Cool.  I wonder if the entire rs_unit code should be modified to pass all
> registers that are written by the vp, regardless of whether they're used or not.
> 
> I can't recall how fglrx does this now.

Sounds fair enough.

It seems that this only related to colors. arbvptorus.c:138 has similar bogus
instruction too but it doesnt cause problems. Changing it to:
"MOV   result.color.secondary, vertex.texcoord;\n"
brings it back.
However, I vaguely remember that this inst has caused arbvptorus to run at
reduced speed in past.
Comment 9 Rune Petersen 2006-10-02 07:12:11 UTC
I can't see this being a problem anymore....
Comment 10 Rune Petersen 2006-10-02 07:13:45 UTC
closed


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.