77740 – i965: Relax accumulator dependency scheduling on Gen < 6

Bug 77740 - i965: Relax accumulator dependency scheduling on Gen < 6

Summary: i965: Relax accumulator dependency scheduling on Gen < 6

Status:	RESOLVED FIXED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/DRI/i965 (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	medium normal
Assignee:	Iago Toral
QA Contact:	Intel 3D Bugs Mailing List

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	i965-perf
	Show dependency tree / graph

Reported:	2014-04-21 20:30 UTC by Matt Turner
Modified:	2014-05-14 05:34 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
Patch (9.59 KB, patch) 2014-04-28 09:30 UTC, Iago Toral	Details \| Splinter Review
Scheduling plan (before patch) (64.72 KB, text/plain) 2014-04-28 09:32 UTC, Iago Toral	Details
Scheduling plan (after patch) (64.72 KB, text/plain) 2014-04-28 09:33 UTC, Iago Toral	Details
View All

Description Matt Turner 2014-04-21 20:30:45 UTC

Many instructions implicitly update the accumulator on Gen < 6. The instruction scheduling code just calls add_barrier_deps() for each accumulator access on these platforms, but a large class of operations don't actually update the accumulator -- mostly move and logical instructions.

Search the i965/GM45/ILK docs for "does not implicit" (they misspelled "implicitly" in about half of the cases...)

Teaching the scheduling code about this would allow more flexibility to schedule instructions.

Comment 1 Iago Toral 2014-04-24 08:24:55 UTC

There seems to be a 'writes_accumulator' flag that is set to TRUE for instructions that are known to write to the accumulator. The scheduling code currently checks this flag on Gen >= 6 to compute dependencies among instruction that read/write the accumulator.

For Gen < 6, there seems that for some very specific cases where this flag is set to TRUE even if the scheduling code is not going to use it. As far as I can see this is because the 'writes_accumulator' flag can also be used in the optimization passes to remove dead code (and we don't want to eliminate code that writes to the accumulator values that we intend to use).

So, I think for this task the 'writes_accumulator' flag alone is not enough: in the scheduling code we need to know if the instruction can write to the accumulator independently of whether we are expecting to use that value or not and in Gen < 6, as far as I can see, there is a bunch of operations that can write implicitly to the accumulator (mostly opcodes > 63).

I think the way to go about this would be to have:

bool instruction_backend::writes_accumulator_implicitly(int gen);

that we can call from the scheduling code to know if a particular instruction can write implicitly to the accumulator. In Gen >= 6, it would simply return the value of 'writes_accumulator' but for Gen < 6 it would check also the opcodes.

I think with that operation in place we could unify the scheduling code paths for all Gens.

I'll write a patch for this but I will need help with the testing since I am running on Gen 7.

Comment 2 Iago Toral 2014-04-28 09:30:39 UTC

Created attachment 98117 [details] [review]
Patch

I managed to get an ironlake laptop so I could test the patch myself. I tested with a small program that implemented a simple lightning model demo and seems to work fine.

According to the debug traces in instruction_scheduler::schedule_instructions() I can see that the patch is indeed influencing instruction scheduling (I'll attach the traces).

Might need more thorough testing though, please let me know if there are specific tests that you would like me to run to verify the patch before sending it for review to the mailing list.

Comment 3 Iago Toral 2014-04-28 09:32:34 UTC

Created attachment 98122 [details]
Scheduling plan (before patch)

Scheduling plan before the patch for sample program

Comment 4 Iago Toral 2014-04-28 09:33:30 UTC

Created attachment 98123 [details]
Scheduling plan (after patch)

Scheduling plan after the patch for sample program

Comment 5 Iago Toral 2014-04-29 12:07:03 UTC

A piglit test run raises problems though so this still needs more work.

Comment 6 Iago Toral 2014-05-06 08:21:40 UTC

Fixed patch and sent it to the mailing list for review. Passes piglit tests on  ironlake and ivybridge but I could not test on other gens:

http://lists.freedesktop.org/archives/mesa-dev/2014-May/059001.html

Comment 7 Matt Turner 2014-05-14 05:34:11 UTC

Committed as http://cgit.freedesktop.org/mesa/mesa/commit/?id=520dfa4b5c64a09bb34c7e912b102fb99aef7699

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.