Bug 94681 - dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes 25 minutes to compile
Summary: dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes 25 minut...
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Matt Turner
QA Contact: Intel 3D Bugs Mailing List
Depends on:
Blocks: i965-deqp
  Show dependency treegraph
Reported: 2016-03-24 07:55 UTC by Kenneth Graunke
Modified: 2016-08-20 01:24 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Note You need to log in before you can comment on or make changes to this bug.
Description Kenneth Graunke 2016-03-24 07:55:35 UTC
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes absurdly long to compile.  On my Broadwell laptop, in a debug build, it took 25 minutes.  Most of the time seemed to be spent in the instruction scheduler adding dependencies.

There's probably something trivial we can do to make this much faster.

The test does pass, however, so this isn't a true blocking issue.  It would certainly be worth fixing, though.
Comment 1 Matt Turner 2016-03-24 16:25:04 UTC
I noticed something similar elsewhere. The scheduler does a linked list walk over all potential instructions before choosing one -- O(n^2). Maybe we can sort... something?
Comment 2 Matt Turner 2016-08-17 20:25:32 UTC
Okay, the problem is that there are a ton of untyped_surface_writes, which "have side effects" and are therefore treated as barrier dependencies. add_barrier_dep() walks over the whole list of instructions in the basic block (of which there are about 10 thousand).

It seems a bit absurd to add a dependency on instructions on the other side of another barrier...

Maybe we could go ahead and schedule pending instructions when we see a barrier instead of doing all of this work?
Comment 3 Kenneth Graunke 2016-08-17 23:39:08 UTC
Oh, that is pretty absurd.  Scheduling things seems pretty reasonable.

Maybe an easier trick would be to make add_barrier_deps() stop when it hits something that's already a barrier.

If you have:

   <bunch of instructions we'll call A>
   <bunch of instructions we'll call B>

We need to make barrier_2 depend on everything in group B, and also barrier_1.  But since barrier_1 already depends on group A, we don't need to continue.

Something like:

      while (!prev->is_head_sentinel()) {
         add_dep(prev, n, 0);
         prev = (schedule_node *)prev->prev;

         if (is_scheduling_barrier(n->inst))

Using is_scheduling_barrier approximates the right condition...we could also perhaps just add a schedule_node::is_barrier field that we set when calling add_barrier_deps(), and check here.

Seems easy enough and would likely solve this.
Comment 4 Kenneth Graunke 2016-08-18 00:11:05 UTC
(In reply to Kenneth Graunke from comment #3)
> Oh, that is pretty absurd.  Scheduling things seems pretty reasonable.

What I meant was that your suggestion of scheduling outstanding work when we hit a barrier sounds reasonable.
Comment 5 Matt Turner 2016-08-20 01:24:28 UTC
Pushed as

commit a73116ecc60414ade89802150b707b3336d8d50f
Author: Matt Turner <mattst88@gmail.com>
Date:   Thu Aug 18 16:47:05 2016 -0700

    i965/sched: Simplify work done by add_barrier_deps().

bug/show.html.tmpl processed on Mar 24, 2017 at 12:09:46.
(provided by the Example extension).