This bug prevents the Android CTS from passing. I see two ways to make the CTS happy: 1) convince the Android CTS team to increase the timeout for dEQP, or 2) fix Mesa. I haven't yet discussed option #1 with the Android team.
Tested on dual-core Skylake machine on the following commits and build configs. I built everything in release mode.
./configure --disable-debug -DNDEBUG CFLAGS='-g -O2' CXXFLAGS=same ...
CommitDate: Tue Oct 17 15:34:35 2017 -0700
Subject: meson: turn on pl111 not vc4 when pl111 driver specificed
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DEQP_TARGET=Wayland ...
CommitDate: Fri Sep 29 03:39:25 2017 -0400
Subject: Test BLOCK_INDEX of uniform inside block array
Similarly, dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.5 takes 11 seconds to compile.
See previous bug for the same test:
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes 25 minutes to compile
Oh, I think there's actually a simple fix here.
The issue is that add_barrier_dep sets is_barrier = true, and walks backward to the previous barrier, and forward to...the end of the program...because if you're walking forward, you haven't set is_barrier on any future instructions yet.
I think we just need to change to is_scheduling_barrier() instead, which means moving some things to backend_instruction instead of fs_inst. (I have a vague memory that Curro wanted us to do that in the first place...)
I've got a patch to do that, and it reduces the runtime to 3.6 seconds. The test still passes, but I'll have to do real regression testing and clean it up tomorrow.
Thanks Ken! I will be silent on this bug for several days, as I'm taking time off, returning Wed 25th.
Patches on list:
We face the same issue on Android O. Please let us know when these fixes will merge in upstream.
Test passes with the above patches
Author: Kenneth Graunke <email@example.com>
Date: Tue Oct 17 23:19:20 2017 -0700
i965: Use is_scheduling_barrier instead of schedule_node::is_barrier.
Commit a73116ecc60414ade89802150b tried to make add_barrier_deps()
walk to the next barrier, and stop. To accomplish that, it added an
is_barrier flag. Unfortunately, this only works half of the time.
The issue is that add_barrier_deps() walks both backward (to the
previous barrier), and forward (to the next barrier). It also sets
is_barrier. Assuming that we're processing instructions in forward
order, this means that is_barrier will be set for previous instructions,
but not future ones. So we'll never see it, and walk further than we
now compiles its shaders in 3.6 seconds instead of 3.3 minutes.
Reviewed-by: Matt Turner <firstname.lastname@example.org>
Tested-by: Pallavi G <email@example.com>