Mesa Regression: yes, but cannot bisect now
Failed Platform: HSW
Observed HSW performance FPS reduced by 36% when running Synmark2_v6 Multithread case
Bug detailed description:
Compare with good commit, Synmark2_v6 Multithread case performance FPS dropped by 36%
Currently cannot bisect, only find good and bad commit.
Author: Alex Deucher <firstname.lastname@example.org>
Date: Mon Aug 10 15:35:21 2015 -0400
radeonsi: add new OLAND pci id
Author: Ilia Mirkin <email@example.com>
Date: Wed Aug 19 18:43:47 2015 -0400
glsl: expose textureQueryLod in GLSL 4.00+ fragment shaders
See issue from the ARB_texture_query_lod spec for LOD vs Lod confusion:
(3) The core specification uses the "Lod" spelling, not "LOD". Should
this extension be modified to use "Lod"?
Bisect progress log:
1, bf3c50fba221f216e38d3f60f89161ced4c684c0 bad
2, 1bba29ed403e735ba0bf04ed8aa2e571884fcaaf bad
4, 3c04a90e91a64a4a09d77c76c6ddcaca949e9b0e skip
5, bfac8ba9d32be351277c7ea814ac9848bdcb1f16 bad
6, c58534c1384dc63bb1b13eb37c06bdb4652c13ff skip
7, c4b4bad68a90510406c0bef97039f7d0b4f8f5fe skip
8, f72fead4a28d5d8a16bbc20781218ea7df0b9c9a bad
10. fb19df7a626d02cb54614d4610af2d14720a2ef3 skip
11, a47ae8de2cf30fbe45318a18a2ea032f30ab7d10 skip
12. b88f14702d9c02a34d517f95fe840527961631cd skip
13, fe55ab2d12202236ba5bf9beae09803dfe97a7ac skip
14, 86dbd8af40deaa99aedf011e863b908173e63012 skip
15, f9094691378722304dd94deb76ad013bd65c7a5b skip
16, 50545882113b389decc3f05771764f6c62213af3 skip
17, 2cdb24a7c2238843d23b468275d479553f537e7e skip
18, 7e8be000101cc6fe3846745b559f2d785430e253 skip
19, 5f1d5b1c7857f8680b47a7a450ee9e4530e22c6f can't start gnome-session can't run
20, caae52561dabb2d20f2369c547e660d078974285 skip
21, e3eb91af804f449005a2ff535c805eaa1d579d99 can't start gnome-session can't run
3. cd /usr/local/games/OpenGL/SynMark2_6/
4. ./synmark2 OglMultithread
Created attachment 117961 [details]
I'm bisecting it to
Author: Francisco Jerez <firstname.lastname@example.org>
Date: Fri Nov 22 16:00:33 2013 -0800
i965: Define implementation constants for ARB_shader_image_load_store.
Reviewed-by: Paul Berry <email@example.com>
v2: Drop VS support pre-Gen8, drop GS support.
Reviewed-by: Kenneth Graunke <firstname.lastname@example.org>
Specifically, commenting out the assignment of ctx->Const.MaxImageUnits brings performance back to previous levels.
Running strings on synmark indeed shows "GL_ARB_shader_image_load_store", so the benchmark may be doing additional work with the extension enabled.
Except that the extension isn't enabled at that point - it's turned on several commits later. I'm pretty sure only tests with "CS" in their name use ARB_shader_image_load_store, and this isn't one of them.
Yeah, apparently the regression is caused by the additional validation done for each image unit when _NEW_TEXTURE is flagged. I would expect validation for a single disabled image unit to be cheap, little more than a couple of pointer dereferences and a branch, but still it seems to add up to a measurable overhead due to the huge number of image units we expose. The attached patch removes the _NEW_TEXTURE-dependent state (actually just the _Valid field) from the image unit struct so this per-image unit validation is no longer necessary. Instead the driver is required to call a function to calculate the same value when needed, what may make per-image state updates in the driver a little more expensive but is likely to be a net win because the overall complexity is reduced since now only image units which are actually bound to a shader need to be validated.
Created attachment 117980 [details] [review]
Tested on the latest mesa with the patch, this problem does not exists.
(In reply to ye.tian from comment #6)
> Tested on the latest mesa with the patch, this problem does not exists.
Thanks, I've formatted the change as a patch series and sent it for review with your tested-by. 
I've just landed the series that fixes this regression, marking as resolved.