Bug 93650 - GL_ARB_separate_shader_objects is buggy (PCSX2)
Summary: GL_ARB_separate_shader_objects is buggy (PCSX2)
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-10 12:08 UTC by František Zatloukal
Modified: 2016-01-22 13:52 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
apitrace (4.99 MB, application/octet-stream)
2016-01-10 12:08 UTC, František Zatloukal
Details
apitrace from dev build without image_load_store (3.14 MB, application/octet-stream)
2016-01-20 20:17 UTC, František Zatloukal
Details
problematic geometry shader (5.21 KB, text/plain)
2016-01-21 10:48 UTC, Ilia Mirkin
Details
disable varying packing for SSO (995 bytes, patch)
2016-01-21 11:42 UTC, Ilia Mirkin
Details | Splinter Review
fix type computation when no producer and we're still varying-packing (2.25 KB, patch)
2016-01-21 12:07 UTC, Ilia Mirkin
Details | Splinter Review

Description František Zatloukal 2016-01-10 12:08:21 UTC
Created attachment 120928 [details]
apitrace

Graphical output is corrupted if GL_ARB_separate_shader_objects is used by PCSX2 in combination with Geometry Shaders.
Disabling any of that extension fixes problem.
Attached apitrace is captured with Mesa 11.1, same problem exists on git. Corruption example can be found at http://imgur.com/a/DW7nt .

Testing was done on Intel Haswell GPU.
Comment 1 Ilia Mirkin 2016-01-20 15:19:27 UTC
in SHADER
{
    vec4 t;
    vec4 c;
    flat vec4 fc;
} GSin[];

decl_var shader_in INTERP_QUALIFIER_FLAT vec4[2] c (VARYING_SLOT_VAR0, 26)
decl_var shader_in INTERP_QUALIFIER_FLAT vec4[2] fc (VARYING_SLOT_VAR2, 28)
decl_var shader_in INTERP_QUALIFIER_FLAT vec4[2] t (VARYING_SLOT_VAR4, 30)

I would have expected these to be packed tightly. At the end you end up with

GS Input VUE map (7 slots, SSO)
  [0] VARYING_SLOT_PSIZ
  [1] VARYING_SLOT_POS
  [2] VARYING_SLOT_VAR0
  [3] BRW_VARYING_SLOT_PAD
  [4] VARYING_SLOT_VAR2
  [5] BRW_VARYING_SLOT_PAD
  [6] VARYING_SLOT_VAR4

I believe the matching VS shader has

out SHADER
{
    vec4 t;
    vec4 c;
    flat vec4 fc;
} VSout;

and at the end prints

VS Output VUE map (5 slots, SSO)
  [0] VARYING_SLOT_PSIZ
  [1] VARYING_SLOT_POS
  [2] VARYING_SLOT_VAR0
  [3] VARYING_SLOT_VAR1
  [4] VARYING_SLOT_VAR2

which makes sense. I think the issue is that glsl_to_nir (or a higher level) is treating the interface array as actually multiple VAR slots instead of one each.

Unfortunately the trace requires ARB_shader_image_load_store so I can't test with other mesa-based drivers.
Comment 2 gregory.hainaut 2016-01-20 19:20:21 UTC
@František,
Could you generate a trace with the GSdx option "override_GL_ARB_shader_image_load_store = 0" (can use the GUI too). And a dev build might be lighter on the trace noise :)

@Ilia,
It is hard to be sure but there is a very high probability that extension isn't used at all.
Comment 3 František Zatloukal 2016-01-20 20:17:33 UTC
Created attachment 121162 [details]
apitrace from dev build without image_load_store

I've added apitrace generated from dev build with disabled image_load_store.
Comment 4 Ilia Mirkin 2016-01-21 10:28:45 UTC
I'm going to tentatively say that the issue lies outside of i965 -- with nouveau I see 2 geometry shaders generated, one for points, one for lines.

GEOM
PROPERTY GS_INPUT_PRIMITIVE LINES
PROPERTY GS_OUTPUT_PRIMITIVE TRIANGLE_STRIP
PROPERTY GS_MAX_OUTPUT_VERTICES 6
PROPERTY GS_INVOCATIONS 1
DCL IN[][0], POSITION
DCL IN[][1], PRIM_ID
DCL IN[][2], GENERIC[0]
DCL IN[][3], GENERIC[2]
DCL IN[][4], GENERIC[4]
DCL OUT[0], POSITION
DCL OUT[1], PRIM_ID
DCL OUT[2], GENERIC[0]
DCL OUT[3], GENERIC[1]
DCL OUT[4], GENERIC[2]

vs

GEOM
PROPERTY GS_INPUT_PRIMITIVE POINTS
PROPERTY GS_OUTPUT_PRIMITIVE TRIANGLE_STRIP
PROPERTY GS_MAX_OUTPUT_VERTICES 6
PROPERTY GS_INVOCATIONS 1
DCL IN[][0], POSITION
DCL IN[][1], PRIM_ID
DCL IN[][2], GENERIC[0]
DCL IN[][3], GENERIC[1]
DCL IN[][4], GENERIC[2]
DCL OUT[0], POSITION
DCL OUT[1], PRIM_ID
DCL OUT[2], GENERIC[0]
DCL OUT[3], GENERIC[1]
DCL OUT[4], GENERIC[2]

Both of those should be getting GENERIC[0..2] on the inputs. So either st/mesa and i965 have the same bug, or it's getting messed up by the linker.
Comment 5 Ilia Mirkin 2016-01-21 10:48:42 UTC
Created attachment 121175 [details]
problematic geometry shader

Attached is a piglit shader_runner test file which reproduces the issue.
Comment 6 Ilia Mirkin 2016-01-21 11:42:34 UTC
Created attachment 121176 [details] [review]
disable varying packing for SSO

The attached patch fixes the trace for me on nouveau... I think. There's still a weird pattern at the end, but perhaps it was intended, or some form of apitrace fail. [Note that apitrace doesn't handle glCopyImageSubData very well... there's a hack in git, but I don't think it's in any released version.]

I don't know if it's the right thing to do though. All these SSO/non-sso/transform feedback/packing/etc rules are too large to all reside in my head simultaneously =/
Comment 7 Ilia Mirkin 2016-01-21 12:07:02 UTC
Created attachment 121177 [details] [review]
fix type computation when no producer and we're still varying-packing

This is another approach to fixing the issue. Not sure which I like more.
Comment 8 Ilia Mirkin 2016-01-21 15:58:53 UTC
I've sent the second version to mesa-dev:

http://patchwork.freedesktop.org/patch/71272/

And can confirm that it fixes the original trace on HSW.
Comment 9 gregory.hainaut 2016-01-22 08:26:03 UTC
Oh nice. Thanks you.

Would it be possible to backport the patch on current stable versions (11.1?) ?
Comment 10 Ilia Mirkin 2016-01-22 13:52:55 UTC
Fix pushed, cc'd to 11.0 and 11.1 (not sure if there will be another 11.0.x release though). Should appear in 11.1.2. (BTW, in case it wasn't clear, this wasn't an i965 bug at all, but was actually a core mesa bug, affecting all mesa-based drivers.)

commit dac2964f3ebd96d5ac227984ab0cd79c2c3b2a1a
Author: Ilia Mirkin <imirkin@alum.mit.edu>
Date:   Thu Jan 21 07:17:06 2016 -0500

    glsl: always compute proper varying type, irrespective of varying packing


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.