Summary: |
i965/fs generates slow code for vector comparisons |
Product: |
Mesa
|
Reporter: |
Matt Turner <mattst88> |
Component: |
glsl-compiler | Assignee: |
Ian Romanick <idr> |
Status: |
RESOLVED
MOVED
|
QA Contact: |
Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: |
normal
|
|
|
Priority: |
medium
|
CC: |
petri.latvala, siglesias
|
Version: |
unspecified | |
|
Hardware: |
Other | |
|
OS: |
All | |
|
Whiteboard: |
|
i915 platform:
|
|
i915 features:
|
|
Bug Depends on: |
|
|
|
Bug Blocks: |
77547
|
|
|
Attachments: |
t.shader_test
|
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 97371 [details] t.shader_test The fragment shader runs in scalar mode, so to do vec4 comparisons we generate multiple compares and join them together using and or ors, depending on the comparison. INTEL_DEBUG=fs,no16 bin/shader_runner t.shader_test -auto generates: cmp.e.f0(8) g3<1>D g2.3<0,1,0>F g2.7<0,1,0>F cmp.e.f0(8) g4<1>D g2.2<0,1,0>F g2.6<0,1,0>F cmp.e.f0(8) g5<1>D g2.1<0,1,0>F g2.5<0,1,0>F cmp.e.f0(8) g6<1>D g2<0,1,0>F g2.4<0,1,0>F and(8) g7<1>D g5<8,8,1>D g6<8,8,1>D and(8) g8<1>D g4<8,8,1>D g7<8,8,1>D and(8) g9<1>D g3<8,8,1>D g8<8,8,1>D and.ne.f0(8) null g9<8,8,1>D 1D ... (+f0) sel ... We could have just predicated all but the first cmp instruction and skipped the and instructions completely: cmp.e.f0(8) g3<1>D g2.3<0,1,0>F g2.7<0,1,0>F (+f0) cmp.e.f0(8) g4<1>D g2.2<0,1,0>F g2.6<0,1,0>F (+f0) cmp.e.f0(8) g5<1>D g2.1<0,1,0>F g2.5<0,1,0>F (+f0) cmp.e.f0(8) g6<1>D g2<0,1,0>F g2.4<0,1,0>F ... (+f0) sel ... I think a similar thing can be done for !=, where the join operation is or.