Bug 84777

Summary: [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails
Product: Mesa Reporter: lu hua <huax.lu>
Component: Drivers/DRI/i965Assignee: Ben Widawsky <ben>
Status: VERIFIED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: major    
Priority: high CC: ben
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Avoid GS DW mul

Description lu hua 2014-10-08 06:16:15 UTC
System Environment:
--------------------------
Platform: BSW
Libdrm:	(master)libdrm-2.4.58-4-g00847fa48b83a85b0cb882594a12ed1511f780db
Mesa:	(master)16b53005a7df4249fecb6641af0934c32181fdea
Xserver:(master)xorg-server-1.16.0-386-g95a5b92e37f73f497d547fd91c543c16d2cc73de
Xf86_video_intel:(master)2.99.916-84-gec2b9ac81aed0d2dda2948171ca1c260184bf221
Libva:		(master)cdf8636d5fc5b1558570fede347e1599e0d6af3d
Libva_intel_driver:(master)f11176415ec26eb5960ba6841d2d9c22f2cabc60
Kernel:   (drm-intel-nightly)eabc0c8db15f9ba4d727aee5e0612a68cafe1ab5

Bug detailed description:
---------------------------
It fails on BSW with mesa master branch, works well on BDW.
Following cases also fail:
spec_glsl-1.50_execution_geometry_clip-distance-bulk-copy
spec_glsl-1.50_execution_geometry_clip-distance-in-bulk-read
spec_glsl-1.50_execution_geometry_clip-distance-in-explicitly-sized
spec_glsl-1.50_execution_geometry_clip-distance-in-param
spec_glsl-1.50_execution_geometry_clip-distance-in-values
spec_glsl-1.50_execution_geometry_clip-distance-itemized-copy
spec_glsl-1.50_execution_geometry_clip-distance-out-values
spec_glsl-1.50_execution_geometry_core-inputs
spec_glsl-1.50_execution_geometry_dynamic_input_array_index
spec_glsl-1.50_execution_geometry_end-primitive_0
spec_glsl-1.50_execution_geometry_end-primitive_127
spec_glsl-1.50_execution_geometry_end-primitive_128
spec_glsl-1.50_execution_geometry_end-primitive_129
spec_glsl-1.50_execution_geometry_end-primitive_130
spec_glsl-1.50_execution_geometry_end-primitive_31
spec_glsl-1.50_execution_geometry_end-primitive_32
spec_glsl-1.50_execution_geometry_end-primitive_33
spec_glsl-1.50_execution_geometry_end-primitive_34
spec_glsl-1.50_execution_geometry_max-input-components
spec_glsl-1.50_execution_geometry_point-size-out
spec_glsl-1.50_execution_geometry_primitive-id-in
spec_glsl-1.50_execution_geometry_primitive-id-out
spec_glsl-1.50_execution_geometry_primitive-id-restart_GL_LINES_ADJACENCY_ffs
spec_glsl-1.50_execution_geometry_primitive-id-restart_GL_TRIANGLES_ADJACENCY_ffs
spec_glsl-1.50_execution_geometry_primitive-id-restart_GL_TRIANGLES_ADJACENCY_other
spec_glsl-1.50_execution_geometry_primitive-id-restart_GL_TRIANGLE_STRIP_ADJACENCY_ffs
spec_glsl-1.50_execution_geometry_primitive-id-restart_GL_TRIANGLE_STRIP_ADJACENCY_other
spec_glsl-1.50_execution_geometry_primitive-types_GL_LINES_ADJACENCY
spec_glsl-1.50_execution_geometry_primitive-types_GL_LINE_STRIP_ADJACENCY
spec_glsl-1.50_execution_geometry_primitive-types_GL_TRIANGLES_ADJACENCY
spec_glsl-1.50_execution_geometry_primitive-types_GL_TRIANGLE_STRIP_ADJACENCY
spec_glsl-1.50_execution_geometry_tri-strip-ordering-with-prim-restart_GL_TRIANGLE_STRIP_ADJACENCY_ffs
spec_glsl-1.50_execution_geometry_tri-strip-ordering-with-prim-restart_GL_TRIANGLE_STRIP_ADJACENCY_other
spec_glsl-1.50_execution_geometry_triangle-strip-adj
spec_glsl-1.50_execution_geometry_triangle-strip-adj-orientation
spec_glsl-1.50_execution_geometry_triangle-strip-orientation

output: 
Probe color at (0,0)
  Expected: 0.000000 1.000000 0.000000 1.000000
  Observed: 0.000000 0.000000 0.000000 0.000000
PIGLIT: {"result": "fail" }

Reproduce steps:
-------------------------
1. xinit
2. bin/shader_runner tests/spec/glsl-1.50/execution/geometry-basic.shader_test -auto
Comment 1 Ben Widawsky 2014-11-20 22:26:32 UTC
*** Bug 84786 has been marked as a duplicate of this bug. ***
Comment 2 Ben Widawsky 2014-11-20 22:26:57 UTC
*** Bug 85393 has been marked as a duplicate of this bug. ***
Comment 3 Ben Widawsky 2014-11-20 22:27:31 UTC
*** Bug 84778 has been marked as a duplicate of this bug. ***
Comment 4 Ben Widawsky 2014-11-20 22:27:53 UTC
*** Bug 84078 has been marked as a duplicate of this bug. ***
Comment 5 Ben Widawsky 2014-11-20 22:28:29 UTC
*** Bug 84216 has been marked as a duplicate of this bug. ***
Comment 6 Ben Widawsky 2014-11-20 22:29:12 UTC
*** Bug 84776 has been marked as a duplicate of this bug. ***
Comment 7 Ben Widawsky 2014-11-20 22:29:33 UTC
*** Bug 84784 has been marked as a duplicate of this bug. ***
Comment 8 Ben Widawsky 2014-11-20 22:30:01 UTC
*** Bug 84788 has been marked as a duplicate of this bug. ***
Comment 9 Ben Widawsky 2014-11-20 22:30:32 UTC
*** Bug 84779 has been marked as a duplicate of this bug. ***
Comment 10 Ben Widawsky 2014-11-20 22:30:53 UTC
*** Bug 84783 has been marked as a duplicate of this bug. ***
Comment 11 Ben Widawsky 2014-11-20 22:34:00 UTC
*** Bug 84773 has been marked as a duplicate of this bug. ***
Comment 12 Ben Widawsky 2014-11-20 22:37:48 UTC
*** Bug 84772 has been marked as a duplicate of this bug. ***
Comment 13 Ben Widawsky 2014-11-24 18:50:25 UTC
*** Bug 84780 has been marked as a duplicate of this bug. ***
Comment 14 Ben Widawsky 2014-11-24 18:51:21 UTC
*** Bug 84782 has been marked as a duplicate of this bug. ***
Comment 15 Ben Widawsky 2014-11-24 19:44:22 UTC
*** Bug 84774 has been marked as a duplicate of this bug. ***
Comment 16 Ben Widawsky 2014-12-04 02:49:18 UTC
*** Bug 86769 has been marked as a duplicate of this bug. ***
Comment 17 Ben Widawsky 2014-12-04 02:52:11 UTC
*** Bug 84787 has been marked as a duplicate of this bug. ***
Comment 18 Ben Widawsky 2014-12-04 04:09:08 UTC
*** Bug 84221 has been marked as a duplicate of this bug. ***
Comment 19 Ben Widawsky 2014-12-04 04:12:52 UTC
*** Bug 84775 has been marked as a duplicate of this bug. ***
Comment 20 lu hua 2014-12-04 05:44:05 UTC
Increasing priority, It impacts so many cases.
Comment 21 Ben Widawsky 2014-12-04 06:30:02 UTC
Currently all tests using the GS fail. I've filed an internal bug, but haven't yet gotten anywhere with it. At least a few hundred piglit failures are caused by this. The GS failures may be masking real failures. Therefore I Agree that this should be high priority (perhaps even higher).
Comment 22 Ben Widawsky 2014-12-04 21:58:48 UTC
Can you please test my mesa branch?
http://cgit.freedesktop.org/~bwidawsk/mesa/log/?h=qw-mul
Comment 23 Ben Widawsky 2014-12-04 23:33:30 UTC
Created attachment 110473 [details] [review]
Avoid GS DW mul

You can use this patch instead of my branch.
Comment 24 lu hua 2014-12-05 04:48:13 UTC
(In reply to Ben Widawsky from comment #23)
> Created attachment 110473 [details] [review] [review]
> Avoid GS DW mul
> 
> You can use this patch instead of my branch.

Apply this patch, it works well.
root@x-bsw01:/GFX/Test/Piglit/piglit# bin/shader_runner tests/spec/glsl-1.50/execution/geometry-basic.shader_test -auto
libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so
PIGLIT: {"result": "pass" }
Comment 25 Ben Widawsky 2014-12-05 04:55:37 UTC
Can you please do a full piglit run with this patch? It should fix most, if not all of the BSW bugs that don't exist on BDW.
Comment 26 lu hua 2014-12-05 05:00:19 UTC
(In reply to Ben Widawsky from comment #25)
> Can you please do a full piglit run with this patch? It should fix most, if
> not all of the BSW bugs that don't exist on BDW.

OK.
Comment 27 lu hua 2014-12-08 06:14:49 UTC
Hi Ben,
I try to test full piglit case on bcc7eb115ed6e with your patch, and test full piglit case on the latest mesa master branch(I notice your patch landed).
Both they have many cases timeout, up to now finished 1000+ cases.
The timeout happens when auto run full piglit case, If run the binary, I am unable the reproduce the timeout(run more than 20 cycles)

Test on bcc7eb115ed6e7, it doesn't have this "timeout" issue.

Run case list:
Piglit/glx_glx-dont-care-mask    PASS
Piglit/glx_glx-fbconfig-compliance    PASS
Piglit/glx_glx-fbconfig-sanity    PASS
Piglit/glx_glx-fbo-binding    TIMEOUT

run log:
@test: Piglit/glx_glx-fbconfig-sanity
errors_ignored!
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so
!
info: @@@Returncode: 0

Errors:
libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so

Dmesg:

Output:
 bin/glx-fbconfig-sanity -auto -fbo
PIGLIT: {"result": "pass" }

Piglit
returncode: 0
result: pass
summary: Piglit/glx_glx-fbconfig-sanity    PASS
@test: Piglit/glx_glx-fbo-binding
errors_ignored!
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so
!
info: @@@Returncode: 1

Errors:
libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so

Dmesg:

Output:
 bin/glx-fbo-binding -auto

Piglit
returncode: 1
result: TIMEOUT
summary: Piglit/glx_glx-fbo-binding    TIMEOUT
Comment 28 Ben Widawsky 2014-12-08 21:58:50 UTC
I cannot reproduce this locally with mesa master. Can you please try to reproduce this after doing a hard reboot of the target, and running nothing else in between? 

I've seen the behavior you describe on CHV in particular when GPU resets do not actually reset state, and all subsequent tests will hang. If you can still reproduce after the hard reboot, can you please attached the gzipped results.json?
Comment 29 lu hua 2014-12-09 06:10:04 UTC
(In reply to Ben Widawsky from comment #28)
> I cannot reproduce this locally with mesa master. Can you please try to
> reproduce this after doing a hard reboot of the target, and running nothing
> else in between? 
> 
> I've seen the behavior you describe on CHV in particular when GPU resets do
> not actually reset state, and all subsequent tests will hang. If you can
> still reproduce after the hard reboot, can you please attached the gzipped
> results.json?

Finished one cycle full piglit testing on latest master branch.
pass count is similar to BDW.
One case has GPU hang issue, the GPU hang issue is unstable. I am unable reproduce it manually, the full log is too large, I will give you the log location via mail.

case                                 BDW    BSW  BUG
spec_ARB_timer_query_timestamp-get   PASS   FAIL bug 84791
spec_arb_shading_language_packing_execution_built-in-functions_vs-packHalf2x16  PASS  FAIL 
spec_arb_shading_language_packing_execution_built-in-functions_vs-unpackHalf2x16  PASS  FAIL  bug 84223
shaders_glsl-routing  PASS   FAIL bug 84826
spec_glsl-1.10_execution_built-in-functions_vs-op-le-int-int-using-if  PASS GPU_hang

log:
@test: Piglit/spec_glsl-1.10_execution_built-in-functions_vs-op-le-int-int-using-if
errors_ignored!
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so
!
info: @@@Returncode: 0


test case start at: Fri Jan  5 16:22:15 2001
test case end at:   Fri Jan  5 16:22:17 2001

Errors:
libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so


Dmesg:<3>[20185.674527] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... render ring idle


Output:
 bin/shader_runner /GFX/Test/Piglit/piglit/generated_tests/spec/glsl-1.10/execution/built-in-functions/vs-op-le-int-int-using-if.shader_test -auto
PIGLIT: {"result": "pass" }


Piglit
returncode: 0
result: GPU hang
summary: Piglit/spec_glsl-1.10_execution_built-in-functions_vs-op-le-int-int-using-if    GPU_hang
Comment 30 lu hua 2014-12-09 06:51:28 UTC
> pass count is similar to BDW.
> One case has GPU hang issue, the GPU hang issue is unstable. I am unable
> reproduce it manually, 
> 

Due to the "fail" goes away, file Bug 87138 to track GPU hang, Close this bug.
Comment 31 lu hua 2014-12-09 06:51:49 UTC
Verified.Fixed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.