Bug 87138 - [IVB/HSW/BYT/BSW]Piglit sporadically causes GPU hang
Summary: [IVB/HSW/BYT/BSW]Piglit sporadically causes GPU hang
Status: RESOLVED INVALID
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: medium major
Assignee: Elio
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-09 06:49 UTC by lu hua
Modified: 2017-02-10 22:39 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (128.67 KB, text/plain)
2015-02-10 02:20 UTC, lu hua
Details

Description lu hua 2014-12-09 06:49:03 UTC
System Environment:
--------------------------
Platform: BSW
Libdrm:		(master)libdrm-2.4.58-19-gf99522e678dbbaffeca9462a8edcbe900574dc12
Mesa:		(master)7e8ba77c49b3fc0fe56d0ba60acc734d389fd9bd
Xserver:	(master)xorg-server-1.16.99.901-21-g4b0d0df34f10a88c10cb23dd50087b59f5c4fece
Xf86_video_intel:(master)2.99.916-165-g9c2c485df9fd39ae36779f765a892e36835a8001
Libva:		(master)cdfd3d50d80c092ca3ae8914418be628d5a80832
Libva_intel_driver:(master)f2a34f94c57e1f7ce975b2068fb087df84b92e3a
Kernel:   (drm-intel-nightly)bfdd01aa1825aa0068f9236b21362b550f6d630f

Bug detailed description:
---------------------------
Run full Piglit, it sporadically causes GPU hang, If run the binary, I am unable the reproduce it.


test log:
Dmesg:<3>[20185.674527] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... render ring idle


Output:
 bin/shader_runner /GFX/Test/Piglit/piglit/generated_tests/spec/glsl-1.10/execution/built-in-functions/vs-op-le-int-int-using-if.shader_test -auto
PIGLIT: {"result": "pass" }


Piglit
returncode: 0
result: GPU hang
summary: Piglit/spec_glsl-1.10_execution_built-in-functions_vs-op-le-int-int-using-if    GPU_hang

Reproduce steps:
-------------------------
1. xinit
2. run full piglit
Comment 1 Ben Widawsky 2014-12-16 03:02:27 UTC
Can you please attach the error state?
Comment 2 lu hua 2014-12-16 08:42:15 UTC
(In reply to Ben Widawsky from comment #1)
> Can you please attach the error state?

I only meet this issue once when run automation testing, I am unable to reproduce it. If meet again, I will update it.
Comment 3 lu hua 2014-12-17 06:55:48 UTC
It also happens once on HSW. Run 2 cycles on HSW, I am unable to reproduce it.

@test: Piglit/spec_EXT_framebuffer_multisample_accuracy_4_srgb_small
info: @@@Returncode: 0


test case start at: Tue Dec 16 22:41:41 2014
test case end at:   Tue Dec 16 22:41:41 2014

Errors:


Dmesg:<3>[17221.887208] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... blitter ring idle


Output:
 bin/ext_framebuffer_multisample-accuracy 4 srgb small -auto -fbo
Pixels that should be unlit
  count = 214644
  Perfect output
Pixels that should be totally lit
  count = 28972
  Perfect output
Pixels that should be partially lit
  count = 18528
  RMS error = 0.096435
The error threshold for this test is 0.119880
PIGLIT: {"result": "pass" }


Piglit
returncode: 0
result: GPU hang
summary: Piglit/spec_EXT_framebuffer_multisample_accuracy_4_srgb_small    GPU_hang
Comment 4 Ben Widawsky 2014-12-17 07:00:46 UTC
Assigning back to Ian since I am only focusing on BSW specific failures right now.
Comment 5 Ben Widawsky 2014-12-19 07:07:20 UTC
What's the PCI ID on the HSW machine?
Comment 6 Ben Widawsky 2014-12-20 01:06:20 UTC
Can you please try on the latest drm-intel-nightly.
Comment 7 lu hua 2014-12-23 05:33:31 UTC
(In reply to Ben Widawsky from comment #5)
> What's the PCI ID on the HSW machine?

id=0x0412, rev 06

(In reply to Ben Widawsky from comment #6)
> Can you please try on the latest drm-intel-nightly.

I will test it on the latest drm-intel-nightly.
Comment 8 lu hua 2014-12-24 01:52:10 UTC
Test on BSW with latest drm-intel-nightly kernel, this issue still exists.
kernel:
commit 4fa23142a15526f4a4b5df61f26eacdd558a849a
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Dec 19 15:33:33 2014 +0100

    drm-intel-nightly: 2014y-12m-19d-14h-33m-07s UTC integration manifest

test log:
@test: Piglit/spec_glsl-1.20_execution_tex-miplevel-selection_GL2:textureProj_2DShadow
errors_ignored!
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so
!
info: @@@Returncode: 0


test case start at: Mon Jan  1 18:32:57 2001
test case end at:   Mon Jan  1 18:33:01 2001

Errors:
libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so


Dmesg:<3>[25371.894786] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... render ring idle


Output:
 bin/tex-miplevel-selection GL2:textureProj 2DShadow -auto -fbo
Summary: 58212/58212 passed
PIGLIT: {"result": "pass" }


Piglit
returncode: 0
result: GPU hang
summary: Piglit/spec_glsl-1.20_execution_tex-miplevel-selection_GL2:textureProj_2DShadow    GPU_hang
Comment 9 Jani Nikula 2015-01-21 16:53:52 UTC
Please test current drm-intel-nightly, and if that fails, with http://patchwork.freedesktop.org/patch/39363 applied.
Comment 10 lu hua 2015-01-23 08:00:57 UTC
It's sporadically issue.  I will double check it. So update will be late.
Comment 11 lu hua 2015-01-26 03:00:26 UTC
It still happens on the latest -nightly kernel, IVB also has this issue.

HSW:
@test: Piglit/spec_ARB_texture_gather_textureGatherOffset_vs-rg-zero-float-2DArray
info: @@@Returncode: 0


test case start at: Sun Jan 25 10:35:19 2015
test case end at:   Sun Jan 25 10:35:22 2015

Errors:


Dmesg:<3>[53863.198094] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... blitter ring idle


Output:
 bin/textureGather vs offset rg zero float 2DArray -auto -fbo
PIGLIT: {"result": "pass" }


Piglit
returncode: 0
result: GPU hang
summary: Piglit/spec_ARB_texture_gather_textureGatherOffset_vs-rg-zero-float-2DArray    GPU_hang

IVB:
@test: Piglit/spec_ARB_gpu_shader5_textureGather_fs-rgba-1-uint-Cube
errors_ignored!
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
 libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so
!
info: @@@Returncode: 0


test case start at: Fri Jan 23 20:49:35 2015
test case end at:   Fri Jan 23 20:49:37 2015

Errors:
libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so


Dmesg:<3>[  890.693113] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... blitter ring idle


Output:
 bin/textureGather fs  rgba 1 uint Cube repeat -auto -fbo
PIGLIT: {"result": "pass" }


Piglit
returncode: 0
result: GPU hang
summary: Piglit/spec_ARB_gpu_shader5_textureGather_fs-rgba-1-uint-Cube    GPU_hang
Comment 12 lu hua 2015-02-04 01:37:16 UTC
It also happens on BYT.
BYT test log:
@test: Piglit/glean_glsl1-Swizzle_in-place
info: @@@Returncode: 0


test case start at: Tue Feb  3 14:21:24 2015
test case end at:   Tue Feb  3 14:21:27 2015

Errors:


Dmesg:<3>[ 7369.711688] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... render ring idle


Output:
 PIGLIT_TEST='Swizzle in-place' bin/glean -o -v -v -v -t +glsl1 --quick
----------------------------------------------------------------------
GLSL test 1: test basic Shading Language functionality.

glsl1: Running single test: Swizzle in-place
glsl1:  PASS rgba8, db, z24, s8, win+pmap, id 32
        1 tests passed, 0 tests failed.



Piglit
returncode: 0
result: GPU hang
summary: Piglit/glean_glsl1-Swizzle_in-place    GPU_hang
Comment 13 lu hua 2015-02-10 02:20:23 UTC
Created attachment 113291 [details]
dmesg

I reproduce this error manually on HSW. Run ./bin/textureGather fs r 0 uint CubeArray repeat -auto -fbo 500 cycles. But no error stated collected in i915_error_state

[root@x-hsw24 lh]#  dmesg -r|egrep "<[1-4]>"|grep drm
<3>[  152.763646] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... blitter ring idle


[root@x-hsw24 lh]# cat /sys/kernel/debug/dri/0/i915_error_state
no error state collected
Comment 14 cprigent 2015-08-21 10:12:53 UTC
Not reproduced on BSW-M and SKL-Y with fresh setup.

I will check on BYT-M/T.
Elio could you try with your NUC HSW and IVB?
Comment 15 Elio 2015-11-06 17:29:19 UTC
This issue is not present with latest configuration on IVB, i will check it with HSW as soon as possible
Comment 16 Annie 2017-02-10 22:39:03 UTC
Dear Reporter,

This Mesa bug has been in the "NEEDINFO" status for over 60 days. I am closing this bug based on lack of response but feel free to reopen if resolution is still needed. Please ensure you're supplying the correct information as requested.

Thank you.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.