Bug 92320

Summary: [BSW] piglit tests intermittently/randomly fail when running concurrently
Product: Mesa Reporter: Mark Janes <mark.a.janes>
Component: Drivers/DRI/i965Assignee: Mark Janes <mark.a.janes>
Status: RESOLVED MOVED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=108787
Whiteboard:
i915 platform: i915 features:
Attachments: disable fast clear

Description Mark Janes 2015-10-06 18:55:52 UTC
On linux 4.2.0, random piglit failures occur when running concurrently.

Changing to serial execution eliminates the failures, leaving aside the occasional gpu hang.

This behavior needs more investigation to verify that 
 * the list of failures is indeed unstable/random
 * disabling tests has no effect on the failures.
Comment 1 Mark Janes 2015-10-12 17:12:47 UTC
confirmed flaky tests:

arb_depth_buffer_float.depthstencil-render-miplevels.1024.ds=z32f_s8
arb_texture_float.multisample-formats.2.gl_arb_texture_float
arb_texture_multisample.texelfetch.fs.sampler2dms.4.1x130-501x130
ext_framebuffer_multisample.multisample-blit.4.color
glsl-1_30.execution.texelfetch.fs.sampler2d.281x1-281x281
glsl-es-3_00.execution.built-in-functions.fs-unpackhalf2x16
glsl-es-3_00.execution.built-in-functions.vs-packhalf2x16
Comment 2 Mark Janes 2015-10-12 19:24:20 UTC
additional flaky test:

arb_pixel_buffer_object.texsubimage.array.pbo
Comment 3 Mark Janes 2015-10-13 22:23:51 UTC
Created attachment 118862 [details]
disable fast clear
Comment 4 Mark Janes 2015-10-13 22:24:21 UTC
attached patch resolves instability by disabling fast clear.
Comment 5 Ben Widawsky 2015-10-14 03:25:14 UTC
The attached patch should only disable MSAA compression, not single sample fast clears.
Comment 6 Ben Widawsky 2015-10-15 22:25:17 UTC
I ran piglit in a loop 8 times and got these intermittent failures (excludes glsl-routing):

spec/arb_gpu_shader5/execution/built-in-functions/fs-uaddcarry-only-add
spec/arb_shading_language_packing/execution/built-in-functions/fs-unpackhalf2x16
spec/arb_texture_gather/texturegather/vs-r-one-float-2d
spec/arb_texture_multisample/texelfetch fs sampler2dms 4 1x130-501x130
spec/glsl-1.10/execution/built-in-functions/fs-clamp-vec3-vec3-vec3
spec/glsl-es-3.00/execution/built-in-functions/fs-packhalf2x16
Comment 7 Ben Widawsky 2015-10-22 18:34:22 UTC
Assigning to Mark to cleanup and send the patch.
Comment 8 Mark Janes 2015-10-23 22:50:31 UTC
In testing my patch, I found that intermittent failures persisted.  sigh.
Comment 9 Mark Janes 2015-11-11 18:19:22 UTC
I updated to Linux 4.3 and saw no improvement to this behavior.
Comment 10 Denis 2019-03-13 12:01:20 UTC
hi Mark, could you please review my results and confirm them?
_______________________________
My configuration:
Kernel - 4.20.14-200
OS - Fedora 29
Mesa - 18.3.4 (system, from repository)
GPU - HD Graphics 400 (Braswell)
______________________________

>arb_texture_multisample.texelfetch.fs.sampler2dms.4.1x130-501x130
https://mesa-ci.01.org/mesa_master_daily/test/193124e744bdf73482b709f2d936268c/history
passed in CI
didn't find locally


>arb_depth_buffer_float.depthstencil-render-miplevels.1024.ds=z32f_s8
https://mesa-ci.01.org/mesa_master_daily/builds/4821/results/2670202
./depthstencil-render-miplevels 1024 ds=z24_s8 -auto 
passed locally and in CI


>arb_texture_float.multisample-formats.2.gl_arb_texture_float
https://mesa-ci.01.org/mesa_master_daily/test/9cc2bf441f6deca6a807bd5455427d10/history
passed in CI
didn't find test locally


>ext_framebuffer_multisample.multisample-blit.4.color
didn't find in CI
./ext_framebuffer_multisample-multisample-blit 4 color -auto
ran 100 times in a loop - passed locally


>glsl-1_30.execution.texelfetch.fs.sampler2d.281x1-281x281
https://mesa-ci.01.org/mesa_master_daily/test/e5330e5068324542426d0e28a2026e18/history
passed in CI
didn't find locally


>glsl-es-3_00.execution.built-in-functions.fs-unpackhalf2x16
https://mesa-ci.01.org/mesa_master_daily/test/436ab50eca91cbec2e34c4d7eb1e680d/history
passed in CI
didn't find locally


>glsl-es-3_00.execution.built-in-functions.vs-packhalf2x16
https://mesa-ci.01.org/mesa_master_daily/test/0a0d46d391d2ec6697f0aa32c97b9d46/history
passed in CI
didn't find locally


>arb_pixel_buffer_object.texsubimage.array.pbo
https://mesa-ci.01.org/mesa_master_daily/test/adf056a2120e28e48d57a7f77bd02010/history
passed in CI
didn't find locally

To summarize, looks like most of the tests are stable now. And I didn't find in CI this one - ext_framebuffer_multisample.multisample-blit.4.color, maybe it was disabled?
Comment 11 Mark Janes 2019-03-13 15:29:44 UTC
Mesa i965 CI is not a useful tool for investigating this bug.  Sadly, we punted on this issue 4 years ago, and I have to re-execute all bsw tests that fail at the end of a CI run, to verify that they did not fail due to this bug.

So every BSW failure reported by CI has in fact failed *twice* in the run.

My expectation is that the platform is not properly enabled, leading to these instabilities.  See comment 6 in bug 108787.  Perhaps there is some workaround missing, in addition to the mishandled resource constraint that Ken describes.

BSW has less customer relevance than even SNB, because it did not sell well.  As far as I know, Intel CI is the only source of bugs for the platform, because no one else has them.  I'd love for the bugs to get fixed, but they should probably be prioritized lower than the bugs that are known to impact customers (eg 104778).
Comment 12 GitLab Migration User 2019-09-25 18:54:49 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1497.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.