On linux 4.2.0, random piglit failures occur when running concurrently. Changing to serial execution eliminates the failures, leaving aside the occasional gpu hang. This behavior needs more investigation to verify that * the list of failures is indeed unstable/random * disabling tests has no effect on the failures.
confirmed flaky tests: arb_depth_buffer_float.depthstencil-render-miplevels.1024.ds=z32f_s8 arb_texture_float.multisample-formats.2.gl_arb_texture_float arb_texture_multisample.texelfetch.fs.sampler2dms.4.1x130-501x130 ext_framebuffer_multisample.multisample-blit.4.color glsl-1_30.execution.texelfetch.fs.sampler2d.281x1-281x281 glsl-es-3_00.execution.built-in-functions.fs-unpackhalf2x16 glsl-es-3_00.execution.built-in-functions.vs-packhalf2x16
additional flaky test: arb_pixel_buffer_object.texsubimage.array.pbo
Created attachment 118862 [details] disable fast clear
attached patch resolves instability by disabling fast clear.
The attached patch should only disable MSAA compression, not single sample fast clears.
I ran piglit in a loop 8 times and got these intermittent failures (excludes glsl-routing): spec/arb_gpu_shader5/execution/built-in-functions/fs-uaddcarry-only-add spec/arb_shading_language_packing/execution/built-in-functions/fs-unpackhalf2x16 spec/arb_texture_gather/texturegather/vs-r-one-float-2d spec/arb_texture_multisample/texelfetch fs sampler2dms 4 1x130-501x130 spec/glsl-1.10/execution/built-in-functions/fs-clamp-vec3-vec3-vec3 spec/glsl-es-3.00/execution/built-in-functions/fs-packhalf2x16
Assigning to Mark to cleanup and send the patch.
In testing my patch, I found that intermittent failures persisted. sigh.
I updated to Linux 4.3 and saw no improvement to this behavior.
hi Mark, could you please review my results and confirm them? _______________________________ My configuration: Kernel - 4.20.14-200 OS - Fedora 29 Mesa - 18.3.4 (system, from repository) GPU - HD Graphics 400 (Braswell) ______________________________ >arb_texture_multisample.texelfetch.fs.sampler2dms.4.1x130-501x130 https://mesa-ci.01.org/mesa_master_daily/test/193124e744bdf73482b709f2d936268c/history passed in CI didn't find locally >arb_depth_buffer_float.depthstencil-render-miplevels.1024.ds=z32f_s8 https://mesa-ci.01.org/mesa_master_daily/builds/4821/results/2670202 ./depthstencil-render-miplevels 1024 ds=z24_s8 -auto passed locally and in CI >arb_texture_float.multisample-formats.2.gl_arb_texture_float https://mesa-ci.01.org/mesa_master_daily/test/9cc2bf441f6deca6a807bd5455427d10/history passed in CI didn't find test locally >ext_framebuffer_multisample.multisample-blit.4.color didn't find in CI ./ext_framebuffer_multisample-multisample-blit 4 color -auto ran 100 times in a loop - passed locally >glsl-1_30.execution.texelfetch.fs.sampler2d.281x1-281x281 https://mesa-ci.01.org/mesa_master_daily/test/e5330e5068324542426d0e28a2026e18/history passed in CI didn't find locally >glsl-es-3_00.execution.built-in-functions.fs-unpackhalf2x16 https://mesa-ci.01.org/mesa_master_daily/test/436ab50eca91cbec2e34c4d7eb1e680d/history passed in CI didn't find locally >glsl-es-3_00.execution.built-in-functions.vs-packhalf2x16 https://mesa-ci.01.org/mesa_master_daily/test/0a0d46d391d2ec6697f0aa32c97b9d46/history passed in CI didn't find locally >arb_pixel_buffer_object.texsubimage.array.pbo https://mesa-ci.01.org/mesa_master_daily/test/adf056a2120e28e48d57a7f77bd02010/history passed in CI didn't find locally To summarize, looks like most of the tests are stable now. And I didn't find in CI this one - ext_framebuffer_multisample.multisample-blit.4.color, maybe it was disabled?
Mesa i965 CI is not a useful tool for investigating this bug. Sadly, we punted on this issue 4 years ago, and I have to re-execute all bsw tests that fail at the end of a CI run, to verify that they did not fail due to this bug. So every BSW failure reported by CI has in fact failed *twice* in the run. My expectation is that the platform is not properly enabled, leading to these instabilities. See comment 6 in bug 108787. Perhaps there is some workaround missing, in addition to the mishandled resource constraint that Ken describes. BSW has less customer relevance than even SNB, because it did not sell well. As far as I know, Intel CI is the only source of bugs for the platform, because no one else has them. I'd love for the bugs to get fixed, but they should probably be prioritized lower than the bugs that are known to impact customers (eg 104778).
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1497.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.