Summary: | [BDW] piglit@spec@arb_shader_atomic_counters@unused-result intermittent failures | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Dylan Baker <baker.dylan.c> | ||||
Component: | DRM/Intel | Assignee: | Francisco Jerez <currojerez> | ||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||
Severity: | normal | ||||||
Priority: | medium | CC: | chris, currojerez, gary.c.wang, intel-gfx-bugs, lemody | ||||
Version: | unspecified | ||||||
Hardware: | Other | ||||||
OS: | Linux (All) | ||||||
Whiteboard: | |||||||
i915 platform: | i915 features: | ||||||
Bug Depends on: | |||||||
Bug Blocks: | 93185 | ||||||
Attachments: |
|
Description
Dylan Baker
2015-07-10 18:35:56 UTC
It's bad synchronisation of the read buffer: diff --git a/tests/spec/arb_shader_atomic_counters/common.c b/tests/spec/arb_shader_atomic_counters/common.c index 95d809e..bf8309d 100644 --- a/tests/spec/arb_shader_atomic_counters/common.c +++ b/tests/spec/arb_shader_atomic_counters/common.c @@ -35,6 +35,7 @@ atomic_counters_probe_buffer(unsigned base, unsigned count, uint32_t *p = glMapBufferRange( GL_ATOMIC_COUNTER_BUFFER, base * sizeof(uint32_t), count * sizeof(uint32_t), GL_MAP_READ_BIT); + bool pass = true; unsigned i; if (!p) { @@ -43,17 +44,18 @@ atomic_counters_probe_buffer(unsigned base, unsigned count, } for (i = 0; i < count; ++i) { - if (p[i] != expected[i]) { + uint32_t *found = p[i]; + if (found != expected[i]) { printf("Probe value at (%i)\n", i); printf(" Expected: 0x%08x\n", expected[i]); - printf(" Observed: 0x%08x\n", p[i]); - glUnmapBuffer(GL_ATOMIC_COUNTER_BUFFER); - return false; + printf(" Observed: 0x%08x\n", found); + pass = false; + break; } } glUnmapBuffer(GL_ATOMIC_COUNTER_BUFFER); - return true; + return pass; } bool should fix up the error message. s/*found/found/ ofc I sent ickle's patch to the mailing list, and fixed the same bug in shader_runner and a compute shader test. However, this just fixes the test failure message to make sense. The driver still has a race condition that is causing these intermittent failures, and we need to fix that. It definitely seems to be some kind of race - adding usleep(1000) in the test inbetween glMapBufferRange and the p[i] access effectively hides the problem. It also works fine on Haswell - I've run it for a few hours with no failures. INTEL_DEBUG=sync doesn't help. always_flush_batch=true seems to make it fail a little more often...it at minimum doesn't help... Chris, by chance do you have any ideas about this bug? I'm fairly stumped... piglit.spec.arb_fragment_layer_viewport.layer-gs-writes-in-range depends on atomic counters and suffers from this bug. I've sent a test to the mailing list that might be related to this same issue. Adding syncs or sleeps does not seem to help there though. http://lists.freedesktop.org/archives/piglit/2015-November/018227.html As another maddening data point... I've never been able to reproduce this on my ThinkPad X250. I had thought it could be a kernel regression (I was on a quite old kernel), but I just ran the test 100 times in a loop without a single failure on 4.2.6-301.fc23.x86_64. Perhaps it's only a problem with GBM runs? I don't use GBM. I've been trying to reproduce this failure on BDW today without success. It doesn't seem to make a difference for me whether I use X or GBM to run the test case. I noticed though that the DRM render ring code is missing a DC flush on batchbuffer submission, what could lead to a non-deterministic failure in this test case. Can anyone try out the attached kernel patch and see if it helps? If it does this should be considered a DRM bug. Created attachment 120435 [details] [review] gen7_flush_batch_dc_writes.patch Curro's patch fixes the tests that are intermittent. However, upgrading to Linux 4.3 (to apply the patch) produces an intermitten failures on piglit.spec.arb_shader_image_load_store.qualifiers. I'll write that bug up. (In reply to Mark Janes from comment #11) > Curro's patch fixes the tests that are intermittent. > > However, upgrading to Linux 4.3 (to apply the patch) produces an intermitten > failures on piglit.spec.arb_shader_image_load_store.qualifiers. I'll write > that bug up. Thanks, I've added your tested-by and sent the patch for review. Fix pushed to drm-intel-next-queued as commit 965fd602a6436f689f4f2fe40a6789582778ccd5 Author: Francisco Jerez <currojerez@riseup.net> Date: Wed Jan 13 18:59:39 2016 -0800 drm/i915: Make sure DC writes are coherent on flush. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.