This bug affects Gen8 platforms Braswell and Broadwell. I've been testing on a Braswell C0 part.
I'm running with the oibaf ppa on ubuntu 14.10. The SNA version from xorg log is:
xserver-xorg-video-intel 2:2.99.917+git1501151931.bb279d~gd~u (Oibaf <firstname.lastname@example.org>)
If you run the GLBenchmark Onscreen TRex benchmark, you can see occasional artifacts where leaves draw lines forward into the screen. It seems like an off-by-one error, where one of the vertices is squashed when it is rendered.
Changing to UXA eliminates the behavior.
Created attachment 112377 [details] [review]
Use PTE cache settings for winsys buffers.
Sounds like a mesa bug, try this.
(Either that or it is a bug in the rendercopy routine that somehow is only seen within a mesa surface.)
I applied the patch, which had no effect on the behavior.
It should be easy to reproduce, please let me know if you have trouble making it happen.
I have glbenchmark2.7.0 available, what is your full commandline to run the test?
./GLBenchmark -w 1920 -h 1080 -ow 1920 -oh 1080 -t GLB27_TRex_C24Z16_FixedTimeStep
Ok, I spotted the corruption - it seems easiest to spot watching just below halfway down the screen as a green/brown rope extending across the scene.
That is not the corruption I was thinking of and is internal to mesa. It might be the same GEN8_3DSTATE_VF_INSTANCING pollution but in reverse?
i.e. commit 0532a3313ad9c76a6e1d28e8a1c2ea495583fead
Author: Chris Wilson <email@example.com>
Date: Wed Nov 5 20:11:54 2014 +0000
sna/gen8: Clear instancing enabled bit between batches
gen8 sets the instancing bit relative to the vertex element, but we were
clearing it for the vertex buffer. As the maximum number of vertex
elements is fixed, just clear them all when emitting our header. Note
that VF_SGVS is not sufficient by itself to disable all side-effects of
Thanks to Kenneth Graunke for pointing out the change from vertex buffer
to vertex element of the instancing enable bit.
Signed-off-by: Chris Wilson <firstname.lastname@example.org>
Eliminating the use of render copy (by always doing pageflips instead, note that glbenchmark is buggy - the x11_es path sets a borderWidth so that the window never actually fully covers the screen, and only the x11_es path) the artifacts are still visible.
Note: I've seen wrongly drawn vertexes also on Windows, on BSW B0, especially with slightly older Windows driver. They've been visible also in SynMark (batch tests). -> Could be a missing HW WA. Make sure you have latest kernel.
The kernel we used for reproducing this included braswell mods from bwidask:
I think his branch point was mid-january. If you think there have been relevant kernel changes since then, can you reproduce with a more recent kernel?
It's easily reproducible on Broadwell using a stock drm-intel-nightly kernel, though.
(In reply to Mark Janes from comment #9)
> The kernel we used for reproducing this included braswell mods from bwidask:
> I think his branch point was mid-january. If you think there have been
> relevant kernel changes since then, can you reproduce with a more recent
Haven't tried, but according to Valtteri, latest drm-intel-nightly kernels hang BSW machine with Egypt & T-Rex tests (not with others).
I ran TRex on my Lenovo X250 (Broadwell GT2) using xserver master and xf86-video-modesetting with Glamor for acceleration, and I see the exact same corruption we saw with SNA.
So I think that completely rules out SNA as the source of the problem. It's probably a Mesa bug of sorts...
INTEL_DEBUG=sync makes the corruption go away (correct rendering).
always_flush_batch=true has no effect (still broken).
Created attachment 115082 [details] [review]
Fix the blitter code; fixes the bug.
When trying to figure out why synchronization mattered, I started playing around with stalling vs. blitting in brw_buffer_subdata(). It turned out stalling always worked, but blitting failed.
INTEL_DEBUG=sync means that the buffer is never busy, so we can simply map it and edit the data, rather than having to do a stall-avoidance BLT.
Broadwell changed the blitter. Looking at the XY_SRC_COPY_BLT command, it now requires that the source and destination addresses have to be cacheline aligned. Our BufferSubData calls were performing linear blits with unaligned addresses. I suspect the unaligned portion was just...not copied...leading to entirely bunk vertex data.
Fixing this in intel_emit_linear_blit() is pretty easy. We can just use offset % 64 as the X coordinate, and round the address down. This fixes the bug.
We almost certainly need to alter intelEmitCopyBlit() as well, as untiled images will likely suffer from a similar bug. Maybe the other intel_blit.c functions, too. Given that intelEmitCopyBlit() is allowed to fail, we can always just do that when not cacheline aligned. We may also want to actually adjust the parameters to make it work in some cases. Not sure.
Oh, you missed the joy of:
Issue: if the 1st pixel in XY_SRC_COPY is not CL aligned when SRC or
DST are linear that will cause failure.
Fixed in master with:
Author: Kenneth Graunke <email@example.com>
Date: Wed Apr 15 03:04:33 2015 -0700
i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions.