Bug 60172

Summary: Planeshift: triangles where grass would be
Product: Mesa Reporter: Martin Steigerwald <Martin>
Component: Drivers/DRI/i965Assignee: Kenneth Graunke <kenneth>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: idr, jljusten, stereotype441
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: screenshot which shows those triangles that didn´t appear with mesa 8.0.5

Description Martin Steigerwald 2013-02-01 19:47:05 UTC
Created attachment 74068 [details]
screenshot which shows those triangles that didn´t appear with mesa 8.0.5

Since testing out fixes for bug #59086 I have some triangles where grass or rock or whatever would be. Thus spell effects are fast, but triangles appear where there have been none before (with mesa 8.0.5-3 from Debian package). Thus it might be unrelated to the spell effect speedup patches but be related to an issue in the newer mesa.

See example screenshot, that I attach.

Currently I have this with newest mesa git:

OpenGL version string: 3.0 Mesa 9.2-devel (git-a8a5055)

plus the four patches from:

[Mesa-dev] [PATCH 1/4] mesa: Put extern "C" guards in renderbuffer.h.
Kenneth Graunke
http://lists.freedesktop.org/archives/mesa-dev/2013-January/033679.html


I have set LIBGL1_MESA_DRIVERS according to

http://x.debian.net/howto/build-mesa.html

in calling shell, but also in:

martin@merkaba:~> cat /opt/PlaneShift/pslaunch
#!/bin/bash

cd $(dirname $0)
# http://x.debian.net/howto/build-mesa.html
export LIBGL_DRIVERS_PATH=/home/martin/Spielen/Planeshift/spell-effects-slow/mesa.git/lib/
export LD_LIBRARY_PATH=/home/martin/Spielen/Planeshift/spell-effects-slow/mesa.git/lib/:libs/:$LD_LIBRARY_PATH
export EGL_DRIVERS_PATH==/home/martin/Spielen/Planeshift/spell-effects-slow/mesa.git/lib/
exec ./pslaunch.bin $@


I think this would make Planeshift use my newest mesa libraries.

I built mesa with:

martin@merkaba:~/Spielen/Planeshift/spell-effects-slow> ./configure --prefix=/usr \
              --enable-driglx-direct \
              --enable-gles-overlay \
              --enable-gles1 \
              --enable-gles2 \
              --enable-glx-tls \
              --with-driver=dri \
              --with-dri-driverdir=/usr/lib/dri \
              --with-egl-platforms='drm x11' \
              --with-state-trackers=egl,glx,dri,vega \
              --with-dri-drivers=i965 \
            --disable-gallium-llvm \
            --with-gallium-drivers=""

(Disabled gallium stuff, cause it didn´t build out of the box last time.)

Thanks,
Martin
Comment 1 Martin Steigerwald 2013-02-01 19:58:22 UTC
Oh, I have only seen this issue outside of a city so far. Inside cities or buildings all is well.
Comment 2 Kenneth Graunke 2013-02-02 00:12:17 UTC
Confirmed.  This appears to happen with Mesa master but not 9.0.  I'll have to bisect.

(Sadly, it doesn't actually *run* on 9.0 due to an ARB_oq2 bug...I'll pick over the fix for that now so it'll actually work with 9.0.3...)
Comment 3 Kenneth Graunke 2013-02-02 01:21:05 UTC
Managed to track this down.

A trace that reproduces the issue (on Sandy Bridge) is here:
http://people.freedesktop.org/~kwg/traces/planeshift-triangles-2.trace

f3993107f0b997195c4d97b95c47e84220f10b6d is the first bad commit
commit f3993107f0b997195c4d97b95c47e84220f10b6d
Author: Paul Berry <stereotype441@gmail.com>
Date:   Wed Dec 5 10:19:19 2012 -0800

    glsl/linker: Sort varyings by packing class, then vector size.
    
    This patch paves the way for varying packing by adding a sorting step
    before varying assignment, which sorts the varyings into an order that
    increases the likelihood of being able to find an efficient packing.
    
    First, varyings are sorted into "packing classes" by considering
    attributes that can't be mixed during varying packing--at the moment
    this includes base type (float/int/uint/bool) and interpolation mode
    (smooth/noperspective/flat/centroid), though later we will hopefully
    be able to relax some of these restrictions.  The number of packing
    classes places an upper limit on the amount of space that must be
    wasted by varying packing, since in theory a shader might nave 4n+1
    components worth of varyings in each of m packing classes, resulting
    in 3m components worth of wasted space.
    
    Then, within each packing class, varyings are sorted by vector size,
    with vec4's coming first, then vec2's, then scalars, and then finally
    vec3's.  The motivation for this order is that it ensures that the
    only vectors that might be "double parked" (with part of the vector in
    one varying slot and the remainder in another) are vec3's.
    
    Note that the varyings aren't actually packed yet, merely placed in an
    order that will facilitate packing.
    
    Reviewed-by: Eric Anholt <eric@anholt.net>

:040000 040000 665adcc3ac4fb0df02f3a1588400c3a7032e4821 69a17ca15348fca35e5b608a85bd6c061827a767 M       src

Not sure what's going on though.
Comment 4 Kenneth Graunke 2013-02-02 01:32:57 UTC
Drat.  This appears to be Sandy Bridge specific---I can't reproduce it on Ivy Bridge.  Sounds similar to the Steam Big Picture issue we've been tracking down.

CC'ing Jordan in case he's interested...
Comment 5 Martin Steigerwald 2013-02-02 10:23:54 UTC
Kenneth, thank you for digging into it. I am completely astonished by the amount of support I receive for my bug reports. It makes fun filing bug reports that get such a response. I think with these environment variables being set I have a good environment to test out patches. So feel free to throw patches at me. :) Thank you, Martin
Comment 6 Kenneth Graunke 2013-02-02 11:45:36 UTC
My pleasure!

Here's a patch on the mailing list:
http://lists.freedesktop.org/archives/mesa-dev/2013-February/033987.html

It really needs to get cleaned up before going upstream - after several hours of bashing at this, hanging my GPU, and trying to get the numbers exactly right, it got a bit ugly.  But it works now!
Comment 7 Martin Steigerwald 2013-02-02 12:50:57 UTC
(In reply to comment #6)
> My pleasure!
> 
> Here's a patch on the mailing list:
> http://lists.freedesktop.org/archives/mesa-dev/2013-February/033987.html

Tested-by: Martin Steigerwald <martin@lichtvoll.de>

Works just fine on Sandybridge HD 3000 here. Glitches are gone.

Thanks a lot. Great turn around time from bug report to fix! I love it.
Comment 9 Martin Steigerwald 2013-02-03 11:09:45 UTC
Tested-by: Martin Steigerwald <martin@lichtvoll.de>

Corruptions are still gone.

OpenGL version string: 3.0 Mesa 9.2-devel (git-8a4d952)

+ these four patches and the four patches for CopyTexSubImage BLORP y-tiled blits (see bug #59086).

I will observe whether the GPU hangs have gone.

Thanks,
Martin
Comment 10 Kenneth Graunke 2013-02-03 21:44:24 UTC
Fixed in master by:

commit 09fbc298283b41cf3f25d75842a6d8ab4952dca1
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Sat Feb 2 12:46:49 2013 -0800

    i965: Fix the SF Vertex URB Read Length calculation for Sandybridge.
    
    (This commit message was primarily written by Paul Berry, who explained
     what's going on far better than I would have.)
    
    Previous to this patch, we thought that the only restrictions on
    3DSTATE_SF's URB read length were (a) it needs to be large enough to
    read all the VUE data that the SF needs, and (b) it can't be so large
    that it tries to read VUE data that doesn't exist.  Since the VUE map
    already tells us how much VUE data exists, we didn't bother worrying
    about restriction (a); we just did the easy thing and programmed the
    read length to satisfy restriction (b).
    
    However, we didn't notice this erratum in the hardware docs: "[errata]
    Corruption/Hang possible if length programmed larger than recommended".
    Judging by the context surrounding this erratum, it's pretty clear that
    it means "URB read length must be exactly the size necessary to read all
    the VUE data that the SF needs, and no larger".  Which means that we
    can't program the read length based on restriction (b)--we have to
    program it based on restriction (a).
    
    The URB read size needs to precisely match the amount of data that the
    SF consumes; it doesn't work to simply base it on the size of the VUE.
    
    Thankfully, the PRM contains the precise formula the hardware expects.
    
    Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
    corruption in PlaneShift, and Piglit's fbo-5-varyings test.
    
    NOTE: This is a candidate for all stable branches.
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
    Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
    Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
    Reviewed-by: Paul Berry <stereotype441@gmail.com>
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>

(along with the previous two commits).  These should be cherry-picked back to 9.0 assuming no one finds any issues with them on master.
Comment 11 Andreas Boll 2013-02-21 15:33:41 UTC
Cherry-picked to the 9.0 branch: e11ef54a92843be70850ae0244169b5b9708ecab

It will be available in Mesa 9.0.3.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.