Bug 70435 - [regression] Fast Texture Upload optimization results in corrupt rendering.
Summary: [regression] Fast Texture Upload optimization results in corrupt rendering.
Status: VERIFIED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium critical
Assignee: Chad Versace
QA Contact:
URL:
Whiteboard:
Keywords:
: 70737 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-10-13 21:15 UTC by U. Artie Eoff
Modified: 2013-10-24 22:03 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
weston desktop-shell on cairo-glesv2 (193.88 KB, image/png)
2013-10-13 21:15 UTC, U. Artie Eoff
Details
lspci -vvnn (20.62 KB, text/plain)
2013-10-14 19:00 UTC, U. Artie Eoff
Details
patch (1.13 KB, text/plain)
2013-10-20 21:20 UTC, fjhenigman
Details
Potential fix (2.63 KB, text/plain)
2013-10-21 19:07 UTC, Chad Versace
Details

Description U. Artie Eoff 2013-10-13 21:15:59 UTC
Created attachment 87569 [details]
weston desktop-shell on cairo-glesv2

Initially sighted in Weston desktop-shell on cairo-glesv2 (see attached image).  Also seen in various es2 enabled tookits (e.g. EFL and QT5) on wayland and x11.

49ed5991ee002762f963104facdc6b291f14a9b5 is the first bad commit
commit 49ed5991ee002762f963104facdc6b291f14a9b5
Author: Frank Henigman <fjhenigman@google.com>
Date:   Mon Oct 7 21:17:39 2013 -0400

    i965: extend fast texture upload
    
    Extend the fast texture upload from BGRA X-tiled to include RGBA,
    Alpha/Luminance, and Y-tiled.  Speed improvements, measured with
    mesa demos teximage program, on 256 x 256 texture, in MB/s, on a
    Sandy Bridge (Ivy is comparable):
    
                  before  after   increase
    BGRA/X-tiled   3266    4524    1.39x
    BGRA/Y-tiled   1739    3971    2.28x
    RGBA/X-tiled    474    4694    9.90x
    RGBA/Y-tiled    477    3368    7.06x
       L/X-tiled   1268    1516    1.20x
       L/Y-tiled   1439    1581    1.10x
    
    v2: Cosmetic changes only: reformat and reword comments, make doxygen-friendly,
        rename variables, use existing macros, add an assert.
    
    Signed-off-by: Frank Henigman <fjhenigman@google.com>
    Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>

:040000 040000 a38d0d86c05dfe4dd0c9414e6ad33e9590ec6808 54596ee12246a17f1f3df9e9bf254a25e662293c M	src
bisect run success
Comment 1 Chad Versace 2013-10-14 18:48:27 UTC
It's no surprise that this patch caused a regression. I wish Piglit had total and complete coverage of everything :(

Artie, on which hardware did you see the regression? Please provide the GPU's pci id. And how many channels of RAM?
Comment 2 U. Artie Eoff 2013-10-14 19:00:32 UTC
Created attachment 87619 [details]
lspci -vvnn
Comment 3 U. Artie Eoff 2013-10-14 19:01:20 UTC
(In reply to comment #1)
> It's no surprise that this patch caused a regression. I wish Piglit had
> total and complete coverage of everything :(
> 
> Artie, on which hardware did you see the regression? Please provide the
> GPU's pci id. And how many channels of RAM?

Intel Ivybridge hardware... see my attached lscpi output attachment for details.
Comment 4 Chad Versace 2013-10-15 23:56:59 UTC
I have verified the bisect on an Ivybridge machine.

I also verified with gdb that the buggy codepath is exercised only by clients. Weston never hits the codepath.
Comment 5 Chad Versace 2013-10-18 18:34:38 UTC
I'm using weston-terminal to diagnose the bug.

As weston-terminal uses it, there are to formats for which intel_texsubimage_tiled_memcpy codepath gets taken: MESA_FORMAT_ARGB888 and MESA_FORMAT_A8.

I verified that no corruption occurs when format=MESA_FORMAT_ARGB888. And no corruption occurs when format=MESA_FORMAT_A8 and (xoffset, yoffset) = (0, 0).

Corruption does occur when format=MESA_FORMAT_A8 and (xoffset, yoffset) = (10, 9).
Comment 6 Chad Versace 2013-10-18 20:44:32 UTC
I've isolated corruption to this exact call.

intel_texsubimage_tiled_memcpy: level=0 offset=(10,9) (w,h)=(7,7) format=GL_ALPHA(0x1906) type=GL_UNSIGNED_BYTE(0x1401) gl_format=MESA_FORMAT_A8(0x18) tiling=I915_TILING_Y(2) for_glTexImage=false
Comment 7 fjhenigman 2013-10-20 21:19:12 UTC
Thanks Chad.  I was trying to reproduce but had trouble getting wayland/weston/weston-terminal built and running.
Problem here is not taking packing->Alignment into account, which matters when cpp!=4.  In hindsight I was pretty lax in testing the cpp==1 case.
I'll add piglit coverage and test this patch some more before sending to mesa-dev, or apply it now if you're confident.
Comment 8 fjhenigman 2013-10-20 21:20:13 UTC
Created attachment 87898 [details]
patch
Comment 9 Chad Versace 2013-10-21 19:07:43 UTC
Created attachment 87954 [details]
Potential fix

Frank,

I've attached a patch too. I think the correct approach is to calculate the src stride with _mesa_image_row_stride(). That should be more robust against future changes to the function regarding packing options.
Comment 10 fjhenigman 2013-10-21 19:23:24 UTC
Agreed.  What I had in mind for a piglit test is just write textures and read them back to see if they contain the expected stuff - no rendering.  Do this for various formats, sizes, offsets, glPixelStore options.  Sound good?
Comment 11 Chad Versace 2013-10-21 20:18:33 UTC
I think we really need to render the texture contents. Otherwise, the test won't catch the case where a parallel driver bug exists both in the glTexImage and glReadPixels paths. That's not entirely unlikely, because in nearly all cases the upload path for glTexImage and the download path for glReadPixels go through similar code-paths that involve ctx->Driver.MapTextureImage.

For example, if the driver hook for ctx->Driver.MapTextureImage as a bug that miscalculates stride, then it's likely that glTexImage and glReadPixels would agree, but actual texture operations would retrieve incorrect data.
Comment 12 Kristian Høgsberg 2013-10-21 21:13:04 UTC
*** Bug 70737 has been marked as a duplicate of this bug. ***
Comment 13 Chad Versace 2013-10-24 21:28:23 UTC
Fixed by the below Mesa commit. Now waiting for QA to verify.

commit c4205590e731c2e6de6ecc6d13897740df9a7b5f
Author: Chad Versace <chad.versace@linux.intel.com>
Date:   Fri Oct 18 14:06:49 2013 -0700

    i965: Fix glTexImage when packing alignment != cpp

    Fixes texture corruption of Weston clients on cairo-glesv2 backend.
    Commit 49ed599 introduced the bug.

    Corruption occured when glTexSubImage called
    intel_texsubimage_tiled_memcpy() with:
      x,y=10,9
      w,h=7,7
      format=GL_ALPHA(0x1906)
      type=GL_UNSIGNED_BYTE(0x1401)
      gl_format=MESA_FORMAT_A8(0x18)
      packing.alignemnt=4

    The function miscalculated the source image's stride as w*cpp=7 without
    taking into account the packing alignment. The actual stride was 8.

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70435
    Reported-by: U. Artie Eoff <ullysses.a.eoff@intel.com>
    Tested-by: Kristian Høgsberg <krh@bitplanet.net>
    Reviewed-by:Frank Henigman <fjhenigman@google.com>
    Signed-off-by: Chad Versace <chad.versace@linux.intel.com>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.