Bug 109759 - [BISECTED][REGRESSION][IVB, HSW] Font rendering problem in OpenGL
Summary: [BISECTED][REGRESSION][IVB, HSW] Font rendering problem in OpenGL
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords: bisected, regression
Depends on:
Blocks:
 
Reported: 2019-02-24 10:30 UTC by Ferdi Scholten
Modified: 2019-03-25 17:23 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Screenshot showing the problem (152.80 KB, image/png)
2019-02-24 10:30 UTC, Ferdi Scholten
Details
Apitrace from kitty terminal emulator (269.80 KB, application/octet-stream)
2019-02-24 13:51 UTC, Ferdi Scholten
Details
Changes from before 4cd1a0be7688 to after 4cd1a0be7688 (17.88 KB, text/plain)
2019-02-27 17:20 UTC, Ian Romanick
Details

Description Ferdi Scholten 2019-02-24 10:30:34 UTC
Created attachment 143447 [details]
Screenshot showing the problem

openGL accelerated software like the kitty terminal emulator are unusable because the fonts are not rendered correctly. Only seeing blocks and dashes in several colors.

This is with MESA git (from padoka ppa) on Ubuntu 18.10

Rendering is correct on Mesa 18.3

Screenshot shows the problem in front of a bugzilla page with the same kind of problem on RADV (Bug 109735)
Comment 1 Lionel Landwerlin 2019-02-24 13:18:55 UTC
Could you capture the problem with apitrace?
That would help us to pinpoint the problem to particular commit.
Thanks
Comment 2 Ferdi Scholten 2019-02-24 13:51:23 UTC
Created attachment 143450 [details]
Apitrace from kitty terminal emulator
Comment 3 Lionel Landwerlin 2019-02-24 15:09:39 UTC
Hmm.. I can't reproduce locally with mesa master nor with the last PPA (Mesa revision f4f4ec9).
This could be an issue related to rest of your environment (compositor, etc...).
Comment 4 Ferdi Scholten 2019-02-24 15:30:57 UTC
Running standard Ubuntu 18.10
Linux ferdi-ThinkPad-T530 4.18.0-15-generic #16-Ubuntu SMP Thu Feb 7 10:56:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
using default Ubuntu Gnome environment and only upgraded MESA from the Ubuntu standard 18.3 to the MESA git version from the Padoka ppa. Before the upgrade font rendering was correct.

If there is anything else I can do to help let me know, I'm not a software developer though...
Comment 5 Lionel Landwerlin 2019-02-24 20:32:55 UTC
(In reply to Ferdi Scholten from comment #4)
> Running standard Ubuntu 18.10
> Linux ferdi-ThinkPad-T530 4.18.0-15-generic #16-Ubuntu SMP Thu Feb 7
> 10:56:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> using default Ubuntu Gnome environment and only upgraded MESA from the
> Ubuntu standard 18.3 to the MESA git version from the Padoka ppa. Before the
> upgrade font rendering was correct.
> 
> If there is anything else I can do to help let me know, I'm not a software
> developer though...


Thanks for the details, I assumed you would have a more recent kind of HW.
That's probably why I'm not seeing the problem :)
I'll try to test on Ivybridge.
Comment 6 Lionel Landwerlin 2019-02-25 12:08:02 UTC
Bisected to :

commit 4cd1a0be76883c2b13aae8c97972e8f1404d06f7
Author: Ian Romanick <ian.d.romanick@intel.com>
Date:   Tue Jun 19 18:09:05 2018 -0700

    i965/vec4: Propagate conditional modifiers from more compares to other compares
    
    If there is a CMP.NZ that compares a single component (via a .zzzz
    swizzle, for example) with 0, it can propagate its conditional modifier
    back to a previous CMP that writes only that component.  The specific
    case that I saw was:
    
        cmp.l.f0(8)     g42<1>.xF       g61<4>.xF       (abs)g18<4>.zF
        ...
        cmp.nz.f0(8)    null<1>D        g42<4>.xD       0D
    
    In this case we can just delete the second CMP.
    
    No changes on Broadwell or Skylake because they do not use the vec4
    backend.  Also no changes on GM45 or Iron Lake.
    
    Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Sandy Bridge shown)
    total instructions in shared programs: 10856676 -> 10852569 (-0.04%)
    instructions in affected programs: 228322 -> 224215 (-1.80%)
    helped: 1331
    HURT: 0
    helped stats (abs) min: 1 max: 7 x̄: 3.09 x̃: 4
    helped stats (rel) min: 0.11% max: 6.67% x̄: 1.88% x̃: 1.83%
    95% mean confidence interval for instructions value: -3.19 -2.99
    95% mean confidence interval for instructions %-change: -1.93% -1.83%
    Instructions are helped.
    
    total cycles in shared programs: 154788865 -> 154732047 (-0.04%)
    cycles in affected programs: 2485892 -> 2429074 (-2.29%)
    helped: 1097
    HURT: 59
    helped stats (abs) min: 2 max: 168 x̄: 51.96 x̃: 64
    helped stats (rel) min: 0.12% max: 12.70% x̄: 3.44% x̃: 2.22%
    HURT stats (abs)   min: 2 max: 16 x̄: 3.02 x̃: 2
    HURT stats (rel)   min: 0.18% max: 0.83% x̄: 0.64% x̃: 0.71%
    95% mean confidence interval for cycles value: -51.04 -47.26
    95% mean confidence interval for cycles %-change: -3.40% -3.07%
    Cycles are helped.
    
    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Comment 7 Lionel Landwerlin 2019-02-25 12:09:57 UTC
I should add that the incorrect rendering changes slightly after the bisected commit so there might be multiple issues.
Comment 8 Ian Romanick 2019-02-27 06:53:10 UTC
An earlier patch in that series 440c0513406 "i965/vec4/dce: Don't narrow the write mask if the flags are used" fixed some things related to swizzles and compares.  It's possible that didn't fix all the problems.  It's also possible that there are latent problems in cmod propagation.  I would first try disabling cmod propagation.  If that fixes the problem, comparing the before & after shader assembly should make the problem obvious.  The fix may not be so obvious. :)

As a side note... someone should capture the shaders from this app and put them in the public shader-db.
Comment 9 Ferdi Scholten 2019-02-27 07:15:12 UTC
(In reply to Ian Romanick from comment #8)
> An earlier patch in that series 440c0513406 "i965/vec4/dce: Don't narrow the
> write mask if the flags are used" fixed some things related to swizzles and
> compares.  It's possible that didn't fix all the problems.  It's also
> possible that there are latent problems in cmod propagation.  I would first
> try disabling cmod propagation.  If that fixes the problem, comparing the
> before & after shader assembly should make the problem obvious.  The fix may
> not be so obvious. :)
> 
> As a side note... someone should capture the shaders from this app and put
> them in the public shader-db.

The source code is available on github, the developer is Kovid Goyal.
Comment 10 Lionel Landwerlin 2019-02-27 11:20:20 UTC
(In reply to Ian Romanick from comment #8)
> An earlier patch in that series 440c0513406 "i965/vec4/dce: Don't narrow the
> write mask if the flags are used" fixed some things related to swizzles and
> compares.  It's possible that didn't fix all the problems.  It's also
> possible that there are latent problems in cmod propagation.  I would first
> try disabling cmod propagation.  If that fixes the problem, comparing the
> before & after shader assembly should make the problem obvious.  The fix may
> not be so obvious. :)
> 
> As a side note... someone should capture the shaders from this app and put
> them in the public shader-db.

Yeah, but shader-db doesn't tell us whether the generated code is correct :(
Comment 11 Lionel Landwerlin 2019-02-27 12:02:48 UTC
Okay, commenting opt_cmd_propagation in brw_vec4.cpp fixes the problem.
Comment 12 Ian Romanick 2019-02-27 14:55:11 UTC
(In reply to Lionel Landwerlin from comment #10)
> (In reply to Ian Romanick from comment #8)
> > An earlier patch in that series 440c0513406 "i965/vec4/dce: Don't narrow the
> > write mask if the flags are used" fixed some things related to swizzles and
> > compares.  It's possible that didn't fix all the problems.  It's also
> > possible that there are latent problems in cmod propagation.  I would first
> > try disabling cmod propagation.  If that fixes the problem, comparing the
> > before & after shader assembly should make the problem obvious.  The fix may
> > not be so obvious. :)
> > 
> > As a side note... someone should capture the shaders from this app and put
> > them in the public shader-db.
> 
> Yeah, but shader-db doesn't tell us whether the generated code is correct :(

That's why it was a side note. :D  As soon as we understand the underlying problem, my comment will be, "Someone should add a pitlit test that reproduces this issue." ;)
Comment 13 Lionel Landwerlin 2019-02-27 15:11:59 UTC
Here is a dump of the replacements I'm seeing, let me know if you see anything wrong :

shader 1:
==============>
scan_inst:
cmp(8).z.f0.0 vgrf155.y:D, vgrf91.xxxx:D, vgrf26.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf155.yyyy:D, 0D
inst replacement:
mov(8) vgrf155.y:D, vgrf281.yyyy:D
==============>
scan_inst:
cmp(8).z.f0.0 vgrf115.y:D, vgrf92.xxxx:D, vgrf26.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf115.yyyy:D, 0D
inst replacement:
mov(8) vgrf115.y:D, vgrf282.yyyy:D
==============>
scan_inst:
cmp(8).z.f0.0 vgrf72.y:D, vgrf71.xxxx:D, vgrf26.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf72.yyyy:D, 0D
inst replacement:
mov(8) vgrf72.y:D, vgrf283.yyyy:D


shader 2:
==============>
scan_inst:
cmp(8).z.f0.0 vgrf71.y:D, vgrf66.xxxx:D, vgrf6.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf71.yyyy:D, 0D
inst replacement:
mov(8) vgrf71.y:D, vgrf98.yyyy:D
==============>
scan_inst:
cmp(8).z.f0.0 vgrf46.y:D, vgrf45.xxxx:D, vgrf6.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf46.yyyy:D, 0D
inst replacement:
mov(8) vgrf46.y:D, vgrf99.yyyy:D


shader 3:
==============>
scan_inst:
cmp(8).z.f0.0 vgrf92.y:D, vgrf69.xxxx:D, vgrf8.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf92.yyyy:D, 0D
inst replacement:
mov(8) vgrf92.y:D, vgrf151.yyyy:D
==============>
scan_inst:
cmp(8).z.f0.0 vgrf49.y:D, vgrf48.xxxx:D, vgrf8.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf49.yyyy:D, 0D
inst replacement:
mov(8) vgrf49.y:D, vgrf152.yyyy:D


shader 4:
==============>
scan_inst:
cmp(8).z.f0.0 vgrf152.y:D, vgrf88.xxxx:D, vgrf23.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf152.yyyy:D, 0D
inst replacement:
mov(8) vgrf152.y:D, vgrf255.yyyy:D
==============>
scan_inst:
cmp(8).z.f0.0 vgrf112.y:D, vgrf89.xxxx:D, vgrf23.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf112.yyyy:D, 0D
inst replacement:
mov(8) vgrf112.y:D, vgrf256.yyyy:D
==============>
scan_inst:
cmp(8).z.f0.0 vgrf69.y:D, vgrf68.xxxx:D, vgrf23.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf69.yyyy:D, 0D
inst replacement:
mov(8) vgrf69.y:D, vgrf257.yyyy:D


shader 5:
==============>
scan_inst:
cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D
inst:
cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D
inst replacement:
mov(8) vgrf11.y:D, vgrf15.yyyy:D
Comment 14 Lionel Landwerlin 2019-02-27 15:16:03 UTC
Well, I just figured that we're not using the right destination register for the MOV replacement instruction.
That kind of makes things a bit better, but it's still not correct rendering.
Comment 15 Lionel Landwerlin 2019-02-27 15:43:48 UTC
Trying to replace the following sequence :

cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D
...
cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D

By:

cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D
...
mov(8) vgrf11.y:D, vgrf15.yyyy:D

I wonder about 2 things :

 - can the value of the vgrf11 register be used in the '...' section of instruction in between the 2 instructions we're trying to replace?

 - can the value of the accumulator we set in the first cmp in the replaced snippet by altered in the '...' second of instruction?

I could see either of those things alter the behavior of shader.
Comment 16 Lionel Landwerlin 2019-02-27 15:59:22 UTC
Stumbled into a fix, given how much of noob I am on the backend compiler, this is probably wrong : https://gitlab.freedesktop.org/mesa/mesa/merge_requests/347
Comment 17 Ian Romanick 2019-02-27 17:20:43 UTC
Created attachment 143491 [details]
Changes from before 4cd1a0be7688 to after 4cd1a0be7688

These changes all look fine to me with a couple exceptions.  Some of the shaders have a hunk at the beginning like:

@@ -1798,6 +1798,7 @@
 mov(8)          g63<1>.xUD      0x00000000UD                    { align16 1Q compacted };
 mov(8)          g67<1>.xUD      0x00000000UD                    { align16 1Q compacted };
 add(8)          g74<1>.xD       g5<4>.zD        g5<4>.xD        { align16 1Q };
+mov(8)          g77<1>.yD       g30<4>.yD                       { align16 1Q };
 mov(8)          g82<1>.xUD      0x00000010UD                    { align16 1Q compacted };
 mov(8)          g86<1>.xUD      0x00000020UD                    { align16 1Q compacted };
 mov(8)          g87<1>UD        g3<4>UD                         { align16 1Q };

I haven't looked at it to closely, but this seems suspicious.  In this case, g77 was previously the result of the CMP instruction.  It seems unlikely that g30 will have a value that could be valid for any remaining uses of g77.
Comment 18 Lionel Landwerlin 2019-02-27 20:28:02 UTC
Fixed in :

commit 6e184147ddce11e90c269a47af7d7395f5ed9c12
Author: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Date:   Wed Feb 27 15:53:21 2019 +0000

    intel/compiler: use correct swizzle for replacement
    
    The optimization in 4cd1a0be76883c introduced a replacement of :
    
    cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D
    ...
    cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D
    
    By :
    
    cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D
    ...
    mov(8) vgrf11.y:D, vgrf15.yyyy:D
    
    The first cmp instruction is storing in x while the second mov is
    sourcing from y. We need to take into account where the replacement on
    the scan_inst destination is going to store thing so that the
    replacement mov can source things from the correct location.
    
    Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
    Fixes: 4cd1a0be76883c ("i965/vec4: Propagate conditional modifiers from more compares to other compares")
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759
    Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Comment 19 Mark Janes 2019-02-27 20:42:08 UTC
How much work would it be to create a piglit test for this?
Comment 20 Lionel Landwerlin 2019-02-27 21:12:32 UTC
(In reply to Mark Janes from comment #19)
> How much work would it be to create a piglit test for this?

Probably not insanely hard, just need to find what control flow triggered that.
The only thing I could see in the shader that was conditional is switch/case.
Comment 21 Sergii Romantsov 2019-03-12 13:26:05 UTC
Hi Mark and Lionel
I got simplified shader-test that triggers a code with issue.
Trying to complete it.
Comment 22 Sergii Romantsov 2019-03-25 17:23:50 UTC
piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/28


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.