Bug 101484

Summary: [regression, bisected] Steam fails to render content, if mesa is compiled with -O2 -march=native (CPU with bmi instruction supported)
Product: Mesa Reporter: Gregor Münch <gr.muench>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: major    
Priority: medium CC: lonewolf, lucas.francesco93, mihai.dontu, mike, pan.papadopoulos80
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Steam main window
lolscren

Description Gregor Münch 2017-06-17 16:02:16 UTC
Created attachment 132023 [details]
Steam main window

Steam is no longer usable with mesa git. I suspect this is SI (Radeon HD 7970) specific though. Confirmed bad commit by reverting it. 

2b8b9a56efc24cc0f27469bf1532c288cdca2076 is the first bad commit
commit 2b8b9a56efc24cc0f27469bf1532c288cdca2076
Author: Marek Olšák <marek.olsak@amd.com>
Date:   Mon May 29 00:40:39 2017 +0200

    radeonsi: move PSIZE and CLIPDIST unique IO indices after GENERIC
    
    Heaven LDS usage for LS+HS is below. The masks are "outputs_written"
    for LS and HS. Note that 32K is the maximum size.
    
    Before:
      heaven_x64: ls=1f1 tcs=1f1, lds=32K
      heaven_x64: ls=31 tcs=31, lds=24K
      heaven_x64: ls=71 tcs=71, lds=28K
    
    After:
      heaven_x64: ls=3f tcs=3f, lds=24K
      heaven_x64: ls=7 tcs=7, lds=13K
      heaven_x64: ls=f tcs=f, lds=17K
    
    All other apps have a similar decrease in LDS usage, because
    the "outputs_written" masks are similar. Also, most apps don't write
    POSITION in these shader stages, so there is room for improvement.
    (tight per-component input/output packing might help even more)
    
    It's unknown whether this improves performance.
    
    Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
    Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
    Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

:040000 040000 53bea508d363add63bb60fe4d7d776a16ef260c7 b654c367d8856228ec2059a2c3f42c5db2e36119 M	src

git bisect log
git bisect start
# bad: [1f958c1337290b4062a77f79fc101bb9f4bdf515] radeonsi: include ac_binary.h for struct ac_shader_binary
git bisect bad 1f958c1337290b4062a77f79fc101bb9f4bdf515
# good: [bec1c13be2154dfb20baa4fbb33b1560a4ef1910] Android: Drop linking libgcc
git bisect good bec1c13be2154dfb20baa4fbb33b1560a4ef1910
# good: [fe14a9a50140d7b2e25052823efa671bf8d63d71] i965: Drop duplicate shadow variable.
git bisect good fe14a9a50140d7b2e25052823efa671bf8d63d71
# bad: [978e6876f1cd8ccc8850a5665e9619a3e29b731e] etnaviv: flush resource when binding as sampler view
git bisect bad 978e6876f1cd8ccc8850a5665e9619a3e29b731e
# good: [ee38dfe9a5525375012d1c6681e7c39c15ac3049] mesa: make _mesa_scissor_bounding_box() static
git bisect good ee38dfe9a5525375012d1c6681e7c39c15ac3049
# good: [9d3f177e4b1ecd5e6ac4673e1ac8c72df9e159eb] dri: Optionally turn off a couple of GLX extensions based on driconf options
git bisect good 9d3f177e4b1ecd5e6ac4673e1ac8c72df9e159eb
# bad: [97f6f411db9b16ebc7c4bebaf26513c185c8f550] i965/surface_state: Images can't handle CCS at all
git bisect bad 97f6f411db9b16ebc7c4bebaf26513c185c8f550
# bad: [9cb42ae997054f52be2e99764199e00eb28056eb] util: Port nir_array functionality to u_dynarray
git bisect bad 9cb42ae997054f52be2e99764199e00eb28056eb
# bad: [e9409c86e7b076801626474dfa5a9151da078a73] radeonsi: remove 8 bytes from si_shader_key
git bisect bad e9409c86e7b076801626474dfa5a9151da078a73
# good: [1887faf73b379f28eb6c73bdb790dbcc97213b3a] svga: Allow format differences in 16-bit RGBA surface sharing
git bisect good 1887faf73b379f28eb6c73bdb790dbcc97213b3a
# good: [df4d6003dc75395f8ded57fdf59046f0d008eea3] svga: Fix imported surface view creation
git bisect good df4d6003dc75395f8ded57fdf59046f0d008eea3
# bad: [2b8b9a56efc24cc0f27469bf1532c288cdca2076] radeonsi: move PSIZE and CLIPDIST unique IO indices after GENERIC
git bisect bad 2b8b9a56efc24cc0f27469bf1532c288cdca2076
# good: [2c4ec3f93fcab3fddcbe132200b210e7def1facc] svga: Always set the alpha value to 1 when sampling using an XRGB view
git bisect good 2c4ec3f93fcab3fddcbe132200b210e7def1facc
# first bad commit: [2b8b9a56efc24cc0f27469bf1532c288cdca2076] radeonsi: move PSIZE and CLIPDIST unique IO indices after GENERIC
Comment 1 Marek Olšák 2017-06-19 23:48:52 UTC
Sadly, I can't reproduce this on Cape Verde. Piglit results good good.
Comment 2 Marek Olšák 2017-06-19 23:49:17 UTC
I mean piglit results look good. ;)
Comment 3 Gregor Münch 2017-06-20 06:11:32 UTC
One thing that comes to my mind, Im using Steam Beta. Maybe the problem is limited to that or its just steam itself.
Comment 4 Joti Papadopoulos 2017-06-25 09:06:43 UTC
I'm having the same issue on Tonga(Gigabyte R9 380) on Arch Linux
Comment 5 Gregor Münch 2017-06-25 10:37:39 UTC
Ive tested this now, and again its a compiler flags issue.
Arch standard clfags are:
CFLAGS="-march=native -O2 -pipe -fstack-protector-strong"

Looks like "-O2" is already too much. Setting this to -O1 fixes the problem. This is a very nasty issue. However, -02 is considered "safe" so the issue should probably fixed in mesa.
Comment 6 Gregor Münch 2017-06-25 10:40:42 UTC
Forgot to add: it isnt a problem with Steam Beta, its also in the non-beta variant.
Comment 7 network723 2017-06-25 10:51:11 UTC
openSUSE, no problem with -O2

CFLAGS='-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -DNDEBUG'
CXXFLAGS='-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -DNDEBUG'
FFLAGS='-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables'
Comment 8 Gregor Münch 2017-06-25 12:25:39 UTC
(In reply to network723 from comment #7)
> openSUSE, no problem with -O2
> 
> CFLAGS='-fmessage-length=0 -grecord-gcc-switches -O2 -Wall
> -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables
> -fasynchronous-unwind-tables -DNDEBUG'

Using your cflags works, but adding -march=native (in my case -march=haswell) let the problem reappear.
Comment 9 Joti Papadopoulos 2017-06-25 17:19:34 UTC
Same here. Removing -march=native fixes the issue(which in my case would be Skylake).
Comment 10 Eric Engestrom 2017-06-26 09:05:32 UTC
Looks to me like you're hitting a compiler optimisation bug; what compiler (&version) are you using?
Can you try another one? For instance, if you were using gcc, `CC=clang CXX=clang++ ./configure` should do it.
Comment 11 Michel Dänzer 2017-06-26 09:07:54 UTC
It could be either a compiler bug, or a Mesa bug which only manifests itself with certain compiler options.
Comment 12 Mika 2017-06-27 09:03:28 UTC
Hi,
I tried with both clang and gcc on Haswell / Pitcairn with "-O2 -march=native", and I obtained the very same broken display

clang version 5.0.0 (trunk 306114)
gcc version 7.1.1 20170528 (GCC)
Comment 13 Gregor Münch 2017-07-08 16:17:00 UTC
Ive tested now some instructions enabled by Haswell according to:
https://gcc.gnu.org/onlinedocs/gcc-6.3.0/gcc/x86-Options.html#x86-Options

And found the culprit with:
-mbmi

Setting this and compiling mesa, corrupts Steam interface and also graphs ingame enabled with Gallium_hud.

Since the instruction is also used on AMD since Piledriver, it would interesting to see if those CPUs are effected as well.
Comment 14 LoneVVolf 2017-07-26 19:30:35 UTC
(In reply to Gregor Münch from comment #5)
> Ive tested this now, and again its a compiler flags issue.
> Arch standard clfags are:
> CFLAGS="-march=native -O2 -pipe -fstack-protector-strong"
> 

for clarity, default on archlinux is NOT -march=native .

The current pacman version is 5.0.2-2 and it has these defaults :

# ARCHITECTURE, COMPILE FLAGS
#########################################################################
#
CARCH="x86_64"
CHOST="x86_64-pc-linux-gnu"

#-- Compiler and Linker Flags
# -march (or -mcpu) builds exclusively for an architecture
# -mtune optimizes for an architecture, but builds for whole processor family
CPPFLAGS="-D_FORTIFY_SOURCE=2"
CFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt"
CXXFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt"
LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now"
#-- Make Flags: change this for DistCC/SMP systems
#MAKEFLAGS="-j2"
#-- Debugging flags
DEBUG_CFLAGS="-g -fvar-tracking-assignments"
DEBUG_CXXFLAGS="-g -fvar-tracking-assignments"

--------------------------------------------------
Comment 15 Lucas Francesco 2017-07-30 01:52:17 UTC
Created attachment 133133 [details]
lolscren

this happens on LoL using nine too(and probably other games using nine with wine), but setting -nobmi as flag make every game that uses nine to not launch.


do we have any hints about what part of the new code is causing this bug?
Comment 16 Gregor Münch 2017-08-06 18:54:43 UTC
(In reply to LoneVVolf from comment #14)
> for clarity, default on archlinux is NOT -march=native .
> 

Im sorry, youre right.
Ive taken things from:
https://wiki.archlinux.org/index.php/makepkg#Creating_optimized_packages

I guess there are still numerous people affected by this bug and the question is what to do now.
Comment 17 Mike Lothian 2017-08-08 15:11:42 UTC
Lucas the correct flag to disable BMI is -mno-bmi
Comment 18 Gregor Münch 2018-01-21 19:20:43 UTC
I tried again today and I was not able to reproduce this bug.
Im using gcc 7.2.1 20180116.

I close this bug now. If someone still have the same problems, please reopen!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.