Bug 95011 - LLVM failed to compile shader on TAHITI
Summary: LLVM failed to compile shader on TAHITI
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-18 23:35 UTC by Rafael Castillo
Modified: 2016-04-20 03:36 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Steam Log X rebirth (63.93 KB, text/plain)
2016-04-18 23:35 UTC, Rafael Castillo
Details
R600_debug log (236.36 KB, application/x-xz)
2016-04-19 00:41 UTC, Rafael Castillo
Details
Source games broken log (1.06 MB, application/x-xz)
2016-04-19 02:21 UTC, Rafael Castillo
Details
Example IR triggering the bug (13.25 KB, text/plain)
2016-04-20 01:13 UTC, Michel Dänzer
Details

Description Rafael Castillo 2016-04-18 23:35:49 UTC
Created attachment 123038 [details]
Steam Log X rebirth

hi guys, since GL4.2 update(not sure if mesa or llvm are to blame) several steam games got broken on Tahiti (R9-280)

X Rebirth, Left 4 Dead 2, so far

LLVM triggered Diagnostic Handler: Illegal instruction detected: Operand has incorrect register class.
LLVM failed to compile shader

attached full steam log

Ty very much for your hard work

Hardware/Software details

Arquitectura:          x86_64
modo(s) de operación de las CPUs:32-bit, 64-bit
Orden de los bytes:    Little Endian
CPU(s):                8
Lista de la(s) CPU(s) en línea:0-7
Hilo(s) de procesamiento por núcleo:2
Núcleo(s) por «socket»:4
«Socket(s)»          1
Modo(s) NUMA:          1
ID de fabricante:      GenuineIntel
Familia de CPU:        6
Modelo:                60
Nombre del modelo:     Intel(R) Xeon(R) CPU E3-1231 v3 @ 3.40GHz
Revisión:             3
CPU MHz:               3352.718
CPU MHz máx.:         3800,0000
CPU MHz mín.:         800,0000
BogoMIPS:              6786.56
Virtualización:       VT-x
Caché L1d:            32K
Caché L1i:            32K
Caché L2:             256K
Caché L3:             8192K
CPU(s) del nodo NUMA 0:0-7
Indicadores:           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt dtherm ida arat pln pts

OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD TAHITI (DRM 2.43.0, LLVM 3.9.0)
OpenGL core profile version string: 4.2 (Core Profile) Mesa 11.3.0-devel (git-3a26ef2)
OpenGL core profile shading language version string: 4.20
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

llvm-config --version
3.9.0svn

Exports:
R600_DEBUG=sisched
STEAM_RUNTIME=0
Comment 1 Michel Dänzer 2016-04-19 00:14:58 UTC
(In reply to Rafael Castillo from comment #0)
> R600_DEBUG=sisched

Does it also happen without sisched?

Please attach the stderr output from running the game with R600_DEBUG=fs,vs,tes,tcs,gs,ps (if you're running with sisched, you can append those items to R600_DEBUG with a comma).
Comment 2 Rafael Castillo 2016-04-19 00:41:01 UTC
Created attachment 123039 [details]
R600_debug log

Attached output as requested, thx very much for your time
Comment 3 Rafael Castillo 2016-04-19 00:47:43 UTC
yes, it happens without sisched it seems.

Btw just in case my Mesa is compiled with this flags

CFLAGS="-march=native -mtune=native -O3 -pipe -fstack-protector-strong -fstack-check -D_GLIBCXX_USE_CXX11_ABI=1 -flto-compression-level=9 -flto=8 -ffat-lto-objects"
CXXFLAGS="-march=native -mtune=native -O3 -pipe -fstack-protector-strong -fstack-check -D_GLIBCXX_USE_CXX11_ABI=1 -flto-compression-level=9 -flto=8 -ffat-lto-objects"

shouldn't too relevant but just in case (ArchLinux latest updates)

COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-unknown-linux-gnu/5.3.0/lto-wrapper
Objetivo: x86_64-unknown-linux-gnu
Configurado con: /build/gcc-multilib/src/gcc-5-20160209/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared --enable-threads=posix --enable-libmpx --with-system-zlib --with-isl --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --with-linker-hash-style=gnu --enable-gnu-indirect-function --enable-multilib --disable-werror --enable-checking=release
Modelo de hilos: posix
gcc versión 5.3.0 (GCC)
Comment 4 Michel Dänzer 2016-04-19 00:49:33 UTC
(In reply to Rafael Castillo from comment #0)
> hi guys, since GL4.2 update(not sure if mesa or llvm are to blame)

FWIW, LLVM looks more likely to me. Any chance you can try rolling back Mesa to the snapshot before the problem occurred and then bisect LLVM?
Comment 5 Rafael Castillo 2016-04-19 01:02:21 UTC
i can try but i don't compile llvm by hand i use lcarlier's repo but i'll try to see what can i do, maybe he has some old llvm snapshots still in the repo that i can compare. 

if not i'll try to modify the pkgbuild to use github repo instead since i'm allergic to svn.

i guess gimme a couple of days to check :)

any tip to tell mesa to use a custom llvm path? i would like to avoid trashing my llvm install if possible
Comment 6 Michel Dänzer 2016-04-19 01:33:51 UTC
(In reply to Rafael Castillo from comment #5)
> i can try but i don't compile llvm by hand i use lcarlier's repo but i'll
> try to see what can i do, maybe he has some old llvm snapshots still in the
> repo that i can compare. 

FWIW, the first step would be to confirm that the problem occurs with old Mesa and new LLVM.


> any tip to tell mesa to use a custom llvm path?

I recommend the --with-llvm-prefix= parameter to Mesa's configure script for that.
Comment 7 Rafael Castillo 2016-04-19 01:47:01 UTC
i found it(sort of), thank god for pacman cache :)

all the way to llvm 266362 all is good but the lcarlier builds from 266410 are broken, so the culprit must be on those commits.(my internet right now is turtle slow so i don't have the exact commit but that range should be small enough)

btw this is tested without touching my Mesa build, so yes is LLVM :) nice eye.

if i have better internet tomorrow i'll try to see if i can get a precise commit in that range
Comment 8 Rafael Castillo 2016-04-19 02:21:03 UTC
Created attachment 123041 [details]
Source games broken log

Actually mesa seems to be broken too but only for chromium and source games(LLVM version fixed X Rebirth as previous post)

attached debug log
Comment 9 Michel Dänzer 2016-04-19 02:44:12 UTC
(In reply to Rafael Castillo from comment #8)
> Actually mesa seems to be broken too but only for chromium and source
> games(LLVM version fixed X Rebirth as previous post)

If that's with new Mesa and old LLVM, it might just be due to mismatching LLVM/Mesa snapshots, not a bug. If this problem happens with the new LLVM snapshot as well, please file a new report about it and try getting a backtrace of the crash.


> attached debug log

BTW, this contains a lot of noise due to R600_DEBUG=fs,vs,tes,tcs,gs,ps. You can remove those again.


P.S. Please don't put single files in tarballs, just compress them directly.
Comment 10 Michel Dänzer 2016-04-19 03:59:26 UTC
Using one of the failing shaders from the attached R600_debug log, I was able to bisect this to SVN r266378 ("AMDGPU: Run SIFoldOperands after PeepholeOptimizer").

Matt, can you take a look?
Comment 11 Matt Arsenault 2016-04-19 16:31:48 UTC
(In reply to Michel Dänzer from comment #10)
> Using one of the failing shaders from the attached R600_debug log, I was
> able to bisect this to SVN r266378 ("AMDGPU: Run SIFoldOperands after
> PeepholeOptimizer").
> 
> Matt, can you take a look?

Can you post the IR for the specific shader
Comment 12 Michel Dänzer 2016-04-20 01:13:50 UTC
Created attachment 123075 [details]
Example IR triggering the bug

(In reply to Matt Arsenault from comment #11)
> Can you post the IR for the specific shader

Here you are.
Comment 13 Matt Arsenault 2016-04-20 03:36:04 UTC
Same problem fixed by Nioolai's r266825


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.