Using "volatile" in OpenCL kernels will generate an "unsupported initializer for address space" error for the combinations of:
Mesa 18.1.3 (POLARIS10, DRM 3.23.0, 4.16.0-1-amd64, LLVM 6.0.1)
Mesa 17.3.9 AMD KABINI (DRM 2.49.0 / 4.9.0-6-amd64, LLVM 5.0.1)
But works for:
Mesa 18.0.4 (POLARIS11 / DRM 3.23.0 / 4.16.13-1-MANJARO, LLVM 6.0.0)
Mesa 18.1.3 (POLARIS11, DRM 3.23.0, 4.16.18-1-MANJARO, LLVM 6.0.0)
So, this problem curiously seems to avoided by the POLARIS11 OpenCL support and not specific to the Mesa version itself.
volatile has been legal in OpenCL code at least since OpenCL 1.1 and is used in the above projects to improve register allocation (without which AMD hardware gets a performance penalty).
mesa/clover does not touch the clc code. that is processed by clang/llvm.
can you run with CLOVER_DEBUG=llvm,native CLOVER_DEBUG_FILE=dump-file and post the files?
My home system has an RX580 (Polaris10).
I just cloned/built CLBlast and it appears to be running ./clblast_tuner_xgemm (mentioned as the specific failing case in https://github.com/CNugteren/CLBlast/issues/298) successfully, so I don't know that the issue is polaris10 specific.
In my case, I've got Mesa 18.2.0-b21b38c46cd61d9 (latest upstream as of the last hour or so), libclc r335280 (latest revision since late June), and llvm 7.0.0-svn as of r337934 (a few minutes ago).
I did notice that the LLVM version for both failing cases is different than the passing ones, so I went and downgraded to llvm 6.0.1... but it still works.
w/ LLVM 6.0.1 (first section of 578 tests):
Found best result 1.43 ms: 1505.0 GFLOPS
w/ LLVM 7.0.0svn:
Found best result 1.50 ms: 1428.2 GFLOPS
I'd agree with Jan that a dump of the dumped llvm bitcode would be useful. Also, it may be interesting to try to upgrade libclc or mesa to the latest upstream code to see if one of those has an effect.
You can see in my original comment that there are passes with LLVM 6.0.0 and failures for LLVM 6.0.1. So indeed, it can't be LLVM either.
I'm not sure if it's possible for libclc to be different between those, but it looks like it must be?
I asked one of the people suffering from this bug if he could provide the requested output. The affected distro appears to be Debian, so maybe we're looking at the version of this: https://packages.debian.org/sid/libclc-amdgcn
My original theory (before I tested 6.0.1 myself) was that the spectre mitigation changes that went into 6.0.1 broke something about compilation of CL kernels, but that doesn't seem to have been the case (I did go through the entire commit history of 6.0.0->6.0.1 to try to identify things that looked suspicious).
I do run Ubuntu at home, so it would be feasible for me to at least install the libclc version they've got and give it a spin. LLVM/Mesa might be a bit more work, but if we run out of other options, I might give it a try. I'd have to downgrade my whole stack, but it's possible they're compiling with different flags/features which changes behavior somehow.
I think for now, it's worth giving the original reporter a bit of time to try to dump the bitcode and get us a bit more info that might help us reproduce/diagnose this since it seems there's something version/distro specific possibly at play. It might be useful to know the/an exact build/tag/revision that is failing for CLBlast as well, just to eliminate that. I'm assuming that the latest git master is broken, but feel free to tell me otherwise.
And to answer the implied question: libclc supports multiple LLVM versions, and we've supported LLVM 6.0.x in libclc for a while now (and still build against 3.9 - 7.0.0svn).
I reverted llvm to the 6.0.0 version packaged in the padoka ppa (from my 7.0svn build) and libclc to the package manager's version and managed to reproduce the failure.
mesa..: 18.2-dev (b21b38c46cd)
Since mesa hadn't changed and LLVM didn't seem to be the issue, I upgraded libclc to the current upstream revision (still built using llvm 6.0.0), and it started working.
Figuring that debian's version included the git checkout date, I checked out the latest code as of both Mar 8 and Mar 12, but both of those worked as well.
So, I checked out the debian sources and re-built their .debs on my system, those failed.
They've patched configure.py in libclc to add in some debian-specific build flags. When I copied those changes to upstream libclc as of 20180312 it started failing in the same way. Those same changes also break the current upstream libclc source.
The specific lines that debian have patched libclc's configure.py with that break things seem to be:
# add in debian build flags
proc_cflags = Popen(("dpkg-buildflags","--get","CFLAGS"), stdout=PIPE)
clang_bc_flags = " " + proc_cflags.communicate().strip("\n")
Where the output of 'dpkg-buildflags --get CFLAGS' for me is:
-g -O2 -fdebug-prefix-map=/home/awatry/src/libclc=. -fstack-protector-strong -Wformat -Werror=format-security
Something in there is breaking the bitcode compilation for libclc in debian.
Passing "-fstack-protector-strong" into clang_bc_flags is what's breaking libclc for debian.
Is this LLVM miscompiling the library, or LLVM trying to do stack protection on AMD GCN code?
For background, this is the CFLAGS that Debian passes when compiling all C programs. From the GCC man page:
Emit extra code to check for buffer overflows, such as stack smashing
attacks. This is done by adding a guard variable to functions with
vulnerable objects. This includes functions that call "alloca", and
functions with buffers larger than 8 bytes. The guards are initialized when
a function is entered and then checked when the function exits. If a guard
check fails, an error message is printed and the program exits.
Like -fstack-protector but includes additional functions to be protected ---
those that have local array definitions, or have references to local frame
Of course GPUs are different so this may not be appropriate. I'm not familiar with how OpenCLs work so please confirm what the most appropriate solution is for Debian, either:
- append -fno-stack-protector to the Debian flags, if the other flags are appropriate/relevant here
- omit all the flags completely and leave clang_bc_flags alone, if this is such a special thing that Debian's system-policy CFLAGS should be completely ignored
I guess a third possibility is that stack protectors are actually relevant for GPUs but Clang/LLVM is not generating correct code for those in this case.
GPUs don't really have a stack (not for data anyway) and AMDGCN backend currently inlines all function calls anyway.
I'm not sure what kind of checks the flag adds.
If anyone can upload the different libclc bitcode it should be easy to spot.
My guess would be that it adds some initialized global variables used in internal checks. This is illegal in CLC. only variables in constant address space can have initializers.
IMO Debian should not be arbitrarily adding compilation flags unless they know what they're doing and they have tested the resulting package.
local variables are stored in private address space which is backed either by register file, or private buffers.
using volatile, to control where it is located, is a rather hacky workaround of suboptimal register allocation/instruction scheduling.
Reporting a llvm bug with a reproducer can help. it'd need to be reproducible using llvm-7, there are no further releases of llvm-5 or 6 planned.
So the bugs are:
1) Debian adding flags even if they don't make sense (i.e. stack protection on GPU code). Should be filed in Debian against libclc.
2) LLVM respecting that flag even if it shouldn't. Should be checked if it still happens with LLVM 7 and filed against LLVM if so.
Created attachment 140907 [details]
clover debug dump
Attached a dump, it's 1 out of several thousand files and compressed is 11MB.
Did you get a chance to analyze my dump?
I filed a LLVM bug here https://bugs.llvm.org/show_bug.cgi?id=38979
Didn't yet get a chance to test it with LLVM 7 yet but I didn't manage to find any related bugs in their tracker so perhaps nobody fixed it yet.
(In reply to infinity0 from comment #14)
> Did you get a chance to analyze my dump?
> I filed a LLVM bug here https://bugs.llvm.org/show_bug.cgi?id=38979
> Didn't yet get a chance to test it with LLVM 7 yet but I didn't manage to
> find any related bugs in their tracker so perhaps nobody fixed it yet.
sorry for the late reply. given the already strained time/resources, this is not a priority. The fix to this bug is for Debian to stop modifying libclc cflags.
There's a fix for this in the Debian experimental package for libclc. If you haven't already done so and this bug affects you, please test and give feedback on the Debian bug.
Pkg change detail: https://tracker.debian.org/news/994503/accepted-libclc-020git20180917-2-source-into-experimental/
Debian bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=904718