Created attachment 124469 [details]
Histogram kernel backtrace
If I compile ImageMagick with OpenCL support, the current version segfaults when I attempt to do any operation which computes a histogram. This is actually done frequently internally, so it's more crippling than it first appears.
1. Get the current latest ImageMagick from GitHub: https://github.com/ImageMagick/ImageMagick
2. Compile it with OpenCL support (--enable-opencl flag)
3. Get an image, and try to equalize its colors (based on the histogram):
convert input.png -equalize output.png
4. The segfault will occur
LLVM version: r272184
Mesa version: commit d5491a8
ImageMagick version: ade0d8e
The source code for the Histogram kernel is in MagickCore/accelerate-kernels-private.h, starting on line 1283.
That's because global atomics are not implemented in LLVM backend (despite driver advertising the extensions).
Shouldn't be too hard to them.
(In reply to jano.vesely from comment #1)
> That's because global atomics are not implemented in LLVM backend (despite
> driver advertising the extensions).
> Shouldn't be too hard to them.
Are you sure the hardware has the necessary instructions? Tom Stellard commented on #radeon back in November that implementing global atomics for r600g will be very very difficult.
I suspected it was that, based on the word "Atomic" being at fault.
Unfortunately, I cannot assist, as I'm not an expert on parallel architecture and spending the last half hour reading AMD's documentation won't make me one. But I would ask one question: is anyone even working on this? I'd hate to write off ImageMagick on this platform.
The instructions are there for EG+. They're tricky to implement because of the register requirements. However, there might be some weird interaction when using CB with the current stores, and loads from the same WI.
I had a version that passed noret variant of global atomic piglit tests, but I'm not sure piglit tests cover all situations.
Can you dig the patch somewhere? I have a Caicos at hand and I would really like to test atomic compare-and-swap.
(In reply to Vedran Miletić from comment #5)
> Can you dig the patch somewhere? I have a Caicos at hand and I would really
> like to test atomic compare-and-swap.
you can try my github branch , you'll need the last 5 patches (2-3 can be skipped since it's just revert).
but I rebase a lot so things might brake.
It passes noret piglit tests, and _noret should work OK, _return variants will give LLVM ERROR: Cannot select.
uncommenting the FIXME parts enables return variants, but it will brake for more than one WI as every WI uses address 0 as scratchpad.
Normally, I would be asking in this comment if there is any progress on this, but I can tell there has been... change, at least.
Now instead of crashing, it deadlocks the calling process waiting on a mutex. Forever, as far as I can tell.
LLVM version: r289770
Mesa version: commit 47351b843a
Perhaps this is more easily fixed? Or is this getting to the heart of the matter implied in comment 2?
(In reply to nixscripter from comment #7)
> Normally, I would be asking in this comment if there is any progress on
> this, but I can tell there has been... change, at least.
> Now instead of crashing, it deadlocks the calling process waiting on a
> mutex. Forever, as far as I can tell.
> LLVM version: r289770
> Mesa version: commit 47351b843a
> Perhaps this is more easily fixed? Or is this getting to the heart of the
> matter implied in comment 2?
the hang is probably a separate bug. ImageMagick test suite results on my Turks GPU are:
# TOTAL: 86
# PASS: 78
# SKIP: 0
# XFAIL: 0
# FAIL: 3
# XPASS: 0
# ERROR: 5
the errors and failures are accompanied by:
Assertion `i < getNumRegs() && "Register number out of range!"' failed.
The patches to add support for noret atomics (should be good enough for histogram) are at:
I have downloaded LLVM r290690 which contains your first diff, manually applied your 2nd diff, and compiled it. The resulting library seems to resolve the issue.
Once that patch is reviewed and applied to the mainline, I will do one final verification and (hopefully) close this bug.
Thanks for all your efforts on this!
(In reply to nixscripter from comment #9)
> I have downloaded LLVM r290690 which contains your first diff, manually
> applied your 2nd diff, and compiled it. The resulting library seems to
> resolve the issue.
> Once that patch is reviewed and applied to the mainline, I will do one final
> verification and (hopefully) close this bug.
> Thanks for all your efforts on this!
I have just landed the patch. Note that if the kernel uses sub-int(short,char) vectors. it will generate incorrect code until D27964 (or alternative) is merged.
I have just tried LLVM r292714, and it does indeed work as advertised. Marking RESOLVED FIXED.