Bug 82717

Summary: OpenCL support for mandelbulber-opencl
Product: Mesa Reporter: Christoph Haag <haagch>
Component: Gallium/StateTracker/CloverAssignee: mesa-dev
Status: RESOLVED MOVED QA Contact: mesa-dev
Severity: normal    
Priority: medium CC: frederic.romagne
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 87738    
Bug Blocks: 99553    
Attachments: Add -cl-denorms-are-zero to clang

Description Christoph Haag 2014-08-17 00:56:08 UTC
Website of the program here: https://sites.google.com/site/mandelbulber/ (Watch out, it uses a weird install script)

Start it, go to the OpenCL tab and try to enable OpenCL.

A few things that probably need to be addressed. I applied some workarounds to see them all and I don't even know if the OpenCL programs would work after removing all of that stuff.


Problem 1:
OpenCL Build log:
ERROR: Program::build() (-43)

That's because of the build paramater -cl-denorms-are-zero that is hardcoded in the source code of mandelbulber.

Workarodund 1:
sed -i 's/-cl-denorms-are-zero / /g' src/cl_support.cpp


Problem 2:
OpenCL Build log:       input.cl:34:1: error: OpenCL does not support the 'static' storage class specifier

So that's not valid in OpenCL 1.1, but for the sake of getting it to "just compile" at least:

Workaround 2:
for i in /usr/share/mandelbulber/cl/*; do sed -i "s/static //g" $i; done


Problem 3:
OpenCL Build log:       input.cl:323:10: error: cannot combine with previous 'type-name' declaration specifier
input.cl:323:15: error: expected identifier or '('
input.cl:324:8: error: expected identifier or '('
input.cl:325:32: error: expected expression
�

(WTF is random binary data doing here?)

That's a weird one that seems to come from whatever clang does. I think clang thinks that "half" is already defined in the OpenCL kernels. Not sure if this is intended.

Workaround 3:
for i in /usr/share/mandelbulber/cl/*; do sed -i "s/half /halfFOO /g" $i; done
for i in /usr/share/mandelbulber/cl/*; do sed -i "s/half)/halfFOO)/g" $i; done


Problem 4:
OpenCL Build log:       �}�
OpenCL program built done
OpenCL kernel opened
OpenCL workgroup size: 256
OpenCL Job size: 480256
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

I don't think the crash itself is the fault of mesa opencl, because the backtrace looks like this:

#0  0x00007fd1154e2d67 in raise () from /usr/lib/libc.so.6
#1  0x00007fd1154e4118 in abort () from /usr/lib/libc.so.6
#2  0x00007fd115dd81f5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#3  0x00007fd115dd6076 in ?? () from /usr/lib/libstdc++.so.6
#4  0x00007fd115dd60c1 in std::terminate() () from /usr/lib/libstdc++.so.6
#5  0x00007fd115dd62d8 in __cxa_throw () from /usr/lib/libstdc++.so.6
#6  0x00007fd115dd6869 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6
#7  0x00007fd115dd68c9 in operator new[](unsigned long) () from /usr/lib/libstdc++.so.6
#8  0x00000000004a3683 in CclSupport::InitFractal (this=0x207e200) at ../src/cl_support.cpp:451
#9  0x000000000042cffb in ChangedOpenClEnabled (widget=0x22bdd60, data=0x0) at ../src/callbacks.cpp:2896
#10 0x00007fd11673b3d8 in g_closure_invoke () from /usr/lib/libgobject-2.0.so.0
#11 0x00007fd11674cd5d in ?? () from /usr/lib/libgobject-2.0.so.0

But why is there random binary data in the OpenCL build log?

llvm-svn 215809
libclc-git 165.20140816
opencl-headers12 1:1.2.r26859 (not sure if relevant)
mesa git from today

radeon 7970M pitcairn
Comment 1 Christoph Haag 2014-08-17 08:26:07 UTC
Small additions:

With these patches the binary garbage from the build log is gone:
http://patchwork.freedesktop.org/patch/31309/
http://patchwork.freedesktop.org/patch/31311/
http://patchwork.freedesktop.org/patch/31310/
So this is already fixed.

Problem 4 is not related to mesa I think. The problem is that there is a huge buffer because CL_DEVICE_MAX_COMPUTE_UNITS is so big.

Curiously CL_DEVICE_MAX_COMPUTE_UNITS gets increased by 100 every time my gpu wakes up from runpm. I have reduced it to this program:

#include <CL/cl.hpp>
#include <iostream>
int main() {
  int err, numberOfComputeUnits = 0;
  std::vector<cl::Platform> platformList;
  cl::Platform::get(&platformList);
  cl_context_properties cprops[3] = { CL_CONTEXT_PLATFORM, (cl_context_properties) (platformList[0])(), 0 };
  cl::Context *context = new cl::Context(CL_DEVICE_TYPE_GPU, cprops, NULL, NULL, &err);
  std::vector<cl::Device> devices = context->getInfo<CL_CONTEXT_DEVICES>();
  devices[0].getInfo(CL_DEVICE_MAX_COMPUTE_UNITS, &numberOfComputeUnits);
  std::cout << "OpenCL Number of compute units: " << numberOfComputeUnits << std::endl;;
  return 0;
}

Every time my gpu goes to sleep and wakes up this increases by 100.

I have reduced numberOfComputeUnits in the code to 1.
Then it finally correctly says "clCreateImage() not supported by OpenCL 1.1."
So this will then have to wait for OpenCL 1.2.

But to me it seems these two things should be addressed for 1.0/1.1 already:
* "-cl-denorms-are-zero"
* "cannot combine with previous 'type-name' declaration specifier" (at least get a better error message into clang)
* CL_DEVICE_MAX_COMPUTE_UNITS (??)
Comment 2 Christoph Haag 2014-08-19 09:34:04 UTC
(In reply to comment #1)
> * CL_DEVICE_MAX_COMPUTE_UNITS (??)

Fixed in the kernel: https://bugzilla.kernel.org/show_bug.cgi?id=82581
Comment 3 Tom Stellard 2014-08-19 18:29:15 UTC
Created attachment 104907 [details] [review]
Add -cl-denorms-are-zero to clang

This patch adds the -cl-denorms-are-zero flag to clang, which should fix one of your errors.
Comment 4 Tom Stellard 2014-08-19 19:58:56 UTC
> 
> Problem 3:
> OpenCL Build log:       input.cl:323:10: error: cannot combine with previous
> 'type-name' declaration specifier
> input.cl:323:15: error: expected identifier or '('
> input.cl:324:8: error: expected identifier or '('
> input.cl:325:32: error: expected expression
> �
> 

This is a bug in mandelbulber-oepncl.  It's using half as a variable name, which is not allowed since half is also the name of a type.
Comment 5 Christoph Haag 2014-08-21 18:40:07 UTC
(In reply to comment #4)

Thanks, they fixed it upstream: https://code.google.com/p/mandelbulber/source/detail?r=451

That went quicker than I thought.

"Only" OpenCL 1.2 is missing then I guess.
Comment 6 Tom Stellard 2014-08-22 17:09:14 UTC
(In reply to comment #5)
> (In reply to comment #4)
> 
> Thanks, they fixed it upstream:
> https://code.google.com/p/mandelbulber/source/detail?r=451
> 
> That went quicker than I thought.
> 
> "Only" OpenCL 1.2 is missing then I guess.

Does it require any other OpenCL 1.2 features besides the static keyword on functions?  If not then they should just drop the keyword when using a 1.1 implementation.

Have all the issues with this program been fixed?
Comment 7 Christoph Haag 2014-08-22 17:51:45 UTC
(In reply to comment #6)

> Does it require any other OpenCL 1.2 features besides the static keyword on
> functions? 

Yes. The next one is clCreateImage I believe (not at the PC right now).
Mesa has a stub for it that tells the user is not supported in opencl 1.1.

I haven't gotten around testing the denorms parameter yet, but assuming it works I don't think there's anything to do for OpenCL 1.1.

So close it until there's opencl 1.2 or keep it open and update whether something it uses is implemented, however you prefer it. But it's not very important, so if you have something better to do, please don't go out of your way to support it.
Comment 8 Christoph Haag 2017-03-22 16:43:21 UTC
To document the current state...

Recently I was getting "unsupported call to function get_global_id" so I assumed there was some llvm problem. Turns out if this happens you just need to rebuild libclc for your llvm version.

With these patches mandelbulber-opencl sorta works:
https://cgit.freedesktop.org/~funfunctor/mesa/log/?h=clover-image-support-enabled

When I first tested the patches, I made this Video:
https://www.youtube.com/watch?v=-R-r0CEub74
The rendering actually looks close to how OpenCL from amdgpu-pro renders.

Today it sorta works, but looks worse. Some comparison screenshots:

Default view:
clover: https://i.imgur.com/kct6anR.png
amdgpu-pro: https://i.imgur.com/1FIQ07m.jpg

A little bit zoomed in;
clover: https://i.imgur.com/5SmwL6Q.png
amdgpu: https://i.imgur.com/CcPwcpl.png

Possibly it's just an image format mismatch, see
https://cgit.freedesktop.org/~funfunctor/mesa/commit/?h=clover-image-support-enabled&id=894fb7c558e83534855516b499bf66b33397e1ac
Comment 9 GitLab Migration User 2019-09-18 17:55:36 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/129.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.