Hi, This morning I recompiled mesa and found that the OpenCL support was broken. I have managed to bisect the regresion back to commit cc3aeac ( http://cgit.freedesktop.org/mesa/mesa/commit/?id=cc3aeacab64a6928a903f1dbfeaa7c880a8de5a6 ) Strangely, it's nothing related to clover. I am using Arch linux with kernel 3.13.4 and a AMD HD5470. Nothing interesting in dmesg or Xorg logs. If I can give you any more information, just ask.
Hi Bruno What you mean with "broken" here ? If you're talking about a compilation problem take a look at bug 75356, which has a patch to resolve it. If you are having different a problem let us know what it is :)
Hi Emil, No, it's not a compilation error, nor for mesa nor for opencl code. It's just that OpenCL programs crash with segfaults. Every test from http://cgit.freedesktop.org/~tstellar/opencl-example/ fails and its 'hello_world' program crash with a segfault. As the code changed in that bug has nothing to do with clover, maybe the problem is with my configuration? Here's what I pass to autogen.sh, surely there's something I don't need, but I took them from a PKGBUILD: --prefix=/usr \ --sysconfdir=/etc \ --with-dri-driverdir=/usr/lib/xorg/modules/dri \ --with-gallium-drivers=r600,swrast\ --with-dri-drivers=swrast \ --enable-gallium-llvm \ --enable-egl \ --enable-gallium-egl \ --with-egl-platforms=x11,drm,wayland \ --enable-shared-glapi \ --enable-gbm \ --enable-gallium-gbm \ --enable-glx-tls \ --enable-dri \ --enable-glx \ --enable-osmesa \ --enable-texture-float \ --enable-xa \ --enable-vdpau \ --enable-omx \ --with-llvm-shared-libs \ --enable-opencl --enable-opencl-icd \ --with-clang-libdir=/usr/lib If there's anything else I can do to help, just ask. Thanks!
Strange I do not see how the commit will cause other than compilation issues. FWIW might be worth double-checking that the bisect went fine and attaching a back trace of the segfault.
I am also very surpised of what commit seems to start this. I have done the bisect making Arch packages, installing and then testing them. So, unless I have missed something, which is also possible, that's it. I have recompiled at commit cc3aeac with debug information, but for some strange reason, gdb don't want to step into OpenCL functions. Here's what I have guessed: - Actually, the segfault comes from a fprintf with a "%s" and a null pointer. It can be solved by just adding a default case to 'clUtilErrorString'. - The real problem happens with 'clGetPlatformIDs', which returns an error value of '-1001'. I have triggered the return of 'CL_INVALID_VALUE', and tried various combinations of parameters to see if it changed anything. And seems to be one thing or the other. I have checked the code at mesa/src/gallium/state_trackers/clover/api/platform.cpp (where clGetPlatformIDs is) and have no clue how it can be possible. Sorry if this isn't enough information, but I completely clueless of what can be happening. I will check again my packages to see if I have compiled some version and have called it other. If I can help with anything else, just ask.
(In reply to comment #4) > I am also very surpised of what commit seems to start this. I have done the > bisect making Arch packages, installing and then testing them. So, unless I > have missed something, which is also possible, that's it. > > I have recompiled at commit cc3aeac with debug information, but for some > strange reason, gdb don't want to step into OpenCL functions. > > Here's what I have guessed: > > - Actually, the segfault comes from a fprintf with a "%s" and a null > pointer. It can be solved by just adding a default case to > 'clUtilErrorString'. > > - The real problem happens with 'clGetPlatformIDs', which returns an error > value of '-1001'. > > I have triggered the return of 'CL_INVALID_VALUE', and tried various > combinations of parameters to see if it changed anything. And seems to be > one thing or the other. > > I have checked the code at > mesa/src/gallium/state_trackers/clover/api/platform.cpp (where > clGetPlatformIDs is) and have no clue how it can be possible. > > Sorry if this isn't enough information, but I completely clueless of what > can be happening. > > I will check again my packages to see if I have compiled some version and > have called it other. > > If I can help with anything else, just ask. Most likely you're getting that segfault somewhere in the ICD loader because it's unable to load Mesa's ICD library. I guess that this hunk: +if NEED_WINSYS_XLIB +AM_CPPFLAGS += -DHAVE_WINSYS_XLIB +endif pulls in the XLIB pipe-loader back-end that was previously ifdef-ed out in Clover builds, leading to undefined symbols in the resulting library.
(In reply to comment #5) > > Most likely you're getting that segfault somewhere in the ICD loader because > it's unable to load Mesa's ICD library. I guess that this hunk: > > +if NEED_WINSYS_XLIB > +AM_CPPFLAGS += -DHAVE_WINSYS_XLIB > +endif > > pulls in the XLIB pipe-loader back-end that was previously ifdef-ed out in > Clover builds, leading to undefined symbols in the resulting library. Would that not cause the build/link to fail ? Hmm guess not, since the opencl target is missing -no-undefined. Francisco, Is there any particular reason why we do not use -no-undefined for opencl ? Bruno, Feel free to grab the patch from bug 75356, which should handle the symbol problems and continue from there.
Hi Francisco, The segfaults were caused because 'clGetPlatformIDs' returned an strange error (-1001), and when passed to 'clUtilErrorString' (from 'cl_util.c') it meant an unhandled error case. So it returned nothing, and when fprintf tries to write it it gives a segfault. Emil, I'll try that patch as soon as I can. Thanks!
Hi, I'm afraid that that patch doesn't help. I have also tried the patch you have sent to the Mailing List ( http://lists.freedesktop.org/archives/mesa-dev/2014-February/054780.html ) but also nothing. If there's anything else I can do, just ask. Thanks!
(In reply to comment #8) > Hi, > > I'm afraid that that patch doesn't help. I have also tried the patch you > have sent to the Mailing List ( > http://lists.freedesktop.org/archives/mesa-dev/2014-February/054780.html ) > but also nothing. > Interesting that patch you've linked should have caused build breakage as there is yet another missing symbol/reference :\ Just pushed a few patches that should resolve the missing symbols within pipe-loader, used by opencl. Checkout latest master and give it a try.
Hi, The latest master branch works perfectly. Thanks a lot!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.