Bug 102009 - [clover, amdgcn] Blender crashes when compiling OpenCL kernel
Summary: [clover, amdgcn] Blender crashes when compiling OpenCL kernel
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Gallium/StateTracker/Clover (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 99553
  Show dependency treegraph
 
Reported: 2017-08-01 22:22 UTC by Markus
Modified: 2019-09-18 17:56 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Blender crash dump (1.36 KB, text/plain)
2017-08-01 22:22 UTC, Markus
Details
glxinfo (100.94 KB, text/plain)
2017-08-01 22:23 UTC, Markus
Details
clinfo (5.66 KB, text/plain)
2017-08-01 22:24 UTC, Markus
Details
gdb Stacktrace - Ubuntu 17.10 (2.55 KB, text/plain)
2017-10-01 15:47 UTC, Markus
Details
glxinfo - Ubuntu 17.10 (51.88 KB, text/plain)
2017-10-01 15:50 UTC, Markus
Details
clinfo - Ubuntu 17.10 (5.70 KB, text/plain)
2017-10-01 15:50 UTC, Markus
Details

Description Markus 2017-08-01 22:22:43 UTC
Created attachment 133182 [details]
Blender crash dump

On Ubuntu 17.04 with a AMD Radeon R9 380 (Tonga Pro chipset), daily Mesa from Padoka PPA (version 1:17.3~git170731230100.df61a05~z~padoka0), both Blender 2.78 and 2.79 daily build (09eac0159db) crash when the OpenCL kernel is compiled for GPU rendering.


Steps to reproduce:
1. Start Blender with `CYCLES_OPENCL_SPLIT_KERNEL_TEST=1 ./blender` due to missing detection of OpenCL support (probably due to #101594).
2. Under Render settings, switch `Device` to `GPU Compute` and assert this setting is actually used.
3. Start a render and assert the OpenCL kernel is compiled.

Actual behavior:
* Blender crashes during kernel compilation (crash dump of 2.79 build is attached).

Expected behavior:
* Blender renders with OpenCL.

Addtional info:
* `clinfo` and `glxinfo` output is attached as well.
Comment 1 Markus 2017-08-01 22:23:30 UTC
Created attachment 133183 [details]
glxinfo
Comment 2 Markus 2017-08-01 22:24:21 UTC
Created attachment 133184 [details]
clinfo
Comment 3 Jan Vesely 2017-09-14 22:04:29 UTC
the stacktrace does not say much, and it's not similar to segfault I see on my machine. Can you repost the stacktrace with mesa debug information?

thanks
Comment 4 Markus 2017-10-01 15:47:51 UTC
Created attachment 134599 [details]
gdb Stacktrace - Ubuntu 17.10
Comment 5 Markus 2017-10-01 15:50:02 UTC
Created attachment 134600 [details]
glxinfo - Ubuntu 17.10
Comment 6 Markus 2017-10-01 15:50:21 UTC
Created attachment 134601 [details]
clinfo - Ubuntu 17.10
Comment 7 Markus 2017-10-01 15:58:16 UTC
I tried to obtain debug information from Mesa but was unable to do so (i.e. starting Blender with `MESA_DEBUG=context CYCLES_OPENCL_SPLIT_KERNEL_TEST=1 ./blender` did not generate any visible debug information).

What I did instead, is run Blender via gdb which then gave the attached stack trace at the time of crash. Looking at the trace, it appears like the crash is inside LLVM.

What is the best way to debug this further?

P.S.: As I just updated to Ubuntu 17.10 (Beta 2), I've also attached new glxinfo and clinfo output.
Comment 8 Jan Vesely 2017-11-14 23:23:33 UTC
(In reply to Markus from comment #7)
> I tried to obtain debug information from Mesa but was unable to do so (i.e.
> starting Blender with `MESA_DEBUG=context CYCLES_OPENCL_SPLIT_KERNEL_TEST=1
> ./blender` did not generate any visible debug information).
> 
> What I did instead, is run Blender via gdb which then gave the attached
> stack trace at the time of crash. Looking at the trace, it appears like the
> crash is inside LLVM.
> 
> What is the best way to debug this further?
> 
> P.S.: As I just updated to Ubuntu 17.10 (Beta 2), I've also attached new
> glxinfo and clinfo output.

you can use CLOVER_DEBUG=clc,llvm,native CLOVER_DEBUG_FILE=blender
to force clover to dump compiled CL programs (it should produce several dump files for .clc .ll .asm). Make sure the kernels are compiled and not loaded fomr ~/.cache/cycles/kernels
From there you can run and debug LLVM on the command line.

Note that I have been unable to reproduce this.
blender-2.79 on
OpenCL 1.1 Mesa 17.4.0-devel (git-138adc72e7)
AMD Radeon R7 Graphics (CARRIZO / DRM 3.18.0 / 4.11.0-ROC-SC, LLVM 5.0.1)
AMD Radeon (TM) R7 M340 (ICELAND / DRM 3.18.0 / 4.11.0-ROC-SC, LLVM 5.0.1)

instead it hits MAX_GLOBAL_BUFFERS assertion in radeonsi pipe driver.
bumping the limit renders a picture albeit much different from CPU rendering
Comment 9 Markus 2019-01-04 15:46:18 UTC
After installing rocm 2.0 on Ubuntu 18.10 this is now working without crashes in Blender 2.79.

Issue can be considered resolved from my perspective.
Comment 10 Jan Vesely 2019-01-04 15:47:44 UTC
(In reply to Markus from comment #9)
> After installing rocm 2.0 on Ubuntu 18.10 this is now working without
> crashes in Blender 2.79.
> 
> Issue can be considered resolved from my perspective.

clover is not part of rocm
Comment 11 GitLab Migration User 2019-09-18 17:56:25 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/140.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.