Bug 102233

Summary: OpenCL segmentation fault on AMD Radeon (Kaveri+R7) with memtestCL
Product: Mesa Reporter: Senad <senad.jahic>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED NOTOURBUG QA Contact: Default DRI bug account <dri-devel>
Severity: critical    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 99553    

Description Senad 2017-08-15 14:06:16 UTC
Hardware:
AMD A8-7600 with Radeon™ R7 Series
Sapphire R7 240 4GB
ASUS R7 240 4GB

OS: Ubuntu server 17.04 64bit (headless mode).
OpenCL Driver: Padoka


Description:
clinfo works, shows all three GPU platforms.
Sometimes I get segmentation fault using memtestCL.
I get 100% crashes with YACMiner (crypto mining software).


I've traced memtestCL down to this line:
https://github.com/ihaque/memtestCL/blob/607499a54dcac9b846fcc827dc5980d8e5f66cec/memtestCL_cli.cpp#L126

On repeating runs, clGetDeviceIDs sometimes returns 0 for num_cpu and sometimes 1.
If it returns 1, clGetDeviceInfo later crashes when called with device id that does not exist.

If I hardcode "num_devices=3" at offending line, I can run memtestCL on all three devices successfully without faults.
Comment 1 Jan Vesely 2017-09-15 22:41:37 UTC
What is the mesa/llvm version?
can you post clinfo?
do you use ocl-icd library? if so can you run setting
OCL_ICD_VENDORS=/etc/OpenCL/vendors/mesa.icd
Comment 2 Jan Vesely 2017-12-22 12:16:15 UTC
Can you try initializing the num_gpu, num_cpu, num_accel variables?
clGetDeviceIDs returns CL_DEVICE_NOT_FOUND error if there are no devices matching the requested type. This does not set num_devices output parameter to 0.

Unless the variables are initialized to 0, the loop on line 135 accesses devids array out of bounds.
Comment 3 Jan Vesely 2017-12-22 12:17:45 UTC
(In reply to Senad from comment #0)
> I get 100% crashes with YACMiner (crypto mining software).

please report a separate bug for YACMiner
Comment 4 Jan Vesely 2018-05-10 19:06:18 UTC
This PR should fix memtestCL:
https://github.com/ihaque/memtestCL/pull/9

feel free to reopen if you still see the problem.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.