Bug 102233 - OpenCL segmentation fault on AMD Radeon (Kaveri+R7) with memtestCL
Summary: OpenCL segmentation fault on AMD Radeon (Kaveri+R7) with memtestCL
Status: RESOLVED NOTOURBUG
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 99553
  Show dependency treegraph
 
Reported: 2017-08-15 14:06 UTC by Senad
Modified: 2018-05-10 19:06 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Senad 2017-08-15 14:06:16 UTC
Hardware:
AMD A8-7600 with Radeon™ R7 Series
Sapphire R7 240 4GB
ASUS R7 240 4GB

OS: Ubuntu server 17.04 64bit (headless mode).
OpenCL Driver: Padoka


Description:
clinfo works, shows all three GPU platforms.
Sometimes I get segmentation fault using memtestCL.
I get 100% crashes with YACMiner (crypto mining software).


I've traced memtestCL down to this line:
https://github.com/ihaque/memtestCL/blob/607499a54dcac9b846fcc827dc5980d8e5f66cec/memtestCL_cli.cpp#L126

On repeating runs, clGetDeviceIDs sometimes returns 0 for num_cpu and sometimes 1.
If it returns 1, clGetDeviceInfo later crashes when called with device id that does not exist.

If I hardcode "num_devices=3" at offending line, I can run memtestCL on all three devices successfully without faults.
Comment 1 Jan Vesely 2017-09-15 22:41:37 UTC
What is the mesa/llvm version?
can you post clinfo?
do you use ocl-icd library? if so can you run setting
OCL_ICD_VENDORS=/etc/OpenCL/vendors/mesa.icd
Comment 2 Jan Vesely 2017-12-22 12:16:15 UTC
Can you try initializing the num_gpu, num_cpu, num_accel variables?
clGetDeviceIDs returns CL_DEVICE_NOT_FOUND error if there are no devices matching the requested type. This does not set num_devices output parameter to 0.

Unless the variables are initialized to 0, the loop on line 135 accesses devids array out of bounds.
Comment 3 Jan Vesely 2017-12-22 12:17:45 UTC
(In reply to Senad from comment #0)
> I get 100% crashes with YACMiner (crypto mining software).

please report a separate bug for YACMiner
Comment 4 Jan Vesely 2018-05-10 19:06:18 UTC
This PR should fix memtestCL:
https://github.com/ihaque/memtestCL/pull/9

feel free to reopen if you still see the problem.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.