The dense_blas-bench-opencl benchmark from ViennaCL suite fails with doubles on AMD CYPRESS (DRM 2.43.0, LLVM 3.8.0): $ ./dense_blas-bench-opencl ---------------------------------------------- Device Info ---------------------------------------------- Name: AMD CYPRESS (DRM 2.43.0, LLVM 3.8.0) Vendor: AMD Type: GPU Available: 1 Max Compute Units: 10 Max Work Group Size: 256 Global Mem Size: 1073741824 Local Mem Size: 32768 Local Mem Type: 1 Host Unified Memory: 1 Benchmark : BLAS ---------------- sCOPY : 64.3 GB/s sAXPY : 95.4 GB/s sDOT : 85.3 GB/s sGEMV-N : 20.8 GB/s sGEMV-T : 44.3 GB/s sGEMM-NN : 126 GFLOPs/s sGEMM-NT : 87.6 GFLOPs/s sGEMM-TN : 90.5 GFLOPs/s sGEMM-TT : 72.3 GFLOPs/s ---- Build Status = -2 ( Err = -11 ) Log: unsupported call to function __subdf3 in av_cpu Sources: #pragma OPENCL EXTENSION cl_khr_fp64 : enable __kernel void av_cpu( __global double * vec1, uint4 size1, ... It looks like DFmode (double) instructions are not enabled correctly in LLVM for targets that report cl_khr_fp64 extension. clinfo reports: Number of platforms 1 Platform Name Clover Platform Vendor Mesa Platform Version OpenCL 1.1 MESA 11.2.2 Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd Platform Extensions function suffix MESA Platform Name Clover Number of devices 1 Device Name AMD CYPRESS (DRM 2.43.0, LLVM 3.8.0) Device Vendor AMD Device Vendor ID 0x1002 Device Version OpenCL 1.1 MESA 11.2.2 Driver Version 11.2.2 Device OpenCL C Version OpenCL C 1.1 Device Type GPU Device Profile FULL_PROFILE Max compute units 10 Max clock frequency 850MHz Max work item dimensions 3 Max work item sizes 256x256x256 Max work group size 256 Preferred work group size multiple 64 Preferred / native vector sizes char 16 / 16 short 8 / 8 int 4 / 4 long 2 / 2 half 0 / 0 (n/a) float 4 / 4 double 2 / 2 (cl_khr_fp64) Half-precision Floating-point support (n/a) Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Correctly-rounded divide and sqrt operations No Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations No ...
Jan, does R600 backend in LLVM have relevant instructions implemented?
(In reply to Vedran Miletić from comment #1) > Jan, does R600 backend in LLVM have relevant instructions implemented? I don't think they are.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/589.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.