Bug 70779 - OpenCL hangs with big kernels
Summary: OpenCL hangs with big kernels
Status: RESOLVED DUPLICATE of bug 60879
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-22 19:36 UTC by Hristo Venev
Modified: 2017-03-22 16:05 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg after running the uint_add16() kernel (29.83 KB, text/plain)
2016-02-14 11:41 UTC, EoD
Details

Description Hristo Venev 2013-10-22 19:36:31 UTC
Radeon 7870 XT.
A 50-word kernel works, a 67-word one hangs. Other tests confirm this.
See bug 60879
Comment 1 Hristo Venev 2013-11-06 09:38:54 UTC
Bump.
Comment 2 Hristo Venev 2014-01-08 19:38:49 UTC
Bump.
Comment 3 EoD 2016-02-14 04:39:50 UTC
Do you have an example of such a kernel?
Comment 4 Hristo Venev 2016-02-14 07:11:11 UTC
Here are two kernels that fail:
__kernel void uint_div(__global const uint *a,  __global const uint *b, __global uint *c){
    c[0]=a[0]/b[0];
}
__kernel void uint_add16(__global const uint *a,  __global const uint *b, __global uint *c){
    for(uint i=0;i<16;i++) c[i]=a[i]+b[i];
}

This one works:
__kernel void uint_add(__global const uint *a,  __global const uint *b, __global uint *c){
    c[0]=a[0]+b[0];
}

Sadly I don't have access to the hardware anymore (it's probably in a dump somewhere).
Comment 5 EoD 2016-02-14 11:41:38 UTC
Created attachment 121742 [details]
dmesg after running the uint_add16() kernel

(In reply to Hristo Venev from comment #4)
> Here are two kernels that fail:
> __kernel void uint_div(__global const uint *a,  __global const uint *b,
> __global uint *c){
>     c[0]=a[0]/b[0];
> }
> __kernel void uint_add16(__global const uint *a,  __global const uint *b,
> __global uint *c){
>     for(uint i=0;i<16;i++) c[i]=a[i]+b[i];
> }

I actually can confirm that the 2nd kernel does cause a GPU stall in the radeon driver (r600/Barts). It causes no problem with amdgpu (radeonsi/Tonga).

I am using kernel 4.5.0-rc3, current llvm 3.8 branch and current mesa git.

As I am not overly good with OpenCL, is this kernel somehow problematic?
Comment 6 Michel Dänzer 2016-02-15 06:13:00 UTC
I think this is probably a duplicate of bug 60879 and that it was basically luck that "small kernels" didn't hang on Hristo's card.
Comment 7 Vedran Miletić 2017-03-22 16:05:46 UTC
(In reply to EoD from comment #5)
> I actually can confirm that the 2nd kernel does cause a GPU stall in the
> radeon driver (r600/Barts). It causes no problem with amdgpu
> (radeonsi/Tonga).
> 
> I am using kernel 4.5.0-rc3, current llvm 3.8 branch and current mesa git.
> 
> As I am not overly good with OpenCL, is this kernel somehow problematic?

EoD, if these stalls on r600 are still reproducible, please open a separate bug.

*** This bug has been marked as a duplicate of bug 60879 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.