Software in use: Up to date mesa, llvm, clang, libclc, firmware, as of 20130910. Gentoo's Linux 3.11. opencl-example-12905ac620b83713b07ece763ff3c36fb3c2e7e5. Hardware in use: AMD A8 5600K APU (Radeon HD 7560D, Aruba), 32GB system RAM, Biostar Hi-Fi A85W motherboard. Steps to reproduce: Run hello_world program from opencl-example. First run works correctly. Second run either completely locks up the machine or locks up the GPU (which recovers after a short time). Same behavior for other tests - the first test completes and the second causes problems. This is what the second opencl-example run looks like: localhost opencl-example-12905ac620b83713b07ece763ff3c36fb3c2e7e5 # ./hello_world There are 1 platforms. There are 1 GPU devices. clCreateContext() succeeded. clCreateCommandQueue() succeeded. clCreateProgramWithSource() suceeded. clBuildProgram() suceeded. clCreateKernel() suceeded. clCreateBuffer() succeeded. clSetKernelArg() succeeded. clEnqueueNDRangeKernel() suceeded. ((( 10 second hang here, or forever if the machine is toast ))) clEnqueueReadBuffer() suceeded. pi = 3.141590 And, here is the dmesg output from the recoverable lockups: [ 1365.806285] radeon 0000:00:01.0: GPU lockup CP stall for more than 10000msec [ 1365.806292] radeon 0000:00:01.0: GPU lockup (waiting for 0x0000000000007ec3 last fence id 0x0000000000007ec2) [ 1365.821261] radeon 0000:00:01.0: Saved 559 dwords of commands on ring 0. [ 1365.821293] radeon 0000:00:01.0: GPU softreset: 0x00000008 [ 1365.821297] radeon 0000:00:01.0: GRBM_STATUS = 0xB0003828 [ 1365.821300] radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x00000007 [ 1365.821304] radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007 [ 1365.821307] radeon 0000:00:01.0: SRBM_STATUS = 0x20000040 [ 1365.821332] radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000 [ 1365.821335] radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [ 1365.821338] radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x40000000 [ 1365.821341] radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00010002 [ 1365.821344] radeon 0000:00:01.0: R_008680_CP_STAT = 0x80220243 [ 1365.821347] radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 1365.821350] radeon 0000:00:01.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [ 1365.821354] radeon 0000:00:01.0: VM_CONTEXT0_PROTECTION_FAULT_ADDR 0x00000000 [ 1365.821357] radeon 0000:00:01.0: VM_CONTEXT0_PROTECTION_FAULT_STATUS 0x00000000 [ 1365.821360] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 1365.821363] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [ 1365.827029] radeon 0000:00:01.0: GRBM_SOFT_RESET=0x00004001 [ 1365.827083] radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00000100 [ 1365.828237] radeon 0000:00:01.0: GRBM_STATUS = 0x00003828 [ 1365.828240] radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x00000007 [ 1365.828243] radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007 [ 1365.828246] radeon 0000:00:01.0: SRBM_STATUS = 0x20000040 [ 1365.828271] radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000 [ 1365.828274] radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [ 1365.828277] radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x00000000 [ 1365.828280] radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00000000 [ 1365.828283] radeon 0000:00:01.0: R_008680_CP_STAT = 0x00000000 [ 1365.828286] radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 1365.828289] radeon 0000:00:01.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [ 1365.828317] radeon 0000:00:01.0: GPU reset succeeded, trying to resume [ 1365.843638] [drm] PCIE GART of 512M enabled (table at 0x0000000000276000). [ 1365.843775] radeon 0000:00:01.0: WB enabled [ 1365.843781] radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8807dea6bc00 [ 1365.844520] radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc900057b5a18 [ 1365.844524] radeon 0000:00:01.0: fence driver on ring 1 use gpu addr 0x0000000020000c04 and cpu addr 0xffff8807dea6bc04 [ 1365.844528] radeon 0000:00:01.0: fence driver on ring 2 use gpu addr 0x0000000020000c08 and cpu addr 0xffff8807dea6bc08 [ 1365.844532] radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8807dea6bc0c [ 1365.844536] radeon 0000:00:01.0: fence driver on ring 4 use gpu addr 0x0000000020000c10 and cpu addr 0xffff8807dea6bc10 [ 1365.863416] [drm] ring test on 0 succeeded in 2 usecs [ 1365.863477] [drm] ring test on 3 succeeded in 2 usecs [ 1365.863485] [drm] ring test on 4 succeeded in 1 usecs [ 1365.909450] [drm] ring test on 5 succeeded in 1 usecs [ 1365.909454] [drm] UVD initialized successfully. [ 1365.936524] [drm] ib test on ring 0 succeeded in 0 usecs [ 1365.937056] [drm] ib test on ring 3 succeeded in 0 usecs [ 1365.937581] [drm] ib test on ring 4 succeeded in 0 usecs [ 1365.958519] [drm] ib test on ring 5 succeeded
Is this a regression? If so, can you bisect?
Created attachment 85794 [details] [review] Don't set DB_DEST or CB_DEST* bit on cp_coher_cntl This patch fixes the hangs for me and all the run_test.sh tests pass. However, this is just a hack and not a proper solution. Can you test this patch?
The patch seems to make it work! rotl tests fail however: Running ./math-int rotl 1 1 2 Failed Running ./math-int rotl 1 32 1 Failed Running ./math-int rotl -1 5 -1 Failed Running ./math-int rotl 4096 23 8 Failed
The component Drivers/DRI/R600 doesn't exist in Mesa.
@Marek Olšák Sorry, should be gallium/r600 right?
(In reply to comment #3) > The patch seems to make it work! > > rotl tests fail however: > > Running ./math-int rotl 1 1 2 > Failed > Running ./math-int rotl 1 32 1 > Failed > Running ./math-int rotl -1 5 -1 > Failed > Running ./math-int rotl 4096 23 8 > Failed Do you the latest code from my opencl-example repo? Older versions were missing the rotl.cl file.
I wonder if this may also be related to bug 69321. Does reverting f0435ebb07d01a77ca0d98967a002898811a5206 also help?
> Do you the latest code from my opencl-example repo? Older versions were missing the rotl.cl file Yes, that was it. It passes all tests here now!
@Alex Deucher: I'm not that familiar with git, can you give me the command to do that, starting with the checked out branch-master gentoo grabs to the build directory? It keeps .git there at least.
git revert f0435ebb07d01a77ca0d98967a002898811a5206
@Alex Deucher: Reverting f0435ebb07d01a77ca0d98967a002898811a5206 also makes it work.
*** This bug has been marked as a duplicate of bug 69321 ***
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.