Bug 72785

Summary: bfgminer --scrypt OpenCL on Clover RadeonSI
Product: Mesa Reporter: Christoph Haag <haagch>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium CC: darkbasic, luke-jr+freedesktopbugs
Version: git   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 99553    
Attachments: bfgminer with original scrypt kernel and RADEON_DUMP_SHADERS=1 loop error
R600_DEBUG=cs bfgminer -v1 --intensity 8 --shaders 8192 --scrypt -S opencl:auto ...
Patch to test
crashing again with patch from #8
crash with patches from the pyrit bug

Description Christoph Haag 2013-12-17 12:19:35 UTC
bfgminer --benchmark without scrypt seems to be working fine now.

But I think many people would rather compute scrypt right now on their 7xxx+ GPUs, just considering the recent rumors that the radeon 7950 is getting sold out because of litecoin miners.

With mesa/llvm it doesn't work yet.

llvm 3.5 197392, mesa and stuff are very recent git builds.

The kernel is https://github.com/luke-jr/bfgminer/blob/bfgminer/scrypt130511.cl

bfgminer is started with

bfgminer -v1 --scrypt -S opencl:auto --url=<pool> --userpass=<user:worker:pass>

The error is:

Initialising kernel scrypt130511.cl without bitalign, 1 vectors and worksize 256

[...]

LLVM ERROR: Cannot select: 0x7f0b7cbd40a0: ch = br_cc 0x7f0b7cd32bb0, 0x7f0b7cc28c20, 0x7f0b7cd32fb0, 0x7f0b7cc2bc30, 0x7f0b7cc2ad30 [ORD=19251] [ID=4253]
  0x7f0b7cd32fb0: i1,ch = llvm.SI.loop 0x7f0b7cc2c030:1, 0x7f0b7cd333b0, 0x7f0b7cc2c030 [ORD=19250] [ID=4250]
    0x7f0b7cd333b0: i64 = TargetConstant<3431> [ID=132]
    0x7f0b7cc2c030: i64,ch = llvm.SI.if.break 0x7f0b7cf2f3e0, 0x7f0b7cc2b530, 0x7f0b7cc2c530, 0x7f0b7cd339b0 [ORD=19249] [ID=4249]
      0x7f0b7cc2b530: i64 = TargetConstant<3428> [ID=129]
      0x7f0b7cc2c530: i1 = setcc 0x7f0b7cc7dd60, 0x7f0b7cc80760, 0x7f0b7cd337b0 [ORD=19248] [ID=212]
        0x7f0b7cc7dd60: i64 = add 0x7f0b7cc81560, 0x7f0b7cc81360 [ORD=13153] [ID=187]
          0x7f0b7cc81560: i64,ch = CopyFromReg 0x7f0b7c0535d8, 0x7f0b7cc81660 [ORD=13147] [ID=149]
            0x7f0b7cc81660: i64 = Register %vreg53 [ID=1]
          0x7f0b7cc81360: i64 = Constant<1> [ID=2]
        0x7f0b7cc80760: i64 = Constant<4> [ID=4]
      0x7f0b7cd339b0: i64,ch = CopyFromReg 0x7f0b7c0535d8, 0x7f0b7cd342b0 [ORD=19249] [ID=185]
        0x7f0b7cd342b0: i64 = Register %vreg52 [ID=130]
  0x7f0b7cc2bc30: i1 = Constant<-1> [ID=133]
In function: search
Comment 1 Christoph Haag 2014-01-03 18:19:40 UTC
Hm, I did think there would be some interest.

So but because the error had something to do with "llvm.SI.loop" and it said it was in the "search" function there is only this one loop that can be the trouble maker:

#pragma unroll
for (uint i=0; i<4; i++) 
...

So I removed that by manually unrolling it.
And indeed it gets a little further.
But then I get this error:

0x7fa839ae8160: i32 = ExternalSymbol'__muldi3'
Undefined function
UNREACHABLE executed at AMDGPUISelLowering.h:89!

With pyrit like in bug #64600 I get the same error, so... are the patches still not in llvm 198401?

I guess I'll try with http://cgit.freedesktop.org/~tstellar/llvm/log/?h=master-testing-si then.
Comment 2 Christoph Haag 2014-01-04 23:52:46 UTC
Created attachment 91500 [details]
bfgminer with original scrypt kernel and RADEON_DUMP_SHADERS=1 loop error

Well, I would test it if I could get clang to compile with it. (I think it is required?) With revisions after r197612 clang is incompatible with llvm master-testing-si and with a few revisions I tried before there is a segfault (?) while compiling...

Meanwhile an error log with RADEON_DUMP_SHADERS=1 for the original loop problem that may or may not already be fixed by patches.

So either I'll wait until Tom Stellart updates or I'll try later whether upstream merges well with this since it's "only" two weeks behind.
Comment 3 Tom Stellard 2014-01-13 17:22:23 UTC
(In reply to comment #0)
> bfgminer --benchmark without scrypt seems to be working fine now.
> 
> But I think many people would rather compute scrypt right now on their 7xxx+
> GPUs, just considering the recent rumors that the radeon 7950 is getting
> sold out because of litecoin miners.
> 
> With mesa/llvm it doesn't work yet.
> 
> llvm 3.5 197392, mesa and stuff are very recent git builds.
> 
> The kernel is
> https://github.com/luke-jr/bfgminer/blob/bfgminer/scrypt130511.cl
> 
> bfgminer is started with
> 
> bfgminer -v1 --scrypt -S opencl:auto --url=<pool>
> --userpass=<user:worker:pass>
> 
> The error is:
> 
> Initialising kernel scrypt130511.cl without bitalign, 1 vectors and worksize
> 256
> 
> [...]
> 
> LLVM ERROR: Cannot select: 0x7f0b7cbd40a0: ch = br_cc 0x7f0b7cd32bb0,
> 0x7f0b7cc28c20, 0x7f0b7cd32fb0, 0x7f0b7cc2bc30, 0x7f0b7cc2ad30 [ORD=19251]
> [ID=4253]
>   0x7f0b7cd32fb0: i1,ch = llvm.SI.loop 0x7f0b7cc2c030:1, 0x7f0b7cd333b0,
> 0x7f0b7cc2c030 [ORD=19250] [ID=4250]
>     0x7f0b7cd333b0: i64 = TargetConstant<3431> [ID=132]
>     0x7f0b7cc2c030: i64,ch = llvm.SI.if.break 0x7f0b7cf2f3e0,
> 0x7f0b7cc2b530, 0x7f0b7cc2c530, 0x7f0b7cd339b0 [ORD=19249] [ID=4249]
>       0x7f0b7cc2b530: i64 = TargetConstant<3428> [ID=129]
>       0x7f0b7cc2c530: i1 = setcc 0x7f0b7cc7dd60, 0x7f0b7cc80760,
> 0x7f0b7cd337b0 [ORD=19248] [ID=212]
>         0x7f0b7cc7dd60: i64 = add 0x7f0b7cc81560, 0x7f0b7cc81360 [ORD=13153]
> [ID=187]
>           0x7f0b7cc81560: i64,ch = CopyFromReg 0x7f0b7c0535d8,
> 0x7f0b7cc81660 [ORD=13147] [ID=149]
>             0x7f0b7cc81660: i64 = Register %vreg53 [ID=1]
>           0x7f0b7cc81360: i64 = Constant<1> [ID=2]
>         0x7f0b7cc80760: i64 = Constant<4> [ID=4]
>       0x7f0b7cd339b0: i64,ch = CopyFromReg 0x7f0b7c0535d8, 0x7f0b7cd342b0
> [ORD=19249] [ID=185]
>         0x7f0b7cd342b0: i64 = Register %vreg52 [ID=130]
>   0x7f0b7cc2bc30: i1 = Constant<-1> [ID=133]
> In function: search

This patch: http://cgit.freedesktop.org/~tstellar/llvm/commit/?h=master-testing-si&id=41d1f890bc2095a1354c4502cc97e30b3288ba1a should fix the crash in an unmodified kernel.  However, you will probably still see the error from comment 3.

I need a test case in order to commit this patch.  Could you post the output of R600_DEBUG=cs from an unmodified kernel *without* this patch applied.
Comment 4 Christoph Haag 2014-01-13 17:31:56 UTC
Created attachment 91965 [details]
R600_DEBUG=cs bfgminer -v1 --intensity 8 --shaders 8192 --scrypt -S opencl:auto  ...

This is with upstream llvm 199097.

That patch is small enough to apply it to master without conflicts, so I can test whether it fixes this problem later.
Comment 5 darkbasic 2014-02-14 13:59:45 UTC
It doesn't crash for me but it doesn't work at all:

 bfgminer version 3.5.7 - Started: [2014-02-14 14:46:50] - [  0 days 00:01:56]
 [M]anage devices [P]ool management [S]ettings [D]isplay options  [H]elp [Q]uit 
 Connected to hot.wemineltc.com diff 0 with stratum as user <username>
 Block: ...05f1bfaa #513878  Diff:2.67k (19.14Gh/s)  Started: [14:46:49]
 ST:2  F:0  NB:1  AS:0  BW:[ 26/  3 B/s]  E:0.00  I: 0.00 BTC/hr  BS:0
 0      53.0C |   0.0/  0.0/  0.0 h/s | A:0 R:0+0(none) HW:0/none
--------------------------------------------------------------------------------
 OCL 0: 53.0C |   0.0/  0.0/  0.0 h/s | A:0 R:0+0(none) HW:0/none
--------------------------------------------------------------------------------
 [2014-02-14 14:46:49] ADL Initialisation Error! Error -1!
 [2014-02-14 14:46:49] Probing for an alive pool
 [2014-02-14 14:46:49] Testing pool stratum+tcp://hot.wemineltc.com:3335
 [2014-02-14 14:46:49] HTTP request failed: Protocol stratum+tcp not supported or disabled in libcurl
 [2014-02-14 14:46:49] Stratum authorisation success for pool 0
 [2014-02-14 14:46:49] Pool 0 stratum+tcp://hot.wemineltc.com:3335 alive
 [2014-02-14 14:46:49] Network difficulty changed to 2.67k (19.14Gh/s)
 [2014-02-14 14:46:49] New block: ...05f1bfaa #513878 diff 2.67k (19.14Gh/s)
 [2014-02-14 14:46:49] Stratum from pool 0 detected new block
 [2014-02-14 14:46:49] Pool 0 is hiding block contents from us
 [2014-02-14 14:46:50] Init GPU thread 0 GPU 0 virtual GPU 0
 [2014-02-14 14:46:50] CL Platform vendor: Mesa
 [2014-02-14 14:46:50] CL Platform name: Default
 [2014-02-14 14:46:50] CL Platform version: OpenCL 1.1 MESA 10.2.0-devel
 [2014-02-14 14:46:50] List of devices:
 [2014-02-14 14:46:50]  0       AMD TAHITI
 [2014-02-14 14:46:50] Selected 0: AMD TAHITI
 [2014-02-14 14:46:50] Selecting scrypt kernel
 [2014-02-14 14:46:52] Initialising kernel scrypt130511.cl without bitalign, 1 vectors and worksize 256
 [2014-02-14 14:46:52] initCl() finished. Found AMD TAHITI
 [2014-02-14 14:46:58] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:04] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:10] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:16] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:22] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:28] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:34] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:36] Stratum from pool 0 requested work update
 [2014-02-14 14:47:40] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:46] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:52] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:47:54] OCL 0: Idle for more than 60 seconds, declaring SICK!
 [2014-02-14 14:47:54] OCL 0: Attempting to restart
 [2014-02-14 14:47:54] CL Platform 0 vendor: Mesa
 [2014-02-14 14:47:54] CL Platform 0 name: Default
 [2014-02-14 14:47:54] CL Platform 0 version: OpenCL 1.1 MESA 10.2.0-devel
 [2014-02-14 14:47:54] Platform 0 devices: 1
 [2014-02-14 14:47:54]  0       AMD TAHITI
 [2014-02-14 14:47:54] Thread 0 still exists, killing it off
 [2014-02-14 14:47:54] Reinit GPU thread 0
 [2014-02-14 14:47:54] CL Platform vendor: Mesa
 [2014-02-14 14:47:54] CL Platform name: Default
 [2014-02-14 14:47:54] CL Platform version: OpenCL 1.1 MESA 10.2.0-devel
 [2014-02-14 14:47:54] List of devices:
 [2014-02-14 14:47:54]  0       AMD TAHITI
 [2014-02-14 14:47:54] Selected 0: AMD TAHITI
 [2014-02-14 14:47:56] Initialising kernel scrypt130511.cl without bitalign, 1 vectors and worksize 256
 [2014-02-14 14:47:56] initCl() finished. Found AMD TAHITI
 [2014-02-14 14:47:56] Thread 0 restarted
 [2014-02-14 14:47:58] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:48:04] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:48:10] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:48:16] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:48:22] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:48:28] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:48:34] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none
 [2014-02-14 14:48:36] Stratum from pool 0 requested work update
 [2014-02-14 14:48:40] 5s:  0.0 avg:  0.0 u:  0.0  h/s | A:0 R:0+0(none) HW:0/none



linux 3.14 + everything from git (mesa 10.2, lvvm 3.5...)
Comment 6 darkbasic 2014-02-14 14:02:03 UTC
Forgot to say: card is 7950 (radeonsi)
Comment 7 Christoph Haag 2014-02-14 14:34:39 UTC
Yes, I have the same now. Either a bfgminer update or a llvm/mesa update fixed the crashes.

It seems like it is trying to use the correct GPU:

 [2014-02-14 15:31:32] OCL 0: Attempting to restart
 [2014-02-14 15:31:32] CL Platform 0 vendor: Mesa
 [2014-02-14 15:31:32] CL Platform 0 name: Default
 [2014-02-14 15:31:32] CL Platform 0 version: OpenCL 1.1 MESA 10.2.0-devel
 [2014-02-14 15:31:32] Platform 0 devices: 1
 [2014-02-14 15:31:32] ?0?AMD PITCAIRN

But then it doesn't calculate anything (at least not anything bfgminer can use) ad then:

 [2014-02-14 15:32:43] OCL 0: Idle for more than 60 seconds, declaring SICK!
Comment 8 Tom Stellard 2014-02-14 16:18:50 UTC
Created attachment 94078 [details]
Patch to test

Can you test this patch and post the output of R600_DEBUG=cs
Comment 9 darkbasic 2014-02-14 16:40:26 UTC
Is R600_DEBUG=cs supposed to show something on radeonsi (HD 7950)?
Because I see nothing...
Comment 10 Tom Stellard 2014-02-14 16:47:57 UTC
(In reply to comment #9)
> Is R600_DEBUG=cs supposed to show something on radeonsi (HD 7950)?
> Because I see nothing...

Are you redirecting stderr to a log file:

bfgminer --options 2>logfile.txt
Comment 11 darkbasic 2014-02-14 17:43:18 UTC
Obviously I did it but I see nothing in stderr except the bfgminer output I already posted.
Comment 12 Tom Stellard 2014-02-14 17:53:55 UTC
(In reply to comment #11)
> Obviously I did it but I see nothing in stderr except the bfgminer output I
> already posted.

Does bfgminer do anything with the patch, or does it declare the GPU sick?  If bfgminer declares the GPU sick try changing the value of UP.Threshold to something smaller, like 100.
Comment 13 Christoph Haag 2014-02-14 18:52:05 UTC
Created attachment 94094 [details]
crashing again with patch from #8

(In reply to comment #8)
> Created attachment 94078 [details]
> Patch to test
> 
> Can you test this patch and post the output of R600_DEBUG=cs

It's crashing again with

0x7f76ed4687c0: i32 = ExternalSymbol'__muldi3'

For some reason the output seems to be doubled in my log.

I hope it's not a problem that llvm was built from a newer revision than clang.
Comment 14 darkbasic 2014-02-14 18:55:29 UTC
It doesn't crash but it keeps doing nothing with both 100 and 500 UP.Threshold.
Comment 15 Tom Stellard 2014-02-14 19:28:31 UTC
If you apply the patches from this comment, it should fix the crash:

https://bugs.freedesktop.org/show_bug.cgi?id=74679#c2
Comment 16 Tom Stellard 2014-02-14 19:33:38 UTC
(In reply to comment #13)
> Created attachment 94094 [details]
> crashing again with patch from #8
> 
> (In reply to comment #8)
> > Created attachment 94078 [details]
> > Patch to test
> > 
> > Can you test this patch and post the output of R600_DEBUG=cs
> 
> It's crashing again with
> 
> 0x7f76ed4687c0: i32 = ExternalSymbol'__muldi3'
> 
> For some reason the output seems to be doubled in my log.
> 
> I hope it's not a problem that llvm was built from a newer revision than
> clang.

For some reason, every line in the logfile is printed twice.  Could you try and capture a new log?
Comment 17 Christoph Haag 2014-02-14 19:56:57 UTC
Created attachment 94100 [details]
crash with patches from the pyrit bug

Sorry, about the double entries, I don't know why it does that. Maybe it's because it's some ncurses user interface.

(In reply to comment #15)
> If you apply the patches from this comment, it should fix the crash:
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=74679#c2

Indeed it does.

This is now with the two patches from there and the one from here, complete log attached:

LLVM ERROR: Cannot select: 0x7fac6d4530b0: i64 = add 0x7fac6d42a0a0, 0x7fac6d446bb0 [ORD=7] [ID=79]
  0x7fac6d42a0a0: i64,ch = CopyFromReg 0x7fac6c58c660, 0x7fac6d42aba0 [ORD=7] [ID=42]
    0x7fac6d42aba0: i64 = Register %vreg48 [ID=15]
  0x7fac6d446bb0: i64 = Constant<1> [ID=16]
In function: search
Comment 18 Luke-Jr 2014-02-14 20:12:12 UTC
Log output is duplicated to stderr if it is redirected. So, if you redirect stderr to a file, *only* redirect stderr (not stdout+stderr).
Comment 19 Tom Stellard 2014-02-14 20:13:51 UTC
(In reply to comment #17)
> Created attachment 94100 [details]
> crash with patches from the pyrit bug
> 
> Sorry, about the double entries, I don't know why it does that. Maybe it's
> because it's some ncurses user interface.
> 
> (In reply to comment #15)
> > If you apply the patches from this comment, it should fix the crash:
> > 
> > https://bugs.freedesktop.org/show_bug.cgi?id=74679#c2
> 
> Indeed it does.
> 
> This is now with the two patches from there and the one from here, complete
> log attached:
> 
> LLVM ERROR: Cannot select: 0x7fac6d4530b0: i64 = add 0x7fac6d42a0a0,
> 0x7fac6d446bb0 [ORD=7] [ID=79]
>   0x7fac6d42a0a0: i64,ch = CopyFromReg 0x7fac6c58c660, 0x7fac6d42aba0
> [ORD=7] [ID=42]
>     0x7fac6d42aba0: i64 = Register %vreg48 [ID=15]
>   0x7fac6d446bb0: i64 = Constant<1> [ID=16]
> In function: search

This patch should fix the new crash: http://people.freedesktop.org/~tstellar/0001-R600-SI-Custom-select-64-bit-ADD.patch
Comment 20 Christoph Haag 2014-02-14 20:43:56 UTC
@Luke-Jr

Thanks, I thought that didn't make sense, but then I noticed that I wrote 2&> instead of 2> so it was my fault after all.


It looks like it's actually running now with all your patches. AWESOME!

But 45 MH/s? Shouldn't it be more like 350 KH/S according to https://litecoin.info/Mining_hardware_comparison?



 bfgminer version 3.9.0 - Started: [2014-02-14 21:35:07] - [  0 days 00:06:07]
 [M]anage devices [P]ool management [S]ettings [D]isplay options  [H]elp [Q]uit 
 Connected to stratum11.dogehouse.org diff 0 with stratum as user XXXXXXXXX
 Block: ...53143b61 #100970  Diff:1.19k ( 8.54Gh/s)  Started: [21:40:36]
 ST:2  F:0  NB:7  AS:0  BW:[ 64/  9 B/s]  E:0.00  I: 2.03mBTC/hr  BS:0
 1            |  0.22/ 0.22/42.98Mh/s | A:4 R:12+0(none) HW:17/ 52%
--------------------------------------------------------------------------------
 OCL 0:       |  0.22/ 0.22/42.62Mh/s | A:4 R:12+0(none) HW:17/ 52%
--------------------------------------------------------------------------------
 [2014-02-14 21:37:55] ALL 5s: 0.22 avg: 0.21 u:51.35 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:37:57] OCL0       | 5s: 0.22 avg: 0.21 u:50.63 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:37:58] Stratum from pool 0 requested work update
 [2014-02-14 21:38:01] ALL 5s: 0.22 avg: 0.21 u:49.60 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:38:03] OCL0       | 5s: 0.22 avg: 0.21 u:48.93 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:38:06] ALL 5s: 0.22 avg: 0.21 u:48.18 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:38:09] OCL0       | 5s: 0.22 avg: 0.21 u:47.34 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:38:11] ALL 5s: 0.22 avg: 0.21 u:46.73 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:38:15] OCL0       | 5s: 0.22 avg: 0.21 u:45.85 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:38:17] ALL 5s: 0.22 avg: 0.21 u:45.27 Mh/s | A:2 R:4+0( 67%) HW:11/ 65%
 [2014-02-14 21:38:18] Stratum from pool 0 requested work update
Comment 21 Luke-Jr 2014-02-14 21:07:12 UTC
Old versions report the hashrate wrongly. Try latest git, or at least a maintained version (3.5.x or 3.10.x)
Comment 22 Alex Deucher 2014-02-14 21:10:58 UTC
If you are using 3.14, you may have to force your performance level to high or revert this patch for SI parts:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ffcda352b569dcf5be5c8a5f57545794acf4adb9
Comment 23 Christoph Haag 2014-02-14 21:30:22 UTC
Oh, didn't realize the one in the archlinux repository was too old.

Anyway, HW errors and % nonces invalid was climbing until 70+% so something was wrong anyway.

With bfgminer git I first got

 [2014-02-14 22:14:41] OCL 0: Idle for more than 60 seconds, declaring SICK!

but after restarting bfgminer it works again.

Now this is what I get. I guess it's not bad for that it is the first time it works without crashing. :)

 [2014-02-14 22:27:26] New block: ...3b4ec6ee #101023 diff 1.19k ( 8.54Gh/s)
 [2014-02-14 22:27:26] Stratum from pool 0 detected new block
 [2014-02-14 22:27:32] OCL0 86.0C | 20s:221.6 avg:218.1 u: 19.3 kh/s | A:1 R:1+0( 50%) HW:4/ 67%
 [2014-02-14 22:27:35] ALL 20s:217.2 avg:215.6 u: 19.1 kh/s | A:1 R:1+0( 50%) HW:5/ 71%
 [2014-02-14 22:27:46] Stratum from pool 0 requested work update
 [2014-02-14 22:27:53] New block: ...de20348d #101024 diff 1.19k ( 8.54Gh/s)
 [2014-02-14 22:27:53] Stratum from pool 0 detected new block
 [2014-02-14 22:27:55] ALL 20s:214.3 avg:215.1 u: 17.5 kh/s | A:1 R:1+0( 50%) HW:6/ 75%
 [2014-02-14 22:27:56] OCL0 88.0C | 20s:221.8 avg:218.5 u: 17.4 kh/s | A:1 R:1+0( 50%) HW:6/ 75%
 [2014-02-14 22:28:14] Stratum from pool 0 requested work update
 [2014-02-14 22:28:15] ALL 20s:232.8 avg:218.7 u: 16.1 kh/s | A:1 R:1+0( 50%) HW:10/ 83%
 [2014-02-14 22:28:19] OCL0 89.0C | 20s:221.9 avg:218.8 u: 15.8 kh/s | A:1 R:1+0( 50%) HW:10/ 83%
 [2014-02-14 22:28:33] Stratum from pool 0 requested work update
 [2014-02-14 22:28:35] ALL 20s:223.8 avg:218.0 u: 15.0 kh/s | A:1 R:1+0( 50%) HW:12/ 86%
 [2014-02-14 22:28:39] New block: ...74afb383 #101025 diff 1.19k ( 8.54Gh/s)
 [2014-02-14 22:28:39] Stratum from pool 0 detected new block
 [2014-02-14 22:28:43] OCL0 88.0C | 20s:221.9 avg:219.1 u: 14.5 kh/s | A:1 R:1+0( 50%) HW:12/ 86%
 [2014-02-14 22:28:55] ALL 20s:218.3 avg:217.5 u: 14.0 kh/s | A:1 R:1+0( 50%) HW:12/ 86%
 [2014-02-14 22:28:59] Stratum from pool 0 requested work update
 [2014-02-14 22:29:06] OCL0 86.0C | 20s:221.9 avg:219.3 u: 13.4 kh/s | A:1 R:1+0( 50%) HW:12/ 86%
 [2014-02-14 22:29:15] ALL 20s:214.9 avg:217.0 u: 13.1 kh/s | A:1 R:1+0( 50%) HW:12/ 86%
 [2014-02-14 22:29:19] Stratum from pool 0 requested work update
 [2014-02-14 22:29:30] OCL0 88.0C | 20s:221.9 avg:219.5 u: 12.5 kh/s | A:1 R:1+0( 50%) HW:12/ 86%
 [2014-02-14 22:29:35] ALL 20s:212.9 avg:216.6 u: 12.3 kh/s | A:1 R:1+0( 50%) HW:12/ 86%
Comment 24 darkbasic 2014-02-15 11:52:35 UTC
Sorry, the patch previously did nothing because when I upgraded bfgminer I forgot to add "-S opencl:auto" to the command line parameters.

I applied
0001-SelectionDAG-Factor-ISD-MUL-lowering-code-out-of-DAG.patch
0002-SelectionDAG-Use-helper-function-to-improve-legaliza.patch
0001-R600-SI-Custom-select-64-bit-ADD.patch

With "UP.Threshold = 500" it still doesn't work, while with "UP.Threshold = 100" it works but it's terribly slow: on my HD 7950 10 KH/s compared to 440 KH/s with Catalyst 14.1_beta.

GPU_MAX_ALLOC_PERCENT=100 GPU_USE_SYNC_OBJECTS=1 bfgminer -g 2 --scrypt -o stratum+tcp://hot.wemineltc.com:3335 -u user.1 -p password -I 13 -S opencl:auto

These are the parameters which give me the best results with Catalyst and they simply hang my system with radeon.
Comment 25 Christoph Haag 2014-07-11 22:52:30 UTC
So with 3.16, newest mesa git, llvm trunk from a few days ago etc. I'm getting about 12.0 khash/sec out of my HD 7970M with all the stuff one should set:

GPU_MAX_ALLOC_PERCENT=100 GPU_USE_SYNC_OBJECTS=1 bfgminer -v1 --scrypt -S opencl:auto --url=stratum+tcp://dogepool.pw:3334 --userpass=XXXXXX --intensity=8 --shaders=1280

from the debug output:

 [2014-07-12 00:47:12] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:16] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:20] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:24] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:28] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:30] 20s: 11.9 avg: 11.5 u:  0.0 kh/s | A:0 R:0+0(none) HW:0/none

But it's by far not getting as hot as it should (?) with about 96% gpu usage according to radeontop (it's 61°C but I think it should be 80-90°C and I also have it set to "high" and "performance").

And if I read that right performance is about 1/30 it could be (350 KH/S)

So...

Close this bug because it does indeed run or keep it open for performance issues?
Comment 26 Tom Stellard 2014-07-14 14:36:50 UTC
(In reply to comment #25)
> So with 3.16, newest mesa git, llvm trunk from a few days ago etc. I'm
> getting about 12.0 khash/sec out of my HD 7970M with all the stuff one
> should set:
> 
> GPU_MAX_ALLOC_PERCENT=100 GPU_USE_SYNC_OBJECTS=1 bfgminer -v1 --scrypt -S
> opencl:auto --url=stratum+tcp://dogepool.pw:3334 --userpass=XXXXXX
> --intensity=8 --shaders=1280
> 
> from the debug output:
> 
>  [2014-07-12 00:47:12] [thread 0: 47872 hashes, 11.7 khash/sec]
>  [2014-07-12 00:47:16] [thread 0: 47872 hashes, 11.7 khash/sec]
>  [2014-07-12 00:47:20] [thread 0: 47872 hashes, 11.7 khash/sec]
>  [2014-07-12 00:47:24] [thread 0: 47872 hashes, 11.7 khash/sec]
>  [2014-07-12 00:47:28] [thread 0: 47872 hashes, 11.7 khash/sec]
>  [2014-07-12 00:47:30] 20s: 11.9 avg: 11.5 u:  0.0 kh/s | A:0 R:0+0(none)
> HW:0/none
> 
> But it's by far not getting as hot as it should (?) with about 96% gpu usage
> according to radeontop (it's 61°C but I think it should be 80-90°C and I
> also have it set to "high" and "performance").
> 
> And if I read that right performance is about 1/30 it could be (350 KH/S)
> 
> So...

Have you verified that scrypt actually works?  Do you ever get any accepted shares?  If it does work, you can close this bug and open another for the performance issues.
> 
> Close this bug because it does indeed run or keep it open for performance
> issues?
Comment 27 Christoph Haag 2014-07-14 15:18:36 UTC
(In reply to comment #26)
> Have you verified that scrypt actually works?  Do you ever get any accepted
> shares?  If it does work, you can close this bug and open another for the
> performance issues.

Hm...

I don't think so:


 [2014-07-14 17:15:40] 
Summary of runtime statistics:

 [2014-07-14 17:15:40] Started at [2014-07-14 16:40:41]
 [2014-07-14 17:15:40] Pool: stratum+tcp://dogepool.pw:3334
 [2014-07-14 17:15:40] Runtime: 0 hrs : 34 mins : 58 secs
 [2014-07-14 17:15:40] Average hashrate: 0.0 Megahash/s
 [2014-07-14 17:15:40] Solved blocks: 0
 [2014-07-14 17:15:40] Best share difficulty: 0
 [2014-07-14 17:15:40] Share submissions: 0
 [2014-07-14 17:15:40] Accepted shares: 0
 [2014-07-14 17:15:40] Rejected shares: 0 + 0 stale (-nan%)
 [2014-07-14 17:15:40] Accepted difficulty shares: 0
 [2014-07-14 17:15:40] Rejected difficulty shares: 0
 [2014-07-14 17:15:40] Hardware errors: 0
 [2014-07-14 17:15:40] Network transfer:  43.0 /   7.3 kB  ( 20.6 /   3.5  B/s)
 [2014-07-14 17:15:40] Efficiency (accepted shares * difficulty / 2 KB): 0.00
 [2014-07-14 17:15:40] Utility (accepted shares / min): 0.00/min

 [2014-07-14 17:15:40] Unable to get work from server occasions: 3
 [2014-07-14 17:15:40] Work items generated locally: 300
 [2014-07-14 17:15:40] Submitting work remotely delay occasions: 0
 [2014-07-14 17:15:40] New blocks detected on network: 30

 [2014-07-14 17:15:40] Summary of per device statistics:

 [2014-07-14 17:15:40] OCL0 66.0C | 20s: 12.2 avg: 12.1 u:  0.0 kh/s | A:0 R:0+0(none) HW:0/none
 [2014-07-14 17:15:40]  
GPU_MAX_ALLOC_PERCENT=100 GPU_USE_SYNC_OBJECTS=1 bfgminer -v1 --scrypt -S      29,40s user 26,23s system 2% cpu 35:00,31 total




For reference that the pool works I also tried cpuminer:
minerd -a scrypt --url=stratum+tcp://dogepool.pw:3334 --userpass=XXXXXX
and I quickly get:
accepted: 1/1 (100.00%), 38.53 khash/s (yay!!!)
Comment 28 Tom Stellard 2014-08-21 16:30:31 UTC
Does this work with the latest Mesa and LLVM?  I've fixed some bugs recently that may help.
Comment 29 darkbasic 2014-08-21 17:20:30 UTC
I will try, but I have to upgrade my stack first because I switched to fglrx.
Comment 30 darkbasic 2014-08-22 13:49:31 UTC
I get no errors but a ridiculous hash rate and a 100% error rate:
 1      68.0C | 442.2/440.0/  0.0kh/s | A:0 R:0+0(none) HW:8/100%
Comment 31 darkbasic 2014-08-22 13:50:05 UTC
kernel 3.17-rc1 + drm-fixes, llvm git, mesa git
Comment 32 Christoph Haag 2014-08-23 09:21:32 UTC
Doesn't work for me either with llvm 216281 and mesa git

Oh and maybe it's not known, at least it wasn't so clear to me: I earlier wrote how my GPU wouldn't get as warm as I expected nor did it display the expected hashrate. This seems to be entirely controlled by the --intensity parameter. An intensity of 10 or 11 seems to work okay to not get too many H/W errors and higher results.

But I let it run for about an hour and bfgminer didn't seem to create a single usable result again.
Runtime: 0 hrs : 52 mins : 51 secs
Average hashrate: 0.1 Megahash/s
Solved blocks: 0
Best share difficulty: 0
Share submissions: 0
Accepted shares: 0
Rejected shares: 0 + 0 stale (-nan%)
Accepted difficulty shares: 0
Rejected difficulty shares: 0
Hardware errors: 77

(The problem is pretty surely not bfgminer. I replaced CL_DEVICE_TYPE_GPU with CL_DEVICE_TYPE_CPU and ran it via the intel opencl runtime on my CPU and there I get accepted shares after very few minutes)
Comment 33 Luke-Jr 2014-11-02 03:08:25 UTC
Looks like clEnqueueNDRangeKernel is re-compiling (or at least optimising?) the kernel, and taking a looooooooong time to do so. Isn't compiling supposed to be done *once* by clBuildProgram?
Comment 34 Luke-Jr 2014-11-02 12:46:16 UTC
(In reply to Luke-Jr from comment #33)
> Looks like clEnqueueNDRangeKernel is re-compiling (or at least optimising?)
> the kernel, and taking a looooooooong time to do so. Isn't compiling
> supposed to be done *once* by clBuildProgram?

Also, seems it pisses off the hardware really bad when it finishes (although I only have a 5850 and 6xxx for testing at the moment):

[  855.257659] radeon 0000:01:00.0: ring 3 stalled for more than 10000msec
[  855.257670] radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000000963c last fence id 0x000000000000963b on ring 3)
[  855.283709] dmar: DRHD: handling fault status reg 3
[  855.283730] dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr 401df000 
DMAR:[fault reason 06] PTE Read access is not set
[  855.283734] radeon 0000:01:00.0: Saved 514 dwords of commands on ring 0.
[  855.283757] radeon 0000:01:00.0: GPU softreset: 0x0000000D
[  855.283762] radeon 0000:01:00.0:   GRBM_STATUS               = 0xB0433828
[  855.283766] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x08000007
[  855.283770] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000007
[  855.283773] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[  855.283776] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[  855.283779] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[  855.283783] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x400C0000
[  855.283799] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00050002
[  855.283803] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80268647
[  855.283813] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44483146
[  855.298675] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B
[  855.298731] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100100
[  855.299894] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003828
[  855.299898] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000007
[  855.299901] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000007
[  855.299904] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[  855.299908] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[  855.299911] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[  855.299914] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[  855.299918] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[  855.299921] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[  855.299924] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[  855.299956] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[  855.322535] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[  855.323757] [drm] PCIE GART of 1024M enabled (table at 0x0000000000273000).
[  855.323855] radeon 0000:01:00.0: WB enabled
[  855.323859] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff8800a7056c00
[  855.323862] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff8800a7056c0c
[  855.325391] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc900052b2118
[  855.341736] [drm] ring test on 0 succeeded in 3 usecs
[  855.341750] [drm] ring test on 3 succeeded in 1 usecs
[  855.519119] [drm] ring test on 5 succeeded in 2 usecs
[  855.519135] [drm] UVD initialized successfully.
[  855.523058] [drm] ib test on ring 0 succeeded in 0 usecs
[  855.523143] [drm] ib test on ring 3 succeeded in 1 usecs
[  855.675747] [drm] ib test on ring 5 succeeded
Comment 35 Luke-Jr 2014-11-22 04:27:59 UTC
(In reply to Luke-Jr from comment #34)
> [  855.257659] radeon 0000:01:00.0: ring 3 stalled for more than 10000msec
> [  855.257670] radeon 0000:01:00.0: GPU lockup (waiting for
> 0x000000000000963c last fence id 0x000000000000963b on ring 3)

Side note: The keccak kernel gave this same problem - oddly enough, unrolling the for loop going over each round fixed it O.o
Comment 36 Linux User 2014-11-24 17:43:03 UTC
Got similar issues when trying scrypt. I have relatively recent bfgminer (~1 month old git build) and more or less recent graphics stack: 3.17 kernel, mesa 10.4 pre-release from Oibaf PPA and LLVM 3.5.1

I have bunch of HD5000 based cards, mostly HD5750/5770 and somesuch and R9 270. Tests have shown bfgminer performs correctly when computing SHA256 and goes about 80% of catalyst at 57xx and even better than that on R9 270. 

However things are much worse when it comes to scrypt. I got impression open driver can get issues if something aggressively using GPU VRAM.

Some observations about open drivers stack so far:
- If you set scrypt intensity too high, there is high risk GPU would lock up in fatal way and crash (can happen on both 57xx and R9).

- It is also exceptionally unsafe to try sha256 with vectors=4 on HD 57xx. It would be much slower than other vector settings anyway, but still indication of some techmical problem lurking around.

- If I'm setting up intensity at reasonable levels, it does builds kernels and computes BUT --benchmark ***NEVER*** accepts computed blocks, 100% reject rate. If I reduce intensity - error rate goes down. But still no accepted blocks. If I try CPU at same time, it computes several blocks at time frame where GPU gives no results at all. This indicates computations are just going wrong. 

p.s. IMO bfgminer is a really worthy program to add it into automated tests/regression checks, etc. 

P.P.S. and what about better fan/intensity control? Problem is that if ambient is warm, about 25-26C, 57xx cards can make really annoying noise. Heat can be reduced via intensity a bit but it works poorly and not really fine grained. There is also overheat + hysteresis setting which can be (ab)used to cool down GPU a bit but it makes fan to speed up and down in oscillating manner which is also really annoying to hear.

Any proper tooling to hint DPM about max acceptable cooler rate or max allowed GPU freq comparable to what Catalyst haves to offer in ADL? Preferred attitude would be to reduce GPU core clocks if ambient is high and increase it if ambient temp is low and TDP/fan setup permits. Its possible to achieve with Catalyst, but Catalyst really stinks and really I want to get rid of it.
Comment 37 Tom Stellard 2014-11-24 18:14:06 UTC
(In reply to Linux User from comment #36)
> Got similar issues when trying scrypt. I have relatively recent bfgminer (~1
> month old git build) and more or less recent graphics stack: 3.17 kernel,
> mesa 10.4 pre-release from Oibaf PPA and LLVM 3.5.1
> 
> I have bunch of HD5000 based cards, mostly HD5750/5770 and somesuch and R9
> 270. Tests have shown bfgminer performs correctly when computing SHA256 and
> goes about 80% of catalyst at 57xx and even better than that on R9 270. 
> 
> However things are much worse when it comes to scrypt. I got impression open
> driver can get issues if something aggressively using GPU VRAM.
> 
> Some observations about open drivers stack so far:
> - If you set scrypt intensity too high, there is high risk GPU would lock up
> in fatal way and crash (can happen on both 57xx and R9).
> 

Do you have an X server running?  One cause of lockups is bfgminer hogging the GPU, and X not being able to do anything.  I recommend trying without an X server.  For best results when not using an X server, enable render nodes in the kernel by adding drm.rnodes=1 to your kernel command line.
Comment 38 Linux User 2014-11-24 19:33:08 UTC
> Do you have an X server running?
I am (on one of GPUs). And I can understand X could be slow, etc and adjusting intensity on GPU sharing X server is well known thing to me.

But I'm talking about GPU lockups. When GPU crashes due to ring stall and driver have to restart it, its likely something else failing? Somehow, attempts to run bfgminer --scrypt with high intensity often can provoke ring stalls.

> drm.rnodes=1 to your kernel command line.
Cool, but...
1) It could be nice to view output on separate monitor and graphic terminal looks better than just framebuffer console. At least in trial runs I would prefer to deal with my favorite terminal, adjusting intensity of 1st GPU a bit.
2) I do not think apps should be cause fatal GPU deadlocks, effectively screwing all graphics, system-wide.

Though thanks for render nodes hint - sounds like it can be really valuable thing to try on some headless machines, etc.

P.S. also there is another silly issue. If I just install Ubuntu and run bfgminer on multi-GPU setup within X session, it would only see 1st GPU (where X server running). Remaining GPUs are not detected. Fix is to either run bfgminer as root (extremely unsafe!!!) or create new user and make "video" it's primary group. The user who installs Ubuntu is a member of "video" group, but "video" is his secondary group, which is very common. Somehow, kernel seems to disregard permissions in such case and would issue -EPERM on certain syscall, making bfgminer unable to find GPUs except one used by X. Generally it means that user can't use more than 1 GPU unless he is either root (very dangerous!) or video is his primary group (inconvenient and uncommon). I believe it is a bug and I should file it? Since I fail to understand how average Joe would be able to use some OpenCL program in multi-GPU setup and get it working "by default" on all available GPUs. I guess I should file it as new bug? Is it kernel issue or MESA, etc?
Comment 39 Michel Dänzer 2014-11-25 02:46:49 UTC
(In reply to Linux User from comment #36)
> P.P.S. and what about better fan/intensity control?

See bug 73338.


(In reply to Linux User from comment #38)
> P.S. also there is another silly issue. If I just install Ubuntu and run
> bfgminer on multi-GPU setup within X session, it would only see 1st GPU
> (where X server running). Remaining GPUs are not detected. Fix is to either
> run bfgminer as root (extremely unsafe!!!) or create new user and make
> "video" it's primary group. The user who installs Ubuntu is a member of
> "video" group, but "video" is his secondary group, which is very common.
> Somehow, kernel seems to disregard permissions in such case and would issue
> -EPERM on certain syscall, making bfgminer unable to find GPUs except one
> used by X. Generally it means that user can't use more than 1 GPU unless he
> is either root (very dangerous!) or video is his primary group (inconvenient
> and uncommon). I believe it is a bug and I should file it? Since I fail to
> understand how average Joe would be able to use some OpenCL program in
> multi-GPU setup and get it working "by default" on all available GPUs. I
> guess I should file it as new bug? Is it kernel issue or MESA, etc?

You should report this to Ubuntu.
Comment 40 Tom Stellard 2014-11-26 17:07:34 UTC
(In reply to Linux User from comment #38)
> > Do you have an X server running?
> I am (on one of GPUs). And I can understand X could be slow, etc and
> adjusting intensity on GPU sharing X server is well known thing to me.
> 
> But I'm talking about GPU lockups. When GPU crashes due to ring stall and
> driver have to restart it, its likely something else failing? Somehow,
> attempts to run bfgminer --scrypt with high intensity often can provoke ring
> stalls.
> 
> > drm.rnodes=1 to your kernel command line.
> Cool, but...
> 1) It could be nice to view output on separate monitor and graphic terminal
> looks better than just framebuffer console. At least in trial runs I would
> prefer to deal with my favorite terminal, adjusting intensity of 1st GPU a
> bit.
> 2) I do not think apps should be cause fatal GPU deadlocks, effectively
> screwing all graphics, system-wide.
> 
> Though thanks for render nodes hint - sounds like it can be really valuable
> thing to try on some headless machines, etc.
> 
> P.S. also there is another silly issue. If I just install Ubuntu and run
> bfgminer on multi-GPU setup within X session, it would only see 1st GPU
> (where X server running). Remaining GPUs are not detected. Fix is to either
> run bfgminer as root (extremely unsafe!!!) or create new user and make
> "video" it's primary group. The user who installs Ubuntu is a member of
> "video" group, but "video" is his secondary group, which is very common.
> Somehow, kernel seems to disregard permissions in such case and would issue
> -EPERM on certain syscall, making bfgminer unable to find GPUs except one
> used by X. Generally it means that user can't use more than 1 GPU unless he
> is either root (very dangerous!) or video is his primary group (inconvenient
> and uncommon). I believe it is a bug and I should file it? Since I fail to
> understand how average Joe would be able to use some OpenCL program in
> multi-GPU setup and get it working "by default" on all available GPUs. I
> guess I should file it as new bug? Is it kernel issue or MESA, etc?


Enabling rendernodes should make both GPUs visible.
Comment 41 Vedran Miletić 2017-03-22 16:03:00 UTC
It has been a while. What is the current state of bfgminer on Clover?
Comment 42 Jan Vesely 2018-05-30 06:43:48 UTC
(In reply to Vedran Miletić from comment #41)
> It has been a while. What is the current state of bfgminer on Clover?

works OK on my carrizo/topaz nb:

./bfgminer -S opencl:auto --scrypt --benchmark

 [2018-05-30 02:38:36] Summary of per device statistics:

 [2018-05-30 02:38:36] OCL0        | 20s: 25.9 avg: 23.5 u: 10.6 kh/s | A:89 R:0+0(none) HW:0/none
 [2018-05-30 02:38:36] OCL1        | 20s: 31.1 avg: 28.1 u: 13.9 kh/s | A:117 R:0+0(none) HW:0/none
 [2018-05-30 02:38:36]

which is the same result as comment #25

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.