Created attachment 15133 [details] [review] Enable BSR in Log2 functions This patch edits RADEONLog2 and ATILog2 to use the x86 BSR instruction instead of looping through bits. It should provide a somewhat of a speed increase in this function on x86 and AMD64 architectures. Note: the BSR instruction was added with the 80386 CPU and is therefore not compatible with earlier CPUs, though I highly doubt it's even possible to use a 286 in conjunction with a Radeon. The inline assembly also works with Intel's compiler (icc). Assembly output for current RADEONLog2: RADEONLog2: testl %edi, %edi movl $-1, %eax je .L4 xorl %eax, %eax .p2align 4,,7 .L5: addl $1, %eax sarl %edi jne .L5 subl $1, %eax .L4: rep ; ret Assembly output for BSR-enabled RADEONLog2: RADEONLog2: movl %edi, %ecx bsrl %ecx, %eax ret
committed: 17cd42ed31814ba329a6a68edd0d75390a7da40e Thanks
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.