Bug 27901

Summary:	GLSL cos/sin functions broken on Mesa R600 driver
Product:	Mesa	Reporter:	Alain Perrot <alain.perrot>
Component:	Drivers/DRI/R600	Assignee:	Default DRI bug account <dri-devel>
Status:	RESOLVED FIXED	QA Contact:
Severity:	normal
Priority:	medium	CC:	amaasikas
Version:	git
Hardware:	All
OS:	All
Whiteboard:
i915 platform:		i915 features:
Attachments:	r700_assembler.c Trig function patch r700_assembler.c Trig function & sincos patch Alternative assemble_TRIG fix Alternative assemble_TRIG and assemble_SCS fix patch to try

Description Alain Perrot 2010-04-29 12:09:06 UTC

As I was playing with OpenGL examples from Joe Groff's blog at http://duriansoftware.com/joe/An-intro-to-modern-OpenGL.-Table-of-Contents.html, I noticed that the cos and sin GLSL functions are broken on the Mesa R600 driver. Other trigonometric functions may be broken as well.

The blog entries provide two versions of a sample program that blend two images together using GLSL. The blend factor is computed from a timestamp using the sin function.

There's no problem with the example from chapter 2 where the blend factor is computed on the CPU (source code at http://github.com/jckarter/hello-gl).

Mesa R600 driver fails to properly run the example from chapter 3 where the blend factor is computed on the GPU using the GLSL sin function. Mesa software renderer has no problem running this example (source code at http://github.com/jckarter/hello-gl-ch3).


Looking at GLSL 1.10 spec, R600 ISA doc and R600 driver code (function assemble_TRIG in r700_assembler.c), I guess there is two problems :

1. both the GLSL cos/sin functions and R600 hardware expects the angle to be specified in radians, but the assemble_TRIG function divides the specified angle by 2 * PI (the result is no more radians).

2. R600 hardware expects the angle to be in the range [-PI, PI] while GLSL does not specify a range, and the assemble_TRIG function does not clip the angle in the required range.


A quick and dirty hack to make the example from chapter 3 work with R600 driver is to clip the timestamp to the [-PI, PI] range and then to multiply it by 2 * PI :

--- hello-gl.c.old      2010-04-29 20:55:55.000000000 +0200
+++ hello-gl.c  2010-04-29 20:56:12.000000000 +0200
@@ -202,7 +202,10 @@
 static void update_timer(void)
 {
     int milliseconds = glutGet(GLUT_ELAPSED_TIME);
-    g_resources.timer = (float)milliseconds * 0.001f;
+    g_resources.timer = fmodf((float)milliseconds * 0.001f, 2.0f * 3.1415926535897f);
+    if(g_resources.timer > 3.1415926535897f)
+        g_resources.timer -= 2.0f * 3.1415926535897f;
+    g_resources.timer *= 2.0f * 3.1415926535897f;
     glutPostRedisplay();
 }
 

My config:
RV670 (Radeon HD 3870)
Kubuntu 10.04 "Lucid Lynx" 64-bit
Linux kernel 2.6.34-rc5 from Ubuntu Mainline Kernel PPA
libdrm, radeon, Mesa, Xorg from Xorg-edgers PPA

Comment 1 Conn Clark 2010-05-17 15:10:30 UTC

A proper fix would be to fix the r700_assembler.c code so that it generates code to translate the radians input to a corresponding value between PI and -PI before passing it to the SIN or COS instructions.


I *think* that this could be achieved through the following r600 instructions

inst count  unit        opcode          DEST   SRC0, SRC1, SRC3
        0   x:          MUL             tmp.x, ang, 1/(2.0 * pi)
        1   x:          FRACT           tmp.x, tmp.x
        2   x:          ADD             tmp.x, tmp.x, -0.5F
            y:          MULADD          tmp.y, tmp.x, (2.0 * pi), -(2.0 * pi)
            z:          MUL             tmp.z, tmp.x, (2.0 * pi)
        3   x:          CMOVEGT         tmp.x, tmp.x, tmp.y, tmp.z
        4   t:          SIN or COS      dest,  tmp.x

Unfortunately I can't quite figure out how to do it in the code yet to try and see.

Comment 2 Conn Clark 2010-05-18 19:25:02 UTC

Created attachment 35743 [details] [review]
r700_assembler.c Trig function patch

Comment 3 Conn Clark 2010-05-18 19:33:26 UTC

Alain,

Okay, The patch I just posted might fix this bug. It doesn't cause any additional errors in piglit either.  I think its working right with http://github.com/jckarter/hello-gl-ch3 too. Of course I have only tested it on my RadeonHD 3100 and its my first attempt at r600 assembly so let me know if it works for you. 

Note: it could probably be done so its faster but I'm still not to sure on how everything works yet in the software

Conn

Comment 4 Alain Perrot 2010-05-19 15:58:12 UTC

(In reply to comment #3)
> Alain,
> 
> Okay, The patch I just posted might fix this bug. It doesn't cause any
> additional errors in piglit either.  I think its working right with
> http://github.com/jckarter/hello-gl-ch3 too. Of course I have only tested it on
> my RadeonHD 3100 and its my first attempt at r600 assembly so let me know if it
> works for you. 
> 
> Note: it could probably be done so its faster but I'm still not to sure on how
> everything works yet in the software
> 
> Conn

Thanks for your help.

Unfortunately, your patch does not fix the cos/sin functions (at least on my Radeon HD 3870 / RV670). The hello-gl-ch3 example works better but still not as expected, you can compare with Mesa software rendering.

I tried to play with your patch to make it work without success for now. A note is that there may be a mistake on the last operand to the CNDGT instruction :

+    setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE);
+    pAsm->S[1].src.rtype = DST_REG_TEMPORARY;
+    pAsm->S[1].src.reg   = tmp2;
+    setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);

should probably be :

+    setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE);
+    pAsm->S[2].src.rtype = DST_REG_TEMPORARY;
+    pAsm->S[2].src.reg   = tmp2;
+    setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);

But this update does not make the cos/sin functions work.

I will try again tomorrow.

Comment 5 Conn Clark 2010-05-19 16:46:22 UTC

On Wed, May 19, 2010 at 3:58 PM,  <bugzilla-daemon@freedesktop.org> wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>
> --- Comment #4 from Alain Perrot <alain.perrot@gmail.com> 2010-05-19 15:58:12 PDT ---
> (In reply to comment #3)
>> Alain,
>>
>> Okay, The patch I just posted might fix this bug. It doesn't cause any
>> additional errors in piglit either.  I think its working right with
>> http://github.com/jckarter/hello-gl-ch3 too. Of course I have only tested it on
>> my RadeonHD 3100 and its my first attempt at r600 assembly so let me know if it
>> works for you.
>>
>> Note: it could probably be done so its faster but I'm still not to sure on how
>> everything works yet in the software
>>
>> Conn
>
> Thanks for your help.
>
> Unfortunately, your patch does not fix the cos/sin functions (at least on my
> Radeon HD 3870 / RV670). The hello-gl-ch3 example works better but still not as
> expected, you can compare with Mesa software rendering.
>
> I tried to play with your patch to make it work without success for now. A note
> is that there may be a mistake on the last operand to the CNDGT instruction :
>
> +    setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE);
> +    pAsm->S[1].src.rtype = DST_REG_TEMPORARY;
> +    pAsm->S[1].src.reg   = tmp2;
> +    setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);
>
> should probably be :
>
> +    setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE);
> +    pAsm->S[2].src.rtype = DST_REG_TEMPORARY;
> +    pAsm->S[2].src.reg   = tmp2;
> +    setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);
>
> But this update does not make the cos/sin functions work.
>
> I will try again tomorrow.
>
> --
> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Alain,

Okay I'll look at it some more tonight. At least I know I'm on the
right track. Thanks for testing.

Conn

Comment 6 Conn Clark 2010-05-20 12:39:38 UTC

Created attachment 35776 [details] [review]
r700_assembler.c Trig function & sincos patch

This patch fixes the assemble_TRIG function and eliminates two errors in piglit when running the glean/glsl1 test case.

I took the liberty of applying the same change to the assemble_SCS function as well since it to would suffer from the same problem.

Comment 7 Conn Clark 2010-05-20 12:46:03 UTC

Alain,

The patch I just posted fixes the problem on my machine completely. The problem with the previous version of the patch (other than the source mistake you found ;) )has something to do me coding in the 0.5 special constant. Replacing it with a literal 0.5 constant works.

I also extended the definition of PI 3 places to avoid drift of the values when it gets an input significantly greater or less than PI and -PI.


Let me know how it works for you.


Thanks.

Conn

Comment 8 Alain Perrot 2010-05-20 17:40:21 UTC

Created attachment 35777 [details] [review]
Alternative assemble_TRIG fix

I can confirm that your patch seems to work for me too.

By the way, you beat me at posting a working patch here :-)

I also figured out that the 0.5 special constant was an issue in your patch, and I managed to get a working assemble_TRIG function which implements the following instruction sequence (lightly different of yours) to normalize the angle:

MULADD  tmp.x, angle, 1/(2*PI), 0.5
FRACT   tmp.x, tmp.x
ADD     tmp.y, tmp.x, 1
CNDGE   tmp.x, tmp.x, tmp.x, tmp.y
MULADD  tmp.x, tmp.x, 2*PI, -PI

I don't known if it is better or worse than yours beside the fact that it use only one helper variable.

I attached my patch (updated to use the same extended value of PI than yours) which fix the assemble_TRIG function, but not the assemble_SCS one.

Comment 9 Conn Clark 2010-05-20 18:08:28 UTC

On Thu, May 20, 2010 at 5:40 PM,  <bugzilla-daemon@freedesktop.org> wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>
> --- Comment #8 from Alain Perrot <alain.perrot@gmail.com> 2010-05-20 17:40:21 PDT ---
> Created an attachment (id=35777)
>  View: https://bugs.freedesktop.org/attachment.cgi?id=35777
>  Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35777
>
> Alternative assemble_TRIG fix
>
> I can confirm that your patch seems to work for me too.
>
> By the way, you beat me at posting a working patch here :-)
>
> I also figured out that the 0.5 special constant was an issue in your patch,
> and I managed to get a working assemble_TRIG function which implements the
> following instruction sequence (lightly different of yours) to normalize the
> angle:
>
> MULADD  tmp.x, angle, 1/(2*PI), 0.5
> FRACT   tmp.x, tmp.x
> ADD     tmp.y, tmp.x, 1
> CNDGE   tmp.x, tmp.x, tmp.x, tmp.y
> MULADD  tmp.x, tmp.x, 2*PI, -PI
>
> I don't known if it is better or worse than yours beside the fact that it use
> only one helper variable.
>
> I attached my patch (updated to use the same extended value of PI than yours)
> which fix the assemble_TRIG function, but not the assemble_SCS one.
>
> --
> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Alain,

Its a tough call on who's is the better solution. Yours uses one less
temp reg and mine will allow for a couple of operations to be done in
parallel in the future. I guess we both deserve a pat on the back and
leave it to someone more experienced to make the call on which one to
choose.

Good job

Conn

Comment 10 Alain Perrot 2010-05-21 00:48:34 UTC

> Alain,
> 
> Its a tough call on who's is the better solution. Yours uses one less
> temp reg and mine will allow for a couple of operations to be done in
> parallel in the future. I guess we both deserve a pat on the back and
> leave it to someone more experienced to make the call on which one to
> choose.
> 
> Good job
> 
> Conn

You're probably right (and I known nothing about parallelization on GPUs).

In any way, many thanks for your help.

Comment 11 Conn Clark 2010-05-21 09:32:33 UTC

On Fri, May 21, 2010 at 12:48 AM,  <bugzilla-daemon@freedesktop.org> wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>
> --- Comment #10 from Alain Perrot <alain.perrot@gmail.com> 2010-05-21 00:48:34 PDT ---
>> Alain,
>>
>> Its a tough call on who's is the better solution. Yours uses one less
>> temp reg and mine will allow for a couple of operations to be done in
>> parallel in the future. I guess we both deserve a pat on the back and
>> leave it to someone more experienced to make the call on which one to
>> choose.
>>
>> Good job
>>
>> Conn
>
> You're probably right (and I known nothing about parallelization on GPUs).
>
> In any way, many thanks for your help.
>
> --
> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Alain,

For the current code I think your patch a tad better. It uses one less
instruction. Short of benchmarking the two solutions I think yours is
the one that should go in. Would you please submit a patch that
includes the assemble_SCS function as well. After you submit it with
that change and a notice that you have singed off on it. I'll nominate
it over my own to go in.

Conn

Comment 12 Alain Perrot 2010-05-21 11:13:05 UTC

Created attachment 35787 [details] [review]
Alternative assemble_TRIG and assemble_SCS fix

Conn,

Attached is the updated patch which includes the assemble_SCS function.
If it is ok for you, I will submit it (I guess that it should be sent to the dri-devel mailing list ?)

Comment 13 Conn Clark 2010-05-21 11:56:36 UTC

On Fri, May 21, 2010 at 11:13 AM,  <bugzilla-daemon@freedesktop.org> wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>
> Alain Perrot <alain.perrot@gmail.com> changed:
>
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>  Attachment #35777 [details]|0                           |1
>        is obsolete|                            |
>
> --- Comment #12 from Alain Perrot <alain.perrot@gmail.com> 2010-05-21 11:13:05 PDT ---
> Created an attachment (id=35787)
>  View: https://bugs.freedesktop.org/attachment.cgi?id=35787
>  Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35787
>
> Alternative assemble_TRIG and assemble_SCS fix
>
> Conn,
>
> Attached is the updated patch which includes the assemble_SCS function.
> If it is ok for you, I will submit it (I guess that it should be sent to the
> dri-devel mailing list ?)
>
> --
> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Alain,

Its on the mailing list.

I'll inform them to merge it after I run piglit and verify it works.

Conn

Comment 14 Conn Clark 2010-05-21 13:45:19 UTC

Alain,

Your patch passes piglit.

I'm going to change the status of this bug now to FIXED


Conn

Comment 15 Alain Perrot 2010-05-21 14:04:24 UTC

(In reply to comment #14)
> Alain,
> 
> Your patch passes piglit.
> 
> I'm going to change the status of this bug now to FIXED
> 
> 
> Conn

Ok, thanks.

Comment 16 Conn Clark 2010-06-07 09:17:41 UTC

Reopened because patch has not yet been applied

Comment 17 Alex Deucher 2010-06-07 09:30:18 UTC

Andre wanted to review this before applying.

Comment 18 Andre Maasikas 2010-06-08 05:53:36 UTC

dont' have much net this week to review/test:(
but i'm ok with it if you make last mul conditional on r700 as 
it has -1..1 range it seems, also amd shader analyzer gives this difference:

RV610 hd2400

; --------  Disassembly --------------------
00 ALU: ADDR(32) CNT(8)
      0  y: MULADD      R123.y,  R0.x,  (0x3E22F983, 0.1591549367f).x,
 0.5
         z: MOV         R0.z,  0.0f
         w: MOV         R0.w,  1.0f
      1  x: FRACT       ____,  PV0.y
      2  z: MULADD      R123.z,  PV1.x,  (0x40C90FDB, 6.283185482f).y,
-(0x40490FDB, 3.141592741f).x
      3  t: SIN         R0.x,  PV2.z
01 EXP_DONE: PIX0, R0.xzzw
END_OF_PROGRAM

4870 RV770
; --------  Disassembly --------------------
00 ALU: ADDR(32) CNT(10)
      0  y: MOV         R0.y,  0.0f
         z: MOV         R0.z,  1.0f
         w: MULADD      R123.w,  R0.x,  (0x3E22F983, 0.1591549367f).x,
 0.5
      1  y: FRACT       ____,  PV0.w
      2  x: MULADD      R123.x,  PV1.y,  (0x40C90FDB, 6.283185482f).y,
-(0x40490FDB, 3.141592741f).x
      3  z: MUL         ____,  PV2.x,  (0x3E22F983, 0.1591549367f).x
      4  t: SIN         R0.x,  PV3.z
01 EXP_DONE: PIX0, R0.xyyz
END_OF_PROGRAM

Comment 19 Conn Clark 2010-06-08 09:09:49 UTC

On Tue, Jun 8, 2010 at 5:53 AM,  <bugzilla-daemon@freedesktop.org> wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>
> --- Comment #18 from Andre Maasikas <amaasikas@gmail.com> 2010-06-08 05:53:36 PDT ---
> dont' have much net this week to review/test:(
> but i'm ok with it if you make last mul conditional on r700 as
> it has -1..1 range it seems, also amd shader analyzer gives this difference:
>
> RV610 hd2400
>
> ; --------  Disassembly --------------------
> 00 ALU: ADDR(32) CNT(8)
>      0  y: MULADD      R123.y,  R0.x,  (0x3E22F983, 0.1591549367f).x,
>  0.5
>         z: MOV         R0.z,  0.0f
>         w: MOV         R0.w,  1.0f
>      1  x: FRACT       ____,  PV0.y
>      2  z: MULADD      R123.z,  PV1.x,  (0x40C90FDB, 6.283185482f).y,
> -(0x40490FDB, 3.141592741f).x
>      3  t: SIN         R0.x,  PV2.z
> 01 EXP_DONE: PIX0, R0.xzzw
> END_OF_PROGRAM
>
> 4870 RV770
> ; --------  Disassembly --------------------
> 00 ALU: ADDR(32) CNT(10)
>      0  y: MOV         R0.y,  0.0f
>         z: MOV         R0.z,  1.0f
>         w: MULADD      R123.w,  R0.x,  (0x3E22F983, 0.1591549367f).x,
>  0.5
>      1  y: FRACT       ____,  PV0.w
>      2  x: MULADD      R123.x,  PV1.y,  (0x40C90FDB, 6.283185482f).y,
> -(0x40490FDB, 3.141592741f).x
>      3  z: MUL         ____,  PV2.x,  (0x3E22F983, 0.1591549367f).x
>      4  t: SIN         R0.x,  PV3.z
> 01 EXP_DONE: PIX0, R0.xyyz
> END_OF_PROGRAM
>
> --
> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

This is very very strange that amd would change the instruction. I
wonder if it is a bug in their code....   Perhaps we need someone with
an r700 to run the sin and cos tests in piglit . The proposed patch
passes on my rs780 (rv610) .

Comment 20 Conn Clark 2010-06-08 09:26:01 UTC

On Tue, Jun 8, 2010 at 9:09 AM,  <bugzilla-daemon@freedesktop.org> wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>
> --- Comment #19 from Conn Clark <conn.o.clark@gmail.com> 2010-06-08 09:09:49 PDT ---
> On Tue, Jun 8, 2010 at 5:53 AM,  <bugzilla-daemon@freedesktop.org> wrote:
>> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>>
>> --- Comment #18 from Andre Maasikas <amaasikas@gmail.com> 2010-06-08 05:53:36 PDT ---
>> dont' have much net this week to review/test:(
>> but i'm ok with it if you make last mul conditional on r700 as
>> it has -1..1 range it seems, also amd shader analyzer gives this difference:
>>
>> RV610 hd2400
>>
>> ; --------  Disassembly --------------------
>> 00 ALU: ADDR(32) CNT(8)
>>      0  y: MULADD      R123.y,  R0.x,  (0x3E22F983, 0.1591549367f).x,
>>  0.5
>>         z: MOV         R0.z,  0.0f
>>         w: MOV         R0.w,  1.0f
>>      1  x: FRACT       ____,  PV0.y
>>      2  z: MULADD      R123.z,  PV1.x,  (0x40C90FDB, 6.283185482f).y,
>> -(0x40490FDB, 3.141592741f).x
>>      3  t: SIN         R0.x,  PV2.z
>> 01 EXP_DONE: PIX0, R0.xzzw
>> END_OF_PROGRAM
>>
>> 4870 RV770
>> ; --------  Disassembly --------------------
>> 00 ALU: ADDR(32) CNT(10)
>>      0  y: MOV         R0.y,  0.0f
>>         z: MOV         R0.z,  1.0f
>>         w: MULADD      R123.w,  R0.x,  (0x3E22F983, 0.1591549367f).x,
>>  0.5
>>      1  y: FRACT       ____,  PV0.w
>>      2  x: MULADD      R123.x,  PV1.y,  (0x40C90FDB, 6.283185482f).y,
>> -(0x40490FDB, 3.141592741f).x
>>      3  z: MUL         ____,  PV2.x,  (0x3E22F983, 0.1591549367f).x
>>      4  t: SIN         R0.x,  PV3.z
>> 01 EXP_DONE: PIX0, R0.xyyz
>> END_OF_PROGRAM
>>
>> --
>> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
>> ------- You are receiving this mail because: -------
>> You are the assignee for the bug.
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>
> This is very very strange that amd would change the instruction. I
> wonder if it is a bug in their code....   Perhaps we need someone with
> an r700 to run the sin and cos tests in piglit . The proposed patch
> passes on my rs780 (rv610) .
>
> --
> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>


Nope Andre, is right.


the evergreen docs describe the sine function as follows

Scalar sine. Input must be normalized from radians by dividing by
2*PI. The valid input
domain is [-256, +256], which corresponds to an un-normalized input
domain [-512*PI,
+512*PI]. Out-of-range input results in float 0.


while the r600 describes the sine function as follows

Scalar sine. Valid input domain [-PI, +PI].

Comment 21 Conn Clark 2010-06-08 09:36:57 UTC

On Tue, Jun 8, 2010 at 9:25 AM, Conn Clark <conn.o.clark@gmail.com> wrote:
> On Tue, Jun 8, 2010 at 9:09 AM,  <bugzilla-daemon@freedesktop.org> wrote:
>> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>>
>> --- Comment #19 from Conn Clark <conn.o.clark@gmail.com> 2010-06-08 09:09:49 PDT ---
>> On Tue, Jun 8, 2010 at 5:53 AM,  <bugzilla-daemon@freedesktop.org> wrote:
>>> https://bugs.freedesktop.org/show_bug.cgi?id=27901
>>>
>>> --- Comment #18 from Andre Maasikas <amaasikas@gmail.com> 2010-06-08 05:53:36 PDT ---
>>> dont' have much net this week to review/test:(
>>> but i'm ok with it if you make last mul conditional on r700 as
>>> it has -1..1 range it seems, also amd shader analyzer gives this difference:
>>>
>>> RV610 hd2400
>>>
>>> ; --------  Disassembly --------------------
>>> 00 ALU: ADDR(32) CNT(8)
>>>      0  y: MULADD      R123.y,  R0.x,  (0x3E22F983, 0.1591549367f).x,
>>>  0.5
>>>         z: MOV         R0.z,  0.0f
>>>         w: MOV         R0.w,  1.0f
>>>      1  x: FRACT       ____,  PV0.y
>>>      2  z: MULADD      R123.z,  PV1.x,  (0x40C90FDB, 6.283185482f).y,
>>> -(0x40490FDB, 3.141592741f).x
>>>      3  t: SIN         R0.x,  PV2.z
>>> 01 EXP_DONE: PIX0, R0.xzzw
>>> END_OF_PROGRAM
>>>
>>> 4870 RV770
>>> ; --------  Disassembly --------------------
>>> 00 ALU: ADDR(32) CNT(10)
>>>      0  y: MOV         R0.y,  0.0f
>>>         z: MOV         R0.z,  1.0f
>>>         w: MULADD      R123.w,  R0.x,  (0x3E22F983, 0.1591549367f).x,
>>>  0.5
>>>      1  y: FRACT       ____,  PV0.w
>>>      2  x: MULADD      R123.x,  PV1.y,  (0x40C90FDB, 6.283185482f).y,
>>> -(0x40490FDB, 3.141592741f).x
>>>      3  z: MUL         ____,  PV2.x,  (0x3E22F983, 0.1591549367f).x
>>>      4  t: SIN         R0.x,  PV3.z
>>> 01 EXP_DONE: PIX0, R0.xyyz
>>> END_OF_PROGRAM
>>>
>>> --
>>> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
>>> ------- You are receiving this mail because: -------
>>> You are the assignee for the bug.
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>
>>
>> This is very very strange that amd would change the instruction. I
>> wonder if it is a bug in their code....   Perhaps we need someone with
>> an r700 to run the sin and cos tests in piglit . The proposed patch
>> passes on my rs780 (rv610) .
>>
>> --
>> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
>> ------- You are receiving this mail because: -------
>> You are the assignee for the bug.
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>
>
> Nope Andre, is right.
>
>
> the evergreen docs describe the sine function as follows
>
> Scalar sine. Input must be normalized from radians by dividing by
> 2*PI. The valid input
> domain is [-256, +256], which corresponds to an un-normalized input
> domain [-512*PI,
> +512*PI]. Out-of-range input results in float 0.
>
>
> while the r600 describes the sine function as follows
>
> Scalar sine. Valid input domain [-PI, +PI].
>
>
> --
>
> Conn O. Clark
>
> Observation: In formal computer science advances are made
> by standing on the shoulders of giants. Linux has proved
> that if there are enough of you, you can advance just as
> far by stepping on each others toes.
>

I guess I'll modify Alain's patch tonight. If r700 has a valid range
-256 to 256 we can get away with just using a FRACT instruction with
2*PI.

Comment 22 Andre Maasikas 2010-07-29 07:58:26 UTC

Created attachment 37437 [details] [review]
patch to try

If you could try the attached patch on r600,
it passes fine for me on r700.

Comment 23 Alain Perrot 2010-07-29 09:12:45 UTC

I have just tested your patch with my RV670 GPU, and the hello-gl-ch3 example program works fine here.

The piglit glean/glsl1-cos(vec4) and glean/glsl1-sin(vec4) tests are also fixed with this patch.

Thanks!

Comment 24 Alain Perrot 2010-08-03 10:05:39 UTC

I've seen that Andre's patch has been committed to Mesa repository (commit d6a5f94ea4d03b05c434fcad125d1f9c50c638e8), so I'm closing this bug as fixed.

Thanks again!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.