Bug 49140 - r600_state_common.c:761:r600_draw_vbo: Assertion `0' failed
Summary: r600_state_common.c:761:r600_draw_vbo: Assertion `0' failed
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r600 (show other bugs)
Version: git
Hardware: All Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-25 05:21 UTC by Marko Srebre
Modified: 2019-09-18 18:59 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
R600_DUMP_SHADERS (44.66 KB, text/plain)
2012-04-25 05:21 UTC, Marko Srebre
Details
lorentzTransform function (2.22 KB, text/plain)
2012-04-26 17:38 UTC, Vadim Girlin
Details

Description Marko Srebre 2012-04-25 05:21:29 UTC
Created attachment 60568 [details]
R600_DUMP_SHADERS

Hello,

I have an application using a rather complex shader with some branching - while/if. Applications fails with r600 driver giving the following error:

EE r600_shader.c:140 r600_pipe_shader_create - translation from TGSI failed !
r600_state_common.c:761:r600_draw_vbo: Assertion `0' failed.

It seems something goes wrong in the branching section, since it works if I comment it. The same shader works fine using either LIBGL_ALWAYS_SOFTWARE=1 or fglrx. Also, I can remember working it fine with some older revision of R600, unfortunately I don't know which one exactly.

I have attached R600_DUMP_SHADERS output. If needed I can also provide links to source code or any other data that may be helpful in debugging.

Some relevant parts of glxinfo:

OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD RV770
OpenGL version string: 2.1 Mesa 8.1-devel (git-1a33c1b precise-oibaf-ppa)

Best regards,
Marko
Comment 1 Vadim Girlin 2012-04-25 16:06:28 UTC
Probably register limit. Shader uses 5 inputs + 8 outputs + 112 temps = 125 registers. I think it should work if you could make it less than 120.
Comment 2 Marko Srebre 2012-04-26 00:27:46 UTC
Any suggestions on how to accomplish that? I tried turning my code around and around, but so far with no success. What should help with the register count?

On the other hand, why is the register limit set so low? I just tested the program with Mesa 7.11, same hardware and r600 drivers - it works. This seems like a regression.

If needed, there are binaries/source available for testing at http://thelarge.org
Comment 3 Vadim Girlin 2012-04-26 03:43:45 UTC
(In reply to comment #2)
> Any suggestions on how to accomplish that? I tried turning my code around and
> around, but so far with no success. What should help with the register count?
> 
> On the other hand, why is the register limit set so low? I just tested the
> program with Mesa 7.11, same hardware and r600 drivers - it works. This seems
> like a regression.
> If needed, there are binaries/source available for testing at
> http://thelarge.org

It's a hardware limit. The compiler in theory should optimize register allocation, but the problem is that r600g still lacks real register allocator. And probably some changes since 7.11 increased register usage in the TGSI IR.

I'll see if I can help with that shader somehow, but generally r600g needs a better shader compiler. There is some work in progress on that, but I don't know when it will be completed.

Also there is some experimental code that probably could help with that, but currently it works only with evergreen GPUs. If you could use a gpu of the evergreen class (IIRC it's all of 5xxx, some of 6xxx cards), then you might want to try r600_shader_opt and r600_shader_opt_2 branches from the following repo: https://github.com/VadimGirlin/mesa
Comment 4 Vadim Girlin 2012-04-26 17:38:49 UTC
Created attachment 60642 [details]
lorentzTransform function

It seems you could replace the following lines in the lorentzTransform function:

    r.w = g*p.w - v.x*g*p.x - v.y*g*p.y - v.z*g*p.z;
    r.x = -v.x*g*p.w + (1.0 + gm1*v.x*v.x/v2)*p.x + (gm1*v.x*v.y/v2)*p.y + (gm1*v.x*v.z/v2)*p.z;
    r.y = -v.y*g*p.w + (gm1*v.y*v.x/v2)*p.x + (1.0 + gm1*v.y*v.y/v2)*p.y + (gm1*v.y*v.z/v2)*p.z,
    r.z = -v.z*g*p.w + (gm1*v.z*v.x/v2)*p.x + (gm1*v.z*v.y/v2)*p.y + (1.0+gm1*v.z*v.z/v2)*p.z;

with 

    vec3 p3 = vec3(p.x, p.y, p.z);
    float t = dot(v, p3);
    float t2 = gm1*t/v2 - g*p.w;
    r = vec4( v*t2 + p3, g * (p.w - t));

Attachment contains the complete text of the modified function with the separate steps of the transformation in the comments. Please check if all steps are correct. Anyway, it shows the direction.

Original shader uses 130 regs, 1262 vliw alu instructions on my system.
Modified version - 81 reg, 778 instructions.
Comment 5 Marko Srebre 2012-05-02 22:42:10 UTC
 
>     vec3 p3 = vec3(p.x, p.y, p.z);
>     float t = dot(v, p3);
>     float t2 = gm1*t/v2 - g*p.w;
>     r = vec4( v*t2 + p3, g * (p.w - t));
> 

Vadim, this is really great and much appreciated. I went through the steps and it all seems fine to me, also tested some examples and everything works great. This also makes the advanced path in the shader work again without my quirky workarounds in the while loop. I never really took much care with these kind of optimisations, somehow blindly hoping that the compiler will automagically optimise everything. I'll try to be more careful in advance. 

I'd like to put some comments in the code, like "optimised by Vadim" if that's ok with you or perhaps should I use your real name? Thanks.
Comment 6 Vadim Girlin 2012-05-03 03:16:19 UTC
(In reply to comment #5)
> >     vec3 p3 = vec3(p.x, p.y, p.z);
> >     float t = dot(v, p3);
> >     float t2 = gm1*t/v2 - g*p.w;
> >     r = vec4( v*t2 + p3, g * (p.w - t));
> > 
> 
> Vadim, this is really great and much appreciated. I went through the steps and
> it all seems fine to me, also tested some examples and everything works great.
> This also makes the advanced path in the shader work again without my quirky
> workarounds in the while loop. I never really took much care with these kind of
> optimisations, somehow blindly hoping that the compiler will automagically
> optimise everything. I'll try to be more careful in advance.
> 
> I'd like to put some comments in the code, like "optimised by Vadim" if that's
> ok with you or perhaps should I use your real name? Thanks.

Ah, yes, I forgot to set my full name in the account here. Updated now, so you can use it if you want.
Comment 7 GitLab Migration User 2019-09-18 18:59:20 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/408.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.