Bug 85596

Summary: SB is used only after GPR check / GPR max is not dynamic
Product: Mesa Reporter: Lauri Kasanen <cand>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 69623, 74868, 81683    
Bug Blocks:    

Description Lauri Kasanen 2014-10-29 11:37:25 UTC
Many big shaders that currently fail with
r600_shader_select - Failed to build shader variant (type=1) -12
r600_shader_from_tgsi - GPR limit exceeded - shader requires foo registers

would actually work if the GPR check was moved to after SB, as SB reduces the GPR usage quite nicely.

Another thing is that 128 is the lowest common denominator, and many cards have 192 or 256, which could be checked with the radeon_info ioctl.
Comment 1 Alex Deucher 2014-10-29 13:31:08 UTC
(In reply to Lauri Kasanen from comment #0)
> Many big shaders that currently fail with
> r600_shader_select - Failed to build shader variant (type=1) -12
> r600_shader_from_tgsi - GPR limit exceeded - shader requires foo registers
> 
> would actually work if the GPR check was moved to after SB, as SB reduces
> the GPR usage quite nicely.
> 
> Another thing is that 128 is the lowest common denominator, and many cards
> have 192 or 256, which could be checked with the radeon_info ioctl.

From an ISA perspective, there are only 128 GPRs.  The higher limits are hw internal details.
Comment 2 Vadim Girlin 2014-10-29 21:36:27 UTC
(In reply to Lauri Kasanen from comment #0)
> Many big shaders that currently fail with
> r600_shader_select - Failed to build shader variant (type=1) -12
> r600_shader_from_tgsi - GPR limit exceeded - shader requires foo registers
> 
> would actually work if the GPR check was moved to after SB, as SB reduces
> the GPR usage quite nicely.
> 
> Another thing is that 128 is the lowest common denominator, and many cards
> have 192 or 256, which could be checked with the radeon_info ioctl.

As Alex said, ISA encoding doesn't allow to address more than 128 registers in the instructions. IIRC we also by default reserve 4 GPRs as temporary (they are not preserved between ALU clauses), so the actual limit is 124 (or even 120?).

It's also the reason why we can't simply move the GPR check, the shader is passed from TGSI translator to SB in the ISA encoding which can't represent the code that uses more than 128 registers.

If anyone would like to revive a direct TGSI->SB translator that solves the problem, here is the branch: http://cgit.freedesktop.org/~vadimg/mesa/log/?h=wip-sb-tgsi
There were no piglit regressions with that branch on evergreen when it was implemented, but now I suspect it's a bit outdated.
Comment 3 Lauri Kasanen 2014-10-30 12:34:57 UTC
Thanks, at least that info is now in one place.
Comment 4 GitLab Migration User 2019-09-18 19:17:44 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/530.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.