111241 – Shadertoy shader causing hang

Bug 111241 - Shadertoy shader causing hang

Summary: Shadertoy shader causing hang

Status:	RESOLVED FIXED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Gallium/radeonsi (show other bugs)
Version:	19.1
Hardware:	Other All

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:	Default DRI bug account

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2019-07-28 18:06 UTC by Felix Potthast
Modified:	2019-09-03 12:00 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
tgsi version of the shader (47.00 KB, text/plain) 2019-08-09 16:22 UTC, Pierre-Eric Pelloux-Prayer	Details
nir version (52.81 KB, text/plain) 2019-08-09 16:22 UTC, Pierre-Eric Pelloux-Prayer	Details
Shadertoy-EOT-Grid-scene-2.png (775.54 KB, image/png) 2019-08-30 00:06 UTC, Dieter Nützel	Details
View All

Description Felix Potthast 2019-07-28 18:06:39 UTC

When opening https://www.shadertoy.com/view/3lsXDB on my Desktop PC
with Radeon HD 7870 Graphics card (Pitcairn) i get a freeze.

It works fine on my Laptop with Intel Graphics.

Both systems use Mesa 19.1.3

Comment 1 Pierre-Eric Pelloux-Prayer 2019-08-08 21:34:59 UTC

I could reproduce the issue on a Raven Ridge and a Navi10.

But when using NIR (radeonsi_enable_nir=true env variable) the shader is perfectly usable.

Comment 2 Dieter Nützel 2019-08-09 00:15:33 UTC

RX 580 / NIR
amd-staging-drm-next
Mesa git
Firefox 68.0.1

[42489.228053] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!                                                                                  
[42494.348053] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
[42508.171689] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!

[42513.035801] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

[42556.811021] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!                                                                                  
[42561.418927] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

[42571.658863] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Comment 3 Pierre-Eric Pelloux-Prayer 2019-08-09 16:22:09 UTC

Here's my understanding of the issue.

This shader uses 2 passes:
 - the first pass has BufferA as input and output and does:

if (first frame)
  // init bufferA content
else
  // do something useful

 - the 2nd pass has BufferA as input and does:

N = texelFetch(bufferA)
for(i=0; i < N; i++)
  // do something


The problem here is the "// init bufferA content": it fails to initialize the buffer content properly, leading to an infinite loop in the 2nd pass.

The exact code is:
   if (iFrame==0) { O -= O; return; }

If one replaces this line with:
   if (iFrame==0) { O = vec4(0.0f); return; }

The shader works fine (you can test the modified version here: https://www.shadertoy.com/view/wtSXzw ).

Comment 4 Pierre-Eric Pelloux-Prayer 2019-08-09 16:22:42 UTC

Created attachment 144993 [details]
tgsi version of the shader

Comment 5 Pierre-Eric Pelloux-Prayer 2019-08-09 16:22:59 UTC

Created attachment 144994 [details]
nir version

Comment 6 Pierre-Eric Pelloux-Prayer 2019-08-14 12:56:48 UTC

I've created a different shadertoy showing the problem: https://www.shadertoy.com/view/Wt2SW1 (but this one doesn't hang the GPU).

The shader for "Buffer A" is:

  0: MOV TEMP[0], SV[0]
  1: MAD TEMP[0].y, SV[0], CONST[0][2].xxxx, CONST[0][2].yyyy
  2: MOV OUT[0], IMM[0].xxxx
  3: USEQ TEMP[1].x, CONST[0][1].xxxx, IMM[1].xxxx
  4: UIF TEMP[1].xxxx
  5:   ADD TEMP[2], TEMP[2], -TEMP[2]
  6: ELSE
  [...]
 13: MOV OUT[0], TEMP[2]
 14: END

TEMP[2] is used before being assigned a value, so I suppose that's what allows LLVM to turn line 5 in:

   v_mov_b32_e32 v3, 0x7fc00000
   v_mov_b32_e32 v2, 0x7fc00000
   v_mov_b32_e32 v1, 0x7fc00000
   v_mov_b32_e32 v0, 0x7fc00000

(ie: output is NaN)

A possible way to fix this is to transform "dst = x - x" operations in "dst = 0" which is what nir does in its nir_opt_algebraic pass.

I've open a MR to fix/discuss this issue: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1681

Comment 7 Dieter Nützel 2019-08-29 23:57:01 UTC

Works for me with commit #

glsl: replace 'x + (-x)' with constant 0
https://cgit.freedesktop.org/mesa/mesa/commit/?id=47cc660d9c19572e5ef2dce7c8ae1766a2ac9885

Thanks Pierre-Eric!

Comment 8 Dieter Nützel 2019-08-30 00:05:57 UTC

BTW

Pierre-Eric can you look into this

Shadertoy shader corruption, too?
https://www.shadertoy.com/view/Xt3cWS

I get it with

Konqueror 5.0.97
KDE Frameworks 5.61.0
Qt 5.13.0

And

Firefox 68.0.1

Comment 9 Dieter Nützel 2019-08-30 00:06:57 UTC

Created attachment 145215 [details]
Shadertoy-EOT-Grid-scene-2.png

Comment 10 Dieter Nützel 2019-08-30 00:11:40 UTC

(In reply to Dieter Nützel from comment #9)
> Created attachment 145215 [details]
> Shadertoy-EOT-Grid-scene-2.png

Saw it first @ 2. November 2018

Comment 11 Pierre-Eric Pelloux-Prayer 2019-08-30 11:56:23 UTC

(In reply to Dieter Nützel from comment #8)
> BTW
> 
> Pierre-Eric can you look into this
> 
> Shadertoy shader corruption, too?
> https://www.shadertoy.com/view/Xt3cWS
>

The "Buffer A" shader doesn't write fragColor when uv.y is < 0.1 or > 0.9.

So the content is undefined and may be black on some platform or random.

radeonsi is correct here, but we might want to replace undef values with 0x0 to get a default value instead of random.

Comment 12 Dieter Nützel 2019-08-30 23:57:36 UTC

(In reply to Pierre-Eric Pelloux-Prayer from comment #11)
> (In reply to Dieter Nützel from comment #8)
> > BTW
> > 
> > Pierre-Eric can you look into this
> > 
> > Shadertoy shader corruption, too?
> > https://www.shadertoy.com/view/Xt3cWS
> >
> 
> The "Buffer A" shader doesn't write fragColor when uv.y is < 0.1 or > 0.9.
> 
> So the content is undefined and may be black on some platform or random.
> 
> radeonsi is correct here, but we might want to replace undef values with 0x0
> to get a default value instead of random.

Cool to have you around for bug hunting...;-)

Any hints where I shoud change 'undef values with 0x0' for testing?

And sorry that I 'hijacked' this thread - should I open a new ticket?

Comment 13 Timothy Arceri 2019-09-02 23:47:12 UTC

(In reply to Dieter Nützel from comment #12)
> (In reply to Pierre-Eric Pelloux-Prayer from comment #11)
> > (In reply to Dieter Nützel from comment #8)
> > > BTW
> > > 
> > > Pierre-Eric can you look into this
> > > 
> > > Shadertoy shader corruption, too?
> > > https://www.shadertoy.com/view/Xt3cWS
> > >
> > 
> > The "Buffer A" shader doesn't write fragColor when uv.y is < 0.1 or > 0.9.
> > 
> > So the content is undefined and may be black on some platform or random.
> > 
> > radeonsi is correct here, but we might want to replace undef values with 0x0
> > to get a default value instead of random.
> 
> Cool to have you around for bug hunting...;-)
> 
> Any hints where I shoud change 'undef values with 0x0' for testing?
> 
> And sorry that I 'hijacked' this thread - should I open a new ticket?

I don't think you need to open a bug for it at all. As its not a bug in Mesa its a shader bug.

Closing this bug report as it should be fixed by:

commit	47cc660d9c19572e5ef2dce7c8ae1766a2ac9885
glsl: replace 'x + (-x)' with constant 0
   This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:

   value -= value

   with value being undefined.
   In this case LLVM replace the operation with an assignment to NaN.

   Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
   Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
   Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Comment 14 Pierre-Eric Pelloux-Prayer 2019-09-03 12:00:34 UTC

> > And sorry that I 'hijacked' this thread - should I open a new ticket?
> 
> I don't think you need to open a bug for it at all. As its not a bug in Mesa
> its a shader bug.
> 

It's not a bug in Mesa but if adopting the same behavior than other drivers is cheap performance-wise I think we should consider it.

For instance, adding a prolog to PS that always initializes v0, v1, v2 to 0 could do the trick... maybe there are other/better ways though.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.