Bug 83319 - [r600g] GPU lockup in gsraytrace (Mesa-demo-8.2.0) - RV730
Summary: [r600g] GPU lockup in gsraytrace (Mesa-demo-8.2.0) - RV730
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r600 (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: medium critical
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-31 22:59 UTC by Dieter Nützel
Modified: 2019-09-18 19:17 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg-3.16.1-7.g90bc0f1-desktop-gsraytrace.log (56.71 KB, text/plain)
2014-08-31 22:59 UTC, Dieter Nützel
Details
Xorg.0.log-3.16.1-7.g90bc0f1-desktop (53.98 KB, text/plain)
2014-08-31 23:00 UTC, Dieter Nützel
Details
gsraytrace-R600_DEBUG-ps-vs.log (147.74 KB, text/plain)
2014-09-01 11:40 UTC, Dieter Nützel
Details
gsraytrace-MESA_GLSL-dump.log (149.19 KB, text/plain)
2014-09-01 11:40 UTC, Dieter Nützel
Details
kmsg: gsraytrace lockup on HD6850 (126.25 KB, text/plain)
2015-09-02 18:23 UTC, Heiko
Details

Description Dieter Nützel 2014-08-31 22:59:35 UTC
Created attachment 105518 [details]
dmesg-3.16.1-7.g90bc0f1-desktop-gsraytrace.log

GL_RENDERER   = Gallium 0.4 on AMD RV730
GL_VERSION    = 3.0 Mesa 10.4.0-devel (git-5598458)
GL_VENDOR     = X.Org

GShader thing?

mesa-demos/glsl> ./vsraytrace 
ATTENTION: default value of option vblank_mode overridden by environment.
GL_RENDERER = Gallium 0.4 on AMD RV730
47.590482 FPS (238 frames in 5.001000 seconds)
62.325210 FPS (312 frames in 5.006000 seconds)
65.407555 FPS (329 frames in 5.030000 seconds)

mesa-demos/glsl> ./fsraytrace 
ATTENTION: default value of option vblank_mode overridden by environment.
GL_RENDERER = Gallium 0.4 on AMD RV730
503.000000 FPS (2515 frames in 5.000000 seconds)
626.800000 FPS (3134 frames in 5.000000 seconds)
577.600000 FPS (2888 frames in 5.000000 seconds)

mesa-demos/glsl> ./gsraytrace 
ATTENTION: default value of option vblank_mode overridden by environment.
GL_RENDERER = Gallium 0.4 on AMD RV730

ESC                 = exit demo
left mouse + drag   = rotate camera

0.044960 FPS (1 frames in 22.242000 seconds)
0.047201 FPS (1 frames in 21.186000 seconds)

ESC / CNTRL+C (several/hundred times)

=> 2 times switching between blank (black) full screen (console) and desktop (KDE 4.13.3) before the system comes back

NO, no LLVM this time... (Michel?)

Maybe related: bug 76394
Comment 1 Dieter Nützel 2014-08-31 23:00:49 UTC
Created attachment 105519 [details]
Xorg.0.log-3.16.1-7.g90bc0f1-desktop
Comment 2 Marek Olšák 2014-09-01 11:17:48 UTC
Yeah, texturing in geometry shaders hangs. The problem might be that the GS RING constant buffer is bound to slot 16, which isn't a valid constant buffer slot (the last one is 15).
Comment 3 Dieter Nützel 2014-09-01 11:37:37 UTC
(In reply to comment #2)
> Yeah, texturing in geometry shaders hangs. The problem might be that the GS
> RING constant buffer is bound to slot 16, which isn't a valid constant
> buffer slot (the last one is 15).

Hello Marek!

We're back from vacation, so...;-)

Read about it (R600_MAX_CONST_BUFFERS) in Glenn Kennard <glenn.kennard@gmail.com> patch:
[Mesa-dev] [PATCH] r600g: Implement GL_ARB_sample_shading

Is it that what you mean?
Either way, following the shader dumps.

BTW I have a broken screen shot for geom-outlining-150.png, too if you need.
Comment 4 Dieter Nützel 2014-09-01 11:40:18 UTC
Created attachment 105548 [details]
gsraytrace-R600_DEBUG-ps-vs.log
Comment 5 Dieter Nützel 2014-09-01 11:40:52 UTC
Created attachment 105549 [details]
gsraytrace-MESA_GLSL-dump.log
Comment 6 Marek Olšák 2014-09-01 12:48:41 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > Yeah, texturing in geometry shaders hangs. The problem might be that the GS
> > RING constant buffer is bound to slot 16, which isn't a valid constant
> > buffer slot (the last one is 15).
> 
> Hello Marek!
> 
> We're back from vacation, so...;-)
> 
> Read about it (R600_MAX_CONST_BUFFERS) in Glenn Kennard
> <glenn.kennard@gmail.com> patch:
> [Mesa-dev] [PATCH] r600g: Implement GL_ARB_sample_shading
> 
> Is it that what you mean?

Yes, but it's just a guess. It may or may not have anything to do with this bug.
Comment 7 Dieter Nützel 2014-10-23 18:34:23 UTC
Ping!

What's needed?
Comment 8 Dieter Nützel 2015-07-30 02:50:45 UTC
Ping!

Any news?

Related to Bug 91503 ?
Comment 9 Heiko 2015-09-02 18:23:18 UTC
Created attachment 118054 [details]
kmsg: gsraytrace lockup on HD6850

Just noticed the problem today on my HD6850. Unfortunately, it doesn't come back to life all the time. That is either complete lockup (and reboot) or at least Xorg being hung in process state D and gsraytrace in state Z. Attached log produced, when hitting the latter one (though sysrq still worked and terminal after sysrq-r worked well). The lock contains some of the sysrq-show-xyz triggers, which are of help maybe (the blocked cat is from catting one of the radeon sysfs entries). Xorg log only shows event overflows.

GPU hang seems to be 100% reproducable with gsraytrace for me.

Software components being:
media-libs/mesa-9999 (git-4de86e1)
sys-devel/llvm-3.6.2
sys-kernel/vanilla-sources-4.2.0
x11-libs/libdrm-2.4.64

options radeon audio=1 tv=0 dpm=1
Comment 10 Dieter Nützel 2015-09-03 13:36:14 UTC
(In reply to Heiko from comment #9)
> Created attachment 118054 [details]
> kmsg: gsraytrace lockup on HD6850
> 
> Just noticed the problem today on my HD6850. Unfortunately, it doesn't come
> back to life all the time. That is either complete lockup (and reboot) or at
> least Xorg being hung in process state D and gsraytrace in state Z. Attached
> log produced, when hitting the latter one (though sysrq still worked and
> terminal after sysrq-r worked well). The lock contains some of the
> sysrq-show-xyz triggers, which are of help maybe (the blocked cat is from
> catting one of the radeon sysfs entries). Xorg log only shows event
> overflows.
> 
> GPU hang seems to be 100% reproducable with gsraytrace for me.
> 
> Software components being:
> media-libs/mesa-9999 (git-4de86e1)
> sys-devel/llvm-3.6.2
> sys-kernel/vanilla-sources-4.2.0
> x11-libs/libdrm-2.4.64
> 
> options radeon audio=1 tv=0 dpm=1

Hello Heiko,

I've started this one year ago for _RV730_ (AGP) and it is still open for it.

But as you're on HD6850 and I'm on Turks (6670), now
I've opened up new one for those, here:  bug 91865
I think you should move your logs there.

My impression is that they could be related.
Comment 11 Dieter Nützel 2015-11-25 01:46:30 UTC
(In reply to Dieter Nützel from comment #3)
> (In reply to comment #2)
> > Yeah, texturing in geometry shaders hangs. The problem might be that the GS
> > RING constant buffer is bound to slot 16, which isn't a valid constant
> > buffer slot (the last one is 15).
> 
> Hello Marek!
> 
> We're back from vacation, so...;-)
> 
> Read about it (R600_MAX_CONST_BUFFERS) in Glenn Kennard
> <glenn.kennard@gmail.com> patch:
> [Mesa-dev] [PATCH] r600g: Implement GL_ARB_sample_shading
> 
> Is it that what you mean?
> Either way, following the shader dumps.
> 
> BTW I have a broken screen shot for geom-outlining-150.png, too if you need.

Broken rendering for 'geom-outlining-150' and
ogl-samples: 'gl-320-primitive-shading'

is FIXED with Mesa-11.0.6 (git, too) on RV730 AGP at least.
Comment 12 Dieter Nützel 2015-11-25 01:50:01 UTC
'gsraytrace' GPU hang with

RV730 (AGP)

and

NI/Turks XT (6670)

still there.
Comment 13 Heiko 2015-11-25 06:34:59 UTC
Try disabling sb for the gs case. Seems to b0rken things...
Comment 14 Dieter Nützel 2015-11-27 03:34:05 UTC
(In reply to Heiko from comment #13)
> Try disabling sb for the gs case. Seems to b0rken things...

Thanks, Heiko.
I try normally both versions. ;-)
Look here, too: Bug 91865
I have an apitrace, there.

Both lookup.
But with R600_DEBUG=nosb
some seconds (after some broken frames) later.
With sb it lookup immediately.
Comment 15 Dieter Nützel 2016-01-14 16:37:24 UTC
Update,

this one is NOT solved with current Mesa git.

For EG+ (Bug 91865) this is _fixed_ by 'accident' in Mesa git since:

commit 2239f3eaff5c72c4cb1d4a5be97feb4af3d08d25
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Nov 30 15:48:22 2015 +1000

    r600/shader: emit tessellation factors to GDS at end of TCS.
    
    When we are finished the shader, we read back all the tess factors
    from LDS and write them to special global memory storage using
    GDS instructions.
    
    This also handles adding NOP when GDS or ENDLOOP end the TCS.
    
    Signed-off-by: Dave Airlie <airlied@redhat.com>

Dave and Marek any hints which could point in the right direction?
What is different in this case between R600/R700 and EG+ (NI/Turks in my case) and what should I try next.
Comment 16 Lukáš Krejza 2016-01-14 21:58:50 UTC
I have reported probably a duplicate: https://bugs.freedesktop.org/show_bug.cgi?id=93706
It has also an apitrace trace. If confirmed, i will mark it as a duplicate
Comment 17 GitLab Migration User 2019-09-18 19:17:16 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/525.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.