Bug 81045

Summary: [r600] Unreal Engine 4 demo crashed kernel
Product: DRI Reporter: Nikoli <nikoli>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: major    
Priority: medium CC: commiethebeastie, linuxdonald, mzdunek, richard.llom, sa, vmerlet
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
URL: https://wiki.unrealengine.com/Linux_Demos
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
kernel messages
none
xorg.log none

Description Nikoli 2014-07-08 12:04:35 UTC
Created attachment 102434 [details]
kernel messages

Tried running Unreal Engine 4 demos from https://wiki.unrealengine.com/Linux_Demos
'Mobile Temple Demo' started, worked bad (image was too dark, possible to see almost nothing), but did not crash anything.
Next tried 'Effects Cave Demo': in a few seconds after first start of demo X server became not responding (not even possible to switch with ctrl+alt+f1), it was flickering between grey fill and normal desktop (seems kernel tried to restart GPU), then system became fully unresponsive: ssh did not work, power button did not send system to suspend. Attached kernel messages saved by syslog.

I never had GPU related stability problems with 3.12.x-3.14.x kernels before: several opengl 3d games worked fine, opengl based mpv video output worked fine, glxgears and stellarium worked fine too.

Did unreal engine try to use some non implemented or unstable features? Why kernel allowed to crash itself instead of killing userspace app?

Hardware: Radeon HD 6770
Software: Gentoo hardened amd64 stable, kernel-3.14.10, libdrm-2.4.54, mesa-10.2.2, llvm-3.3, xf86-video-ati-7.3.0, xorg-server-1.15.0
Comment 1 Thomas Rohloff 2014-07-09 08:32:00 UTC
I tried the Effects Cave Demo on a Radeon HD 6950: After it loaded the sound played but the screen was going into standby and the system wasn't even reacting to MagSysRq keys. Would love to give logs but they where wiped with the system reset.

OpenGL renderer string: Gallium 0.4 on AMD CAYMAN
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.3.0-devel (git-3c77d2a)

Kernel 3.15.3
Comment 2 Thomas Rohloff 2014-07-09 08:51:24 UTC
Now I tried Mobile Temple Demo, too, which played no sound and showed vertical blue lines (garbage I guess) only till the watchdog kicked in.
Comment 3 Marek Zdunek 2014-07-10 15:28:20 UTC
same here on Fedora 20

gpu     HD 6850 BARTS
kernel  3.15.3-200.fc20.x86_64
mesa    10.1.5-1.20140607.fc20 

instant gpu lockup after demo startup, only sounds playing
Comment 4 Marek Zdunek 2014-07-10 15:28:46 UTC
Created attachment 102557 [details]
xorg.log
Comment 5 Knut Andre Tidemann 2014-09-24 07:52:07 UTC
I can also reproduce this with the effects demo on this card:
[AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]

I was able to ssh into the machine and kill the application, which restored the display on the X server.

The following output was seen repeating in the terminal output of the application:

EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:751 r600_shader_select - Failed to build shader variant (type=1) -1

as well as GPU lockups in the kernel like this:

[3114375.931718] radeon 0000:01:00.0: ring 0 stalled for more than 10086msec
[3114375.931722] radeon 0000:01:00.0: ring 0 stalled for more than 10086msec
[3114375.931728] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000004e7c894 last fence id 0x0000000004e7c891 on ring 0)
[3114375.931735] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000004e7c894 last fence id 0x0000000004e7c891 on ring 0)
[3114375.931816] radeon 0000:01:00.0: failed to get a new IB (-35)
[3114375.931822] [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib !
[3114375.931831] radeon 0000:01:00.0: failed to get a new IB (-35)
[3114375.931837] [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib !
[3114375.943817] radeon 0000:01:00.0: Saved 1623 dwords of commands on ring 0.
[3114375.943831] radeon 0000:01:00.0: GPU softreset: 0x00000009
[3114375.943833] radeon 0000:01:00.0:   GRBM_STATUS               = 0xF0001828
[3114375.943834] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x80000003
[3114375.943836] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000007
[3114375.943838] radeon 0000:01:00.0:   SRBM_STATUS               = 0x20000AC0
[3114375.943840] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[3114375.943841] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[3114375.943843] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x40040000
[3114375.943845] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00048006
[3114375.943847] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80268647
[3114375.943848] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[3114375.944771] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B
[3114375.944824] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
[3114375.945982] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003828
[3114375.945984] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000007
[3114375.945986] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000007
[3114375.945987] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[3114375.945989] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[3114375.945991] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[3114375.945992] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[3114375.945994] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[3114375.945996] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[3114375.945998] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[3114375.946012] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[3114376.017053] [drm] PCIE gen 2 link speeds already enabled
[3114376.019321] [drm] PCIE GART of 1024M enabled (table at 0x000000000025D000).
[3114376.019441] radeon 0000:01:00.0: WB enabled
[3114376.019443] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff8800b9bafc00
[3114376.019444] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff8800b9bafc0c
[3114376.020040] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc9001249c418
[3114376.036607] [drm] ring test on 0 succeeded in 1 usecs
[3114376.036665] [drm] ring test on 3 succeeded in 1 usecs
[3114376.213817] [drm] ring test on 5 succeeded in 1 usecs
[3114376.213821] [drm] UVD initialized successfully.

This was using kernel 3.16.1 and Mesa 10.4.0-devel from today (g45b104e).
Comment 6 Thomas Kowaliczek 2014-10-07 14:21:15 UTC
EE r600_state_common.c:751 r600_shader_select - Failed to build shader variant (type=1) -1
[2014.10.07-16.08.17:512][840]LogLinuxWindow: TrackActivationChanges: false (Window: 0x4ac0fe0, CurrentlyActiveWindow: 0x4ac0fe0, Event: 1)
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:751 r600_shader_select - Failed to build shader variant (type=1) -1
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:751 r600_shader_select - Failed to build shader variant (type=1) -1
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:751 r600_shader_select - Failed to build shader variant (type=1) -1
[2014.10.07-16.08.17:572][843]Closing by request
[2014.10.07-16.08.17:572][843]LogGenericPlatformMisc: FPlatformMisc::RequestExit(0)
EE r600_shader.c:157 r600_pipe_shader_create - translation from TGSI failed !
EE r600_state_common.c:751 r600_shader_select - Failed to build shader variant (type=1) -1


Same error here
Arch Linux
Kernel: 3.16.4-1
Mesa 10.4
Comment 7 famo 2016-01-28 23:50:59 UTC
I have no problems running any demo from that site.

However - there is some severe and very annoying stuttering. Which I can also replicate on games using that engine...
Can someone else confirm?


Kernel: 4.2.6-1-CHAKRA x86_64 (64 bit)

Graphics:
Card: Advanced Micro Devices [AMD/ATI] Bonaire XTX [Radeon R7 260X]
Display Server: X.Org 1.17.4 drivers: ati,radeon (unloaded: vesa)
GLX Renderer: Gallium 0.4 on AMD BONAIRE (DRM 2.43.0, LLVM 3.7.0)
GLX Version: 3.0 Mesa 11.0.6
Comment 8 Michel Dänzer 2016-01-29 02:29:14 UTC
(In reply to famo from comment #7)
> However - there is some severe and very annoying stuttering. Which I can
> also replicate on games using that engine...
> Can someone else confirm?

Yes, it's because of shader recompiles. This is being addressed and will hopefully be fixed before too long. It's not related to this bug report though.
Comment 9 famo 2016-01-29 11:06:12 UTC
(In reply to Michel Dänzer from comment #8)
> (In reply to famo from comment #7)
> > However - there is some severe and very annoying stuttering. Which I can
> > also replicate on games using that engine...
> > Can someone else confirm?
> 
> Yes, it's because of shader recompiles. This is being addressed and will
> hopefully be fixed before too long. It's not related to this bug report
> though.
>
Thanks for the info. By adressed you mean in the engine or in the driver?
Also is there a bug report about this / can I track this somewhere?


@reporter:
Please check again, this bug can probably be closed...
Comment 10 Michel Dänzer 2016-02-05 08:47:38 UTC
(In reply to famo from comment #9)
> Thanks for the info. By adressed you mean in the engine or in the driver?

In the driver.

> Also is there a bug report about this / can I track this somewhere?

There is bug 92806.
Comment 11 Martin Peres 2019-11-19 08:53:37 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/510.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.