Bug 109466

Summary: Frozen display with Radeon RX 580 and Open Source Drivers under GNU/Linux Debian Sid
Product: Mesa Reporter: Yann Kervran <yann.kervran>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact: Default DRI bug account <dri-devel>
Severity: critical    
Priority: medium CC: bugs
Version: 18.3   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Xorg.0.log
dmesg
New full dmesg with new drivers 10/02/2018

Description Yann Kervran 2019-01-27 20:20:24 UTC
Hello,
I experience some troubles with my computer when I try to make 3D : all the display freeze and I can no longer access to any input (keyboard or mouse). From time to time, I can move the mouse, but can’t do anything with it.
I can’t switch to another TTY but can access with SSH. 
It happens with any 3D software but seems to come quicker with games, with no consistency in the timing or happening conditions: sometimes it takes 1mn, sometimes 10. I have experienced it with Blender (last 2.80 branch) and Super Tux Kart, for instance.
Here is below informations about the crash and my computer.
I am under Debian Sid, with XFCE desktop. Feel free to ask me for details, with eventually the commands to pass to get the info you need.
All the best from snowy France

 --- 
dmesg output at the crash :
[ 4600.707439] gmc_v8_0_process_interrupt: 5 callbacks suppressed
[ 4600.707448] amdgpu 0000:01:00.0: GPU fault detected: 147 0x0c004801 for process supertuxkart pid 3736 thread supertuxka:cs0 pid 3737
[ 4600.707458] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01105D80
[ 4600.707464] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x04048001
[ 4600.707472] amdgpu 0000:01:00.0: VM fault (0x01, vmid 2, pasid 32770) at page 17849728, read from 'TC4' (0x54433400) (72)
[ 5118.249900] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=471393, emitted seq=471395
[ 5118.249909] [drm] GPU recovery disabled.

infos about my computer :
 --- 
$ uname -a
Linux groskoui 4.19.0-2-amd64 #1 SMP Debian 4.19.16-1 (2019-01-17) x86_64 GNU/Linux
 --- 
$ glxinfo -B
name of display: :0.0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: X.Org (0x1002)
    Device: Radeon RX 580 Series (POLARIS10, DRM 3.27.0, 4.19.0-2-amd64, LLVM 7.0.1) (0x67df)
    Version: 18.3.2
    Accelerated: yes
    Video memory: 8192MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.5
    Max compat profile version: 4.5
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
Memory info (GL_ATI_meminfo):
    VBO free memory - total: 7976 MB, largest block: 7976 MB
    VBO free aux. memory - total: 8183 MB, largest block: 8183 MB
    Texture free memory - total: 7976 MB, largest block: 7976 MB
    Texture free aux. memory - total: 8183 MB, largest block: 8183 MB
    Renderbuffer free memory - total: 7976 MB, largest block: 7976 MB
    Renderbuffer free aux. memory - total: 8183 MB, largest block: 8183 MB
Memory info (GL_NVX_gpu_memory_info):
    Dedicated video memory: 8192 MB
    Total available memory: 16384 MB
    Currently available dedicated video memory: 7976 MB
OpenGL vendor string: X.Org
OpenGL renderer string: Radeon RX 580 Series (POLARIS10, DRM 3.27.0, 4.19.0-2-amd64, LLVM 7.0.1)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.3.2
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.5 (Compatibility Profile) Mesa 18.3.2
OpenGL shading language version string: 4.50
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 18.3.2
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
Comment 1 Michel Dänzer 2019-01-28 10:01:52 UTC
Please attach the full output of dmesg and the Xorg log file (if you're using Xorg).
Comment 2 Yann Kervran 2019-01-28 13:07:26 UTC
Created attachment 143237 [details]
Xorg.0.log
Comment 3 Yann Kervran 2019-01-28 13:08:02 UTC
Created attachment 143238 [details]
dmesg
Comment 4 Yann Kervran 2019-01-28 13:10:34 UTC
Hello here are the two files.
For them, I have rebooted my computer to make it freeze quickly, with Blender. There was no other software launched apart of XFCE desktop.
When the computer froze, my mouse could move but, as usual, no effect anywhere nor switching to other TTY was possible. I got this files through SSH.
Comment 5 Yann Kervran 2019-02-10 16:15:42 UTC
Hello,
I have made a system upgrade today and it seems that it doesn’t freeze my whole system anymore.
Anyway, I have
Comment 6 Yann Kervran 2019-02-10 16:17:48 UTC
Anyway, I have strange visual artifacts in a game (Planet Nomads) and when I launch Blender 2.80, it finally freeze. But just Blender.
Here is my corresponding dmesg :

[ 7448.526101] amdgpu 0000:02:00.0: GPU fault detected: 146 0x0bb0840c for process PlanetNomads.x8 pid 3657 thread PlanetNoma:cs0 pid 3659
[ 7448.526107] amdgpu 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00154F76
[ 7448.526111] amdgpu 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0408400C
[ 7448.526117] amdgpu 0000:02:00.0: VM fault (0x0c, vmid 2, pasid 32773) at page 1396598, read from 'TC7' (0x54433700) (132)
[ 7448.543477] amdgpu 0000:02:00.0: GPU fault detected: 146 0x0bb0840c for process PlanetNomads.x8 pid 3657 thread PlanetNoma:cs0 pid 3659
[ 7448.543483] amdgpu 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00154F76
[ 7448.543486] amdgpu 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0408400C
[ 7448.543490] amdgpu 0000:02:00.0: VM fault (0x0c, vmid 2, pasid 32773) at page 1396598, read from 'TC7' (0x54433700) (132)
[ 7448.554750] amdgpu 0000:02:00.0: GPU fault detected: 146 0x0bb0840c for process PlanetNomads.x8 pid 3657 thread PlanetNoma:cs0 pid 3659
[ 7448.554754] amdgpu 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00154F76
[ 7448.554756] amdgpu 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0408400C
[ 7448.554759] amdgpu 0000:02:00.0: VM fault (0x0c, vmid 2, pasid 32773) at page 1396598, read from 'TC7' (0x54433700) (132)
[ 7506.249798] amdgpu 0000:02:00.0: GPU fault detected: 147 0x00b84401 for process blender pid 4851 thread blender:cs0 pid 4863
[ 7506.249801] amdgpu 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00154F6A
[ 7506.249802] amdgpu 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x040C400C
[ 7506.249804] amdgpu 0000:02:00.0: VM fault (0x0c, vmid 2, pasid 32773) at page 1396586, read from 'TC3' (0x54433300) (196)
[ 7513.854924] amdgpu 0000:02:00.0: IH ring buffer overflow (0x00081A10, 0x00006AE0, 0x00001A20)
Comment 7 Yann Kervran 2019-02-10 20:44:05 UTC
OK, the situation has changed, but not so much. The display still completely freeze after a while if I use 3D. It just takes longer.
I enclose a new_full_dmesg with all the details.
I have used Blender, then SuperTuxKart then Planet Nomads. It froze with this last one, as it is the most demanding game I have.
Again, I got the dmesg throught SSH as I couldn’t recover directly and had to switch off the computer manually.
Regards.
Comment 8 Yann Kervran 2019-02-10 20:45:06 UTC
Created attachment 143356 [details]
New full dmesg with new drivers 10/02/2018
Comment 9 Hubert Kario 2019-04-12 10:30:56 UTC
I'm experiencing similar issues with R9 290X.
But after trying to reproduce it few times (crash in Witcher III, crash while running glmark2) I'm getting convinced that it's actually a temperature problem - the time it crashed with glmark2, the GPU reached 105°C.
And now that I kept it under 95°C (by limiting the power to 100W), it haven't crashed during the few times I did run glmark2.
Comment 10 GitLab Migration User 2019-09-25 18:48:53 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1370.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.