Bug 74992

Summary: [NVAA] gnome-shell freezes with vram_list validate fail, texture traps
Product: xorg Reporter: Martin Lukeš <martin.meridius>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: andyrtr
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg
none
journalctl_error
none
journalctl_boot
none
Xorg_0_log
none
Kernel log none

Description Martin Lukeš 2014-02-14 16:59:48 UTC
Created attachment 94080 [details]
dmesg

Description:
After some time of normal use of Gnome Shell with nouveau driver the screen gets frozen and I have to restart the GS from another TTY.


Additional info:
xf86-video-nouveau 1.0.10-2
xorg-server 1.15.0-5
linux 3.12.9-2
I use nVidia GeForce 8200
I'm including Xorg.0.log and outputs from dmesg, `journalctl -b` and `journalctl --since=${time_of_freeze}`


Steps to reproduce:
This behavior seems to happen randomly.
Comment 1 Martin Lukeš 2014-02-14 17:01:00 UTC
Created attachment 94081 [details]
journalctl_error
Comment 2 Martin Lukeš 2014-02-14 17:01:35 UTC
Created attachment 94082 [details]
journalctl_boot
Comment 3 Martin Lukeš 2014-02-14 17:02:07 UTC
Created attachment 94083 [details]
Xorg_0_log
Comment 4 Ilia Mirkin 2014-02-14 19:16:41 UTC
[30028.608886] nouveau E[gnome-shell[1661]] fail ttm_validate
[30028.608894] nouveau E[gnome-shell[1661]] validate vram_list
[30028.608938] nouveau E[gnome-shell[1661]] validate: -12
[30029.187282] nouveau E[gnome-shell[1661]] fail ttm_validate
[30029.187290] nouveau E[gnome-shell[1661]] validate vram_list
[30029.187306] nouveau E[gnome-shell[1661]] validate: -12
[30029.706353] nouveau E[  PGRAPH][0000:02:00.0] magic set 0:
[30029.706364] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408604: 0x20081401
[30029.706368] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408608: 0x00000000
[30029.706372] nouveau E[  PGRAPH][0000:02:00.0] 	0x0040860c: 0x40000432
[30029.706376] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408610: 0x00000000
[30029.706380] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TEXTURE - TP0: Unhandled ustatus 0x00000003
[30029.706383] nouveau E[  PGRAPH][0000:02:00.0]  TRAP
[30029.706390] nouveau E[  PGRAPH][0000:02:00.0] ch 4 [0x000797f000 gnome-shell[1661]] subc 3 class 0x8397 mthd 0x15e0 data 0x00000000
[30029.706400] nouveau E[     PFB][0000:02:00.0] trapped read at 0x0000000000 on channel 0x0000797f [gnome-shell[1661]] PGRAPH/TEXTURE/00 reason: PT_NOT_PRESENT
[30029.706563] nouveau E[  PGRAPH][0000:02:00.0] magic set 0:
[30029.706566] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408604: 0x20086701
[30029.706570] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408608: 0x00000000
[30029.706573] nouveau E[  PGRAPH][0000:02:00.0] 	0x0040860c: 0x40000432
[30029.706577] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408610: 0x00000000
[30029.706580] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TEXTURE - TP0: Unhandled ustatus 0x00000003
[30029.706583] nouveau E[  PGRAPH][0000:02:00.0]  TRAP
[30029.706588] nouveau E[  PGRAPH][0000:02:00.0] ch 4 [0x000797f000 gnome-shell[1661]] subc 3 class 0x8397 mthd 0x15e0 data 0x00000000


To me, this reads like "you ran out of vram and the driver did not handle this at all gracefully". "Unhandled ustatus 3" means "FAULT" as best I can tell. And of course since gnome-shell also acts as a compositor, that means your display is shot.

The question is how can this happen though... I guess something was requested to be migrated to vram and that request failed, but the driver didn't notice? Not sure. Your second log shows a pushbuf submission failure that caused a validate fail. I don't remember if that means that something was claimed to be in VRAM and wasn't or if it means that something was requested to be moved to vram and the move failed.

The fact that you can even go to another TTY and restart GS indicates that the card is in no way hung, which is nice.

You don't mention what version of mesa you're using... but I'm assuming something pretty recent, like 10.0.3 or maybe even 10.1-rc1?
Comment 5 Martin Lukeš 2014-02-14 22:50:11 UTC
Yes, I forgot to mention Mesa version, it is 10.0.3-1 on Arch Linux.
Comment 6 Roman Himmes 2014-05-23 09:02:10 UTC
Created attachment 99622 [details]
Kernel log
Comment 7 Roman Himmes 2014-05-23 09:53:04 UTC
The above comment is missing a description. Here it is:

On my system (ARCH) I run to texture problems on gnome-shell with very similar errors.
When I have a certain amount of widgets open, the task switcher shows black squares instead of previews of my widgets.

Used versions:
- xf86-video-nouveau 1.0.10-2
- mesa 10.1.4-1
- nouveau-dri 10.1.4-1
- libdrm 2.4.54-1
- Linux 3.14.4-1 x86_64
Comment 8 Martin Lukeš 2014-05-24 12:15:33 UTC
So I resigned on using Nouveau and went back to nVidia, specially its 304xx version on Arch Linux. 

local/lib32-libcl 1.1-1
local/lib32-libvdpau 0.7-2
local/lib32-nvidia-304xx-libgl 304.121-2
local/lib32-nvidia-304xx-utils 304.121-2
local/lib32-opencl-nvidia 337.19-1
local/libcl 1.1-3
local/libvdpau 0.7-1
local/nvidia-304xx 304.121-3
local/nvidia-304xx-libgl 304.121-2
local/nvidia-304xx-utils 304.121-2
local/opencl-nvidia 337.19-1


I'm also using these Mesa libs:
local/glu 9.0.0-2
local/lib32-glu 9.0.0-2
local/lib32-mesa 10.1.3-1
local/lib32-mesa-demos 8.1.0-3
local/mesa 10.1.3-1
local/mesa-demos 8.1.0-2
Comment 9 Fabien Bourigault 2014-06-13 10:27:16 UTC
I also had this bug on Arch Linux with my onboard GeForce 7300. I increased the shared memory in the BIOS settings from 256M to 512M and problems are gone !
Comment 10 Martin Peres 2019-12-04 08:43:05 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/92.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.