Bug 74992 - [NVAA] gnome-shell freezes with vram_list validate fail, texture traps
Summary: [NVAA] gnome-shell freezes with vram_list validate fail, texture traps
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-02-14 16:59 UTC by Martin Lukeš
Modified: 2014-06-13 10:27 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (340.59 KB, text/plain)
2014-02-14 16:59 UTC, Martin Lukeš
no flags Details
journalctl_error (74.16 KB, text/plain)
2014-02-14 17:01 UTC, Martin Lukeš
no flags Details
journalctl_boot (148.07 KB, text/plain)
2014-02-14 17:01 UTC, Martin Lukeš
no flags Details
Xorg_0_log (42.53 KB, text/plain)
2014-02-14 17:02 UTC, Martin Lukeš
no flags Details
Kernel log (327.79 KB, text/plain)
2014-05-23 09:02 UTC, Roman Himmes
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Lukeš 2014-02-14 16:59:48 UTC
Created attachment 94080 [details]
dmesg

Description:
After some time of normal use of Gnome Shell with nouveau driver the screen gets frozen and I have to restart the GS from another TTY.


Additional info:
xf86-video-nouveau 1.0.10-2
xorg-server 1.15.0-5
linux 3.12.9-2
I use nVidia GeForce 8200
I'm including Xorg.0.log and outputs from dmesg, `journalctl -b` and `journalctl --since=${time_of_freeze}`


Steps to reproduce:
This behavior seems to happen randomly.
Comment 1 Martin Lukeš 2014-02-14 17:01:00 UTC
Created attachment 94081 [details]
journalctl_error
Comment 2 Martin Lukeš 2014-02-14 17:01:35 UTC
Created attachment 94082 [details]
journalctl_boot
Comment 3 Martin Lukeš 2014-02-14 17:02:07 UTC
Created attachment 94083 [details]
Xorg_0_log
Comment 4 Ilia Mirkin 2014-02-14 19:16:41 UTC
[30028.608886] nouveau E[gnome-shell[1661]] fail ttm_validate
[30028.608894] nouveau E[gnome-shell[1661]] validate vram_list
[30028.608938] nouveau E[gnome-shell[1661]] validate: -12
[30029.187282] nouveau E[gnome-shell[1661]] fail ttm_validate
[30029.187290] nouveau E[gnome-shell[1661]] validate vram_list
[30029.187306] nouveau E[gnome-shell[1661]] validate: -12
[30029.706353] nouveau E[  PGRAPH][0000:02:00.0] magic set 0:
[30029.706364] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408604: 0x20081401
[30029.706368] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408608: 0x00000000
[30029.706372] nouveau E[  PGRAPH][0000:02:00.0] 	0x0040860c: 0x40000432
[30029.706376] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408610: 0x00000000
[30029.706380] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TEXTURE - TP0: Unhandled ustatus 0x00000003
[30029.706383] nouveau E[  PGRAPH][0000:02:00.0]  TRAP
[30029.706390] nouveau E[  PGRAPH][0000:02:00.0] ch 4 [0x000797f000 gnome-shell[1661]] subc 3 class 0x8397 mthd 0x15e0 data 0x00000000
[30029.706400] nouveau E[     PFB][0000:02:00.0] trapped read at 0x0000000000 on channel 0x0000797f [gnome-shell[1661]] PGRAPH/TEXTURE/00 reason: PT_NOT_PRESENT
[30029.706563] nouveau E[  PGRAPH][0000:02:00.0] magic set 0:
[30029.706566] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408604: 0x20086701
[30029.706570] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408608: 0x00000000
[30029.706573] nouveau E[  PGRAPH][0000:02:00.0] 	0x0040860c: 0x40000432
[30029.706577] nouveau E[  PGRAPH][0000:02:00.0] 	0x00408610: 0x00000000
[30029.706580] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TEXTURE - TP0: Unhandled ustatus 0x00000003
[30029.706583] nouveau E[  PGRAPH][0000:02:00.0]  TRAP
[30029.706588] nouveau E[  PGRAPH][0000:02:00.0] ch 4 [0x000797f000 gnome-shell[1661]] subc 3 class 0x8397 mthd 0x15e0 data 0x00000000


To me, this reads like "you ran out of vram and the driver did not handle this at all gracefully". "Unhandled ustatus 3" means "FAULT" as best I can tell. And of course since gnome-shell also acts as a compositor, that means your display is shot.

The question is how can this happen though... I guess something was requested to be migrated to vram and that request failed, but the driver didn't notice? Not sure. Your second log shows a pushbuf submission failure that caused a validate fail. I don't remember if that means that something was claimed to be in VRAM and wasn't or if it means that something was requested to be moved to vram and the move failed.

The fact that you can even go to another TTY and restart GS indicates that the card is in no way hung, which is nice.

You don't mention what version of mesa you're using... but I'm assuming something pretty recent, like 10.0.3 or maybe even 10.1-rc1?
Comment 5 Martin Lukeš 2014-02-14 22:50:11 UTC
Yes, I forgot to mention Mesa version, it is 10.0.3-1 on Arch Linux.
Comment 6 Roman Himmes 2014-05-23 09:02:10 UTC
Created attachment 99622 [details]
Kernel log
Comment 7 Roman Himmes 2014-05-23 09:53:04 UTC
The above comment is missing a description. Here it is:

On my system (ARCH) I run to texture problems on gnome-shell with very similar errors.
When I have a certain amount of widgets open, the task switcher shows black squares instead of previews of my widgets.

Used versions:
- xf86-video-nouveau 1.0.10-2
- mesa 10.1.4-1
- nouveau-dri 10.1.4-1
- libdrm 2.4.54-1
- Linux 3.14.4-1 x86_64
Comment 8 Martin Lukeš 2014-05-24 12:15:33 UTC
So I resigned on using Nouveau and went back to nVidia, specially its 304xx version on Arch Linux. 

local/lib32-libcl 1.1-1
local/lib32-libvdpau 0.7-2
local/lib32-nvidia-304xx-libgl 304.121-2
local/lib32-nvidia-304xx-utils 304.121-2
local/lib32-opencl-nvidia 337.19-1
local/libcl 1.1-3
local/libvdpau 0.7-1
local/nvidia-304xx 304.121-3
local/nvidia-304xx-libgl 304.121-2
local/nvidia-304xx-utils 304.121-2
local/opencl-nvidia 337.19-1


I'm also using these Mesa libs:
local/glu 9.0.0-2
local/lib32-glu 9.0.0-2
local/lib32-mesa 10.1.3-1
local/lib32-mesa-demos 8.1.0-3
local/mesa 10.1.3-1
local/mesa-demos 8.1.0-2
Comment 9 Fabien Bourigault 2014-06-13 10:27:16 UTC
I also had this bug on Arch Linux with my onboard GeForce 7300. I increased the shared memory in the BIOS settings from 256M to 512M and problems are gone !


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.