Created attachment 50095 [details] xorg log Hi. I experience quite frequent freezes of the system since I've switched over to nouveau... :( This system here runs Debian unstable on AMD64, the GPU is a G86 [GeForce 8400M G]. Symptom is that X suddenly freezes (although the system seems to keep running in the back). Mouse continues to work, keyboard doesn't although SysRq is possible to some extent (well sync/umount/boot). Going back to console does not work. I've attached an Xorg.log (although this didn't contain any errors). Also some kernel log messages, collected from various events: kern.log.2.xz:Jul 30 20:19:44 heisenberg kernel: [16890.395990] [drm] nouveau 0000:01:00.0: fail ttm_validate kern.log.2.xz:Jul 30 20:19:44 heisenberg kernel: [16890.395995] [drm] nouveau 0000:01:00.0: validate vram_list kern.log.2.xz:Jul 30 20:19:44 heisenberg kernel: [16890.396031] [drm] nouveau 0000:01:00.0: validate: -12 kern.log.2.xz:Jul 30 20:21:38 heisenberg kernel: [17004.414063] [drm] nouveau 0000:01:00.0: EvoCh 2 Mthd 0x0080 Data 0x00000000 (0x000b 0x05) kern.log.2.xz:Jul 30 20:22:55 heisenberg kernel: [17081.482834] [drm] nouveau 0000:01:00.0: fail ttm_validate kern.log.2.xz:Jul 30 20:22:55 heisenberg kernel: [17081.482840] [drm] nouveau 0000:01:00.0: validate vram_list kern.log.2.xz:Jul 30 20:22:55 heisenberg kernel: [17081.482867] [drm] nouveau 0000:01:00.0: validate: -12 kern.log.2.xz:Jul 30 20:22:55 heisenberg kernel: [17081.495997] [drm] nouveau 0000:01:00.0: fail ttm_validate kern.log.2.xz:Jul 30 20:22:55 heisenberg kernel: [17081.496136] [drm] nouveau 0000:01:00.0: validate vram_list kern.log.2.xz:Jul 30 20:22:55 heisenberg kernel: [17081.496155] [drm] nouveau 0000:01:00.0: validate: -12 kern.log.2.xz:Jul 30 20:22:58 heisenberg kernel: [17084.554033] [drm] nouveau 0000:01:00.0: fail ttm_validate kern.log.2.xz:Jul 30 20:22:58 heisenberg kernel: [17084.554036] [drm] nouveau 0000:01:00.0: validate vram_list kern.log.2.xz:Jul 30 20:22:58 heisenberg kernel: [17084.554056] [drm] nouveau 0000:01:00.0: validate: -16 kern.log.3.xz:Jul 21 20:12:56 heisenberg kernel: [29940.042744] [drm] nouveau 0000:01:00.0: Error allocating channel PRAMIN: -28 kern.log.3.xz:Jul 21 20:12:56 heisenberg kernel: [29940.042748] [drm] nouveau 0000:01:00.0: init pramin kern.log.3.xz:Jul 21 20:12:56 heisenberg kernel: [29940.042750] [drm] nouveau 0000:01:00.0: gpuobj -28 kern.log.2.xz:Jul 29 20:48:19 heisenberg kernel: [ 5362.017674] [drm] nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 4/1 Mthd 0x0060 Data 0xbeef0201 kern.log.2.xz:Jul 29 12:39:11 heisenberg kernel: [ 5129.943625] [drm] nouveau 0000:01:00.0: fail ttm_validate kern.log.2.xz:Jul 29 12:39:11 heisenberg kernel: [ 5129.943627] [drm] nouveau 0000:01:00.0: validate vram_list kern.log.2.xz:Jul 29 12:39:11 heisenberg kernel: [ 5129.943632] [drm] nouveau 0000:01:00.0: validate: -12 kern.log.2.xz:Jul 29 12:39:11 heisenberg kernel: [ 5129.981260] [drm] nouveau 0000:01:00.0: fail ttm_validate kern.log.2.xz:Jul 29 12:39:11 heisenberg kernel: [ 5129.981261] [drm] nouveau 0000:01:00.0: validate vram_list kern.log.2.xz:Jul 29 12:39:11 heisenberg kernel: [ 5129.981267] [drm] nouveau 0000:01:00.0: validate: -12 kern.log.1:Aug 5 21:14:12 heisenberg kernel: [11333.244292] [drm] nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 2/1 Mthd 0x0060 Data 0xd8000001 kern.log.1:Aug 5 16:34:38 heisenberg kernel: [12607.195869] [drm] nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 4/1 Mthd 0x0060 Data 0xbeef0201 k In this case, it apparently just killed my window manager, falling back to metacity, and I could somehow cleanly reboot the system Aug 10 14:43:40 heisenberg kernel: [ 1061.085190] [drm] nouveau 0000:01:00.0: fail ttm_validate Aug 10 14:43:40 heisenberg kernel: [ 1061.085195] [drm] nouveau 0000:01:00.0: validate vram_list Aug 10 14:43:40 heisenberg kernel: [ 1061.085200] [drm] nouveau 0000:01:00.0: validate: -12 Aug 10 14:53:25 heisenberg kernel: [ 1646.803467] [drm] nouveau 0000:01:00.0: fail ttm_validate Aug 10 14:53:25 heisenberg kernel: [ 1646.803472] [drm] nouveau 0000:01:00.0: validate vram_list Aug 10 14:53:25 heisenberg kernel: [ 1646.803477] [drm] nouveau 0000:01:00.0: validate: -12 Aug 10 14:53:25 heisenberg kernel: [ 1646.861090] [drm] nouveau 0000:01:00.0: fail ttm_validate Aug 10 14:53:25 heisenberg kernel: [ 1646.861096] [drm] nouveau 0000:01:00.0: validate vram_list Aug 10 14:53:25 heisenberg kernel: [ 1646.861102] [drm] nouveau 0000:01:00.0: validate: -12 Aug 10 15:10:38 heisenberg kernel: [ 2679.960989] [drm] nouveau 0000:01:00.0: fail ttm_validate Aug 10 15:10:38 heisenberg kernel: [ 2679.960994] [drm] nouveau 0000:01:00.0: validate vram_list Aug 10 15:10:38 heisenberg kernel: [ 2679.961027] [drm] nouveau 0000:01:00.0: validate: -12 Aug 10 15:10:44 heisenberg kernel: [ 2685.310110] [drm] nouveau 0000:01:00.0: PGRAPH_TRAP_MP_EXEC - TP 0 MP 0: INVALID_OPCODE at 000000 warp 0, opcode 00000000 00000000 Aug 10 15:10:44 heisenberg kernel: [ 2685.310117] [drm] nouveau 0000:01:00.0: PGRAPH - TRAP Aug 10 15:10:44 heisenberg kernel: [ 2685.310121] [drm] nouveau 0000:01:00.0: PGRAPH - ch 5 (0x0006496000) subc 7 class 0x8297 mthd 0x1a1c data 0x00001111 Aug 10 15:10:44 heisenberg kernel: [ 2685.346157] [drm] nouveau 0000:01:00.0: PGRAPH_TRAP_MP_EXEC - TP 0 MP 0: INVALID_OPCODE at 000000 warp 0, opcode 00000000 00000000 Aug 10 15:10:44 heisenberg kernel: [ 2685.346165] [drm] nouveau 0000:01:00.0: PGRAPH - TRAP Aug 10 15:10:44 heisenberg kernel: [ 2685.346169] [drm] nouveau 0000:01:00.0: PGRAPH - ch 5 (0x0006496000) subc 7 class 0x8297 mthd 0x1a1c data 0x00001111 (was just doing some libreoffice stuff,... so there shouldn't have been any 3D problems included!?) There usually follows some filesystem corruption from this... Which makes the whole thing barely usable. Not sure whether this is related, but I had some severe problems of freezing nouveau, which seemed to be related with the system going out of main memory. (Had several VirtualBox VMs started,...) This was quite easily reproduceable. Any ideas? Or is there at least some way to certainly kill nouveau/X and go back to a working console? If you need further data, please tell me which. Cheers, Chris.
Oh and some versions: linux 3.0 $ apt-cache show xserver-xorg | grep Version Version: 1:7.6+7 $ apt-cache show xserver-xorg-core | grep Version Version: 2:1.10.3-1 $ apt-cache show xserver-xorg-video-nouveau | grep Version Version: 1:0.0.16+git20110411+8378443-1+b1 $ apt-cache show libdrm-nouveau1a | grep Version Version: 2.4.26-1 $ apt-cache show libgl1-mesa-dri-experimental | grep Version Version: 7.10.3-4
Hi Christoph Would you mind removing the following package in order to establish if it's related/triggers the issue libgl1-mesa-dri-experimental Note that by doing so your compositing manager will/should fallback to software rendering (you will not have any 3d/GL hardware acceleration) thus your desktop experience is not going to be as smooth For future reference would you mind attaching the whole log (dmesg) rather than pasting fragments of it in the report Can you please take a look at our Bugs [1] and FAQ [2] section to eliminite one of the most common root cause Cheers Emil [1] http://nouveau.freedesktop.org/wiki/Bugs [2] http://nouveau.freedesktop.org/wiki/FAQ
Hi. >Would you mind removing the following package in order to >establish if it's related/triggers the issue >libgl1-mesa-dri-experimental I feared you'd ask this ^^... Well I can give it a try, of course I'll loose compiz, which I'm particularly used to... might very well happen that it doesn't happen without this. >For future reference would you mind attaching the whole log (dmesg) >rather than pasting fragments of it in the report ok,.. that was just a huge mess... and apart for the usual log messages detection the card/etc. there was not DRI related output at all. I just hat the same(?) issue on a nother machine, with basically the same software config but a G94 (Geforce 9600 GT) with two monitors attached. But this time,.. no output to the logs at all. Neither xorg.log, nor kernel log. And even the SysRq messages made it into the syslog before the system rebooted. This particular system shows also a very easily reproducable bug, but I'll report that in a spearate bug report. >Can you please take a look at our Bugs [1] and FAQ [2] >section to eliminite one of the most common root cause I've actually read them,.. but honestly,... some things you suggest there are rather "difficult" (and I've studied computer science) for end users to do... especially using git head for everything is quite an effort, especially when you want to keep rather in sync with your distro. Cheers, Chris.
Oh btw: isn't it somehow possible to add functionallity if nouveau detects some longer lasting lockup, that it kills itself,.. going back to some basic graphic mode? Magic-sysrq + g seem to not work :(
*** Bug 45230 has been marked as a duplicate of this bug. ***
This still persists basically. Now with: Linux 3.2.12 libgl1-mesa-dri/libgl1-mesa-glx 7.11.2-1 libdrm-nouveau1a 2.4.32-1 xserver-xorg-video-nouveau 1:0.0.16+g One thing I've noticed: It's possible to see the problems approaching (even when not looking at the kernel output)... The screen starts to flicker (especially when switching virtual desktops, due to the compiz animation stuff) in some areas,.. if you then continue to work it will usually freeze. I've also noticed that closing windows (e.g. some terminals) usually helps then and also stops the flickering. When the freeze however happens, one has now usually a few seconds time to Ctrl+Alt+F1 to the console and Ctrl+Alt+Enf or ACPI Power Button Event. Thereby you at least shut down cleanly though still loosing all your work. :-(
Hello, I have the same problem: X freezes, mouse is working, keyboard does (mostly) not work. But: I can still switch to console with Ctrl+Alt+F1 or sth like that. In my syslog i find messages like that: Apr 25 14:27:05 foo kernel: [1390386.065164] [drm] nouveau 0000:0f:00.0: fail ttm_validate Apr 25 14:27:05 foo kernel: [1390386.065168] [drm] nouveau 0000:0f:00.0: validate vram_list Apr 25 14:27:05 foo kernel: [1390386.065177] [drm] nouveau 0000:0f:00.0: validate: -12 Apr 25 14:28:00 foo kernel: [1390440.453306] [drm] nouveau 0000:0f:00.0: fail ttm_validate Apr 25 14:28:00 foo kernel: [1390440.453310] [drm] nouveau 0000:0f:00.0: validate vram_list Apr 25 14:28:00 foo kernel: [1390440.453337] [drm] nouveau 0000:0f:00.0: validate: -12 Apr 25 14:28:54 foo kernel: [1390494.359876] [drm] nouveau 0000:0f:00.0: fail ttm_validate Apr 25 14:28:54 foo kernel: [1390494.359880] [drm] nouveau 0000:0f:00.0: validate vram_list Apr 25 14:28:54 foo kernel: [1390494.359887] [drm] nouveau 0000:0f:00.0: validate: -12 Apr 25 14:34:01 foo gdm3][17808]: GLib-GIO-WARNING: Dropping signal ActiveSessionChanged of type (s) since the type from the expected interface is (o) Apr 25 14:34:03 foo acpid: client 17732[0:0] has disconnected My solution is then normally to restart gdm, then its working again (for some time). Its absolutely not reproducable, for me its totally random occurance. Some of my system settings: Debian testing % uname -a Linux foo 3.2.0-4-amd64 #1 SMP Debian 3.2.41-2 x86_64 GNU/Linux % lspci G NVIDIA 0f:00.0 VGA compatible controller: NVIDIA Corporation G98 [Quadro NVS 295] (rev a1) % show xserver-xorg G Version Version: 1:7.7+2 % show xserver-xorg-core G Version Version: 2:1.12.4-6 % show xserver-xorg-video-nouveau G Version Version: 1:1.0.1-5 % show libdrm-nouveau1a G Version Version: 2.4.40-1~deb7u2 % show libgl1-mesa-dri-experimental G Version Version: 8.0.5-4 I try to give any information you need! Thx for any help! Markus
Does this still happen with recent software (kernel 3.11, mesa 9.2, xf86-video-nouveau 1.0.9)? If so, please post fresh dmesg/xorg logs.
No response to re-test request after a month. Closing as invalid.
Sorry for not having responded,... forgot this somehow. Anyway, I no longer have nvidia cards, so I couldn't have tested it anymore :(
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.