Bug 90967 - [NV92] System freeze using nouveau
Summary: [NV92] System freeze using nouveau
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-06-13 10:10 UTC by Jérôme
Modified: 2015-07-17 08:21 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg output (59.95 KB, text/plain)
2015-06-13 10:12 UTC, Jérôme
no flags Details
Xorg log (75.25 KB, text/plain)
2015-06-13 10:13 UTC, Jérôme
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jérôme 2015-06-13 10:10:54 UTC
Hi.

Sometimes, my system freezes. No keyboard, no CTRL+Fx, altough I can REISUB. The mouse moves but the screen is frozen so it can't click anywhere.

I tried to wait two or three minutes but nothing happens so I generally REISUB.

Is there a point waiting any longer ? A timeout / watchdog of any kind ?

When I switch the computer on, after logging with my user, before the desktop displays (background, panels, etc.), I get some kind of broken mozaic of images from my sessions before the failure (not images diplayed on frozen screen, rather images from different pages of my browser that where open when it occured).

It seems to happen randomly, so I can't reproduce systematically, but I think it always happens when I'm watching a video (using vlc) and my brower is running, maybe some other apps like mail-client, and it could be linked to a specific user action (like moving window, or changing focus,...) because I don't remember it happening while I'm just sitting away from the keyboard, like watching a movie.

Today, instead of REISUBing, I went to another computer to access mine through SSH. When I came back, lightdm was waiting for my user/pwd. Considering the uptime and the dates of /var/log/Xorg.0.log.old, I guess Xorg was restarted. When logging in, I didn't get the mozaic.

I doubt it is another bug, so maybe Xorg was restarted due to me waiting longer than usual, or, more doubtedly, my remote logging. And I didn't get the mozaic because nouveau's buffer were flushed more cleanly than with REISUB.

Anyway, I don't know much about graphics, so I can't make more guesses and I can't tell whether this is a duplicate or not.

My system is an up-to-date Debian Jessie.

xserver-xorg-video-nouveau -> 1:1.0.11-1
Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 GNU/Linux

Attaching dmesg output (same now than when remotely logged in, nothing about Xorg reboot or anything since the freeze).

Also attaching /var/log/Xorg.0.log.old.
Comment 1 Jérôme 2015-06-13 10:12:22 UTC
Created attachment 116474 [details]
dmesg output
Comment 2 Jérôme 2015-06-13 10:13:02 UTC
Created attachment 116475 [details]
Xorg log
Comment 3 Jérôme 2015-06-13 10:18:10 UTC
Not sure it is needed, but I forgot to specify my hardware.

The video card is a NVIDIA Corporation G92 [GeForce 9800 GT]

--------------------------------------------------------------

lspci output:

00:00.0 Host bridge: NVIDIA Corporation C55 Host Bridge (rev a2)
00:00.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:00.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:00.3 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:00.4 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:00.5 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a2)
00:00.6 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:00.7 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:01.0 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:01.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:01.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:01.3 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:01.4 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:01.5 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:01.6 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:02.0 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:02.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:02.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)
00:03.0 PCI bridge: NVIDIA Corporation C55 PCI Express bridge (rev a1)
00:09.0 RAM memory: NVIDIA Corporation MCP51 Host Bridge (rev a2)
00:0a.0 ISA bridge: NVIDIA Corporation MCP51 LPC Bridge (rev a3)
00:0a.1 SMBus: NVIDIA Corporation MCP51 SMBus (rev a3)
00:0a.2 RAM memory: NVIDIA Corporation MCP51 Memory Controller 0 (rev a3)
00:0b.0 USB controller: NVIDIA Corporation MCP51 USB Controller (rev a3)
00:0b.1 USB controller: NVIDIA Corporation MCP51 USB Controller (rev a3)
00:0d.0 IDE interface: NVIDIA Corporation MCP51 IDE (rev a1)
00:0e.0 RAID bus controller: NVIDIA Corporation MCP51 Serial ATA Controller (rev a1)
00:0f.0 RAID bus controller: NVIDIA Corporation MCP51 Serial ATA Controller (rev a1)
00:10.0 PCI bridge: NVIDIA Corporation MCP51 PCI Bridge (rev a2)
00:10.1 Audio device: NVIDIA Corporation MCP51 High Definition Audio (rev a2)
00:14.0 Bridge: NVIDIA Corporation MCP51 Ethernet Controller (rev a3)
01:00.0 VGA compatible controller: NVIDIA Corporation G92 [GeForce 9800 GT] (rev a2)
02:05.0 FireWire (IEEE 1394): LSI Corporation FW322/323 [TrueFire] 1394a Controller (rev 70)

--------------------------------------------------------------


I can't tell when this begun to happen, so I can't relate it to any change of kernel version, nouveau version, or anything.

I think it has been happening for a while (months, at least) but not so often, so I would peek into the logs, grab a few keywords from the errors and search the web which would lead me nowhere, then leave it alone.
Comment 4 Ilia Mirkin 2015-06-13 14:31:24 UTC
G92 is known to hang when using video decoding acceleration... try removing the firmware if you have it? (/lib/firmware/nouveau) If you don't have the firmware, it's definitely something else though.
Comment 5 Jérôme 2015-06-13 14:46:08 UTC
I don't have it.

ls: cannot access /lib/firmware/nouveau: No such file or directory

I blurrily remember trying to install it once to see if it would solve those tearing issues I sometimes get, but I didn't succeed so I removed everything. I can't remember exactly what the problem was.
Comment 6 Ilia Mirkin 2015-06-13 16:01:57 UTC
(In reply to Jérôme from comment #5)
> I don't have it.
> 
> ls: cannot access /lib/firmware/nouveau: No such file or directory
> 
> I blurrily remember trying to install it once to see if it would solve those
> tearing issues I sometimes get, but I didn't succeed so I removed
> everything. I can't remember exactly what the problem was.

OK, that also makes sense... the symptom there was an insta-hang on vdpau usage.

Do you perchance have libdrm 2.4.60? If so, switch to a different version (either .59 or .61).
Comment 7 Jérôme 2015-06-13 16:08:02 UTC
I'm using libdrm-nouveau2 (2.4.58-2)

https://packages.debian.org/jessie/libdrm-nouveau2

I could try 2.4.60-3 from Debian Stretch. Would that be useful ? It would be hard to conclude anyway since the bug is not easy to reproduce.
Comment 8 Ilia Mirkin 2015-06-13 16:10:40 UTC
(In reply to Jérôme from comment #7)
> I'm using libdrm-nouveau2 (2.4.58-2)
> 
> https://packages.debian.org/jessie/libdrm-nouveau2
> 
> I could try 2.4.60-3 from Debian Stretch. Would that be useful ? It would be
> hard to conclude anyway since the bug is not easy to reproduce.

No, 2.4.60 is broken for nouveau :) Figured you might have been using it.
Comment 9 homer242 2015-06-13 20:46:21 UTC
I think I have the same issue with my debian sid.

One week ago, I used nvidia driver and I haven't this kind of problem. Maybe it's a coincidence because I did some upgrade of my packages.

Like Jérôme, the problem occurs when I use vlc with "automatic output selection" after a couple of minutes (i think it's random but it's about 20 minutes). When I use "X11 video output (XCB)", the problem don't show up. The other cases where the problem happens are when I played a youtube video on iceweasel. Juste a second before my computer is freeze, there is a big slow-down like a fork-bomb :)

----
libdrm-nouveau2:amd64                     2.4.60-3
libdrm-nouveau2:i386                      2.4.60-3
xserver-xorg-video-nouveau                1:1.0.11-1+b1
----

The most relevant part I think is this log from Xorg.log:

----
(EE) [mi] EQ overflow continuing.  1000 events have been dropped.
(EE) [mi] No further overflow reports will be reported until the clog is cleared.
(EE) 
(EE) Backtrace:
(EE) 0: /usr/bin/Xorg (xorg_backtrace+0x56) [0x7f8530944a36]
(EE) 1: /usr/bin/Xorg (QueuePointerEvents+0x52) [0x7f8530801902]
(EE) 2: /usr/lib/xorg/modules/input/evdev_drv.so (0x7f8526289000+0x60a7) [0x7f852628f0a7]
(EE) 3: /usr/lib/xorg/modules/input/evdev_drv.so (0x7f8526289000+0x687d) [0x7f852628f87d]
(EE) 4: /usr/bin/Xorg (0x7f8530792000+0x963a8) [0x7f85308283a8]
(EE) 5: /usr/bin/Xorg (0x7f8530792000+0xbf349) [0x7f8530851349]
(EE) 6: /lib/x86_64-linux-gnu/libc.so.6 (0x7f852e692000+0x35180) [0x7f852e6c7180]
(EE) 7: /lib/x86_64-linux-gnu/libc.so.6 (ioctl+0x7) [0x7f852e770be7]
(EE) 8: /usr/lib/x86_64-linux-gnu/libdrm.so.2 (drmIoctl+0x28) [0x7f852fa4fb48]
(EE) 9: /usr/lib/x86_64-linux-gnu/libdrm.so.2 (drmCommandWrite+0x1b) [0x7f852fa5280b]
(EE) 10: /usr/lib/x86_64-linux-gnu/libdrm_nouveau.so.2 (nouveau_bo_wait+0x8c) [0x7f852a5a276c]
(EE) 11: /usr/lib/xorg/modules/drivers/nouveau_drv.so (0x7f852a7a7000+0x7ef3) [0x7f852a7aeef3]
(EE) 12: /usr/lib/xorg/modules/libexa.so (0x7f8529f62000+0x569b) [0x7f8529f6769b]
(EE) 13: /usr/lib/xorg/modules/libexa.so (0x7f8529f62000+0x7e5f) [0x7f8529f69e5f]
(EE) 14: /usr/lib/xorg/modules/libexa.so (0x7f8529f62000+0x1194b) [0x7f8529f7394b]
(EE) 15: /usr/lib/xorg/modules/libexa.so (0x7f8529f62000+0xe810) [0x7f8529f70810]
(EE) 16: /usr/bin/Xorg (0x7f8530792000+0x13b771) [0x7f85308cd771]
(EE) 17: /usr/lib/xorg/modules/libexa.so (0x7f8529f62000+0xf72f) [0x7f8529f7172f]
(EE) 18: /usr/bin/Xorg (0x7f8530792000+0x13212c) [0x7f85308c412c]
(EE) 19: /usr/bin/Xorg (0x7f8530792000+0x57f37) [0x7f85307e9f37]
(EE) 20: /usr/bin/Xorg (0x7f8530792000+0x5c0bb) [0x7f85307ee0bb]
(EE) 21: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf5) [0x7f852e6b3b45]
(EE) 22: /usr/bin/Xorg (0x7f8530792000+0x464be) [0x7f85307d84be]
(EE) 

----

I can give you more information if you want. I can easily reproduce the problem.
Comment 10 homer242 2015-06-13 20:47:24 UTC
I have this nvidia card:

01:00.0 VGA compatible controller: NVIDIA Corporation GT215 [GeForce GT 240] (rev a2)
Comment 11 Tobias Klausmann 2015-06-13 21:04:54 UTC
(In reply to homer242 from comment #9)
> I think I have the same issue with my debian sid.
> 
> One week ago, I used nvidia driver and I haven't this kind of problem. Maybe
> it's a coincidence because I did some upgrade of my packages.
> 
> Like Jérôme, the problem occurs when I use vlc with "automatic output
> selection" after a couple of minutes (i think it's random but it's about 20
> minutes). When I use "X11 video output (XCB)", the problem don't show up.
> The other cases where the problem happens are when I played a youtube video
> on iceweasel. Juste a second before my computer is freeze, there is a big
> slow-down like a fork-bomb :)
> 
> ----
> libdrm-nouveau2:amd64                     2.4.60-3
> libdrm-nouveau2:i386                      2.4.60-3
> xserver-xorg-video-nouveau                1:1.0.11-1+b1
> ----
> 
> The most relevant part I think is this log from Xorg.log:

snip

> ----
> 
> I can give you more information if you want. I can easily reproduce the
> problem.

Try using libdrm != 2.4.60, that version causes problems, if the problem persists you can come back! :)
Comment 12 homer242 2015-06-15 12:52:06 UTC
Ok. I'm using debian SID. Do you have a procedure to downgrade libdrm-nouveau2 to 2.4.58 ?

I tried to add jessie and apt-get install libdrm-nouveau2=2.4.58-2 but I have some dependencies problems with other packages (libdrm-dev, steam:i386, ...). Maybe someone have a good couples of commands to do softly this downgrade ?

Otherwise, when do you think the problem will be solved ? There is already someone who work on this bug ?

Thanks,
Anthony.
Comment 13 Sanford Rockowitz 2015-07-16 16:28:27 UTC
TL;DR: Replacing libdrm 2.4.60 with 2.4.58 appears to solve the problem for me.

I've seen the same problem intermittently.   Xorg.0.log output is similar to that of homer32.  The problem seems to be triggered by trying to navigate away from an unresponsive web page.  The problem occurred with the 3.18, 3.19, and 4.0 kernels.  Dropping back to 3.16 kernel always resulted in a stable system.  (3.17 had other more severe problems and was unusable.) 

Since reverting to libdrm2.4.58 7 days ago my Fedora system has been stable, first with Fedora kernel 4.0.6-200.fc21.x86_64 and then 4.0.7-200.fc21.x86_64. 

Environment: 
Fedora 21 64 bit 
nouveau driver
GTX660Ti video card
Comment 14 Jérôme 2015-07-17 08:21:03 UTC
The problem I describe in the original post happen[s|ned] using 2.4.58 so it has to be something different.

As I said, it does not occur so often. It has not happened since then.

Besides, I reinstalled my system recently (still a Jessie) since I changed my hard drives. It shouldn't make any difference, but you never know. Maybe I got rid of some long-forgotten parameter inherited from a manual change I had made years ago while fighting with Xorg. I doubt it.

Anyway, I don't mind this bug being closed, I can always reopen it if I want to bring new material.

Is there any logging I can enable in case it would happen again? Or anything I should try to record sshing from another machine when X freezes?


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.