Created attachment 52113 [details] Complete dmseg Hi The system hanged when watching a movie via VLC on one screen I have a dual monitor setup: one 1920x1080 and one 768x1366 (rotated 90° left) DM is Kde 4.6 kernel kept on working: I could log in through ssh and audio kept on playing I don't know how to reproduce the bug, but it sometimes happen, alos sometimes the system fails to disable the video output when going to suspension and I have to manually reboot, but this might be an unrelated bug if you need more info please go ahead and ask Thanks in advance athlonno ~ # uname -a Linux athlonno 2.6.39-gentoo-r3 #6 SMP Sun Aug 21 22:54:31 CEST 2011 x86_64 AMD Athlon(tm) II X4 645 Processor AuthenticAMD GNU/Linux athlonno ~ # emerge -s nouveau Searching... [ Results for search key : nouveau ] [ Applications found : 2 ] * x11-base/nouveau-drm [ Masked ] Latest version available: 20110820 Latest version installed: [ Not Installed ] Size of files: 1,778 kB Homepage: http://nouveau.freedesktop.org/ Description: Nouveau DRM Kernel Modules for X11 License: MIT * x11-drivers/xf86-video-nouveau Latest version available: 0.0.16_pre20110801 Latest version installed: 0.0.16_pre20110801 Size of files: 131 kB Homepage: http://nouveau.freedesktop.org/ Description: Accelerated Open Source driver for nVidia cards License: MIT athlonno ~ # lspci 00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge Alternate 00:02.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (ext gfx port 0) 00:07.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 3) 00:09.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 4) 00:0a.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 5) 00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode] (rev 40) 00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 42) 00:14.2 Audio device: ATI Technologies Inc SBx00 Azalia (Intel HDA) (rev 40) 00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller (rev 40) 00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge (rev 40) 00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller 00:16.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:16.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:18.0 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Link Control 01:00.0 FireWire (IEEE 1394): VIA Technologies, Inc. Device 3403 (rev 01) 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06) 03:00.0 USB Controller: Device 1b73:1000 (rev 01) 04:00.0 VGA compatible controller: nVidia Corporation GT200 [GeForce 210] (rev a2) 04:00.1 Audio device: nVidia Corporation High Definition Audio Controller (rev a1)
Created attachment 52150 [details] Complete dmesg after the freeze
Created attachment 52151 [details] Xorg log after the freeze
How reproducible is this? It's not clear based just on the backtrace if we're spinning in kernel land in the ioctl or if that's just where we happened to be at the time. Could you use strace to see what syscalls are being generated. That will give an idea if it's spinning in the kernel or somewhere in userland: sudo strace -p <pid of server process>
(In reply to comment #3) > How reproducible is this? > > It's not clear based just on the backtrace if we're spinning in kernel land in > the ioctl or if that's just where we happened to be at the time. Could you use > strace to see what syscalls are being generated. That will give an idea if > it's spinning in the kernel or somewhere in userland: > > sudo strace -p <pid of server process> Thanks for the suggestion, but strace dumps a huge amount of data and the bug is very random Once it happened when watching a movie via VLC, another time when using the videochat function of google plus through firefox 7 Bith cases happened after a couple of days of uptime (and suspension: I never turn the pc off) (In reply to comment #3) > How reproducible is this? > > It's not clear based just on the backtrace if we're spinning in kernel land in > the ioctl or if that's just where we happened to be at the time. Could you use > strace to see what syscalls are being generated. That will give an idea if > it's spinning in the kernel or somewhere in userland: > > sudo strace -p <pid of server process> I've been able to reproduce the freeze I have a full strace for that but it's ~500mb uncompressed so I guess it's not really the case to post it. Is tehere a specific part you'll need? This is the point where the freeze happened setitimer(ITIMER_REAL, {it_interval={0, 20000}, it_value={0, 20000}}, NULL) = 0 read(43, "\211\10\10\0\3\0\200\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 4096) = 32 ioctl(8, 0x40086482, 0x7fff7b42b790) = 0 ioctl(8, 0x40046483, 0x7fff7b42b7c0) = 0 ioctl(8, 0xc0406481, 0x7fff7b42b2f0) = 0 ioctl(8, 0xc0406481, 0x7fff7b42b920) = 0 writev(26, [{"f\0d\0\371\231p\1\10\232p\1\266(.\0\355\1Z\1\330\0\207\0\0\0@\1\262\4\0\4", 32}], 1) = 32 writev(43, [{"J%\220\3\2\0\0\0\3\0\200\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 96}], 1) = 96 read(43, 0x1887520, 4096) = -1 EAGAIN (Resource temporarily unavailable) setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0 select(256, [1 3 6 8 9 10 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 33 34 35 36 37 38 39 40 41 42 43 44 45 47 48], NULL, NULL, {0, 0}) = 0 (Timeout) setitimer(ITIMER_REAL, {it_interval={0, 20000}, it_value={0, 20000}}, NULL) = 0 ioctl(8, 0xc0406481, 0x7fff7b42b6a0) = 0 ioctl(8, 0x40086482, 0x7fff7b42b730) = ? ERESTARTSYS (To be restarted) --- SIGALRM (Alarm clock) @ 0 (0) --- rt_sigreturn(0xe) = -1 EINTR (Interrupted system call) ioctl(8, 0x40086482, 0x7fff7b42b730) = ? ERESTARTSYS (To be restarted) --- SIGALRM (Alarm clock) @ 0 (0) --- rt_sigreturn(0xe) = -1 EINTR (Interrupted system call) ioctl(8, 0x40086482, 0x7fff7b42b730) = ? ERESTARTSYS (To be restarted) attaching desg and xorg.log
Created attachment 52164 [details] Complete dmesg after the freeze
Created attachment 52165 [details] Xorg log after the freeze
It's looping on DRM_NOUVEAU_GEM_CPU_PREP ioctl which is periodically interrupted by SIGALRM signal. On the kernel side it's hanging on a fence in __nouveau_fence_wait. It's a typical symptom of GPU lockup. To fix this, someone needs to figure out why GPU locked up and/or how to reset the GPU.
(In reply to comment #7) cut... > To fix this, someone needs to figure out why GPU locked up and/or how to reset > the GPU. If needed I have the strace dump which is ~19MB compressed text. I can attach it here or make it availabale through one of my external servers
It's not enough. Sorry. You need to find quick and reliable way to reproduce it, like running some rendercheck or piglit test.
I think i see the same bug uname -a Linux tenchi-htpc 3.1.0-1-generic #1-Ubuntu SMP Tue Oct 18 21:46:02 UTC 2011 i686 athlon i386 GNU/Linux System: - Ubuntu oneiric with current xorg-edgers - Dual Monitor Setup (1680x1050,1280x1024) - videocard is onboard 8200 - several video player freezes (totem), faster if I skip forwards/backwards -> updated to xorg-edgers but problem remained - filed some launchpad bug: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-nouveau/+bug/877626 (In reply to comment #9) > It's not enough. Sorry. > > You need to find quick and reliable way to reproduce it, like running some > rendercheck or piglit test. I installed the libs and got piglet from that git repo env PIGLIT_BUILD_DIR=`pwd` ./piglit-run.py tests/sanity.tests results/sanity.results [Sat Oct 29 14:26:02 2011] :: running :: glean/basic [Sat Oct 29 14:26:15 2011] :: pass :: glean/basic [Sat Oct 29 14:26:15 2011] :: running :: glean/readPixSanity [Sat Oct 29 14:29:07 2011] :: fail :: glean/readPixSanity Thank you for running Piglit! Results have been written to results/sanity.results/main Dmesg output: [ 1079.152099] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP_TPDMA - TP0: Unhandled ustatus 0x00020000 [ 1079.152105] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP [ 1079.152112] [drm] nouveau 0000:02:00.0: PGRAPH - ch 4 (0x0004d6c000) subc 5 class 0x8397 mthd 0x19d0 data 0x00000001 [ 1079.466136] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP_TPDMA - TP0: Unhandled ustatus 0x00020000 [ 1079.466146] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP [ 1079.466158] [drm] nouveau 0000:02:00.0: PGRAPH - ch 4 (0x0004d6c000) subc 5 class 0x8397 mthd 0x19d0 data 0x00000001 [ 1079.764620] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP_TPDMA - TP0: Unhandled ustatus 0x00020000 [ 1079.764631] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP [ 1079.764643] [drm] nouveau 0000:02:00.0: PGRAPH - ch 4 (0x0004d6c000) subc 5 class 0x8397 mthd 0x19d0 data 0x00000001 [ 1080.064785] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP_TPDMA - TP0: Unhandled ustatus 0x00020000 and more identical lines. Attaching sanity.results/main Are some new logs needed?
Created attachment 52884 [details] Log: Piglit sanity.tests failing
It appears that this bug report has laid dormant for quite a while. Sorry we haven't gotten to it. Since we fix bugs all the time, chances are pretty good that your issue has been fixed with the latest software. Please give it a shot. (Linux kernel 3.10.7, xf86-video-nouveau 1.0.9, mesa 9.1.6, or their git versions.) If upgrading to the latest isn't an option for you, your distro's bugzilla is probably the right destination for your bug report. In an effort to clean up our bug list, we're pre-emptively closing all bugs that haven't seen updates since 2011. If the original issue remains, please make sure to provide fresh info, see http://nouveau.freedesktop.org/wiki/Bugs/ for what we need to see, and re-open this one. Thanks, The Nouveau Team
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.