Summary: | X freeze and PGRAPH errors in dmesg | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Marco Albarelli <motosauro> | ||||||||||||||
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> | ||||||||||||||
Status: | RESOLVED INVALID | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||||||
Severity: | major | ||||||||||||||||
Priority: | medium | CC: | jeremyhu | ||||||||||||||
Version: | unspecified | ||||||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||||||
OS: | Linux (All) | ||||||||||||||||
Whiteboard: | 2011BRB_Reviewed | ||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||
Attachments: |
|
Description
Marco Albarelli
2011-10-08 03:15:22 UTC
Created attachment 52150 [details]
Complete dmesg after the freeze
Created attachment 52151 [details]
Xorg log after the freeze
How reproducible is this? It's not clear based just on the backtrace if we're spinning in kernel land in the ioctl or if that's just where we happened to be at the time. Could you use strace to see what syscalls are being generated. That will give an idea if it's spinning in the kernel or somewhere in userland: sudo strace -p <pid of server process> (In reply to comment #3) > How reproducible is this? > > It's not clear based just on the backtrace if we're spinning in kernel land in > the ioctl or if that's just where we happened to be at the time. Could you use > strace to see what syscalls are being generated. That will give an idea if > it's spinning in the kernel or somewhere in userland: > > sudo strace -p <pid of server process> Thanks for the suggestion, but strace dumps a huge amount of data and the bug is very random Once it happened when watching a movie via VLC, another time when using the videochat function of google plus through firefox 7 Bith cases happened after a couple of days of uptime (and suspension: I never turn the pc off) (In reply to comment #3) > How reproducible is this? > > It's not clear based just on the backtrace if we're spinning in kernel land in > the ioctl or if that's just where we happened to be at the time. Could you use > strace to see what syscalls are being generated. That will give an idea if > it's spinning in the kernel or somewhere in userland: > > sudo strace -p <pid of server process> I've been able to reproduce the freeze I have a full strace for that but it's ~500mb uncompressed so I guess it's not really the case to post it. Is tehere a specific part you'll need? This is the point where the freeze happened setitimer(ITIMER_REAL, {it_interval={0, 20000}, it_value={0, 20000}}, NULL) = 0 read(43, "\211\10\10\0\3\0\200\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 4096) = 32 ioctl(8, 0x40086482, 0x7fff7b42b790) = 0 ioctl(8, 0x40046483, 0x7fff7b42b7c0) = 0 ioctl(8, 0xc0406481, 0x7fff7b42b2f0) = 0 ioctl(8, 0xc0406481, 0x7fff7b42b920) = 0 writev(26, [{"f\0d\0\371\231p\1\10\232p\1\266(.\0\355\1Z\1\330\0\207\0\0\0@\1\262\4\0\4", 32}], 1) = 32 writev(43, [{"J%\220\3\2\0\0\0\3\0\200\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 96}], 1) = 96 read(43, 0x1887520, 4096) = -1 EAGAIN (Resource temporarily unavailable) setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0 select(256, [1 3 6 8 9 10 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 33 34 35 36 37 38 39 40 41 42 43 44 45 47 48], NULL, NULL, {0, 0}) = 0 (Timeout) setitimer(ITIMER_REAL, {it_interval={0, 20000}, it_value={0, 20000}}, NULL) = 0 ioctl(8, 0xc0406481, 0x7fff7b42b6a0) = 0 ioctl(8, 0x40086482, 0x7fff7b42b730) = ? ERESTARTSYS (To be restarted) --- SIGALRM (Alarm clock) @ 0 (0) --- rt_sigreturn(0xe) = -1 EINTR (Interrupted system call) ioctl(8, 0x40086482, 0x7fff7b42b730) = ? ERESTARTSYS (To be restarted) --- SIGALRM (Alarm clock) @ 0 (0) --- rt_sigreturn(0xe) = -1 EINTR (Interrupted system call) ioctl(8, 0x40086482, 0x7fff7b42b730) = ? ERESTARTSYS (To be restarted) attaching desg and xorg.log Created attachment 52164 [details]
Complete dmesg after the freeze
Created attachment 52165 [details]
Xorg log after the freeze
It's looping on DRM_NOUVEAU_GEM_CPU_PREP ioctl which is periodically interrupted by SIGALRM signal. On the kernel side it's hanging on a fence in __nouveau_fence_wait. It's a typical symptom of GPU lockup. To fix this, someone needs to figure out why GPU locked up and/or how to reset the GPU. (In reply to comment #7) cut... > To fix this, someone needs to figure out why GPU locked up and/or how to reset > the GPU. If needed I have the strace dump which is ~19MB compressed text. I can attach it here or make it availabale through one of my external servers It's not enough. Sorry. You need to find quick and reliable way to reproduce it, like running some rendercheck or piglit test. I think i see the same bug uname -a Linux tenchi-htpc 3.1.0-1-generic #1-Ubuntu SMP Tue Oct 18 21:46:02 UTC 2011 i686 athlon i386 GNU/Linux System: - Ubuntu oneiric with current xorg-edgers - Dual Monitor Setup (1680x1050,1280x1024) - videocard is onboard 8200 - several video player freezes (totem), faster if I skip forwards/backwards -> updated to xorg-edgers but problem remained - filed some launchpad bug: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-nouveau/+bug/877626 (In reply to comment #9) > It's not enough. Sorry. > > You need to find quick and reliable way to reproduce it, like running some > rendercheck or piglit test. I installed the libs and got piglet from that git repo env PIGLIT_BUILD_DIR=`pwd` ./piglit-run.py tests/sanity.tests results/sanity.results [Sat Oct 29 14:26:02 2011] :: running :: glean/basic [Sat Oct 29 14:26:15 2011] :: pass :: glean/basic [Sat Oct 29 14:26:15 2011] :: running :: glean/readPixSanity [Sat Oct 29 14:29:07 2011] :: fail :: glean/readPixSanity Thank you for running Piglit! Results have been written to results/sanity.results/main Dmesg output: [ 1079.152099] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP_TPDMA - TP0: Unhandled ustatus 0x00020000 [ 1079.152105] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP [ 1079.152112] [drm] nouveau 0000:02:00.0: PGRAPH - ch 4 (0x0004d6c000) subc 5 class 0x8397 mthd 0x19d0 data 0x00000001 [ 1079.466136] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP_TPDMA - TP0: Unhandled ustatus 0x00020000 [ 1079.466146] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP [ 1079.466158] [drm] nouveau 0000:02:00.0: PGRAPH - ch 4 (0x0004d6c000) subc 5 class 0x8397 mthd 0x19d0 data 0x00000001 [ 1079.764620] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP_TPDMA - TP0: Unhandled ustatus 0x00020000 [ 1079.764631] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP [ 1079.764643] [drm] nouveau 0000:02:00.0: PGRAPH - ch 4 (0x0004d6c000) subc 5 class 0x8397 mthd 0x19d0 data 0x00000001 [ 1080.064785] [drm] nouveau 0000:02:00.0: PGRAPH - TRAP_TPDMA - TP0: Unhandled ustatus 0x00020000 and more identical lines. Attaching sanity.results/main Are some new logs needed? Created attachment 52884 [details]
Log: Piglit sanity.tests failing
It appears that this bug report has laid dormant for quite a while. Sorry we haven't gotten to it. Since we fix bugs all the time, chances are pretty good that your issue has been fixed with the latest software. Please give it a shot. (Linux kernel 3.10.7, xf86-video-nouveau 1.0.9, mesa 9.1.6, or their git versions.) If upgrading to the latest isn't an option for you, your distro's bugzilla is probably the right destination for your bug report. In an effort to clean up our bug list, we're pre-emptively closing all bugs that haven't seen updates since 2011. If the original issue remains, please make sure to provide fresh info, see http://nouveau.freedesktop.org/wiki/Bugs/ for what we need to see, and re-open this one. Thanks, The Nouveau Team |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.