Summary: | VLC fullscreen, progress bar flickers, and then halts weston, xwayland | ||
---|---|---|---|
Product: | Wayland | Reporter: | soloturn |
Component: | weston | Assignee: | Wayland bug list <wayland-bugs> |
Status: | RESOLVED WONTFIX | QA Contact: | |
Severity: | critical | ||
Priority: | medium | ||
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
soloturn
2017-01-18 07:34:59 UTC
on gnome using xwayland, vlc in fullscreen mode crashes the display, blocks it, or makes it close to unresponsive. see https://bugzilla.gnome.org/show_bug.cgi?id=777428. Let's look at the hang with Weston first. When you say hard reset is necessary, did you also try remote access via ssh? Does an already open ssh connection also freeze? If the ssh connection freezes, then it is surely a kernel driver bug. If the ssh connection does not freeze, could you gather some logs, please? - start weston (or weston-launch if you need to use that) with: weston > weston-log.txt 2>&1 - start also vlc with '> vlc-log.txt 2>&1' - dmesg output - output of 'ps auxw -H' Gather these after the problem has triggered. You could also try adding --use-pixman argument to Weston, which forces Weston, Xwayland and X11 apps to not use the GPU, and see what happens. If the flickering remains after the freeze is fixed, it will be a different bug and possibly harder to track down. Does VLC draw the progress bar into the video image or is it another X11 Window? i started weston now out of gnome command line, not on a text tty: weston > weston-log.txt 2>&1 which opens a weston window. then i open a terminal in this weston window and start vlc. then make this vlc full screen, i.e. filling only the whole weston window. then i click around on this vlc progress bar until it freezes. i pressed ctrl-c in the command line where i started weston and it did not react. i opened a second command shell to analyze. and - vlc is not there any more. but weston still displays its contents. i can kill -9 8723, see below for the process tree. weston.log ends with: [07:18:15.697] Spawned Xwayland server, pid 8732 glamor: EGL version 1.4 (DRI2): [07:18:15.874] xfixes version: 5.0 [07:18:15.890] created wm, root 624 (EE) Fatal server error: (EE) failed to write to XWayland fd: Broken pipe PID TTY COMMAND 1 ? /sbin/init \EFI\arch\vmlinuz-linux 355 ? login -- rt 6631 tty1 -bash 6657 tty1 dbus-run-session gnome-session 6665 tty1 dbus-daemon --nofork --print-address 4 --session 6666 tty1 /usr/lib/gnome-session/gnome-session-binary 6787 6 tty1 /usr/bin/gnome-shell 7127 tty1 /usr/bin/Xwayland :0 -rootless -noreset -listen 4 -listen 5 -displayfd 6 7267 tty1 /usr/lib/gnome-settings-daemon/gnome-settings-daemon 7346 tty1 clipit 7375 tty1 /usr/lib/gnome-terminal/gnome-terminal-server 7388 pts/0 bash 8723 pts/0 weston 8724 pts/0 [weston-keyboard] <defunct> 8725 pts/0 [weston-desktop-] <defunct> 8732 pts/0 /usr/bin/Xwayland :1 -rootless -listen 37 -listen 38 -wm 39 -terminate 8757 pts/1 bash 8789 pts/1 ps auxw -H That's a very nice way to reproduce. Is weston consuming 100% CPU when it is frozen? Could you attach gdb to weston and get a backtrace? List manipulation bugs in Weston have a tendency to cause an endless loop, eating 100% CPU. This might be a problem to be fixed by: https://patchwork.freedesktop.org/patch/125594/ The reproducing steps remind me of https://bugzilla.redhat.com/show_bug.cgi?id=1278159 where the tooltip in the vlc progress bar would cause a large X11 traffic and eventually cause a deadlock between Xwayland and the compositor. Commit https://cgit.freedesktop.org/xorg/xserver/commit/?id=b79eaf1 was meant to mitigate this and try to avoid the deadlock (although I am not sure it's entirely possible to avoid it) but the code in vlc causing this is probably still there... However, weston was not affected as it doesn't do X11 rountrips, so maybe it's a different issue... interesting thanks for the pointers. this time i started both applications from gnome terminals, using: nohup weston & cd tmp export DISPLAY=:1 nohup vlc & i have vlc running with minimal interface so i do not see the vlc's progress bar unless in full screen. so i tried to show it via right click to test if this happens when not in full screen. vlc hanged weston's display after displaying is menu. weston takes 100% of one core. one cannot move the weston window any more, the mouse pointer is still shown on the menu. both applications still run. the vlc log shows: $ tail -f nohup.out [0000000001758148] core libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface. Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory https://bbs.archlinux.org/viewtopic.php?id=205292 says the microcode could be a problem as well, i have not installed it, but will try later just to be sure. i tried to debug weston - but i never really did something like this tbh. strace shows nothing. gdb shows: $ sudo gdb -p 1170 ... Attaching to process 1170 Reading symbols from /usr/bin/weston...(no debugging symbols found)...done. Reading symbols from /usr/lib/libweston-1.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libwayland-server.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libunwind.so.8...(no debugging symbols found)...done. Reading symbols from /usr/lib/libdl.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libinput.so.10...(no debugging symbols found)...done. Reading symbols from /usr/lib/libc.so.6...(no debugging symbols found)...done. Reading symbols from /usr/lib/libpixman-1.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxkbcommon.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libm.so.6...(no debugging symbols found)...done. Reading symbols from /usr/lib/libffi.so.6...(no debugging symbols found)...done. Reading symbols from /usr/lib/librt.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libpthread.so.0...(no debugging symbols found)...done. [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/liblzma.so.5...(no debugging symbols found)...done. Reading symbols from /usr/lib/libmtdev.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libudev.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libevdev.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libwacom.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libresolv.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libcap.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libgudev-1.0.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libgobject-2.0.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libglib-2.0.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libgio-2.0.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libpcre.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libgmodule-2.0.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libmount.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libblkid.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libuuid.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libweston-1/wayland-backend.so...(no debugging symbols found)...done. Reading symbols from /usr/lib/libwayland-egl.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libwayland-client.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libwayland-cursor.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libcairo.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libpng16.so.16...(no debugging symbols found)...done. Reading symbols from /usr/lib/libjpeg.so.8...(no debugging symbols found)...done. Reading symbols from /usr/lib/libfontconfig.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libfreetype.so.6...(no debugging symbols found)...done. Reading symbols from /usr/lib/libEGL.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-shm.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-render.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libXrender.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libX11.so.6...(no debugging symbols found)...done. Reading symbols from /usr/lib/libXext.so.6...(no debugging symbols found)...done. Reading symbols from /usr/lib/libGL.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libexpat.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libbz2.so.1.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libharfbuzz.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libX11-xcb.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-dri2.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-xfixes.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-dri3.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-present.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-sync.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxshmfence.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libgbm.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libdrm.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libXau.so.6...(no debugging symbols found)...done. Reading symbols from /usr/lib/libXdmcp.so.6...(no debugging symbols found)...done. Reading symbols from /usr/lib/libglapi.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libXdamage.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libXfixes.so.3...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-glx.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libXxf86vm.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libgraphite2.so.3...(no debugging symbols found)...done. Reading symbols from /usr/lib/libweston-1/gl-renderer.so...(no debugging symbols found)...done. Reading symbols from /usr/lib/libGLESv2.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/xorg/modules/dri/i965_dri.so...(no debugging symbols found)...done. Reading symbols from /usr/lib/libgcrypt.so.20...(no debugging symbols found)...done. Reading symbols from /usr/lib/libdrm_intel.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libdrm_nouveau.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libdrm_radeon.so.1...(no debugging symbols found)...done. Reading symbols from /usr/lib/libstdc++.so.6...done. Reading symbols from /usr/lib/libgcc_s.so.1...done. Reading symbols from /usr/lib/libgpg-error.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libpciaccess.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libtxc_dxtn.so...(no debugging symbols found)...done. Reading symbols from /usr/lib/weston/desktop-shell.so...(no debugging symbols found)...done. Reading symbols from /usr/lib/libweston-desktop-1.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libweston-1/xwayland.so...(no debugging symbols found)...done. Reading symbols from /usr/lib/libxcb-composite.so.0...(no debugging symbols found)...done. Reading symbols from /usr/lib/libXcursor.so.1...(no debugging symbols found)...done. 0x00007f6b6d59c36f in wl_list_remove () from /usr/lib/libwayland-server.so.0 (gdb) frame #0 0x00007f6b6d59c36f in wl_list_remove () from /usr/lib/libwayland-server.so.0 (gdb) info registers rax 0x858dd8 8752600 rbx 0x858d90 8752528 rcx 0xf0bdc000 4038967296 rdx 0x858dd8 8752600 rsi 0x0 0 rdi 0x858dd8 8752600 rbp 0x858dd8 0x858dd8 rsp 0x7ffd658dfba8 0x7ffd658dfba8 r8 0x9d7ea0 10321568 r9 0xa12700 10561280 r10 0x7ffd658df958 140726307256664 r11 0x7ffd658df900 140726307256576 r12 0x0 0 r13 0xf0bdc000 4038967296 r14 0xf0bdc000 4038967296 r15 0x0 0 rip 0x7f6b6d59c36f 0x7f6b6d59c36f <wl_list_remove+15> eflags 0x246 [ PF ZF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) next Single stepping until exit from function wl_list_remove, which has no line number information. 0x00007f6b6d7bb3b0 in weston_pointer_set_focus () from /usr/lib/libweston-1.so.0 (gdb) next Single stepping until exit from function weston_pointer_set_focus, which has no line number information. 0x00007f6b6d596e44 in ?? () from /usr/lib/libwayland-server.so.0 (gdb) next Cannot find bounds of current function (gdb) next Cannot find bounds of current function then vlc: Attaching to process 1547 [New LWP 1548] [New LWP 1549] [New LWP 1551] [New LWP 1553] [New LWP 1557] [New LWP 1562] [New LWP 2379] [New LWP 2381] [New LWP 2382] [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". 0x00007f7b7ebcac37 in do_sigwait () from /usr/lib/libpthread.so.0 (gdb) frame #0 0x00007f7b7ebcac37 in do_sigwait () from /usr/lib/libpthread.so.0 $ ps -ef | grep weston 1170 808 7 Jan19 pts/0 00:23:31 weston 1171 1170 0 Jan19 pts/0 00:00:00 /usr/lib/weston/weston-keyboard 1172 1170 0 Jan19 pts/0 00:00:00 /usr/lib/weston/weston-desktop-shell then i killed vlc. weston still uses 100% of a core. i can still not interact with its window. $ ps -ef | grep weston 1170 808 12 Jan19 pts/0 00:46:20 weston 1171 1170 0 Jan19 pts/0 00:00:00 [weston-keyboard] <defunct> 1172 1170 0 Jan19 pts/0 00:00:00 [weston-desktop-] <defunct> 2428 808 0 05:55 pts/0 00:00:00 grep weston vlc's nohup then ends with: Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory [00007f7b60c38ee8] core input error: ES_OUT_SET_(GROUP_)PCR is called too late (pts_delay increased to 300 ms) [00007f7b60c38ee8] core input error: ES_OUT_RESET_PCR called [mpeg4 @ 0x7f7b48c75f40] warning: first frame is no keyframe [00007f7b60c38ee8] core input error: ES_OUT_SET_(GROUP_)PCR is called too late (jitter of 12176 ms ignored) i did it again because i forgot to get a backtrace. it hangs when pressing on the popped up menu. for weston (gdb) backtrace #0 0x00007f348b3f7e44 in () at /usr/lib/libwayland-server.so.0 #1 0x00007f348b3f87e4 in wl_resource_destroy () at /usr/lib/libwayland-server.so.0 #2 0x00007f348a0151c8 in ffi_call_unix64 () at /usr/lib/libffi.so.6 #3 0x00007f348a014c2a in ffi_call () at /usr/lib/libffi.so.6 #4 0x00007f348b3fcabe in () at /usr/lib/libwayland-server.so.0 #5 0x00007f348b3f8cb7 in () at /usr/lib/libwayland-server.so.0 #6 0x00007f348b3fad32 in wl_event_loop_dispatch () at /usr/lib/libwayland-server.so.0 #7 0x00007f348b3f91da in wl_display_run () at /usr/lib/libwayland-server.so.0 #8 0x0000000000405207 in () #9 0x00007f348aa24291 in __libc_start_main () at /usr/lib/libc.so.6 #10 0x00000000004059ca in _start () for gnome the behaviour was different, it just would crash it on right click and choosing a menu entry. to be complete, weston-keyboard: (gdb) backtrace #0 0x00007f7980b88db3 in __epoll_wait_nocancel () from /usr/lib/libc.so.6 #1 0x000000000040e034 in ?? () #2 0x0000000000404f42 in ?? () #3 0x00007f7980ac0291 in __libc_start_main () from /usr/lib/libc.so.6 #4 0x0000000000404fda in ?? () weston-desktop-shell (gdb) backtrace #0 0x00007fbfa97f7db3 in __epoll_wait_nocancel () from /usr/lib/libc.so.6 #1 0x000000000040f034 in ?? () #2 0x00000000004053dd in ?? () #3 0x00007fbfa972f291 in __libc_start_main () from /usr/lib/libc.so.6 #4 0x00000000004054da in ?? () Ok, thanks. Yes, it all fits with my guess that Weston is having a corrupted list and going into an endless loop. The patch I referred to is actually a Wayland patch that fixes at least one this kind of case. It still needs some improvements before getting merged, but would it be possible for you to try the patch? If not, don't worry. It might also be another list manipulation bug in Weston, but I'm betting on the patch to fix it. I cannot say anything about GNOME, though. Was it vlc that crashed there? no, gnome crashed. the stock weston build looks like this: https://www.archlinux.org/packages/community/x86_64/weston/ https://git.archlinux.org/svntogit/community.git/tree/trunk/PKGBUILD?h=packages/weston this then would mean build the git source: https://cgit.freedesktop.org/wayland/weston into the PKGBUILD a little bit along the lines of arch linux AUR vlc-git using the current state of the git repository, building current VLC 3.0.0: https://aur.archlinux.org/packages/vlc-git/ https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=vlc-git and then do a makepkg, described here: https://wiki.archlinux.org/index.php/Arch_User_Repository there is also the xwayland coming out of x: https://www.archlinux.org/packages/extra/x86_64/xorg-server/ but this is not the one if i understand you right? to be clear, there are two problems in gnome as well: one, making it crash (vlc right mouse button menu i think), and one, making it freeze (vlc progress bar i think). For testing https://patchwork.freedesktop.org/patch/125594/ you would apply it to and build libwayland. Rebuilding Weston is not necessary. Let's concentrate on the Weston freeze in this bug report. I'm changing the component to Weston for now. If there are bugs in Xwayland, we'd only see them after Weston runs without freezing or crashing. Weston does not need Xwayland like mutter does, so Xwayland should never be able to cause Weston to freeze, crash or die. I think it is possible for Xwayland to quit if the Wayland compositor stops processing requests. Weston freezing might therefore cause Xwayland to quit soon after, which in turn should cause X11 apps to quit. ok, you mean this one: https://www.archlinux.org/packages/extra/i686/wayland/ have difficulties to build, am on 56f2dad6d2a: make[3]: Entering directory '/home/rt/builds/wayland/src/wayland/doc/doxygen' GEN xml GEN xml/wayland-architecture.png GEN xml/x-architecture.png Warning: flat edge between adjacent nodes one of which has a record shape - replace records with HTML-like labels Edge xserver -> comp Error: getsplinepoints: no spline points available for edge (xserver,comp) Error: lost xserver comp edge Error: lost xserver comp edge Error: lost comp xserver edge Error: lost comp xserver edge make[3]: *** [Makefile:626: xml/x-architecture.png] Error 1 make[3]: Leaving directory '/home/rt/builds/wayland/src/wayland/doc/doxygen' make[2]: *** [Makefile:376: all-recursive] Error 1 make[2]: Leaving directory '/home/rt/builds/wayland/src/wayland/doc' make[1]: *** [Makefile:1893: all-recursive] Error 1 make[1]: Leaving directory '/home/rt/builds/wayland/src/wayland' make: *** [Makefile:1141: all] Error 2 ==> ERROR: A failure occurred in build(). i do: build() { cd "${srcdir}/${_name}" ./autogen.sh ./configure \ --prefix=/usr \ --disable-static make } i built now using: https://aur.archlinux.org/packages/wayland-git/ and applied the patch: git apply ../server-use-a-safer-signal-type-for-the-wl_resource-destruction-signals.patch but i could not see a difference, when starting weston in gnome. i restarted gnome before doing the test. Ok, thanks. It seems one of the devs would need to reproduce and look into it. lets close it as wontfix - this was with vlc-2, now vlc-3 is there. i saw no other applicatio any more which could bring down or halt wayland or weston like this. vlc-3 is still dumping a core with wayland - but this looks like qt or some misunderstanding. waylands docu saying "should" and meaning "must mandatory", and qt implementing like it has been should: * https://codereview.qt-project.org/#/c/184278/ * https://trac.videolan.org/vlc/ticket/17910 If there is any kind of a program that can crash or hang a display server, even if that program was already obsolete and deprecated, it would still be worthwhile to at least understand why that happened. It could be a bug in the display server that needs fixing anyway. As such I'd propose to keep this bug open. OTOH if there is no-one to look into it and it cannot be reproduced it might not be worth it. I'm still seeing this bug with vlc 3.0.3 / GNOME Shell 3.28.2 on debian. Is there a particular reason why it has been marked wont fix? Is there anything I can do to help diagnose the problem? (In reply to Jeremy Lakeman from comment #19) > I'm still seeing this bug with vlc 3.0.3 / GNOME Shell 3.28.2 on debian. If you are seeing this bug in GNOME, then it cannot be a Weston bug. It could be a libwayland bug, but we cannot know without diagnosing it on the compositor first. So if it is the compositor that freezes, it would be good to report the bug to the compositor project, that is, GNOME. Open Wayland and Weston bug reports have been migrated to https://gitlab.freedesktop.org/ so if you want to file a Wayland or Weston bug, Gitlab is the place now. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.