Summary: | Weston-desktop-shell hangs. | ||
---|---|---|---|
Product: | Wayland | Reporter: | nerdopolis1 |
Component: | weston | Assignee: | Wayland bug list <wayland-bugs> |
Status: | RESOLVED NOTOURBUG | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | pochu27 |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
backtrace of a hung weston-desktop-shell
bt full of a hung weston-desktop-shell |
Description
nerdopolis1
2014-02-09 16:33:08 UTC
How is this reproduced? What version (better, commit ids)? This is weston commit dfaf65ba1636e49b850adff34f31de00b5f06bba (In reply to comment #2) > This is weston commit dfaf65ba1636e49b850adff34f31de00b5f06bba Hmmm... seems to work fine for me on the following stack: wayland (master) heads/master-0-ga18e344 drm (master) heads/master-0-g128e74c mesa (master) heads/master-0-g5125165 libva (master) heads/master-0-gb4a4f9b intel-driver (master) heads/master-0-g54cb60f weston (master) heads/master-0-gdfaf65b and tested on Intel Ivybridge, Fedora 20, X11 backend and desktop-shell using cairo-glesv2 (v1.12.14). I'll try rebuilding Mesa... Can you run weston-desktop-shell under valgrind so we see where the error comes from? I had it set to follow all pids when I called Weston. I am seeing this: but under valgrind, it doesn't hang, so I'm not sure if this is the same issue ==345== Invalid read of size 4 ==345== at 0x4221D48: _cairo_gl_surface_resolve_multisampling (cairo-gl-surface.c:1311) ==345== by 0x8: ??? ==345== Address 0x4c46610 is 8 bytes before a block of size 80 alloc'd ==345== at 0x402B965: calloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==345== by 0x50BAF5A: ralloc_size (ralloc.c:113) ==345== by 0x50BAFD4: rzalloc_size (ralloc.c:134) ==345== by 0x4F1E8C0: _mesa_hash_table_rehash (hash_table.c:223) ==345== by 0x4F1EA89: _mesa_hash_table_insert (hash_table.c:261) ==345== by 0x4F1E139: _mesa_HashInsert (hash.c:226) ==345== by 0x4F72048: _mesa_GenTextures (texobj.c:1029) ==345== by 0x4221503: _create_scratch_internal (cairo-gl-surface.c:454) ==345== by 0x422161F: _cairo_gl_surface_create_and_clear_scratch (cairo-gl-surface.c:509) ==345== by 0x42218BA: cairo_gl_surface_create (cairo-gl-surface.c:612) ==345== by 0x4218C7C: _cairo_gl_composite_glyphs_with_clip (cairo-gl-glyphs.c:365) ==345== by 0x4218F1E: _cairo_gl_composite_glyphs (cairo-gl-glyphs.c:482) ==345== ==345== Conditional jump or move depends on uninitialised value(s) ==345== at 0x421522E: _cairo_gl_context_setup_operand (cairo-gl-composite.c:225) ==345== by 0x4215756: _cairo_gl_set_operands_and_operator (cairo-gl-composite.c:724) ==345== by 0x42159CF: _cairo_gl_composite_begin (cairo-gl-composite.c:760) ==345== by 0x421EA7E: composite_boxes (cairo-gl-spans-compositor.c:409) ==345== by 0x41C6CE5: clip_and_composite_boxes.part.10 (cairo-spans-compositor.c:683) ==345== by 0x41C71F5: clip_and_composite_boxes (cairo-spans-compositor.c:901) ==345== by 0x1: ??? ==345== I forgot to mention, I am compiling Weston with --with-cairo=gl It's been a while since I last looked at my weston build script. (In reply to comment #7) > I forgot to mention, I am compiling Weston with --with-cairo=gl Ookay, that's pretty rare I think. Do wayland clients use egl_dri2 or egl_gallium in your system? Fbdev backend does not initialize server-side EGL, so it does not advertise wl_drm. DRM and x11 backends do. This means that on client side, weston-desktop-shell on fbdev-compositor will either complain and fall back to wl_shm, or it will use egl_gallium with a software renderer, depending on your Mesa build. On the other compositors, weston-desktop-shell will attempt to use wl_drm if advertized, which means you hit either egl_dri2 or egl_gallium, again depending on your Mesa build. What is your gfx card flavour, intel, nouveau or radeon? I believe the untested and mostly unmaintained combination is egl_gallium with nouveau and radeon drivers (with wl_drm), so problems are not unexpected. Do you hit this case? Unfortunately egl_gallium is atm the only way to use software rendered GL, and Mesa prefers egl_gallium over egl_dri2. You could override that with EGL_DRIVER env var. Hi. This is an Intel card. I build my mesa with ./autogen.sh --prefix=$INSTALLDIR --enable-driglx-direct --enable-dri --with-dri-drivers=r200,radeon,nouveau,i915,i965,swrast --enable-osmesa --enable-xa --enable-glx-tls --enable-shared-dricore --enable-gles2 --with-gallium-drivers=nouveau,svga,r300,r600,swrast,radeonsi,ilo --with-egl-platforms=x11,wayland,drm --enable-gbm --enable-shared-glapi --enable-gallium-egl --with-llvm-prefix=/usr/lib/llvm-3.4/ --disable-dri3 --with-llvm-shared-libs --libdir=$INSTALLDIR/lib/$(dpkg-architecture -qDEB_HOST_MULTIARCH) The reason why I enable so much many options is because this is a Live CD distribution. I'll try EGL_DRIVER=egl_gallium, but I'll also see how this works on vbox... Created attachment 93890 [details]
backtrace of a hung weston-desktop-shell
Weston built with --with-cairo=gl
I am now getting this without the --with-cairo=gl, and setting it to --with-cairo=image. I am still getting the hang, unless I call it with --use-pixman, if these details help (In reply to comment #10) > Created attachment 93890 [details] > backtrace of a hung weston-desktop-shell > > Weston built with --with-cairo=gl That... does not look like any specific problem, it looks like memory corruption, because it detects an error inside malloc() and then hangs trying to report it which involves some init-once dlopening a library hitting a deadlock on a mutex. Something that I would describe as "wtf". Unfortunately this doesn't tell much. But since you say it does not happen when ran under Valgrind, it might involve a race which leads to e.g. use of freed memory or whatever corrupting memory. Gaah... Created attachment 94003 [details]
bt full of a hung weston-desktop-shell
Does a bt full trace help?
Would it be worth trying to see if this happens when using Mesa 10.0? We are seeing another regression caused by Mesa master in bug 74689, which is why I ask. Hi. Sorry about the delay. I tested with Mesa 10.0, and it seems like I am NOT getting the hang. (In reply to comment #15) > Hi. > > Sorry about the delay. > I tested with Mesa 10.0, and it seems like I am NOT getting the hang. ok, perhaps this issue you're seeing is caused by the Mesa commit mentioned in bug 74689#c5, too. (In reply to comment #16) > (In reply to comment #15) > > Hi. > > > > Sorry about the delay. > > I tested with Mesa 10.0, and it seems like I am NOT getting the hang. > > ok, perhaps this issue you're seeing is caused by the Mesa commit mentioned > in bug 74689#c5, too. That is, this one: http://cgit.freedesktop.org/mesa/mesa/commit/?id=11baad35088dfd4bdabc1710df650 I tried building mesa master, runnning git revert 11baad35088dfd4bdabc1710df650 -n It still seems to hang As it turns out, I had ilo enabled in my mesa. This is what was causing the hang |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.