Bug 100432 - Program terminates with SIGBUS if running in Docker
Summary: Program terminates with SIGBUS if running in Docker
Status: RESOLVED NOTABUG
Alias: None
Product: cairo
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: cairo-bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-28 10:24 UTC by Luka Napotnik
Modified: 2018-01-18 07:41 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
gdb backtrace (2.29 KB, text/plain)
2017-03-28 10:24 UTC, Luka Napotnik
Details
gdb backtrace (4.71 KB, text/plain)
2017-03-28 11:17 UTC, Luka Napotnik
Details
gdb backtrace (5.70 KB, text/plain)
2017-03-29 07:48 UTC, Luka Napotnik
Details

Description Luka Napotnik 2017-03-28 10:24:12 UTC
Created attachment 130497 [details]
gdb backtrace

I'm running a gtk3 app inside an Ubuntu docker container that load a page via WebKit and creates a snapshot of it.

The problem I have is that in some web pages, the app would terminate with a SIGBUS signal. I've provided an attachment with the apps backtrace.

I've eliminated gtk3 and webkit as the offending memory allocation happens inside cairo.

I've done some browsing and found out that a pointer arithmetic in src/cairo-image-compositor.c @ _fill_xrgb32_lerp_opaque_spans causes unaligned memory access.

Particularly, assignments like:

*d = lerp8x4 (r->u.fill.pixel, a, *d);

and:

uint32_t *d = (uint32_t *)(r->u.fill.data + r->u.fill.stride*yy + spans[0].x*4);

in the mentioned function seemed to be problematic as they end up being non-aligned.

The interesting thing is that the exact same app doesn't fail on the host system, after running it for a long time.

If interested, I can also provide a working test case in a form of a docker image.

Software used:
- x86-64 Ubuntu 16.04.1 on host
- phusion/baseimage as container baseimage 
- Docker 1.12.3
- updated to Cairo 1.14.8 in container
Comment 1 Chris Wilson 2017-03-28 10:33:08 UTC
Stride and data should be aligned. At a rough guess it seems that the data is not aligned. Was that cairo_image_surface_create_for_data perchance?
Comment 2 Luka Napotnik 2017-03-28 10:39:17 UTC
If you mean how do I get the surface ... I get it from gtk_offscreen_window_get_surface(window).
Comment 3 Luka Napotnik 2017-03-28 10:54:50 UTC
But I don't know what cairo operations does webkitgtk call. I'm using webkit2gtk-4.0
Comment 4 Chris Wilson 2017-03-28 10:55:30 UTC
Why would that be giving you an image surface? Mysteries.

Anyway, recompile the lot (cairo/pixmap definitely) with -O0 -g3 and get decent locals so we can see the source of the misalignment.
Comment 5 Luka Napotnik 2017-03-28 11:17:09 UTC
Created attachment 130498 [details]
gdb backtrace

Here's a more complete GDB backtrace
Comment 6 Luka Napotnik 2017-03-29 07:48:04 UTC
Created attachment 130527 [details]
gdb backtrace

Here's another backtrace, showing misalignment when using SSE2... Is it possible that this code doesn't work so well under Docker?
Comment 7 Sirshak Das 2018-01-18 07:34:36 UTC
Any known solution for this. I am also seeing this error on my ubuntu docker instance and dont see that on regular ubuntu linux installation with same compiler and build tools.
Comment 8 Luka Napotnik 2018-01-18 07:41:07 UTC
The problem is in the shared memory device (/dev/shm) that is mounted inside the Docker container. By default, docker mounts it as a 64 MB (correct me if I'm wrong) filesystem, which might be too small for some applications.

The solution is to specify --shm-size when running the container and increase the size. Or by sharing the system /dev/shm by adding it as a volume. The crashes vanished after resizing /dev/shm in the container.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.