Created attachment 93749 [details] Code to reproduce the crash When a sufficiently complex path, consisting of multiple subpaths, is used both for clip and for stroke, cairo sometimes crashes in a segmentation fault. The crash seems to happen in the function active_edges, where in the innermost do...while loop the "right" variable for some reason gets a NULL value: Program received signal SIGSEGV, Segmentation fault. active_edges (polygon=0x7fffffffd440, top=32768, left=0x6b9a80) at /build/buildd/cairo-1.12.16/src/cairo-polygon-intersect.c:1235 1235 /build/buildd/cairo-1.12.16/src/cairo-polygon-intersect.c: No such file or directory. (gdb) where #0 active_edges (polygon=0x7fffffffd440, top=32768, left=0x6b9a80) at /build/buildd/cairo-1.12.16/src/cairo-polygon-intersect.c:1235 #1 intersection_sweep (polygon=0x7fffffffd440, num_events=<optimized out>, start_events=<optimized out>) at /build/buildd/cairo-1.12.16/src/cairo-polygon-intersect.c:1271 #2 _cairo_polygon_intersect (a=a@entry=0x7fffffffd440, winding_a=winding_a@entry=0, b=b@entry=0x7fffffffcff0, winding_b=<optimized out>) at /build/buildd/cairo-1.12.16/src/cairo-polygon-intersect.c:1466 #3 0x00007ffff7b3812a in clip_and_composite_polygon (compositor=compositor@entry=0x7ffff7dd8000 <spans.11982>, extents=extents@entry=0x7fffffffd880, polygon=polygon@entry=0x7fffffffd440, fill_rule=CAIRO_FILL_RULE_WINDING, antialias=antialias@entry=CAIRO_ANTIALIAS_DEFAULT) at /build/buildd/cairo-1.12.16/src/cairo-spans-compositor.c:937 #4 0x00007ffff7b38c77 in _cairo_spans_compositor_stroke (_compositor=0x7ffff7dd8000 <spans.11982>, extents=0x7fffffffd880, path=<optimized out>, style=0x7fffffffdc70, ctm=0x604c70, ctm_inverse=0x604ca0, tolerance=0,10000000000000001, antialias=CAIRO_ANTIALIAS_DEFAULT) at /build/buildd/cairo-1.12.16/src/cairo-spans-compositor.c:1074 #5 0x00007ffff7af7974 in _cairo_compositor_stroke (compositor=0x7ffff7dd8000 <spans.11982>, surface=0x6049c0, op=CAIRO_OPERATOR_OVER, source=0x7fffffffdca0, path=0x604eb8, style=0x7fffffffdc70, ctm=0x604c70, ctm_inverse=ctm_inverse@entry=0x604ca0, tolerance=0,10000000000000001, antialias=antialias@entry=CAIRO_ANTIALIAS_DEFAULT, clip=clip@entry=0x6089f0) at /build/buildd/cairo-1.12.16/src/cairo-compositor.c:157 #6 0x00007ffff7b07953 in _cairo_image_surface_stroke (abstract_surface=<optimized out>, op=<optimized out>, source=<optimized out>, path=<optimized out>, style=<optimized out>, ctm=<optimized out>, ctm_inverse=0x604ca0, tolerance=<optimized out>, antialias=CAIRO_ANTIALIAS_DEFAULT, clip=0x6089f0) at /build/buildd/cairo-1.12.16/src/cairo-image-surface.c:961 #7 0x00007ffff7b3bd42 in _cairo_surface_stroke (surface=0x6049c0, op=CAIRO_OPERATOR_OVER, source=0x7fffffffdca0, path=0x604eb8, stroke_style=0x7fffffffdc70, ctm=0x604c70, ctm_inverse=0x604ca0, tolerance=0,10000000000000001, antialias=CAIRO_ANTIALIAS_DEFAULT, clip=0x6089f0) at /build/buildd/cairo-1.12.16/src/cairo-surface.c:2210 #8 0x00007ffff7aff05f in _cairo_gstate_stroke (gstate=0x604b80, path=path@entry=0x604eb8) at /build/buildd/cairo-1.12.16/src/cairo-gstate.c:1185 #9 0x00007ffff7af9079 in _cairo_default_context_stroke (abstract_cr=0x604b50) at /build/buildd/cairo-1.12.16/src/cairo-default-context.c:1013 #10 0x00007ffff7af2875 in INT_cairo_stroke (cr=0x604b50) at /build/buildd/cairo-1.12.16/src/cairo.c:2146 #11 0x0000000000400b8f in main () at bug.c:28 Can be reproduced reliably with the attached code.
Had to get this working (at least to some extent), so I did a bit of research on the problem. The root cause seems to be that in certain circumstances the edges in the sweep list end up in a wrong order. When after that new edges are inserted they may end up at _very_ wrong positions, and before long the whole list is all messed up. Unfortunately, I did not have the time to find a real fix, but I worked around the problem by adding some NULL checks to the active_edges function (see the attached workaround patch), which at least makes the SIGSEGV disappear. The result is probably not pixel perfect, as the edges are still in the wrong order, but that's not that big an issue for me. I am also attaching another test case, reproduced from real live data this time, which demonstrates hopefully a little less pathological case for examination.
Created attachment 95645 [details] [review] Patch to work around the problem
Created attachment 95646 [details] Live data test case for reproducing the bug
Created attachment 95665 [details] Reduced version of live test case for reproducing the bug (In reply to comment #3) > Created attachment 95646 [details] > Live data test case for reproducing the bug I let creduce[0] play with this for a while (The interesting measure was "crashes accessing address 0x30"). The result is the attached, reduced test case which is a lot smaller. [0]: http://embed.cs.utah.edu/creduce/
Created attachment 104163 [details] [review] quick hack This is a patch that fixes the crashes in the three attached test cases on my machine, it is meant to show what appears to me as the problem, the solution probably should be decided by someone familiar with the code. 1) not returning the pointer difference as a signed int fixes the 'creduced' test. 2) computing the IMHO correctly rounded quotient fixes the #93749 attachment and inserting the new edge sorted at current_y+1 fixes the live data test. This last fix could possibly introduce the same problem for the horizontally specular case, that is with an edge having an intersection at the same y as its starting end-point, but at a greater x, in that case the intersection event would be added, that is the edge 'contains' the intersection, and when the event is processed the edges would be correctly swapped preserving the order in the active list. HTH
Created attachment 104273 [details] [review] patch to instrument cairo-polygon-intersect.c Adding comments to describe the easy parts of this bug: instrumenting src/cairo-polygon-intersect.c as in the attached patch and executing the 'creduced' test case prints on the console lines like: > 0x00000001a3ef00 0x00000001a3fae0 intptr_t diff: fffffffffffff420 -3040 int diff: 55555458 1431655512 a < b -1 > 0x00000001a3f9f0 0x00000001a3f950 intptr_t diff: 00000000000000a0 160 int diff: aaaaaab8 -1431655752 a < b +1 > 0x00000001a3f590 0x00000001a3f9f0 intptr_t diff: fffffffffffffba0 -1120 int diff: 555554f8 1431655672 a < b -1 > 0x00000001a3f0e0 0x00000001a3f040 intptr_t diff: 00000000000000a0 160 int diff: aaaaaab8 -1431655752 a < b +1 > 0x00000001a3ed70 0x00000001a3ee10 intptr_t diff: ffffffffffffff60 -160 int diff: 55555548 1431655752 a < b -1 > 0x00000001a3f040 0x00000001a3f0e0 intptr_t diff: ffffffffffffff60 -160 int diff: 55555548 1431655752 a < b -1 > 0x00000001a3f590 0x00000001a3f9f0 intptr_t diff: fffffffffffffba0 -1120 int diff: 555554f8 1431655672 a < b -1 > 0x00000001a3f9f0 0x00000001a3f950 intptr_t diff: 00000000000000a0 160 int diff: aaaaaab8 -1431655752 a < b +1 > 0x00000001a3f040 0x00000001a3f0e0 intptr_t diff: ffffffffffffff60 -160 int diff: 55555548 1431655752 a < b -1 > 0x00000001a3ed70 0x00000001a3ee10 intptr_t diff: ffffffffffffff60 -160 int diff: 55555548 1431655752 a < b -1 > 0x00000001a3f950 0x00000001a3f9f0 intptr_t diff: ffffffffffffff60 -160 int diff: 55555548 1431655752 a < b -1 > 0x007fff5ea37418 0x007fff5ea37518 intptr_t diff: ffffffffffffff00 -256 int diff: 55555540 1431655744 a < b -1 > 0x007fff5ea37678 0x007fff5ea37458 intptr_t diff: 0000000000000220 544 int diff: aaaaaad8 -1431655720 a < b +1 > 0x007fff5ea37658 0x007fff5ea373d8 intptr_t diff: 0000000000000280 640 int diff: aaaaaae0 -1431655712 a < b +1 http://cgit.freedesktop.org/cairo/tree/src/cairo-polygon-intersect.c?id=7e856071a27b06a6ae35b6445635da9276975c69#n798 this shows that the last statement in 'cairo_bo_event_compare' is executed and on 64 bit platforms it invokes undefined behaviour (the compiler assumes that it must not happen and is entitled to emit whatever) For the record I'm using gcc from fc20: gcc (GCC) 4.8.3 20140624 (Red Hat 4.8.3-1) Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Created attachment 104274 [details] [review] patch to add an assert to verify an assumption Judging from the comment and the code at (where 'exactness' is used): http://cgit.freedesktop.org/cairo/tree/src/cairo-polygon-intersect.c?id=7e856071a27b06a6ae35b6445635da9276975c69#n665 it seems you want to compute the ordinate of an intersection rounded toward -oo with the unused remainder non-negative. adding an assert to verify this happens (as in the attached patch) num = q * den + r -> r = num - q * den -> r >= 0 -> num >= q * den it is evident from its failure when running the 'creduced' test case intersect_lines: Assertion `(!((det64x32_128 (a_det, dx1, b_det, dx2)) < (((int128_t) (qr.quo) * (den_det)))))' failed. that when the computed remainder is not zero not always a > b In case comments are old and misleading and you want to round to the nearest integer I suspect the code computing the ordinate is anyway wrong: for example when http://cgit.freedesktop.org/cairo/tree/src/cairo-polygon-intersect.c?id=7e856071a27b06a6ae35b6445635da9276975c69#n608 den64x32 = 16 den_det = -5 16 / -5 = -3.2 qr initially is {-3, 1} because 16 == -3 * -5 + 1 but den_det and qr.rem have different signs so qr.rem is negated and doubled -> qr = { -3, -2 } because now qr.rem >= den_det (-2 >= -5) qr.quo is decremented to -4, which is not the nearest integer approximating -3.2 ~
Re the problem caused by using the difference between ptrs: initially I thought that the problem was that the difference between raw pointers to stack/heap allocated objects could exceed 2^31 * sizeof(cairo_bo_event_t) and so not representable as an int. But reading better the code the problem is that you're computing the difference between addresses of objects of different types (cairo_bo_start_event_t and cairo_bo_queue_event_t) using pointers to another type (cairo_bo_event_t). When the raw difference is not divisible by 'sizeof(cairo_bo_event_t)' the result is random. In C, technically, relational operations between pointers are defined only for pointers to objects from the same array etc. (in C++ it is possible to use std::less that hides the magic necessary to make it well defined). To be on the safe side you'd better cast the pointers to uintptr_t and return -1 or 0 or 1 for <, ==, >.
Created attachment 105530 [details] [review] proposed patch This patch superesedes the previous. The problem seems to be that flooring (or rounding to the nearest) intersections and processing START events after INTERSECTION events causes an inversion in the active edge list, that is it happens that introducing new edges at the same y of a floored intersection finds two or more edges already swapped and event_queue_insert_if_intersect_below_current_y enqueues intersections events above the current sweep_line position because slope_compare is incorrectly assuming left to be the left edge and right the right one. The patched implementation of the sweep_line instead considers edges as translated upward and shrinked by 1 and 2 epsilon. It means that all edge lower vertices are considered positioned at an y = nominal y - 2 * epsilon, the upper points at y - epsilon. It follows that events having the same y are processed in the order: STOP prior than START prior than INTERSECTION (floored to integer). This way START and STOP events should find the active-edges list correctly sorted, and sorting the new edges also using slope_compare should not miss intersection, nor move the sweep line back and forth. Here it fixes the 3 test cases attached to this report and the 2 scripts attached to bug 59098, plus I can open in evince the pdf mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1134832 with the side panel (thumb pages) visible.
Two links to bug reports with attached pdf files that crash evince and are rendered after patching cairo with the patch in my previous comment: http://lists.opensuse.org/opensuse-bugs/2013-11/msg03431.html and https://bugzilla.redhat.com/show_bug.cgi?id=1058503
It is also possible to round intersections to their nearest integer, it is a little more elaborate: it is necessary to properly round the integer quotient to the nearest integer, change the cairo_bo_intersect_ordinate 'exactness' member to hold say {EXCESS, EXACT, DEFECT } values, and split INTERSECTION events in INTERSECTION_ABOVE and INTERSECTION_BELOW events which should be processed INTERSECTION_ABOVE < STOP < START < INTERSECTION_ABOVE with that approach few image tests that fail with the patch above do pass.
Created attachment 106407 [details] [review] proposed patch The attached patch achieves the same result (processing events in a valid order) by including the approximation of the intersection coordinates in the event->point member. It is thus easy to adjust the event comparison function to avoid that intersection events having y coordinate approximated in default are processed before start events having that same exact y. To simplify the rounding computation it exploits the fact that in intersect_lines den_det is always positive, because intersect_lines is only called after _slope_compare returned > 0 and _slope_compare is returning the sign of den_det. And it also simplifies _cairo_bo_edge_contains_intersect_point because comparing at top is useless as it is explained in the comment and the quadratic-time finder mentioned there is only present (but unusable) in src/cairo-bentley-ottman.c, whereas the comparison at bottom is necessary when edge.p2.y != edge.bottom, but comparing only the y is sufficient. On my laptop running: DISPLAY=:2 make -s test TARGETS=image,xlib,xcb reports the same number of failures with/without the patch and also replicating the changes in the patch to src/cairo-{bentley-ottman,polygon-reduce}.c Well the big number of failures in the test suite could hide newly introduced bugs, so considering the target image.argb32 tests executing intersection_sweep (in src/cairo-polygon-intersect.c) are bug-bo-ricotz clip-disjoint clip-disjoint-hatching clip-disjoint-quad clip-stroke-unbounded clip-fill-nz-unbounded clip-fill-eo-unbounded clip-fill clip-group-shapes-circles clip-operator clip-polygons clip-stroke clip-twice hatchings random-clip record90-paint-alpha-clip-mask rotated-clip tighten-bounds trap-clip of which only trap-clip and clip-operator are failing here, so anyway the test-suite is exercising and verifying the code modified.
Hi Massimo, mind breaking this patch up into some simpler chunks to make reviewing easier?
Created attachment 106722 [details] [review] patchset patch split in 4.
Thanks Massimo, we'll land this when the tree opens for 1.4.1.
Bryce, is that still on your list?
report of similar issues against evince https://bugs.launchpad.net/debian/+source/cairo/+bug/1404715 https://bugzilla.gnome.org/show_bug.cgi?id=745302 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=708120
Thanks for splitting the patches up. The logic is still a bit dense for me to grok but based on the thoroughness of the testing and the pervasiveness of the issue I've taken the liberty of landing it at this point, with hope this will give us additional time for testing it before we get to the 1.14.4 release. Massimo, thanks again for investigating this bug! To ssh://git.cairographics.org/git/cairo b9ada81..1ed318c master -> master
*** Bug 89488 has been marked as a duplicate of this bug. ***
Bug 59098 is a duplicate of this?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.