Bug 8801 - text rendering lacking locking in multithreaded apps [PATCH]
Summary: text rendering lacking locking in multithreaded apps [PATCH]
Status: RESOLVED FIXED
Alias: None
Product: cairo
Classification: Unclassified
Component: general (show other bugs)
Version: 1.2.5
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Carl Worth
QA Contact: cairo-bugs mailing list
URL:
Whiteboard:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2006-10-27 17:45 UTC by Monty Montgomery
Modified: 2007-02-14 00:12 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
add locking to surface/pattern/cache refcounting (6.14 KB, patch)
2006-10-27 17:45 UTC, Monty Montgomery
Details | Splinter Review
One file hadn't been saved out of the emacs buffer in the previous patch incarnation (dang you emacs and your random safety interlocks!) (1.44 KB, patch)
2006-10-27 18:25 UTC, Monty Montgomery
Details | Splinter Review
Compined Monty's patches + fix for typo (see comment #9) + win32 complation fix (7.90 KB, patch)
2007-01-20 18:01 UTC, Jan Słupski
Details | Splinter Review
Lock more cairo code with mutex to prevent remaining crashes (8.80 KB, patch)
2007-01-20 19:05 UTC, Jan Słupski
Details | Splinter Review

Description Monty Montgomery 2006-10-27 17:45:01 UTC
Although some elements of text rendering are locked for use in multithreaded
applications, surface, pattern and cache refcounting are not; this is causing
multithreaded apps to crash even when no external cairo type instances are being
shared between threads.
Comment 1 Monty Montgomery 2006-10-27 17:45:53 UTC
Created attachment 7562 [details] [review]
add locking to surface/pattern/cache refcounting
Comment 2 Monty Montgomery 2006-10-27 18:25:24 UTC
Created attachment 7563 [details] [review]
One file hadn't been saved out of the emacs buffer in the previous patch incarnation (dang you emacs and your random safety interlocks!)
Comment 3 Behdad Esfahbod 2006-11-01 11:24:47 UTC
Monty, can you attach a backtrace?
I want to see if this is the same bug as:
  http://bugzilla.gnome.org/show_bug.cgi?id=360691
Comment 4 Monty Montgomery 2006-11-01 11:36:41 UTC
(In reply to comment #3)
> Monty, can you attach a backtrace?
> I want to see if this is the same bug as:
>   http://bugzilla.gnome.org/show_bug.cgi?id=360691

The exact point of the 'crash' was different each time because the problem was
unlocked concurrency leaking refcounts, then later throwing an assert error when
the refcounts dipped under zero.  The 'crash' happened long after the original
cause.

It's not the backtrace you really wanted from the user-- you wanted the assert
abort message. 

Regardless, I'll get you a backtrace in a few moments.

Monty
Comment 5 Monty Montgomery 2006-11-01 12:51:17 UTC
(In reply to comment #3)
> Monty, can you attach a backtrace?
> I want to see if this is the same bug as:
>   http://bugzilla.gnome.org/show_bug.cgi?id=360691

Here are the relevant asserts and backtraces.  Specifically, I saw two different
possible assert failures, both are detailed below.

First assert:
sushivision: cairo-cache.c:213: _cairo_cache_thaw: Assertion
`cache->freeze_count > 0' failed.

Program received signal SIGABRT, Aborted.
0x00002b06e3ac307b in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00002b06e3ac307b in raise () from /lib/libc.so.6
#1  0x00002b06e3ac484e in abort () from /lib/libc.so.6
#2  0x00002b06e3abcaf4 in __assert_fail () from /lib/libc.so.6
#3  0x00002b06e39334da in _cairo_cache_thaw (cache=<value optimized out>)
    at cairo-cache.c:213
#4  0x00002b06e393eb74 in _cairo_scaled_font_show_glyphs (
    scaled_font=0x6d1f00, op=CAIRO_OPERATOR_OVER, pattern=0x4084eb80, 
    surface=0x2aaaab7011a0, source_x=781, source_y=171, dest_x=781, 
    dest_y=171, width=7, height=26, glyphs=0x2aaaab701c50, num_glyphs=5)
    at cairo-scaled-font.c:1008
#5  0x00002b06e3942b7f in _cairo_surface_old_show_glyphs_draw_func (
    closure=0x4084eae0, op=CAIRO_OPERATOR_OVER, src=0x4084eb80, 
    dst=0x2aaaab7011a0, dst_x=0, dst_y=0, extents=0x4084eb10)
    at cairo-surface-fallback.c:890
#6  0x00002b06e394268f in _clip_and_composite (clip=0x0, 
    op=CAIRO_OPERATOR_OVER, src=0x4084eb80, 
    draw_func=0x2b06e3942a10 <_cairo_surface_old_show_glyphs_draw_func>, 
    draw_closure=0x4084eae0, dst=0x2aaaab7011a0, extents=0x4084eb10)
    at cairo-surface-fallback.c:391
#7  0x00002b06e3942a05 in _cairo_surface_fallback_show_glyphs (
    surface=0x2aaaab7011a0, op=CAIRO_OPERATOR_OVER, source=0x4084eb80, 
    glyphs=0x2aaaab701c50, num_glyphs=5, scaled_font=0x6d1f00)
    at cairo-surface-fallback.c:941
#8  0x00002b06e3941494 in _cairo_surface_show_glyphs (surface=0x2aaaab7011a0, 
    op=CAIRO_OPERATOR_OVER, source=<value optimized out>, 
    glyphs=0x2aaaab701c50, num_glyphs=5, scaled_font=0x6d1f00)
    at cairo-surface.c:1847
#9  0x00002b06e3935fd7 in _cairo_gstate_show_glyphs (gstate=0x2aaaab7013c0, 
    glyphs=0x2aaaab701900, num_glyphs=5) at cairo-gstate.c:1503
#10 0x00002b06e3931c06 in cairo_show_text (cr=0x2aaaab7017c0, 
    utf8=0x4084ee10 "0.230") at cairo.c:2681
#11 0x000000000040490b in draw_scales_work (s=0x2aaaab7011a0, xs=
      {lo = 0.099285714285714297, hi = 0.255, first_val = 100, first_pixel = 4,
step_val = 5, step_pixel = 30, decimal_exponent = -3, m = 0.001, init = 1,
pixels = 946}, ys=
      {lo = -96, hi = 96, first_val = -8, first_pixel = 17, step_val = 2,
step_pixel = 21, decimal_exponent = 1, m = 10, init = 1, pixels = 200})
    at plot.c:129
#12 0x0000000000404a17 in plot_draw_scales (p=0x566650) at plot.c:149
#13 0x000000000040c6dd in _sushiv_panel_cooperative_compute_2d (p=0x5431d0)
    at panel-2d.c:709
#14 0x00000000004096d7 in _sushiv_panel_cooperative_compute (p=0x5431d0)
    at panel.c:110
#15 0x0000000000403630 in worker_thread (dummy=0x0) at main.c:106
#16 0x00002b06e1e9cf1a in start_thread () from /lib/libpthread.so.0
#17 0x00002b06e3b5d5d2 in clone () from /lib/libc.so.6
#18 0x0000000000000000 in ?? ()
(gdb) 

other threads:
(gdb) thread 1
[Switching to thread 1 (Thread 47308603799168 (LWP 11843))]#0 
0x00002b06e3b54cc6 in poll () from /lib/libc.so.6
(gdb) bt
#0  0x00002b06e3b54cc6 in poll () from /lib/libc.so.6
#1  0x00002b06e37b967e in g_main_context_check ()
   from /usr/lib/libglib-2.0.so.0
#2  0x00002b06e37b9b16 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#3  0x00002b06e20d77e2 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#4  0x000000000040381f in main (argc=1, argv=0x7fffc8d287e8) at main.c:179
(gdb) thread 3
[Switching to thread 3 (Thread 1091180896 (LWP 11845))]#0  0x00002b06e3b079e0 in
mempcpy () from /lib/libc.so.6
(gdb) bt
#0  0x00002b06e3b079e0 in mempcpy () from /lib/libc.so.6
#1  0x00002b06e3ac5043 in bsearch () from /lib/libc.so.6
#2  0x00002b06e3ac4f93 in bsearch () from /lib/libc.so.6
#3  0x00002b06e3ac5230 in qsort () from /lib/libc.so.6
#4  0x00002b06e3943e54 in _cairo_traps_tessellate_polygon (traps=0x4109fb10, 
    poly=<value optimized out>, fill_rule=CAIRO_FILL_RULE_WINDING)
    at cairo-traps.c:742
#5  0x00002b06e393ce34 in _cairo_pen_stroke_spline (pen=0x4109f8e0, 
    spline=0x4109f820, tolerance=0.10000000000000001, traps=0x4109fb10)
    at cairo-pen.c:451
#6  0x00002b06e393c116 in _cairo_stroker_curve_to (closure=0x4109f9f0, 
    b=<value optimized out>, c=<value optimized out>, d=0x4109f990)
    at cairo-path-stroke.c:860
#7  0x00002b06e3939d99 in _cairo_path_fixed_interpret (
    path=<value optimized out>, dir=<value optimized out>, 
    move_to=0x2b06e393b820 <_cairo_stroker_move_to>, 
    line_to=0x2b06e393c5f0 <_cairo_stroker_line_to>, 
    curve_to=0x2b06e393be80 <_cairo_stroker_curve_to>, 
    close_path=0x2b06e393c780 <_cairo_stroker_close_path>, closure=0x4109f9f0)
    at cairo-path.c:564
#8  0x00002b06e393b9d3 in _cairo_path_fixed_stroke_to_traps (
    path=0x2aaaab601198, stroke_style=<value optimized out>, 
    ctm=<value optimized out>, ctm_inverse=<value optimized out>, 
    tolerance=<value optimized out>, traps=<value optimized out>)
    at cairo-path-stroke.c:995
#9  0x00002b06e3943353 in _cairo_surface_fallback_stroke (
    surface=0x2aaaab601460, op=CAIRO_OPERATOR_OVER, source=0x4109fba0, 
    path=0x2aaaab601198, stroke_style=0x2aaaab600b08, ctm=0x4109fc80, 
    ctm_inverse=0x4109fc50, tolerance=0.10000000000000001, 
    antialias=CAIRO_ANTIALIAS_DEFAULT) at cairo-surface-fallback.c:783
#10 0x00002b06e39410ef in _cairo_surface_stroke (surface=0x2aaaab601460, 
    op=CAIRO_OPERATOR_OVER, source=<value optimized out>, 
    path=0x2aaaab601198, stroke_style=0x2aaaab600b08, 
    ctm=<value optimized out>, ctm_inverse=0x2aaaab600c08, 
    tolerance=0.10000000000000001, antialias=CAIRO_ANTIALIAS_DEFAULT)
    at cairo-surface.c:1375
#11 0x00002b06e39355d0 in _cairo_gstate_stroke (gstate=0x2aaaab600af0, 
    path=0x2aaaab601198) at cairo-gstate.c:931
#12 0x00002b06e393123d in *INT_cairo_stroke_preserve (cr=0x2aaaab601190)
    at cairo.c:1920
#13 0x00002b06e3931259 in cairo_stroke (cr=0x4109f5ed) at cairo.c:1896
#14 0x000000000040486e in draw_scales_work (s=0x2aaaab601460, xs=
      {lo = 0.098571428571428588, hi = 0.255, first_val = 100, first_pixel = 9,
step_val = 5, step_pixel = 30, decimal_exponent = -3, m = 0.001, init = 1,
pixels = 946}, ys=
      {lo = -96, hi = 96, first_val = -8, first_pixel = 17, step_val = 2,
step_pixel = 21, decimal_exponent = 1, m = 10, init = 1, pixels = 200})
    at plot.c:125
#15 0x0000000000404a17 in plot_draw_scales (p=0x566650) at plot.c:149
#16 0x000000000040c6dd in _sushiv_panel_cooperative_compute_2d (p=0x5431d0)
    at panel-2d.c:709
#17 0x00000000004096d7 in _sushiv_panel_cooperative_compute (p=0x5431d0)
    at panel.c:110
#18 0x0000000000403630 in worker_thread (dummy=0x0) at main.c:106
#19 0x00002b06e1e9cf1a in start_thread () from /lib/libpthread.so.0
#20 0x00002b06e3b5d5d2 in clone () from /lib/libc.so.6
#21 0x0000000000000000 in ?? ()

Second assert:
sushivision: cairo-surface.c:441: cairo_surface_destroy: Assertion
`surface->ref_count > 0' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 1082460512 (LWP 11860)]
0x00002ba8bc5f107b in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00002ba8bc5f107b in raise () from /lib/libc.so.6
#1  0x00002ba8bc5f284e in abort () from /lib/libc.so.6
#2  0x00002ba8bc5eaaf4 in __assert_fail () from /lib/libc.so.6
#3  0x00002ba8bc46f93d in *INT_cairo_surface_destroy (surface=0x6cf520)
    at cairo-surface.c:441
#4  0x00002ba8bc46ca08 in _cairo_scaled_font_show_glyphs (
    scaled_font=0x6d1f00, op=CAIRO_OPERATOR_OVER, pattern=0x4084dc50, 
    surface=0x2aaaab816830, source_x=54, source_y=171, dest_x=54, dest_y=171, 
    width=7, height=26, glyphs=0x756070, num_glyphs=5)
    at cairo-scaled-font.c:987
#5  0x00002ba8bc470b7f in _cairo_surface_old_show_glyphs_draw_func (
    closure=0x4084dbb0, op=CAIRO_OPERATOR_OVER, src=0x4084dc50, 
    dst=0x2aaaab816830, dst_x=0, dst_y=0, extents=0x4084dbe0)
    at cairo-surface-fallback.c:890
#6  0x00002ba8bc47068f in _clip_and_composite (clip=0x0, 
    op=CAIRO_OPERATOR_OVER, src=0x4084dc50, 
    draw_func=0x2ba8bc470a10 <_cairo_surface_old_show_glyphs_draw_func>, 
    draw_closure=0x4084dbb0, dst=0x2aaaab816830, extents=0x4084dbe0)
    at cairo-surface-fallback.c:391
#7  0x00002ba8bc470a05 in _cairo_surface_fallback_show_glyphs (
    surface=0x2aaaab816830, op=CAIRO_OPERATOR_OVER, source=0x4084dc50, 
    glyphs=0x756070, num_glyphs=5, scaled_font=0x6d1f00)
    at cairo-surface-fallback.c:941
#8  0x00002ba8bc46f494 in _cairo_surface_show_glyphs (surface=0x2aaaab816830, 
    op=CAIRO_OPERATOR_OVER, source=<value optimized out>, glyphs=0x756070, 
    num_glyphs=5, scaled_font=0x6d1f00) at cairo-surface.c:1847
#9  0x00002ba8bc463fd7 in _cairo_gstate_show_glyphs (gstate=0x2aaaab816940, 
    glyphs=0x754ea0, num_glyphs=5) at cairo-gstate.c:1503
#10 0x00002ba8bc45fc06 in cairo_show_text (cr=0x2aaaab801970, 
    utf8=0x4084dee0 "0.160") at cairo.c:2681
#11 0x000000000040490b in draw_scales_work (s=0x2aaaab816830, xs=
      {lo = 0.14940119760479043, hi = 0.5, first_val = 150, first_pixel = 3,
step_val = 5, step_pixel = 27, decimal_exponent = -3, m = 0.001, init = 1,
pixels = 1917}, ys=
      {lo = -96, hi = 96, first_val = -8, first_pixel = 17, step_val = 2,
step_pixel = 21, decimal_exponent = 1, m = 10, init = 1, pixels = 200})
    at plot.c:129
#12 0x0000000000404a17 in plot_draw_scales (p=0x566650) at plot.c:149
#13 0x000000000040c6dd in _sushiv_panel_cooperative_compute_2d (p=0x5431d0)
    at panel-2d.c:709
#14 0x00000000004096d7 in _sushiv_panel_cooperative_compute (p=0x5431d0)
    at panel.c:110
#15 0x0000000000403630 in worker_thread (dummy=0x0) at main.c:106
#16 0x00002ba8ba9caf1a in start_thread () from /lib/libpthread.so.0
#17 0x00002ba8bc68b5d2 in clone () from /lib/libc.so.6
#18 0x0000000000000000 in ?? ()

other threads:
(gdb) thread 1
[Switching to thread 1 (Thread 48003729135232 (LWP 11859))]#0 
0x00002ba8bc4a4ad8 in _cairo_pixman_render_edge_init (e=0x7ffff01f88c0, n=8,
y_start=67720, 
    x_top=172998, y_top=66009, x_bot=156614, y_bot=70399) at renderedge.c:152
(gdb) bt
#0  0x00002ba8bc4a4ad8 in _cairo_pixman_render_edge_init (e=0x7ffff01f88c0, 
    n=8, y_start=67720, x_top=172998, y_top=66009, x_bot=156614, y_bot=70399)
    at renderedge.c:152
#1  0x00002ba8bc4a4b31 in _cairo_pixman_render_line_fixed_edge_init (
    e=0x1126, n=8, y=3214, line=<value optimized out>, 
    x_off=<value optimized out>, y_off=<value optimized out>)
    at renderedge.c:189
#2  0x00002ba8bc49ada0 in fbRasterizeTrapezoid (
    pPicture=<value optimized out>, trap=0x735698, x_off=-761, y_off=0)
    at fbtrap.c:135
#3  0x00002ba8bc4992fe in _cairo_pixman_add_trapezoids (dst=0x7532b0, 
    x_off=-761, y_off=0, traps=0x735698, ntraps=36) at ictrap.c:204
#4  0x00002ba8bc465a2d in _cairo_image_surface_composite_trapezoids (
    op=CAIRO_OPERATOR_OVER, pattern=0x7ffff01f9060, abstract_dst=0x6f2e00, 
    antialias=<value optimized out>, src_x=761, src_y=0, dst_x=761, dst_y=0, 
    width=15, height=14, traps=0x734f90, num_traps=81)
    at cairo-image-surface.c:1013
#5  0x00002ba8bc46e3a1 in _cairo_surface_composite_trapezoids (op=4390, 
    pattern=0x7ffff01f9060, dst=0x6f2e00, antialias=CAIRO_ANTIALIAS_DEFAULT, 
    src_x=761, src_y=0, dst_x=761, dst_y=0, width=15, height=14, 
    traps=0x734f90, num_traps=81) at cairo-surface.c:1459
#6  0x00002ba8bc471280 in _composite_traps_draw_func (closure=0x7ffff01f8f50, 
    op=CAIRO_OPERATOR_OVER, src=0x7ffff01f9060, dst=0x6f2e00, dst_x=0, 
    dst_y=0, extents=0x7ffff01f8f60) at cairo-surface-fallback.c:492
#7  0x00002ba8bc47068f in _clip_and_composite (clip=0x0, 
    op=CAIRO_OPERATOR_OVER, src=0x7ffff01f9060, 
    draw_func=0x2ba8bc4711c0 <_composite_traps_draw_func>, 
    draw_closure=0x7ffff01f8f50, dst=0x6f2e00, extents=0x7ffff01f8f60)
    at cairo-surface-fallback.c:391
#8  0x00002ba8bc470fb4 in _clip_and_composite_trapezoids (src=0x7ffff01f8eb0, 
    op=32767, dst=0x6f2e00, traps=0x7ffff01f8fd0, clip=0x0, 
    antialias=CAIRO_ANTIALIAS_DEFAULT) at cairo-surface-fallback.c:636
#9  0x00002ba8bc47137c in _cairo_surface_fallback_stroke (surface=0x6f2e00, 
    op=CAIRO_OPERATOR_OVER, source=0x7ffff01f9060, path=0x69b628, 
    stroke_style=0x754218, ctm=0x7ffff01f9140, ctm_inverse=0x7ffff01f9110, 
    tolerance=<value optimized out>, antialias=CAIRO_ANTIALIAS_DEFAULT)
    at cairo-surface-fallback.c:793
#10 0x00002ba8bc46f0ef in _cairo_surface_stroke (surface=0x6f2e00, 
    op=CAIRO_OPERATOR_OVER, source=<value optimized out>, path=0x69b628, 
    stroke_style=0x754218, ctm=<value optimized out>, ctm_inverse=0x754318, 
    tolerance=0.10000000000000001, antialias=CAIRO_ANTIALIAS_DEFAULT)
    at cairo-surface.c:1375
#11 0x00002ba8bc4635d0 in _cairo_gstate_stroke (gstate=0x754200, 
    path=0x69b628) at cairo-gstate.c:931
#12 0x00002ba8bc45f23d in *INT_cairo_stroke_preserve (cr=0x69b620)
    at cairo.c:1920
#13 0x0000000000406fb7 in slider_draw (s=0x69fea0) at slider.c:320
#14 0x0000000000408999 in slider_motion (s=0x69fea0, slicenum=0, x=767, y=18)
    at slider.c:680
#15 0x0000000000408df5 in slice_motion (widget=0x69cd00, event=0x6cc670)
    at slice.c:66
#16 0x00002ba8bac0a77d in _gtk_marshal_BOOLEAN__BOXED ()
   from /usr/lib/libgtk-x11-2.0.so.0
#17 0x00002ba8bb527589 in g_closure_invoke ()
   from /usr/lib/libgobject-2.0.so.0
#18 0x00002ba8bb536d8f in g_signal_chain_from_overridden ()
   from /usr/lib/libgobject-2.0.so.0
#19 0x00002ba8bb537c6e in g_signal_emit_valist ()
   from /usr/lib/libgobject-2.0.so.0
#20 0x00002ba8bb538083 in g_signal_emit () from /usr/lib/libgobject-2.0.so.0
#21 0x00002ba8bace0e8e in gtk_widget_get_default_style ()
   from /usr/lib/libgtk-x11-2.0.so.0
#22 0x00002ba8bac043de in gtk_propagate_event ()
   from /usr/lib/libgtk-x11-2.0.so.0
#23 0x00002ba8bac05487 in gtk_main_do_event ()
   from /usr/lib/libgtk-x11-2.0.so.0
#24 0x00002ba8baf4d4dc in _gdk_events_init ()
   from /usr/lib/libgdk-x11-2.0.so.0
#25 0x00002ba8bc2e49e3 in g_main_context_dispatch ()
   from /usr/lib/libglib-2.0.so.0
#26 0x00002ba8bc2e782d in g_main_context_check ()
   from /usr/lib/libglib-2.0.so.0
#27 0x00002ba8bc2e7b16 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#28 0x00002ba8bac057e2 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#29 0x000000000040381f in main (argc=1, argv=0x7ffff01f9cb8) at main.c:179
(gdb) thread 3
[Switching to thread 3 (Thread 1091180896 (LWP 11861))]#0  0x00002ba8ba9cfeab in
__lll_mutex_lock_wait () from /lib/libpthread.so.0
(gdb) bt
#0  0x00002ba8ba9cfeab in __lll_mutex_lock_wait () from /lib/libpthread.so.0
#1  0x0000000000000000 in ?? ()

...so the backtraces are not similar to the referenced bugs-- but that does not
rule out the cause being the same.  The crashes above are not coming from a
standard Gtk+ widget, but rather a custom widget implemented by Sushivision.

Monty
Comment 6 Behdad Esfahbod 2006-11-01 13:11:21 UTC
Thanks.
Comment 7 Monty Montgomery 2006-11-01 13:14:53 UTC
(In reply to comment #5)

I also should add that although I only saw two assert failures, I added or
adjusted locking in five places that came up in code review: Surface refcounts,
Pattern refcounts, font cache 'freeze' counts, generic cache refcounts, and
lastly fixing a race in glyph rendering where it would first check for glyphs,
then freeze the cache instead of the other way around.

So it's possible the above patch fixes more concurrency clobbers than just the
two assert failures I saw.

Monty
Comment 8 Behdad Esfahbod 2006-11-01 14:29:16 UTC
This bug suggests that the same is happening in cairo_ft_scaled_font_lock/unlock:

  http://bugzilla.gnome.org/show_bug.cgi?id=357939

We should fix that one too.
Comment 9 Jan Słupski 2007-01-10 12:00:07 UTC
Looking at the bottom of the 2nd patch:

@@ -347,6 +351,7 @@ static void
 _cairo_cache_remove (cairo_cache_t	 *cache,
 		     cairo_cache_entry_t *entry)
 {
+    CAIRO_MUTEX_LOCK (cairo_cache_refcount_mutex);
     cache->size -= entry->size;
 
     _cairo_hash_table_remove (cache->hash_table,
@@ -354,6 +359,7 @@ _cairo_cache_remove (cairo_cache_t	 *cac
 
     if (cache->entry_destroy)
 	cache->entry_destroy (entry);
+    CAIRO_MUTEX_LOCK (cairo_cache_refcount_mutex);
 }

Shouldn't the second mutex operation be UNLOCK?
Comment 10 Carl Worth 2007-01-20 01:08:27 UTC
(In reply to comment #9)
> Shouldn't the second mutex operation be UNLOCK?

Definitely. Thanks for the catch.

Could anyone testing this stuff please post a single, good patch containing
everything of interest? I seem to have broken something in attempting to
generate one.

Actually, what would be even better would be multiple patches for the
independent pieces, (for example, separating the missing freeze/thaw from the
addition of locking around all reference counting).

And while I'm making a wishlist, I'd love to have the stuff available as a git
patchset with commit messages, etc. And I'd really love to see some new
cairo_atomic_increment/decrement inline functions used for the reference
counting, (even if initially implemented with just a mutex---and likely just one
mutex for all reference count locking).

But please don't let my long list of wishes keep anyone from doing one piece
just because they don't plan to do all of them.

Thanks,

-Carl

Comment 11 Monty Montgomery 2007-01-20 01:21:39 UTC
What is it that you broke Carl?  I don't mind making this mostly my problem as I
seem to have the most immediate need for the fix.

Monty
Comment 12 Carl Worth 2007-01-20 01:45:20 UTC
(In reply to comment #11)
> What is it that you broke Carl?  I don't mind making this mostly my problem as I
> seem to have the most immediate need for the fix.

I seemed to see tests hanging when I ran "make test". Could you give that a try?

The other things that would help would be a single correct patch known to apply
to the latest cairo code.

Thanks,

-Carl
Comment 13 Jan Słupski 2007-01-20 18:01:36 UTC
Created attachment 8470 [details] [review]
Compined Monty's patches + fix for typo (see comment #9) + win32 complation fix

Patch attached is combined patch 7562 and 7673.
On top of it CAIRO_MUTEX_LOCK is changed to CAIRO_MUTEX_UNLOCK in
cairo-cache.c, that is obvious typo (see comment #9).
Then added declaration and initialization of mutex used in this patch in win32
code. (Are there any other platforms that needs that too?)
Comment 14 Jan Słupski 2007-01-20 19:05:19 UTC
Created attachment 8471 [details] [review]
Lock more cairo code with mutex to prevent remaining crashes

Monty's patches helped a lot, but not all cases were solved. 

Examples of crashes:

threads: cairo-ft-font.c:562: _cairo_ft_unscaled_font_unlock_face: Assertion
`unscaled->lock > 0' failed.

threads: cairo-hash.c:196: _cairo_hash_table_destroy: Assertion
`hash_table->live_entries == 0' failed.

It seems that these crashes are easiest reproducible when threads tie while
executing some instructions. I.e. I'm running two threads application on a
CoreDuo processor. Each thread runs the same code in a loop, so (I guess) they
are often executing the same code block in the same moment.

As a solution I have added 5 more mutex in different parts of code
After that all problems seems to be resolved, although it is quite possible
that I've added more locks than really needed, and that the code is
sub-optimal.

I've added:
- cairo_refcount_mutex for cairo_reference/cairo_destroy
- cairo_clip_path_refcount_mutex for
_cairo_clip_path_reference/_cairo_clip_path_destroy
- cairo_unscaled_font_refcount_mutex for
_cairo_unscaled_font_reference/_cairo_unscaled_font_destroy
- cairo_ft_font_unscaled_lock_mutex for
_cairo_ft_unscaled_font_lock_face/_cairo_ft_unscaled_font_unlock_face
- cairo_font_face_refcount_mutex for
cairo_font_face_reference/cairo_font_face_destroy

The last one is the most tricky - especially unlocking/locking around
"font_face->backend->destroy (font_face);". But without this I was getting
deadlock.
Comment 15 Jan Słupski 2007-01-22 18:30:50 UTC
See also: https://bugs.freedesktop.org/show_bug.cgi?id=4299
Comment 16 Carl Worth 2007-01-30 11:13:32 UTC
(In reply to comment #13)
> Created an attachment (id=8470) [details]
> Compined Monty's patches + fix for typo (see comment #9) + win32 complation fix

This still isn't applying cleanly to the latest cairo.

Oh, looks from the patch that it was generated against 1.2.6.

It would be much more helpful to have patches generated from the
latest cairo, (as made available through git). And if you are in
a position to feed me patches generated from git that could help
a lot too.

> On top of it CAIRO_MUTEX_LOCK is changed to CAIRO_MUTEX_UNLOCK in
> cairo-cache.c, that is obvious typo (see comment #9).

Thanks.

> Then added declaration and initialization of mutex used in this patch in win32
> code. (Are there any other platforms that needs that too?)

Also, thanks. There might be another platform that needs it, but we won't
hear about it here in bugzilla-land where we get basically just a pair-
wise conversation. Please
Comment 17 Carl Worth 2007-01-30 11:24:36 UTC
(In reply to comment #16)
> This still isn't applying cleanly to the latest cairo.

Beyond that, I'm still getting deadlock with "make test" after applying
(and massaging) the patch. Something's very wrong.

I'll start by breaking the patch up into minimal pieces, but I'll need
some help from someone who can easily replicate the original bugs
to make sure my new stuff still catches them all.

I'll post a git branch with the new work soon, (I refuse to post
patches as bugzilla attachments---yuck!).

-Carl
Comment 18 Carl Worth 2007-01-30 12:18:05 UTC
So here's a web-view of a git branch I made, (8801-attachments) with
the two most recent attachments as git commits:

http://gitweb.freedesktop.org/?p=users/cworth/cairo;a=shortlog;h=8801-attachments

One should be able to explore that by doing the following for a new clone:

    git clone git://people.freedesktop.org/~cworth/cairo
    cd cairo
    git checkout -b tmp 8801-attachments

or from an existing clone of cairo's repository:

    git fetch git://people.freedesktop.org/~cworth/cairo 8801-attachments:8801-attachments
    git checkout -b tmp 8801-attachments

If you could test to see if you con't also get deadlock with "make test" that
would be helpful.

Then, I'll also follow up with a new branch that breaks up each minimal piece
of new functionality.

Thanks,

-Carl
Comment 19 Carl Worth 2007-01-30 23:11:35 UTC
(In reply to comment #18)
> Then, I'll also follow up with a new branch that breaks up each minimal piece
> of new functionality.

The first thing I tried separating out were the changes to cairo-scaled-font.c, (moving the scaled_font_freeze up before the call to backend->show_glyphs).

That's definitely not what we want. It's even been proposed before and rejected, see:

https://bugs.freedesktop.org/show_bug.cgi?id=6955#c12

So you'll see that the freeze is happening as the first substantive thing in
_cairo_xlib_surface_show_glyphs. So now I'm really curious as to what prompted
this change in the patch. I'd really like to see a test case that fails
without this part of the patch and passes with it.

The ref_count stuff is a bit more straightforward, but I still want to change
it in a couple of ways:

1. I think I might unify all of the reference counting throughout cairo

2. I'd still like to see the protection for this implemented with atomic
   instructions.

More to come on those fronts...

-Carl
Comment 20 Carl Worth 2007-01-31 01:02:53 UTC
(In reply to comment #19)
> The first thing I tried separating out were the changes to cairo-scaled-font.c,
> (moving the scaled_font_freeze up before the call to backend->show_glyphs).
> 
> That's definitely not what we want. It's even been proposed before and
> rejected, see:
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=6955#c12

Looking closer, what was rejected was doing generic freeze/thaw to fix the
problem of glyphs disappearing from the X server glyph cache due to cache
pressure from other glyphs in the same call to _cairo_surface_show_glyphs.

But the question of how to ensure that the results of _cairo_scaled_glyph_lookup
remain valid until the caller is done looking at them has not been properly
addressed. I think we're missing freeze/thaw not only here, but around
every call into _cairo_scaled_glyph_lookup.

And _cairo_scaled_glyph_lookup is also currently abusing the font_map_mutex
when the scaled_font->glyphs cache really should have its own mutex.

So, basically, it looks like the locking situtation inside cairo-scaled-font.c
is quite miserable currently, and I'm not surprised that you've both had
to do a fair amount of whack-a-mole fixing on it.

I'd like to attempt a more complete fix addressing the issues described here,
but I'm definitely too tired to finish that up tonight.

More tomorrow, (and probably moving over to the cairo mailing list at this
point).

-Carl
Comment 21 Carl Worth 2007-02-05 17:08:15 UTC
See my recent mail to the cairo list for an attempted patch to address this bug:

http://article.gmane.org/gmane.comp.lib.cairo/9360

Thanks,

-Carl
Comment 22 Carl Worth 2007-02-06 19:04:10 UTC
As described in the mail linked to above, the patch that Keith and I put together solves the bug without introducing any locking overhead for reference counting.

Monty confirmed that the patch fixes his bug. And the patch also doesn't cause any deadlocks in "make test".

Finally, the new locking that the patch adds to the text rendering code could be made more narrow for better performance, but that's obviously a different bug.

Anyway, the patch is committed now, (during 1.3.13 as available through git), and will appear in the next (1.3.14 snapshot). So that's a version number you can key off of Monty.

Jan, if you could retest with the latest code that would be helpful. And if you still see problems you may be seeing an independent bug---so let us know.

-Carl
Comment 23 Carl Worth 2007-02-06 23:55:15 UTC
Ok, this one isn't quite done yet.

The pthread-show-text test case is being useful for exposing the remaining problems, (particularly with a multi-core or multi-processor machine).

More later, (sleep first).

-Carl
Comment 24 Carl Worth 2007-02-14 00:12:02 UTC
The scaled_font locking added previously helped a lot, but as Jan noted, it wasn't complete. Since then, with Jan's help we tracked down the need to add locking to the unscaled_font object as well, and triggered some interesting discussions about the way this locking should interact with cairo_ft_scaled_font_lock_face, (which is a separate bug---how to ensure cairo and cairo-using applications don't call into the non-thread-safe freetype library at the same time).

Anyway, the new locking (and subsequent fixes for the several things I broke while adding it), are available in commits 7e1301ffb066b04d95dc71144d86e660f0155bba through 9966551dc7640ae7901ffed0e15f0fbf7e603d5d .

Hopefully, we can keep this bug closed this time.

Again, thanks for all your help with this.

-Carl



Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.