Bug 49719

Summary: Assert failures in cairo-surface.c on Ubuntu 11.10/12.04
Product: cairo Reporter: John Beaumont <john>
Component: generalAssignee: Carl Worth <cworth>
Status: RESOLVED MOVED QA Contact: cairo-bugs mailing list <cairo-bugs>
Severity: normal    
Priority: medium    
Version: 1.10.2   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: claws mail backtrace

Description John Beaumont 2012-05-10 01:25:48 UTC
Created attachment 61324 [details]
claws mail backtrace

Some users of Ubuntu 11.10 and 12.04 are getting program crashes along with this error.

claws-mail: /build/buildd/cairo-1.10.2/src/cairo-surface.c:1287:
cairo_surface_set_device_offset: Assertion `status == CAIRO_STATUS_SUCCESS'
failed.
Aborted (core dumped)

The above one is specific to claws mail but the error is basically the same for other software.

This is likely affecting Chromium browser, Pidgin, Banshee, Gwibber and Claws Mail.

Main thread for bug:
https://bugs.launchpad.net/ubuntu/+source/chromium-browser/+bug/887850

Others:
http://www.thewildbeast.co.uk/claws-mail/bugzilla/show_bug.cgi?id=2656
http://developer.pidgin.im/ticket/13810
Comment 1 Chris Wilson 2012-05-10 01:36:46 UTC
Already fixed and tell the applications to stop being buggy as well.
Comment 2 John Beaumont 2012-05-10 02:51:18 UTC
Could you elaborate on that please?

Which bug number and version is this already fixed in?

What do you mean by "tell the applications to stop being buggy as well"? Surely if this is a cairo bug, then that has no relevance.
Comment 3 Chris Wilson 2012-05-10 03:17:23 UTC
No, the applications are not detecting errors (which result from their own bugs) correctly, consider this a friendly segfault.
Comment 4 John Beaumont 2012-05-10 03:23:50 UTC
Lovely. So I'm left with nowhere to go now?

The developers of the other software blame cairo, and the developer of cairo blames the other software.

From my point of view so far this is a cairo or gtk bug. Since it affects multiple applications that use cairo. It would seem unlikely that they are all suddenly having the same bug on 11.10/12.04 at the same time.

Please could you be more forthcoming at least explain to me what this bug is about and tell me where this was fixed in cairo?
Comment 5 John Beaumont 2012-05-10 03:45:53 UTC
Additionally here are the comments from Pidgin and claws

"doing a quick search for this cairo error message confirms my suspicions (or at
least make me more confident) that the bug is in cairo/pango"

and

"This isn't crashing in Pidgin, it's an upstream bug in Pango/Cairo, you should report this to your distro. "
Comment 6 Kalle Vahlman 2012-05-10 06:03:14 UTC
(In reply to comment #4)
> Lovely. So I'm left with nowhere to go now?
> 
> The developers of the other software blame cairo, and the developer of cairo
> blames the other software.

That happens more than you'd think ;)

But in this case, the particular error message and the resulting abort is due to the application (or a library it's using) calling cairo_surface_set_device_offset() when the cairo context is already in an error state.

So unless it's a memory corruption or something similar in cairo, checking for the error state (or not causing it in the first place) is definitely something the application developers should look into.
Comment 7 John Beaumont 2012-05-10 07:17:02 UTC
(In reply to comment #6)
> So unless it's a memory corruption or something similar in cairo, checking for
> the error state (or not causing it in the first place) is definitely something
> the application developers should look into.

Thank you Kalle for a much more helpful reply.

I could certainly go with what you said. My only sticking point is that this has suddenly affected a number of programs since 11.10. If it were only one then I could believe it is not a problem with cairo. But to have multiple software affected in the same way, when there was no problem at all in 11.04, would point the finger at cairo.

Also Chris alluded to the fact that he already knew about the problem with cairo when he said "already fixed", but in the next breath blamed other software. Something I ask him to elaborate on, which he seemed reluctant to do.

All I ask is that if it is "already fixed", could someone give me a bug number.
Comment 8 Benjamin Otte 2012-05-10 07:49:54 UTC
What Chris meant to say was this:

This problem can only happen with broken applications.

Cairo 1.8 was written in a way to cope with these breakages. But somewhere along the way we lost that feature, because we don't test broken applications (and I don't think we intend to).
Comment 9 Uli Schlachter 2012-05-10 08:00:13 UTC
Can you get cairo 1.12.2 and check if this also happens with that cairo version?
I don't have any bug number for you, but there were lots of fixes since 1.10 and I can't easily tell how many of those affect cairo 1.10, too.

On your backtrace:
When this happens again, could you figure out the surface's state? In gdb, do "frame 5" and "print *surace"? However, this would need debug symbols for cairo which your earlier backtrace doesn't have (as can be seen by the invisible function arguments in the backtrace: "#5  0x00705f0a in cairo_surface_set_device_offset ()"). I think/hope there should be a package libcairo2-dbg for that.

I'm especially interested in surface->device_fransform.
Comment 10 John Beaumont 2012-05-10 08:07:20 UTC
Thanks, I will get back to you as soon as I can.
Comment 11 Bill Spitzak 2012-05-10 11:50:01 UTC
Isn't Cairo supposed to silently ignore all further commands when a cairo_t is in an error state?

It seems it is not, so I would say this is a Cairo bug, right?
Comment 12 Uli Schlachter 2012-05-10 11:55:34 UTC
Let me quote the (shortened) source:


void
cairo_surface_set_device_offset (cairo_surface_t *surface, double x_offset, double y_offset)
{
    cairo_status_t status;

    if (unlikely (surface->status))
	return;

[...]

    surface->device_transform.x0 = x_offset;
    surface->device_transform.y0 = y_offset;

    surface->device_transform_inverse = surface->device_transform;
    status = cairo_matrix_invert (&surface->device_transform_inverse);
    /* should always be invertible unless given pathological input */
    assert (status == CAIRO_STATUS_SUCCESS);

[...]
}

You can see different things:
- This function doesn't do anything on error surface (=> cairo correctly ignores operations on error surfaces)
- The only assert() in there has a comment which says "it's virtually impossible for this to fail"

I agree with this assert(). The device_transform should always be a translation matrix and those are always invertible. So unless I missed something, this leaves "random memory corruption" as the most likely case for this assert() to trigger (and debugging random memory corruption is hard and most likely not a bug in cairo).

Also, this is why I asked for someone to ask gdb which values the device_transform contains after a crash.
Comment 13 Bill Spitzak 2012-05-10 12:23:40 UTC
I think you are correct.

I thought it was doing the assert with surface->status. But it is doing it to the return value of a function which should not fail.
Comment 14 Sergio 2012-09-18 19:49:42 UTC
Hello.
webkitgtk browser (and browsers that use it like xxxterm - which changed name recently - and Midori) have been having this bug for quite a while (at least one year).
I personally experience it for a long time and I used Debian sid and currently I'm using Fedora 17 i686.

Here's the backtrace of /usr/libexec/webkitgtk/GtkLauncher when trying to open https://launchpad.net

http://pastebin.com/nvseMqbA

$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 6
model		: 6
model name	: AMD Athlon(tm) XP 1700+
stepping	: 2
cpu MHz		: 1466.842
cache size	: 256 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow up
bogomips	: 2933.68
clflush size	: 32
cache_alignment	: 32
address sizes	: 34 bits physical, 32 bits virtual
power management: ts
Comment 15 Sergio 2012-09-18 19:51:48 UTC
cairo package here is 1.10.2-7.fc17
Comment 16 Sergio 2012-09-18 21:47:33 UTC
Sorry, I see this is marked as invalid.
I tested the Fedora 18 live-cd with cairo 1.12.2-3.fc18 and webkitgtk's browser crashed with

GtkLauncher: cairo-surface.c:1591: cairo_surface_set_device_offset: Assertion `status == CAIRO_STATUS_SUCCESS' failed.

But now back in my F17 installed system I tried webkitgtk3 browser and this one works properly in https://launchpad.net [note it's the gtk3 browser instead of the gtk(2)]

I suspect I must bother someone in webkit.org about this then but I'm posting just in case anyone has any additional info on this.

Thanks.
Comment 17 dev@diy-biogas.eu 2013-04-18 07:36:24 UTC
unfortunately the problem still persists for 1.12.14
http://www.thewildbeast.co.uk/claws-mail/bugzilla/show_bug.cgi?id=2656#c17
Comment 18 Uli Schlachter 2013-04-19 19:33:27 UTC
Since everyone seems to be happy to provide more information:

Could you get debug symbols for gtk+ and figure out the local state in gdk_window_begin_paint_region()? So "frame 6" and "print *paint", "print *implicit_paint" and "print *paint->region", "print *implicit_paint->flushed", perhaps even "print *windows" and "print clip_box" (and for completeness: "print *paint->surface").

If you want to be really, really helpful, it would be great to have a stand-alone test case for this, but I doubt that this will show up.
Comment 19 GitLab Migration User 2018-08-25 13:51:32 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/cairo/cairo/issues/234.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.