Summary: | evince crashed with SIGSEGV in CairoOutputDev::restoreState() | ||
---|---|---|---|
Product: | poppler | Reporter: | Pedro Villavicencio <pvillavi> |
Component: | cairo backend | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | davidben |
Version: | unspecified | ||
Hardware: | x86 (IA32) | ||
OS: | All | ||
URL: | https://bugs.edge.launchpad.net/ubuntu/+source/poppler/+bug/430887 | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | Do not restoreState on empty stack |
Description
Pedro Villavicencio
2009-10-16 12:13:42 UTC
Created attachment 31363 [details] [review] Do not restoreState on empty stack This PDF is horrendously malformed. As far as I can tell, several of the pages have command streams missing spaces and who knows what else. Even acroread will display an error. That said, acroread does not crash and does a much better job rendering it than we do. The confused command stream results in a restoreState without the right number of saveStates. The above patch causes that to result in a warning. We probably want to try being lenient about parsing the command stream to allow one to omit the spaces before/after commands? I suspect this is what acroread is doing. This may be tricky as some commands (d0 and d1) end in numbers. Yes that'd would be the end goal, we already do that this space thing when parsing obj number, see XRef.cc line 963 About this patch, i'd prefer to move it to the CairoOutputDev, the other *OutputDev already do this check, and in a regular project i'd suggest "upstreaming" the checks but here i prefer not to touch existing code to ease future merges with xpdf code (if he ever does a new release) so moving to CairoOutputDev is as good for fixing the bug and eases a bit the merging so if you agree Carlos or I can commit the fix there. Yes, go ahead. Pushed, should we open a different bug for the rendering thing? It might be worth checking the PDF more carefully if it's really just a spaces thing. I see a block 442.5542 466.4069 l 442.4207 466.2510 442.2421 466.0923 442.0182 465.9284 c 441.7940 46.63657 441.5888 465.6430 441.4008 465.5602 c 441.4008 466.2391 l 441.7194 466.4217 441.9995 466.6445 442.2436 466.9070 c 442.4862 467.1711 442.6580 46 h2.71 442.7566 46 h6777 c 443.1314 46 h6777 l h h 444.6473 466.9030 m which only makes sense as far as argument counts go if we just ignore the h2.71 and h6777 tokens. A first attempt at quickly implementing something like that didn't seem to make the PDF render right. Though, I could have decompressed it wrong --- what I did was rather hacky. Is there a tool somewhere that will decode all the streams of a PDF so that one may easily view the command streams in plain text? you can try podofobrowser, i'm working on a "uncompress" tool based on poppler since we have all the bits, just never found the time :D Now, I see the error message with any PDF document: "Error: restoreState on an empty state stack" so, I'm not sure the patch is right. It looks like that got introduced when porting the patch down to CairoOutputDev. I get this error on a random PDF with master, but not the original patch. I think it's because we call state->restore() before the output device sees anything. void Gfx::restoreState() { state = state->restore(); out->restoreState(state); } What about just something like this? - mask = ms->mask; - maskStack = ms->next; - delete ms; + if (ms) { + mask = ms->mask; + maskStack = ms->next; + delete ms; + } that's what we already do with other stacks. Fianlly fixed it in git master and poppler-0.12 branch. I'm... not convinced this actually works. I think the patch as it is still can cause a crash. Say we're at a point where the state stack is empty, but we have set a mask. So out->mask is not null. Then we call Gfx::restoreState. In the current code, we call CairoOutputDev::restoreState which reaches this code: MaskStack* ms = maskStack; if (mask) cairo_pattern_destroy(mask); if (ms) { mask = ms->mask; maskStack = ms->next; delete ms; } This frees the current mask, but fails to replace it with a different one because ms is NULL. Future uses of mask will then access freed memory and cause problems. We could instead write if (mask) { cairo_pattern_destroy(mask); mask = NULL; } which wouldn't crash, but then an erroneous restoreState would clear your mask --- something I'm not convinced is desirable. The main problem being that, if we do a Gfx::restoreState on an empty stack, it really should be a no-op[1]. However, without the original patch of checking in the generic layer, we still call an OutputDev::restoreState, which is then responsible for enforcing this. This is messy, and, more importantly, actually difficult to detect without extra state. In both the case of restoring to an empty stack and restoring to a 1-element stack, OutputDev::restoreState receives the same GfxState object and has no way of distinguishing between the two without maintaining its own state, which seems rather poor. In the case of CairoOutputDev, on any restoreState, we forcibly pop the mask, cairo context and cairo_shape context[2]. I think it's worth adding the check to the output-independent part even if future merges with xpdf may be somewhat of a nuisance. Especially if it turns out acroread does something fancy on empty restoreState[1], we'll need to add code there anyway. Also, it looks like the last xpdf release was almost 3 years ago. [1] Or maybe a restore to default state? Might be worth seeing exactly what acroread does here. [2] Actually, I think the logic here doesn't work if cairo_shape is introduced/removed on anything other than the bottom-most stack. I'll try to come up with a test case and file a bug if it indeed doesn't work. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.