Bug 23074 - buffer overflow in the Abiword backend
Summary: buffer overflow in the Abiword backend
Status: RESOLVED WONTFIX
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-01 06:27 UTC by Jakub Wilk
Modified: 2011-03-22 15:50 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Fix described issue, and other similar ones (5.09 KB, patch)
2009-10-21 08:15 UTC, Vincent Untz
Details | Splinter Review

Description Jakub Wilk 2009-08-01 06:27:59 UTC
(Tested with poppler 0.10.6)

The Abiword backend is susceptible to buffer overflows.

Proof of concept:

$ gdb pdftoabw
[snip]
(gdb) break ABWOutputDev::endWord()
Function "ABWOutputDev::endWord()" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (ABWOutputDev::endWord()) pending.
(gdb) run test.pdf /tmp/tmp
[snip]

Breakpoint 1, ABWOutputDev::endWord (this=0x97a12d0) at ABWOutputDev.cc:424
424       if (N_word) {
(gdb) cont
Continuing.

Breakpoint 1, ABWOutputDev::endWord (this=0x97a12d0) at ABWOutputDev.cc:424
424       if (N_word) {
(gdb) n
425         sprintf(buf, "%f", X2);    xmlNewProp(N_word, BAD_CAST "X2", BAD_CAST buf);
(gdb) n
426         sprintf(buf, "%f", Y2);    xmlNewProp(N_word, BAD_CAST "Y2", BAD_CAST buf);
(gdb) print sizeof buf
$1 = 20
(gdb) print strlen(buf)
$2 = 20
Comment 1 Albert Astals Cid 2009-08-01 06:31:47 UTC
Jauco could you have a look?
Comment 2 Jauco Noordzij 2009-08-20 10:17:26 UTC
I'd like to, but I don't know that much about buffer overflows. Which part of the gdb dump shows that a buffer overflow exists? And do you know of any good documentation on this matter. (I wanted to learn more about it anyway, so this is a nice opportunity :-)
Comment 3 Jakub Wilk 2009-08-23 06:48:34 UTC
String length is 20, which means that it occupies 21 bytes, but the buffer is 20 bytes only.

To format a double with "%f" you need:
- DBL_MAX_10_EXP + 1 bytes for an integer part;
- 1 byte for the decimal separator;
- 6 bytes for a fractional part;
- 1 byte for the null character.
Comment 4 Albert Astals Cid 2009-08-31 15:03:51 UTC
@Jauco: Did you understand the problem? Need more help?
Comment 5 Jauco Noordzij 2009-09-01 02:02:19 UTC
Yes I think I understand it. I'll try and commit a fix this weekend. 
Comment 6 Jakub Wilk 2009-10-10 10:49:30 UTC
I forgot: 1 additional byte for the minus sign.
Comment 7 Vincent Untz 2009-10-21 08:15:18 UTC
Created attachment 30599 [details] [review]
Fix described issue, and other similar ones

It turns out it's not the only place where we can have an overflow.

The attached patch tries to fix all this, but I'm not really sure how to test things to validate it...
Comment 8 Albert Astals Cid 2009-10-21 15:41:46 UTC
There is a 160 buffer moved to much less, is that ok? Also i have no idea how to test, maybe we should just move to using GooString for formatting and that way be sure there is no possible overflow at all?
Comment 9 Jakub Wilk 2009-10-22 00:54:20 UTC
Just nitpicking: theoretically int can be bigger than 32-bits. I'd define BUFLEN_FOR_INT as 2 + sizeof (int) * CHAR_BIT / 3.
Comment 10 BinLi 2009-10-26 03:36:48 UTC
(In reply to comment #9)
> Just nitpicking: theoretically int can be bigger than 32-bits. I'd define
> BUFLEN_FOR_INT as 2 + sizeof (int) * CHAR_BIT / 3.
> 
Jakub,

 Why "sizeof (int) * CHAR_BIT / 3 " could be for the length of INT_MAX ?

 Thanks!
Comment 11 Jakub Wilk 2009-10-26 05:11:44 UTC
Well, `sizeof (int) * CHAR_BIT / 3` is enough essentially because:
log(2**N) = N ** log(2) < N/3.

I am too lazy to write a formal proof. ;)
Comment 12 Jakub Wilk 2009-10-26 05:12:12 UTC
Erm,
log(2**N) = N * log(2) < N/3.
Comment 13 Michael Gilbert 2009-11-11 21:06:59 UTC
fyi, log(2.0) is not less than one third.  log(2.0) ~= 0.69 and 1.0/3.0 ~= 0.33.
Comment 14 Michael Gilbert 2009-11-11 21:10:11 UTC
or did you mean log10(2.0)?  log10(2.0) ~= 0.30.  where does the log(2**N) come from in the first place?
Comment 15 Jakub Wilk 2009-11-12 00:02:19 UTC
Yes, I meant log10. (log is an alias for log10 in my part of the world, sorry.)
Comment 16 Marc Deslauriers 2010-03-24 10:08:15 UTC
What's the status on this patch?

This issue was assigned CVE-2009-3938.

Thanks!
Comment 17 Albert Astals Cid 2010-03-24 13:21:18 UTC
Well, it seems that Vincent wasn't really sure the patch was correct and noone is really really interested in fixing the code so it has not been commited.

So well, if you guys want i can commit it, i really do not have an opinion, not sure if anyone really uses that code.
Comment 18 Albert Astals Cid 2011-03-22 15:50:25 UTC
pdftoabw was just removed from poppler as it was unmaintained so this won't be fixed. Sorry. If you were using it, this is the moment to step up and be its maintainer.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.