(Tested with poppler 0.10.6) The Abiword backend is susceptible to buffer overflows. Proof of concept: $ gdb pdftoabw [snip] (gdb) break ABWOutputDev::endWord() Function "ABWOutputDev::endWord()" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (ABWOutputDev::endWord()) pending. (gdb) run test.pdf /tmp/tmp [snip] Breakpoint 1, ABWOutputDev::endWord (this=0x97a12d0) at ABWOutputDev.cc:424 424 if (N_word) { (gdb) cont Continuing. Breakpoint 1, ABWOutputDev::endWord (this=0x97a12d0) at ABWOutputDev.cc:424 424 if (N_word) { (gdb) n 425 sprintf(buf, "%f", X2); xmlNewProp(N_word, BAD_CAST "X2", BAD_CAST buf); (gdb) n 426 sprintf(buf, "%f", Y2); xmlNewProp(N_word, BAD_CAST "Y2", BAD_CAST buf); (gdb) print sizeof buf $1 = 20 (gdb) print strlen(buf) $2 = 20
Jauco could you have a look?
I'd like to, but I don't know that much about buffer overflows. Which part of the gdb dump shows that a buffer overflow exists? And do you know of any good documentation on this matter. (I wanted to learn more about it anyway, so this is a nice opportunity :-)
String length is 20, which means that it occupies 21 bytes, but the buffer is 20 bytes only. To format a double with "%f" you need: - DBL_MAX_10_EXP + 1 bytes for an integer part; - 1 byte for the decimal separator; - 6 bytes for a fractional part; - 1 byte for the null character.
@Jauco: Did you understand the problem? Need more help?
Yes I think I understand it. I'll try and commit a fix this weekend.
I forgot: 1 additional byte for the minus sign.
Created attachment 30599 [details] [review] Fix described issue, and other similar ones It turns out it's not the only place where we can have an overflow. The attached patch tries to fix all this, but I'm not really sure how to test things to validate it...
There is a 160 buffer moved to much less, is that ok? Also i have no idea how to test, maybe we should just move to using GooString for formatting and that way be sure there is no possible overflow at all?
Just nitpicking: theoretically int can be bigger than 32-bits. I'd define BUFLEN_FOR_INT as 2 + sizeof (int) * CHAR_BIT / 3.
(In reply to comment #9) > Just nitpicking: theoretically int can be bigger than 32-bits. I'd define > BUFLEN_FOR_INT as 2 + sizeof (int) * CHAR_BIT / 3. > Jakub, Why "sizeof (int) * CHAR_BIT / 3 " could be for the length of INT_MAX ? Thanks!
Well, `sizeof (int) * CHAR_BIT / 3` is enough essentially because: log(2**N) = N ** log(2) < N/3. I am too lazy to write a formal proof. ;)
Erm, log(2**N) = N * log(2) < N/3.
fyi, log(2.0) is not less than one third. log(2.0) ~= 0.69 and 1.0/3.0 ~= 0.33.
or did you mean log10(2.0)? log10(2.0) ~= 0.30. where does the log(2**N) come from in the first place?
Yes, I meant log10. (log is an alias for log10 in my part of the world, sorry.)
What's the status on this patch? This issue was assigned CVE-2009-3938. Thanks!
Well, it seems that Vincent wasn't really sure the patch was correct and noone is really really interested in fixing the code so it has not been commited. So well, if you guys want i can commit it, i really do not have an opinion, not sure if anyone really uses that code.
pdftoabw was just removed from poppler as it was unmaintained so this won't be fixed. Sorry. If you were using it, this is the moment to step up and be its maintainer.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.