Steps to reproduce: * Call poppler_document_new_from_data with data = (char*) 0x1 and len = 0 (0x1 is the dummy pointer that Rust uses when not allocating heap memory in a zero-size vector of bytes Vec<u8>.) Expected result: * A null pointer is returned and an error is set Actual result: * Segfault because this method of MemStream: int getChar() override { return (bufPtr < bufEnd) ? (*bufPtr++ & 0xff) : EOF; } … dereferences a null pointer. It was entered with bufEnd == 0x1 (as expected) and bufPtr == 0x0 (incorrect, should be 0x1). It looks like this MemStream comes from str->makeSubStream() being called with an incorrect start position of -1 in XRef::readXRef(). This value came from getStartXRef() which apparently uses it to signal that "startxref" wasn’t found. But the caller PDFDoc::setup() doesn’t check for that error and uses the value as a position for the XRef constructor.
Don't do that? I mean what's the reason you're passing a zero bytes buffer? Anyhow actual compilable code to reproduce the problem would be welcome
Is “don’t do that” really the best answer to an out-of-bounds memory access bug in a library routinely used with untrusted input from the Internet? The reason I stumbled upon this bug is that I’d like to use Poppler to test a library that generates PDF files. I started writing a test harness before I wrote any library code, so the first input I tried with my Rust bindings to Poppler was the empty byte vector. Steps to reproduce: compile with gcc $(pkg-config --cflags --libs poppler-glib) #include "poppler.h" void main() { poppler_document_new_from_data((char*) 1, 0, NULL, NULL); }
(In reply to simon-freedesktop from comment #2) > Is “don’t do that” really the best answer to an out-of-bounds memory access > bug in a library routinely used with untrusted input from the Internet? Yeah, seriously, it's in your realm of duty to give proper data to the libraries you use. > > The reason I stumbled upon this bug is that I’d like to use Poppler to test > a library that generates PDF files. I started writing a test harness before > I wrote any library code, so the first input I tried with my Rust bindings > to Poppler was the empty byte vector. > > Steps to reproduce: compile with gcc $(pkg-config --cflags --libs > poppler-glib) > > #include "poppler.h" > > void main() { > poppler_document_new_from_data((char*) 1, 0, NULL, NULL); > } It works fine if you pass a null pointer in the first parameter, i'll let the glib frontend maintainers decide if they actually care about your use case or not
There is no use case. Passing 0x1 only makes this bug visible with a segfault but that’s not the point. NULL probably triggers an explicit check early. Some other pointers might be preceded by memory that happens to be valid and so reading there silently "works", but using a -1 error code as an index still causes an out-of-bounds memory access. I’m not good at creating exploits, but I believe this is undefined behavior that could potentially lead to a vulnerability. This bug is not in the glib frontend. It’s in poppler/PDFDoc.cc that the return value of getStartXRef() is used without checking for errors. In addition to checking there, another good change might be to assert in makeSubStream that the 'start' parameter is not negative.
> I’m not good at creating exploits, but I believe this is undefined behavior that could potentially lead to a vulnerability. I have real bugs to fix here, unless you can prove this crash happens when using the code correctly i'm going to ignore your belief
Created attachment 136622 [details] poppler_oob.c Here's a proof-of-concept. Compile with "clang -o poppler_oob poppler_oob.c `pkg-config poppler-glib --cflags --libs`" or similar.
I don't think that code makes sense whatsoever, but anyway i fixed the crash.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.