Created attachment 113383 [details] [review] patch to GooString::appendfv() In the test file in https://bugs.freedesktop.org/show_bug.cgi?id=89076 pdftops spends the most time in GooString::formatDouble() and GooString::appendfv(). GooString::appendfv() calls gmallocn() to allocate a small buffer of arguments to print each time that it is called. pdftops can call it millions of times on some PDFs. These patches change appendfv() to use a local buffer with 8 items (the same as it used to allocate). It now allocates a buffer only if it has more than 8 items. I tested the change by setting the initial size to 1 and running a file with valgrind to exercise cases that require multiple reallocations. I have profiles before and after showing that the number of calls to gmallocn() was reduced from almost 11M to 4.8M and the total time spent inside appendfv() and gmallocn() was reduced. pdftops before the patch (total from 3 runs of "pdftops 2-DESCR_648853-IT-EN-DE_MANITOU_RIGENERATO.pdf x.ps") % cumulative self self total time seconds seconds calls ms/call ms/call name 15.25 0.93 0.93 8905797 0.00 0.00 GooString::formatDouble(double, char*, int, int, bool, char**, int*) 9.34 1.50 0.57 6172503 0.00 0.00 GooString::appendfv(char const*, __va_list_tag*) 7.87 1.98 0.48 34515024 0.00 0.00 FlateStream::getHuffmanCodeWord(FlateHuffmanTab*) 7.05 2.41 0.43 22730064 0.00 0.00 FlateStream::readSome() 6.48 2.81 0.40 18514623 0.00 0.00 Lexer::getObj(Object*, int) 5.90 3.17 0.36 16607355 0.00 0.00 Parser::getObj(Object*, bool, unsigned char*, CryptAlgorithm, int, int, int, int, bool) 5.90 3.53 0.36 106901274 0.00 0.00 JBIG2Segment::~JBIG2Segment() 4.92 3.83 0.30 111594801 0.00 0.00 FlateStream::getChar() 4.34 4.09 0.27 38266749 0.00 0.00 Object::free() 3.69 4.32 0.23 87202734 0.00 0.00 Lexer::lookChar() 3.61 4.54 0.22 19199391 0.00 0.00 GooString::append(char const*, int) 2.79 4.71 0.17 438 0.39 12.47 Gfx::go(bool) 2.62 4.87 0.16 38745 0.00 0.00 FlateStream::compHuffmanCodes(int*, int, FlateHuffmanTab*) 1.39 4.95 0.09 10967463 0.00 0.00 gmallocn pdftops after GooString, appendfv reduced 0.57s to 0.50s, gmalloc reduced 0.09s to 0.02s, 13.07 0.75 0.75 8905797 0.00 0.00 GooString::formatDouble(double, char*, int, int, bool, char**, int*) 8.77 1.25 0.50 6172503 0.00 0.00 GooString::appendfv(char const*, __va_list_tag*) 8.60 1.74 0.49 22730064 0.00 0.00 FlateStream::readSome() 8.07 2.20 0.46 18514623 0.00 0.00 Lexer::getObj(Object*, int) 7.19 2.61 0.41 111594801 0.00 0.00 FlateStream::getChar() ... 0.35 5.14 0.02 4794960 0.00 0.00 gmallocn
Can you attach one file you used to test this ?
I put my copy of the file at http://williambader.com/2-DESCR_648853-IT-EN-DE_MANITOU_RIGENERATO.pdf pdftops has to write a lot of formatted strings, especially for PDFs that draw a lot of lines. I think that the calls to write formatted strings (which end up as calls to appendfv()) are necessary. My patch just saves appendfv() from making a malloc call for a small, fixed size chunk of memory each time that it is called. In the test file, all of the calls have 8 arguments or less, and appendfv() never needs to reallocate the argument buffer.
Ok, the code is not much harder to read so if noone disagrees i'll commit it next week.
(In reply to Albert Astals Cid from comment #3) > Ok, the code is not much harder to read so if noone disagrees i'll commit it > next week. LGTM
Pushed
Created attachment 113584 [details] [review] patch to the patch When I looked at the copy of the patch in the email, I noticed that the memcpy() length needs to be multiplied by sizeof(GooStringFormatArg)
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.