Summary: | Optimizations | ||
---|---|---|---|
Product: | fontconfig | Reporter: | Kevin Puetz <puetzk> |
Component: | library | Assignee: | Keith Packard <keithp> |
Status: | VERIFIED FIXED | QA Contact: | |
Severity: | enhancement | ||
Priority: | high | CC: | pzb, savitha.gatec, vswadeyar |
Version: | 2.1 | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 99934, 108283 | ||
Attachments: |
various speedups
fontconfig-opt-cache.patch fontconfig-opt-constants.patch fontconfig-opt-hashed-lang.patch fontconfig-opt-name-parse.patch fontconfig-opt-string.patch |
Description
Kevin Puetz
2003-01-09 11:22:48 UTC
Created attachment 1 [details] [review] various speedups Looks like some useful improvements here; let's get them split into pieces so we can take them one at a time without breaking things. Improved and split-out patches, and the timings which show the benefits of each. Attachments coming... all times taken using a modified fc-list which loads the cache 100 times instead of once to reduce timing jitter. This probably results in some understatement of the benefit of fontconfig-opt-cache.patch (the disk cache warmup is quite good, after which the difference between requesting I/O from the kernel in blocks or as a long read is less). plain: ./fc-list 3.86s user 0.12s system 99% cpu 3.999 total fontconfig-opt-cache.patch: ./fc-list 3.67s user 0.15s system 99% cpu 3.826 total fontconfig-opt-constants.patch: changes to tolower->FcToLower usage are also correctness fixes overall speedup is quote minor if visible at all ./fc-list/fc-list 3.82s user 0.13s system 99% cpu 3.958 total fontconfig-opt-hashed-lang.patch: use of hash table to find start/end offsets - very significant ./fc-list 3.13s user 0.13s system 99% cpu 3.262 total fontconfig-opt-name-parse.patch: bitmask test for parsing out delimiters ./fc-list/fc-list 3.50s user 0.12s system 99% cpu 3.623 total fontconfig-opt-string.patch: shortcut in case insensitive comparison - check for strict equality first, then if that fails check insensitively. since we are usually seeing all lowercase this is worth quite a bit ./fc-list/fc-list 3.79s user 0.14s system 99% cpu 3.939 total with all: ./fc-list/fc-list 2.63s user 0.15s system 99% cpu 2.792 total some oprofile data for a kde HEAD login sequence columns are #samples, %total time, %app time, library std: <snip non-kdeinit stuff - some of which does call fontconfig, but...> 55806 59.9336 0.0000 /usr/local/src/kde/parts/HEAD/.kdelibs/usr/local/kdeHEAD/bin/kdeinit <snip other libs> 1633 1.7538 2.9262 /lib/libpthread-0.10.so 1659 1.7817 2.9728 /usr/lib/libstdc++.so.5.0.2 5099 5.4761 9.1370 /usr/local/src/kde/parts/HEAD/.kdelibs/usr/local/kdeHEAD/lib/libkdecore.so.4.2.0 7392 7.9387 13.2459 /usr/local/src/kde/parts/misc/.qt-copy/usr/local/kde3.1/lib/libqt-mt.so.3.1.1 8355 8.9730 14.9715 /lib/libc-2.3.1.so 15122 16.2405 27.0974 /usr/lib/libfontconfig.so.1.0 15978 17.1598 28.6313 /lib/ld-2.3.1.so with all patches: <snip non-kdeinit stuff - some of which does call fontconfig, but...> 43772 56.0633 0.0000 /usr/local/src/kde/parts/HEAD/.kdelibs/usr/local/kdeHEAD/bin/kdeinit <snip other libs> 1569 2.0096 3.5845 /usr/lib/libstdc++.so.5.0.2 2952 3.7809 6.7440 /lib/libpthread-0.10.so 4443 5.6906 10.1503 /usr/local/src/kde/parts/HEAD/.kdelibs/usr/local/kdeHEAD/lib/libkdecore.so.4.2.0 5211 6.6743 11.9049 /lib/libc-2.3.1.so 5958 7.6310 13.6114 /usr/lib/libfontconfig.so.1.0 6520 8.3508 14.8954 /usr/local/src/kde/parts/misc/.qt-copy/usr/local/kde3.1/lib/libqt-mt.so.3.1.1 15595 19.9741 35.6278 /lib/ld-2.3.1.so if you want to see the full oprofile data, I can attach the logs. I don't think Created attachment 3 [details] [review] fontconfig-opt-cache.patch Created attachment 4 [details] [review] fontconfig-opt-constants.patch Created attachment 5 [details] [review] fontconfig-opt-hashed-lang.patch Created attachment 6 [details] [review] fontconfig-opt-name-parse.patch Created attachment 7 [details] [review] fontconfig-opt-string.patch Any chance these will get in before 2.2 is released? I'd rather get 2.2 out the door as the bits released with XFree86 are quite buggy. I'd also like to see if we can't squeeze significantly more than a factor of two out of the speedups; perhaps something like mmap'ing the cache files would help here, and then parsing the names in place instead of copying strings around. That's a lot more ambitious, but I think it'll reduce cache thrashing. I've included the short-circuited language lookups, a couple of changes of tolower to FcToLower and the string compare optimizations (fixed). I'm discarding the remaining patches as I suggest we look at reformatting the cache files to make loading them more efficient. *** Bug 82557 has been marked as a duplicate of this bug. *** *** Bug 110491 has been marked as a duplicate of this bug. *** |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.