There seems the kind of the race condition issue when updating the fontconfig cache for directory. it is, somehow the directory cache is missing a directory 'a' when creating a directory, installing a font and generating a cache.
That seems most likely to happen when installing OS newly.
I'm not quite sure the details and root-cause but the scenario what I can imagine is:
1. installing 'a' font package
2. generating a cache for 'a'
3. installing 'b' font package at the same time
4. generating a cache for 'b'
5. somehow updating the parent directory cache from 'b' first
6. updating the parent directory cache from 'a'
If there are no 'b' directory when processing 'a', this might happens.
possible some idea to address this situation:
a) remember the time when starting to generate the cache and write it into the cache. so if it's newer than one has, they can just stop to proceed.
b) when failed to lock the cache, restart the process from reading the directory
any other idea?
Ah, the scenario is wrong. one of them must be something like the process updating the parent directory cache.
Created attachment 86722 [details]
modified fc-cache.c and fc-match.c to reproduce this issue
Created attachment 86822 [details]
To make sure the cache in FcConfig is up-to-dated.
more possible solution is, to watch the directory and when creating a directory, just add it into the cache?
I haven't got to fully study this issue. Do you want me to?
I have no strong idea how to fix this yet. so I should bring this up to discuss on the list perhaps.
Anyway, the attached test case can 100% reproduces this issue. additional changes on my local tree to test it:
@@ -16,7 +16,7 @@ TESTDATA=4x6.pcf 8x16.pcf out.expected fonts.conf.in
AM_CPPFLAGS = -I$(top_srcdir) -I$(top_builddir)
-check_PROGRAMS = test-migration
+check_PROGRAMS = test-migration test-bz69845
check_PROGRAMS += test-pthread
test_pthread_LDADD = $(top_builddir)/src/libfontconfig.la
@@ -27,6 +27,8 @@ endif
noinst_PROGRAMS = $(check_PROGRAMS)
test_migration_LDADD = $(top_builddir)/src/libfontconfig.la
+test_bz69845_CFLAGS = -DSRCDIR="\"$(abs_srcdir)\""
+test_bz69845_LDADD = $(top_builddir)/src/libfontconfig.la
and copy PCF file on test directory as bz69845.pcf
or I can just commit it into git for testing.
Worked around in git but it's not perfect. we may need to think about more robust solution for this.
About the issue being reported at the list, http://lists.freedesktop.org/archives/fontconfig/2013-November/004983.html
In current implementation, given that a directory contains sub-directories and font files, fontconfig doesn't support to update a cache for either direstories or files. it will be done in one API call. so it needs to be split up to allow updating either them at least.
furthermore we may need to take an action either:
a) support the deserialization of the cache and update the directory cache only
b) have the directory entry cache and the file entry cache separately
b) looks like simpler to me though, any comments and/or any other idea are welcome.
another proposal for fixing the performance issue:
Updating the directory cache only when rescanning
Why this change:
- memset (new->map, '\0', sizeof (new->map));
- memcpy (new->map, ls->map, FC_MIN (sizeof (new->map), ls->map_size * sizeof (ls->map)));
+ memcpy (new->map, ls->map, sizeof (ls->map));
(In reply to comment #10)
> Why this change:
> - memset (new->map, '\0', sizeof (new->map));
> - memcpy (new->map, ls->map, FC_MIN (sizeof (new->map), ls->map_size *
> sizeof (ls->map)));
> + memcpy (new->map, ls->map, sizeof (ls->map));
> It's wrong.
unrelated to this matter but isn't it a duplicate? is there any assumption that the size of new->map and ls->map is different? map_size is always set to NUM_LANG_SET_MAP and no reallocation on map too. hmm, architecture related thing? that said we have separate caches per architectures. there shouldn't be the case which we deal with different size of FcChar32.
That would be appreciated if you can explain a bit so that I may missing something else.
(In reply to comment #11)
> (In reply to comment #10)
> > Why this change:
> > - memset (new->map, '\0', sizeof (new->map));
> > - memcpy (new->map, ls->map, FC_MIN (sizeof (new->map), ls->map_size *
> > sizeof (ls->map)));
> > + memcpy (new->map, ls->map, sizeof (ls->map));
> > It's wrong.
> unrelated to this matter but isn't it a duplicate? is there any assumption
> that the size of new->map and ls->map is different? map_size is always set
> to NUM_LANG_SET_MAP and no reallocation on map too. hmm, architecture
> related thing? that said we have separate caches per architectures. there
> shouldn't be the case which we deal with different size of FcChar32.
The idea is that whenever we add new orth files, we don't have to bump up the cache version. So, the langset may be coming from a cache file generated by older fontconfig and hence having fewer entries. I added this after hours of very hard to debug issues when this actually happened a few years ago. I don't think it's a bad assumption. Bumping cache version every time we add a language sounds a bit excessive.
(In reply to comment #12)
> The idea is that whenever we add new orth files, we don't have to bump up
> the cache version. So, the langset may be coming from a cache file
> generated by older fontconfig and hence having fewer entries. I added this
> after hours of very hard to debug issues when this actually happened a few
> years ago. I don't think it's a bad assumption. Bumping cache version
> every time we add a language sounds a bit excessive.
Ah... I see. and just realized I did mistake on adding quz.orth in 2.10.94 :(
even though it wasn't a first time to add new orth. doh!
Anyway reverted that change. thanks
Rescanning directories only has been pushed to git. closing.
current approach seems not perfect. it looks like there are a pitfall fontconfig creates( or updates perhaps) a cache without containing certain sub-directories.
Created attachment 100378 [details] [review]
I'm not sure if the patch gets rid of the issue completely but it looks like improved after applying. increasing the amount of syscall and disk I/O definitely affects the performance though, I have no idea other than at this moment.
forgot to add some progress. still not perfect to avoid this issue.
I'm not quite sure what happened. need to investigate it hard.
Created attachment 116870 [details] [review]
need a review to be merged
going back here again. this issue seems still happens. I'm not quite sure how we can fix this completely. meanwhile I wrote a patch to detect the change of files in directory to update the cache.
To reduce the footprint to scan font files, added the number of files in the directory to the cache and compare if it is same. it can be obtained during updating cache and when it is changed, mtime is usually changed. so it shouldn't affect the performance no matter if the directory contains new fonts or updated.
Fixed with the latest commit in git for platforms that support st_mtim in struct stat. for platforms that doesn't recognize the changes less than 1 second, we might need to change the code to compare the mtime if the mtime of directories is newer than the cache rather than comparing it if it is same.
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/fontconfig/fontconfig/issues/88.