Bug 69845 - race condition issue?
Summary: race condition issue?
Status: RESOLVED MOVED
Alias: None
Product: fontconfig
Classification: Unclassified
Component: fc-cache (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Akira TAGOH
QA Contact: Behdad Esfahbod
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-26 11:11 UTC by Akira TAGOH
Modified: 2018-08-20 21:51 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
test case (10.26 KB, text/plain)
2013-09-27 13:21 UTC, Akira TAGOH
Details
test case (10.56 KB, text/plain)
2013-09-30 02:49 UTC, Akira TAGOH
Details
patch (6.01 KB, patch)
2014-06-04 02:44 UTC, Akira TAGOH
Details | Splinter Review
need a review to be merged (6.09 KB, patch)
2015-07-02 09:00 UTC, Akira TAGOH
Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Akira TAGOH 2013-09-26 11:11:19 UTC
There seems the kind of the race condition issue when updating the fontconfig cache for directory. it is, somehow the directory cache is missing a directory 'a'  when creating a directory, installing a font and generating a cache.
That seems most likely to happen when installing OS newly.

I'm not quite sure the details and root-cause but the scenario what I can imagine is:

1. installing 'a' font package
2. generating a cache for 'a'
3. installing 'b' font package at the same time
4. generating a cache for 'b'
5. somehow updating the parent directory cache from 'b' first
6. updating the parent directory cache from 'a'

If there are no 'b' directory when processing 'a', this might happens.

possible some idea to address this situation:
a) remember the time when starting to generate the cache and write it into the cache. so if it's newer than one has, they can just stop to proceed.

b) when failed to lock the cache, restart the process from reading the directory

any other idea?
Comment 1 Akira TAGOH 2013-09-27 10:34:26 UTC
Ah, the scenario is wrong. one of them must be something like the process updating the parent directory cache.
Comment 2 Akira TAGOH 2013-09-27 13:21:51 UTC
Created attachment 86722 [details]
test case

modified fc-cache.c and fc-match.c to reproduce this issue
Comment 3 Akira TAGOH 2013-09-30 02:49:28 UTC
Created attachment 86822 [details]
test case

To make sure the cache in FcConfig is up-to-dated.
Comment 4 Akira TAGOH 2013-09-30 03:19:04 UTC
more possible solution is, to watch the directory and when creating a directory, just add it into the cache?
Comment 5 Behdad Esfahbod 2013-09-30 03:23:48 UTC
I haven't got to fully study this issue.  Do you want me to?
Comment 6 Akira TAGOH 2013-09-30 03:32:43 UTC
I have no strong idea how to fix this yet. so I should bring this up to discuss on the list perhaps.
Anyway, the attached test case can 100% reproduces this issue. additional changes on my local tree to test it:

--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -16,7 +16,7 @@ TESTDATA=4x6.pcf 8x16.pcf out.expected fonts.conf.in
 
 AM_CPPFLAGS = -I$(top_srcdir) -I$(top_builddir)
 
-check_PROGRAMS = test-migration
+check_PROGRAMS = test-migration test-bz69845
 if HAVE_PTHREAD
 check_PROGRAMS += test-pthread
 test_pthread_LDADD = $(top_builddir)/src/libfontconfig.la
@@ -27,6 +27,8 @@ endif
 noinst_PROGRAMS = $(check_PROGRAMS)
 
 test_migration_LDADD = $(top_builddir)/src/libfontconfig.la
+test_bz69845_CFLAGS = -DSRCDIR="\"$(abs_srcdir)\""
+test_bz69845_LDADD = $(top_builddir)/src/libfontconfig.la
 
 EXTRA_DIST=$(check_SCRIPTS) $(TESTDATA)
 
and copy PCF file on test directory as bz69845.pcf

or I can just commit it into git for testing.
Comment 7 Akira TAGOH 2013-10-02 10:41:05 UTC
Worked around in git but it's not perfect. we may need to think about more robust solution for this.
Comment 8 Akira TAGOH 2013-11-14 09:35:09 UTC
About the issue being reported at the list, http://lists.freedesktop.org/archives/fontconfig/2013-November/004983.html

In current implementation, given that a directory contains sub-directories and font files, fontconfig doesn't support to update a cache for either direstories or files. it will be done in one API call. so it needs to be split up to allow updating either them at least.

furthermore we may need to take an action either:

a) support the deserialization of the cache and update the directory cache only

b) have the directory entry cache and the file entry cache separately


b) looks like simpler to me though, any comments and/or any other idea are welcome.
Comment 9 Akira TAGOH 2013-12-05 10:18:58 UTC
another proposal for fixing the performance issue:
http://cgit.freedesktop.org/~tagoh/fontconfig/commit/?h=bz69845

Updating the directory cache only when rescanning
Comment 10 Behdad Esfahbod 2013-12-05 19:48:50 UTC
Why this change:

- memset (new->map, '\0', sizeof (new->map));
- memcpy (new->map, ls->map, FC_MIN (sizeof (new->map), ls->map_size * sizeof (ls->map[0])));
+ memcpy (new->map, ls->map, sizeof (ls->map));

It's wrong.
Comment 11 Akira TAGOH 2013-12-06 03:21:25 UTC
(In reply to comment #10)
> Why this change:
> 
> - memset (new->map, '\0', sizeof (new->map));
> - memcpy (new->map, ls->map, FC_MIN (sizeof (new->map), ls->map_size *
> sizeof (ls->map[0])));
> + memcpy (new->map, ls->map, sizeof (ls->map));
> 
> It's wrong.

unrelated to this matter but isn't it a duplicate? is there any assumption that the size of new->map and ls->map is different? map_size is always set to NUM_LANG_SET_MAP and no reallocation on map too. hmm, architecture related thing? that said we have separate caches per architectures. there shouldn't be the case which we deal with different size of FcChar32.

That would be appreciated if you can explain a bit so that I may missing something else.
Comment 12 Behdad Esfahbod 2013-12-06 04:13:22 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > Why this change:
> > 
> > - memset (new->map, '\0', sizeof (new->map));
> > - memcpy (new->map, ls->map, FC_MIN (sizeof (new->map), ls->map_size *
> > sizeof (ls->map[0])));
> > + memcpy (new->map, ls->map, sizeof (ls->map));
> > 
> > It's wrong.
> 
> unrelated to this matter but isn't it a duplicate? is there any assumption
> that the size of new->map and ls->map is different? map_size is always set
> to NUM_LANG_SET_MAP and no reallocation on map too. hmm, architecture
> related thing? that said we have separate caches per architectures. there
> shouldn't be the case which we deal with different size of FcChar32.

The idea is that whenever we add new orth files, we don't have to bump up the cache version.  So, the langset may be coming from a cache file generated by older fontconfig and hence having fewer entries.  I added this after hours of very hard to debug issues when this actually happened a few years ago.  I don't think it's a bad assumption.  Bumping cache version every time we add a language sounds a bit excessive.
Comment 13 Akira TAGOH 2013-12-06 04:31:55 UTC
(In reply to comment #12)
> The idea is that whenever we add new orth files, we don't have to bump up
> the cache version.  So, the langset may be coming from a cache file
> generated by older fontconfig and hence having fewer entries.  I added this
> after hours of very hard to debug issues when this actually happened a few
> years ago.  I don't think it's a bad assumption.  Bumping cache version
> every time we add a language sounds a bit excessive.

Ah... I see. and just realized I did mistake on adding quz.orth in 2.10.94 :(
even though it wasn't a first time to add new orth. doh!
Comment 14 Akira TAGOH 2013-12-06 04:35:10 UTC
Anyway reverted that change. thanks
Comment 15 Akira TAGOH 2014-02-07 08:30:56 UTC
Rescanning directories only has been pushed to git. closing.
Comment 16 Akira TAGOH 2014-06-04 02:33:37 UTC
current approach seems not perfect. it looks like there are a pitfall fontconfig creates( or updates perhaps) a cache without containing certain sub-directories.

see https://bugzilla.redhat.com/show_bug.cgi?id=921706
Comment 17 Akira TAGOH 2014-06-04 02:44:11 UTC
Created attachment 100378 [details] [review]
patch

I'm not sure if the patch gets rid of the issue completely but it looks like improved after applying. increasing the amount of syscall and disk I/O definitely affects the performance though, I have no idea other than at this moment.
Comment 18 Akira TAGOH 2015-02-09 05:17:20 UTC
forgot to add some progress. still not perfect to avoid this issue.

https://bugzilla.redhat.com/show_bug.cgi?id=1169979

I'm not quite sure what happened. need to investigate it hard.
Comment 19 Akira TAGOH 2015-07-02 09:00:15 UTC
Created attachment 116870 [details] [review]
need a review to be merged

going back here again. this issue seems still happens. I'm not quite sure how we can fix this completely. meanwhile I wrote a patch to detect the change of files in directory to update the cache.
To reduce the footprint to scan font files, added the number of files in the directory to the cache and compare if it is same. it can be obtained during updating cache and when it is changed, mtime is usually changed. so it shouldn't affect the performance no matter if the directory contains new fonts or updated.
Comment 20 Akira TAGOH 2015-08-17 02:42:18 UTC
Fixed with the latest commit in git for platforms that support st_mtim in struct stat. for platforms that doesn't recognize the changes less than 1 second, we might need to change the code to compare the mtime if the mtime of directories is newer than the cache rather than comparing it if it is same.
Comment 21 GitLab Migration User 2018-08-20 21:51:30 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/fontconfig/fontconfig/issues/88.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.