Bug 97546

Summary: fc-cache failure with /System/Library/Fonts
Product: fontconfig Reporter: Ed Porras <dev>
Component: fc-cacheAssignee: fontconfig-bugs
Status: RESOLVED FIXED QA Contact: Behdad Esfahbod <freedesktop>
Severity: major    
Priority: medium CC: akira, dickie, japuzzo, qliu, sci-fi, silas-freedesktop, xfacter, zmwangx
Version: 2.12   
Hardware: x86-64 (AMD64)   
OS: Mac OS X (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: log output and cache files
Cleaned up traces with additional font traces in the loop that causes FcCacheOffsetsValid() to fail
proposed
I am still seeing "failed to write cache"
rc2 failure log
attachment-9604-0.html

Description Ed Porras 2016-08-31 08:25:09 UTC
Mac folks are seeing major performance issues with fontconfig 2.12.1 and this post to the fontconfig list by Zhiming Wang has some details about the issue:

  https://lists.freedesktop.org/archives/fontconfig/2016-August/005811.html

In short, fc-cache fails to cache fonts in /System/Library/Fonts and, from then on, every application that needs to read the cache triggers a cache build (which fails again). 

I first reported this to the homebrew folks. You can see how this is impacting performance from sample output here:

  https://github.com/Homebrew/homebrew-core/issues/4172

Other posts I've found:

  https://lists.freedesktop.org/archives/fontconfig/2016-August/005810.html
  https://trac.macports.org/ticket/52088

(FYI: "2.12" is not available in the Version field of this bug form).

Let me know if you need anything to help debug this.

Thank you.
Comment 1 Zhiming Wang 2016-08-31 08:38:45 UTC
Two extra pieces of crucial information from the aforementioned bug
report on the mailing list [1]:

1. The offending commit is 7a4a5bd [2] "Properly validate offsets in
   cache files."

2. A previously valid cache of /System/Library/Fonts (created by fc
   2.12.0) can be downloaded from [3].

[1] https://lists.freedesktop.org/archives/fontconfig/2016-August/005811.html
[2] https://cgit.freedesktop.org/fontconfig/commit/?id=7a4a5bd7897d216f0794ca9dbce0a4a5c9d14940
[3] https://dl.bintray.com/zmwangx/generic/fontconfig-2.12.0-cache.tar.gz
Comment 2 Akira TAGOH 2016-08-31 09:22:18 UTC
need a help to investigate on the debugger what checking is failing in FcCacheOffsetsValid() or FcDirCacheMapFd()
Comment 3 Ed Porras 2016-09-02 13:04:15 UTC
Apologies for the lack of response. I'm traveling this week and got stuck overnight in Atlanta due to the storm. I'll try running with a debug version asap.
Comment 4 Ed Porras 2016-09-08 03:01:53 UTC
Created attachment 126290 [details]
log output and cache files
Comment 5 Ed Porras 2016-09-08 03:09:08 UTC
Ugh.. I had some comments typed out but they were lost when I attached the file. 

Anyway, the issue is happening in FcCacheOffsetsValid(). It returns false towards the end because the final loop results in j < 0.

I've included some output setting FC_DEBUG to 100 and 1024 plus some custom traces I've added myself (with 3 comments indicating where execution hangs). It took me a while to get things set up so I could debug this but now I'm ready if you'd like me to try any steps to help you track this down.

Note that there also seems to be an issue with fc-cache as it tries to open a cache that does not exist (because it found it to be invalid in the previous run and, therefore, deleted it). I disabled the clean up code so I could grab the file and I have included it in the archive (along with the other contents of the cache directory).

If you'd like to try do a Skype session or something of the sort, I'm in EST right for the next 10 days and then will be in CET after that.
Comment 6 Ed Porras 2016-09-08 15:29:01 UTC
Created attachment 126308 [details]
Cleaned up traces with additional font traces in the loop that causes FcCacheOffsetsValid() to fail

First off, sorry about last night's messy post. I've been quite busy these past two weeks between work and travel and I wanted to get a response out to help but losing all the text I typed in when I tried attaching a file just killed me.

Anyway, I've cleaned up the traces a bit in the attached output to be more clear as to what is causing errors. I'm having to do this from my admin account as I ran into problems from my regular user since the install path is on the system. As a result, I'm not able to debug with gdb so I've been stuck with lldb.

I added the traces to start getting some understanding of the flow. The thing that threw me off at first is why is fc-cache trying to open a non-existent cache. In the attached output you'll see I delete the cache from /usr/local/var/cache/fontconfig (/usr/local/var was passed as the localstatedir option to configure since it's where home-brew puts it), and, upon running it, it tries to open one of the files that was there. Where is it getting that from? I can't find this file anywhere else and I've checked the target directory (/usr/local/Cellar/fontconfig/2.12.1-ep) and everywhere I can think of in /usr/local.

Any ideas?
Comment 7 Akira TAGOH 2016-09-12 01:57:18 UTC
Created attachment 126464 [details] [review]
proposed

Thank you for detailed report. does the attached patch work?
Comment 8 Ed Porras 2016-09-15 00:11:27 UTC
Hi Akira,

Looks like it works. Ran diff-pdf and pdftoedn and both ran as expected. 

Do I mark this "resolved" or do you take care of that?

Thank you.
Comment 9 Zhiming Wang 2016-09-15 05:36:32 UTC
Akira: Patch confirmed to work here, too.
Ed: Thanks for all your work that went into this!
Comment 10 Jeremy Huddleston Sequoia 2016-10-24 16:17:29 UTC
*** Bug 98403 has been marked as a duplicate of this bug. ***
Comment 11 Jeremy Huddleston Sequoia 2016-10-24 16:17:51 UTC
*** Bug 98338 has been marked as a duplicate of this bug. ***
Comment 12 Jeremy Huddleston Sequoia 2016-10-24 16:30:14 UTC
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Tested-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>

Review comments:
  The inner-most predicate is written with a side-effect and is otherwise rather complex.  I'd suggest breaking the c-- out.
  Your whitespace usage for casting is not self-consistent.
Comment 13 Jeremy Huddleston Sequoia 2016-10-24 21:23:05 UTC
*** Bug 98419 has been marked as a duplicate of this bug. ***
Comment 14 SciFi 2016-10-25 10:01:05 UTC
Created attachment 127535 [details]
I am still seeing "failed to write cache"
Comment 15 SciFi 2016-10-25 10:02:51 UTC
.

Hi,

I just installed XQuartz-2.7.11_rc1.dmg which mentions this report/patch/etc (upgrading 2.7.10[final]).

I am still on OSX-10.6.8 (still on disability etc and only have a model "imac6,1"[1] here, still unable to afford a new model which is the *only* way to """upgrade""" to a supported o.s. etc [and yes I an *quite* aware even the Zilla™folk don't care about us-po'ppl also, don't get me started………]).

Along with this bug (which I've had to put-up-with for several years, and mentioned it on the xquartz-devel maillist some time ago), even with 2.7.11_rc1, I am still seeing e.g.,
> installer[98140]: ./postinstall: /System/Library/Fonts: failed to write cache
for every subdir under the various Fonts dirs following this line in the installer log,
> installer[98140]: ./postinstall: font_cache: Updating FC cache
but we end up with a "good" run as shown,
> installer[98140]: ./postinstall: font_cache: Done
> installer[98140]: PackageKit: Registered bundle file://localhost/Applications/Utilities/XQuartz.app/
> installer[98140]: PackageKit: Registered bundle file://localhost/Applications/Utilities/XQuartz.app/Contents/Frameworks/Sparkle.framework/Versions/Current/Resources/Autoupdate.app/
> installer[98140]: Installed "XQuartz 2.7.11_rc1" ()
> installer[98140]: PackageKit: ----- End install -----
> […]

I'll attach a zip'd Terminal log to this report.

Yes my local login-id has full admin privs etc etc etc, being the first/only one generated since day-1, honest.  ;p

FWIW
I use the Pan newsreader with GNome/GLib/etc (not current code), browsing the GMane.org system that holds mailists (newsgroups) some 500+ to which I've subscribed, etc.  I notice the fontconfig list has been helping to alleviate the cache problem, and indeed I'm impressed so far with the work on this.

Another FWIW
I do not use any of the port systems (Macports, Fink, etc) since I have upgraded several subsystems e.g. Perl etc (with the help of ActiveState.com and others) to where the port systems won't work as expected.  I have resorted to doing all this by hand with deep study+planning including having to use their OSX-mandated patches if needed.  Yes this is mainly how I spend my time nowadays, related to my paid job (mainframe shop with state-wide wired network before the Internet) -- all this should help keep my brain sharp.

Thanks for putting up with me.

I'll stay tuned for anything further.

[1]:  http://www.everymac.com/systems/apple/imac/specs/imac-core-2-duo-2.16-24-inch-specs.html

btw I've "lost" almost everything else I've owned, including several PPC models, am carless now, sleep problems, etc.

.
Comment 16 Silas S. Brown 2016-10-25 10:57:04 UTC
(In reply to SciFi from comment #15)
If all else fails there is always the option to run GNU/Linux on your iMac natively, see https://help.ubuntu.com/community/Intel_iMac although the setup is awkward and carries a certain element of risk (but if you get completely left in the dark then it might be the only option).  You may also be able to run a recent GNU/Linux in a VirtualBox, although you might need to find an older version of VirtualBox to install.  (Be sure to enable PXE Boot if you want to run a 64-bit distro, otherwise stick with 32-bit distros.)  But I know it would be nice to carry on in MacOS for now, especially if you want the option of compiling MacOS binaries for others.

Developers not supporting old Mac versions does not always mean the developer doesn't care.  They might feel forced by Apple.  Apple do not make it easy for developers to continue to compile binaries for old versions of the OS.  A developer might upgrade their own OS and then suddenly find they can no longer reliably compile for older OS versions (say, two or three major releases before their own), and they weren't warned about that before upgrading their development machine and now they've done it it's too late.  And then somebody else might say "oh but I still have an older Mac so I can compile the binaries for you", which is great until the main developer accidentally introduces a feature that doesn't compile on the older Mac without even realising it.  Then the somebody else either suddenly has to do a whole bunch of extra patching work (which is not what they signed up for when they said they'll help compile the binaries) or else just report bugs and hope the main developer has time to handle them even though said main developer no longer has any means of directly testing their changes for compatibility on the older Mac.

My 10.7 box can be upgraded but I have been afraid to do so lest I lose the ability to ship and test binaries for older Macs.  But right now there's a couple of projects I'm involved with that I can't compile and I haven't yet been able to figure out how to patch them so they compile again on my older Mac (well in one case I do have a working fork of an earlier version and I've been able to backport some of the updates, but not all and it's generally a mess; I need to sort it out somehow but don't hold your breath).  Does Apple think we developers are so rich we can afford to have an array of 10 different Macs all running different versions?  or maybe they just don't mind there being a situation where users of older versions tend to be left out in the cold and are under pressure to spend more money upgrading.

It's also true that older versions of Mac OS don't have all the security patches, so some developers might take the line of "oh well it would be irresponsible to carry on supporting it and thereby encouraging users to carry on using insecure stuff".  But as you say some people just can't upgrade, and Mac security problems were never as big as old Windows was, plus if the box is behind a good-enough GNU/Linux firewall (I have a host of iptables rules on the Raspberry Pi) and the user has sufficient competence then the security risks are hopefully not TOO bad.
Comment 17 Jeremy Huddleston Sequoia 2016-10-26 16:23:17 UTC
For XQuartz users landing here, this patch is included in XQuartz 2.7.11_rc2, which I just released today.  Please give it a try.
Comment 18 Alex 2016-10-26 18:05:20 UTC
Created attachment 127555 [details]
rc2 failure log

Still does not work for me with rc2. Installs but Xquartz hangs and never finishes starting up.
Comment 19 Zhiming Wang 2016-10-26 18:17:10 UTC
SciFi: The bug here was introduced in an August 2016 commit, so if you've been putting up with it for several years, it must be a different bug. Please be so kind as to not take over the thread and instead report a different one, thanks.

Silas: I'm afraid a bug tracker isn't the best place for general computing advice.
Comment 20 Dick Riegner 2016-10-26 21:07:18 UTC
So is XQuartz 2.7.11_rc2 safe to install on 10.11.6 (El Capitan)?  Feedback
here says XQuartz now hangs and will not start, but an update in 98338 says it 
resolves the problem.

I really need XQuartz to run reliably for work and do not have an easy way to revert back to a previous XQuartx release without reverting to a full system backup from a week ago.

Any guidance on whether of not XQuartz 2.7.11_rc2 is safe to install would be appreciated.
Comment 21 Alex 2016-10-26 21:31:48 UTC
> Feedback here says XQuartz now hangs and will not start

I should add that I didn't restart or log out/in, but I don't think that should be an issue since XQuartz was already installed and working when I tried updating to rc2. 

If it's mission critical to have XQuartz working, and it is currently working, you should obviously wait for a stable release.
Comment 22 Jeremy Huddleston Sequoia 2016-10-26 21:32:41 UTC
Alex, please ask for help on the x11-users mailing list.  Note that the installer does take a while to complete.  Also, a logout is needed (and you should have gotten a prompt) if you're upgrading from 2.7.9 or earlier.

Dick, XQuartz 2.7.11_rc2 should be good to install on Snow Leopard and later.  If you want general support, please join the x11-users mailing list.  This ticket is tracking this specific problem, not support with installer problems.
Comment 23 Dick Riegner 2016-10-26 21:37:48 UTC
Jeremy,

Understood about general installer issues.

I read the problem updating to XQuart 2.7.11_rc2 as a problem running with that update rather than an unrelated installation problem.  Hence I posted my concern to this bug.

I want to help validate XQuart 2.7.11_rc2, but just want to avoid any serious regressions.  So would it be helpful for me to install XQuart 2.7.11_rc2 and what do you see as the risk?

Dick
Comment 24 Alex 2016-10-26 21:39:20 UTC
Created attachment 127556 [details]
attachment-9604-0.html

Dick, you can simply reinstall 2.7.9 if it does not work. I have had no issues doing that, just takes some time. 

> On Oct 26, 2016, at 3:37 PM, bugzilla-daemon@freedesktop.org wrote:
> 
> 
> Comment # 23 <https://bugs.freedesktop.org/show_bug.cgi?id=97546#c23> on bug 97546 <https://bugs.freedesktop.org/show_bug.cgi?id=97546> from Dick Riegner <mailto:dickie@integra.net>
> Jeremy,
> 
> Understood about general installer issues.
> 
> I read the problem updating to XQuart 2.7.11_rc2 as a problem running with that
> update rather than an unrelated installation problem.  Hence I posted my
> concern to this bug.
> 
> I want to help validate XQuart 2.7.11_rc2, but just want to avoid any serious
> regressions.  So would it be helpful for me to install XQuart 2.7.11_rc2 and
> what do you see as the risk?
> 
> Dick
> 
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 25 Alex 2016-10-26 21:40:32 UTC
Comment on attachment 127556 [details]
attachment-9604-0.html

><html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"><base></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Dick, you can simply reinstall 2.7.9 if it does not work. I have had no issues doing that, just takes some time.&nbsp;</div><br class=""><div><blockquote type="cite" class=""><div class="">On Oct 26, 2016, at 3:37 PM, <a href="mailto:bugzilla-daemon@freedesktop.org" class="">bugzilla-daemon@freedesktop.org</a> wrote:</div><br class="Apple-interchange-newline"><div class="">
>    
>      <base href="https://bugs.freedesktop.org/" class="">
>    
>    <div class=""><div class="">
>        <br class="webkit-block-placeholder"></div><div class="">
>            <b class=""><a class=" bz_bug_link bz_status_ASSIGNED 
>" title="ASSIGNED - fc-cache failure with /System/Library/Fonts" href="https://bugs.freedesktop.org/show_bug.cgi?id=97546#c23">Comment # 23</a>
>              on <a class=" bz_bug_link bz_status_ASSIGNED 
>" title="ASSIGNED - fc-cache failure with /System/Library/Fonts" href="https://bugs.freedesktop.org/show_bug.cgi?id=97546">bug 97546</a>
>              from <span class="vcard"><a class="email" href="mailto:dickie@integra.net" title="Dick Riegner &lt;dickie@integra.net&gt;"> <span class="fn">Dick Riegner</span></a>
></span></b>
>        <pre class="">Jeremy,
>
>Understood about general installer issues.
>
>I read the problem updating to XQuart 2.7.11_rc2 as a problem running with that
>update rather than an unrelated installation problem.  Hence I posted my
>concern to this bug.
>
>I want to help validate XQuart 2.7.11_rc2, but just want to avoid any serious
>regressions.  So would it be helpful for me to install XQuart 2.7.11_rc2 and
>what do you see as the risk?
>
>Dick</pre>
>        </div><div class=""><br class="webkit-block-placeholder"></div>
>
>
>      <hr class="">
>      <span class="">You are receiving this mail because:</span>
>
>      <ul class="">
>          <li class="">You are on the CC list for the bug.</li>
>      </ul>
>    </div>
></div></blockquote></div><br class=""></body></html>
Comment 26 Alex 2016-10-26 22:00:50 UTC
Rebooting after install fixed the issue for me. (Sorry about the spam..)
Comment 27 Dick Riegner 2016-10-26 23:25:21 UTC
Yes, XQuartz 2.7.11_rc2 fixed the slow xclock launch for me.  A local launch of xclock now takes less than a second.  Subsequent local launches are also less than a second.

Thanks for the fix.
Comment 28 whitebob 2016-10-27 15:30:30 UTC
*** Bug 98422 has been marked as a duplicate of this bug. ***
Comment 29 Jeremy Huddleston Sequoia 2016-10-31 03:20:57 UTC
*** Bug 98419 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.