17107 – fc-cat, fc-list, fc-match, fc-query output formats

Bug 17107 - fc-cat, fc-list, fc-match, fc-query output formats

Summary: fc-cat, fc-list, fc-match, fc-query output formats

Status:	RESOLVED FIXED

Alias:	None

Product:	fontconfig
Classification:	Unclassified
Component:	library (show other bugs)
Version:	2.4
Hardware:	Other All

Importance:	medium enhancement
Assignee:	Keith Packard
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2008-08-13 00:12 UTC by Behdad Esfahbod
Modified:	2009-06-24 12:35 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:
i915 features:

Attachments

Description Behdad Esfahbod 2008-08-13 00:12:58 UTC

Each use a different output format right now.  Worse, --verbose changes their format too.

With FcPatternFilter() that I added in bug 13016, we now can easily switch between different formats.  If --verbose is not specified, certain elements are printed only.  fc-list supports list of elements.  The rest don't yet.  Adding that to fc-match is bug 13017.  Not sure if it's useful to add to fc-cat and fc-query.

Should add a --format parameter to them all.  The values mean:

  - "verbose": Use FcPatternPrint().  

  - "name": Use FcNameUnparse().

  - Anything else: Will be a printf-like format string.  For example, fc-match's default format is:

        "%(basefilename): \"%{family}\" \"%{style}\"\n".

The tags in %{} are looked up from the pattern.  There's also a few extra tags, those in %(), that exist for convenience and backward compat (so all current formats can be expressed using this format).  Those will be %(basename), %(dirname), %(verbose), and %(name).  The last two refering to the above-defined formats.

Need to add a FcPatternFormat() function for this.

Sounds good?

(I modelled this after rpm's --queryformat)

Comment 1 Behdad Esfahbod 2008-08-13 11:31:16 UTC

I also have some more crazy syntax ideas here.  Got to get to implement it.

Comment 2 Behdad Esfahbod 2008-12-29 18:07:28 UTC

An initial version is in my tree:
commit f52290c475a49e600605502a2e0cb2283f77bfd7
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Mon Dec 29 20:00:26 2008 -0500

    Implement FcPatternFormat and use it in cmdline tools (bug #17107)
    
    Still need to add more features, but the API is there, and used
    by cmdline tools with -f or --format.

Comment 3 Behdad Esfahbod 2009-02-13 16:51:53 UTC

I actually added various extensions to the language.  Needed for font autoinstallation tag generation.  In my tree.

commit 8914d1b001c0850105ae47c33fd450a5e47a70cc
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Fri Feb 13 16:41:37 2009 -0800

    [fcformat] Add a 'pkgkit' builtin that prints tags for font packages
    
    For DejaVu Sans Condensed it generates:
    
    Font(dejavusans)
    Font(dejavusanscondensed)
    Font(:lang=aa)
    Font(:lang=ab)
    ...
    Font(:lang=yo)
    Font(:lang=zu)

commit 4edfb43575c08cbbf2226e6b856097396ce7d0c4
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Fri Feb 13 16:33:58 2009 -0800

    [fcformat] Enumerate langsets like we do arrays of values
    
    If one asks for a format like '%{[]elt{expr}}' and the first value
    for elt is a langset, we enumerate the langset languages in expr.

commit 69a351fa138fe7e71e29316698b0ca4ce532631d
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Fri Feb 13 16:30:43 2009 -0800

    [fclang] Implement FcLangSetGetLangs() (#18846)

commit eb082c4689648ea2afdd23a265176ccad903749c
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Thu Feb 12 21:48:22 2009 -0600

    [fcformat] Implement array enumeration
    
    The format '%{[]family,familylang{expr}}' expands expr once for the first
    value of family and familylang, then for the second, etc, until both lists
    are exhausted.

commit 45e9fd6a0791dfb78189547d4b35295ce3adfc16
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Wed Feb 11 23:55:11 2009 -0500

    [fcformat] Support 'default value' for simple tags
    
    The format '%{family:-XXX} prints XٓٓٓٓXX if the element family is not defined.
    Also works for things like '%{family[1]:-XXX}'.

commit d771c5ccdc57857d1a4c786c836c5e0e39d925b9
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Wed Feb 11 23:44:36 2009 -0500

    [fcformat] Support indexing simple tags
    
    The format '%{family[0]}' will only output the first value for element family.

commit 1c3f15b40a635db0f8ba65b1ab01928782637ee4
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Feb 10 20:56:39 2009 -0500

    [fcformat] Add support for builtin formats
    
    The '%{=unparse}' format expands to the FcNameUnparse() result on the
    pattern.  Need to add '%{=verbose}' for FcPatternPrint() output but
    need to change that function to output to a string first.
    
    Also added the '%{=fclist}' and '%{=fcmatch}' which format like the
    default format of fc-list and fc-match respectively.

commit efc412e770733bab2fc42005a5c543f2bebcd0b2
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Feb 10 18:57:34 2009 -0500

    [fcformat] Refactor code to avoid malloc

commit f515404c53ed5bf35abbce574fb34f1777cc9a62
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Feb 10 06:22:55 2009 -0500

    [fcformat] Start adding builtins

commit 1920c8b5533e683db4b82be270892af4e4303159
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Feb 10 05:57:10 2009 -0500

    [fcformat] Implement 'delete', 'escape', and 'translate' filter functions
    
    The format '%{family|delete( )}' expands to family values with space removed.
    The format '%{family|translate( ,-)}' expands to family values with space
    replaced by dash.  Multiple chars are supported, like tr(1).
    The format '%{family|escape(\\ )}' expands to family values with space
    escaped using backslash.

commit 60bd84d54b97abc109d1978bc718e20bbbb47c30
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Feb 10 05:05:53 2009 -0500

    [fcformat] Add value-count syntax
    
    The format '%{#family}' expands to the number of values for the element
    'family', or '0' if no such element exists in the pattern.

commit 8f169dd0eb0bbb5ff0bf0095d96056b446f5263e
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Feb 10 04:44:54 2009 -0500

    [fcformat] Implement 'cescape', 'shescape', and 'xmlescape' converters

commit 808ec01a88c458c8270cb9740f163bf205dc156d
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Feb 10 03:38:22 2009 -0500

    [fcformat] Add simple converters
    
    The format '%{family|downcase}' for example prints the lowercase of
    the family element.  Three converters are defined right now:
    'downcase', 'basename', and 'dirname'.

commit e916f59f9d012d77c7f16b1cd352da435d1c450e
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Feb 10 00:15:08 2009 -0500

    [fcformat] Add conditionals
    
    The conditional '%{?elt1,elt2,!elt3{expr1}{expr2}}' will evaluate
    expr1 if elt1 and elt2 exist in pattern and elt3 doesn't exist, and
    expr2 otherwise.  The '{expr2}' part is optional.

commit ffc5a1678aae643b2d3dd261b184c54d56182e06
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Mon Feb 9 23:08:08 2009 -0500

    [fcformat] Add element filtering and deletion
    
    The filtering, '%{+elt1,elt2,elt3{subexpr}}' will evaluate subexpr
    with a pattern only having the listed elements from the surrounding
    pattern.
    
    The deletion, '%{-elt1,elt2,elt3{subexpr}}' will evaluate subexpr
    with a the surrounding pattern sans the listed elements.

commit 97326401b560bae54b1d0e49011f3866cd399753
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Mon Feb 9 20:49:45 2009 -0500

    [fcformat] Add support for subexpressions
    
    The syntax is '{{expr}}'.  Can be used for aligning/justifying an entire
    subexpr for example.

commit ae46e6d9395ebf364c8718422dfd377005b62f58
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Mon Feb 9 19:13:07 2009 -0500

    [fcformat] Refactor and restructure code for upcoming changes
    
    Also makes it thread-safe.

commit d6543dfa92613e0d45658c89a377a27a6cc72ea6
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Mon Feb 9 18:18:59 2009 -0500

    [fcformat] Add support for width modifiers
    
    One can do '%30{family}' for example.  Or '%-30{family}' for the
    left-aligned version.

Comment 4 Behdad Esfahbod 2009-03-13 16:01:00 UTC

Also committed 300 lines worth of docs for the syntax.  See FcPatternFormat(3).

Comment 5 Behdad Esfahbod 2009-03-13 16:05:24 UTC

Here's the docs for the impatient:

@RET@		FcChar8 *
@FUNC@		FcPatternFormat
@TYPE1@		FcPattern *			@ARG1@		pat
@TYPE2@		const FcChar8 *			@ARG2@		format
@PURPOSE@	Format a pattern into a string according to a format specifier
@DESC@

Converts given pattern <parameter>pat</parameter> into text described by
the format specifier <parameter>format</parameter>.
The return value refers to newly allocated memory which should be freed by the
caller using free(), or NULL if <parameter>format</parameter> is invalid.

</para><para>

The format is loosely modelled after printf-style format string.
The format string is composed of zero or more  directives: ordinary
characters (not "%"), which are copied unchanged to the output stream;
and tags which are interpreted to construct text from the pattern in a
variety of ways (explained below).
Special characters can be escaped
using backslash.  C-string style special characters like \n and \r are
also supported (this is useful when the format string is not a C string
literal).
It is advisable to always escape curly braces that
are meant to be copied to the output as ordinary characters.

</para><para>

Each tags is introduced by the character "%",
followed by an optional minimum field width,
followed by tag contents in curly braces ({}).
If the minimum field width value is provided the tag
will be expanded and the result padded to achieve the minimum width.
If the minimum field width is positive, the padding will right-align
the text.  Negative field width will left-align.
The rest of this section describes various supported tag contents
and their expansion.

</para><para>

A <firstterm>simple</firstterm> tag
is one where the content is an identifier.  When simple
tags are expanded, the named identifier will be looked up in
<parameter>pattern</parameter> and the resulting list of values returned,
joined together using comma.  For example, to print the family name and style the
pattern, use the format "%{family} %{style}\n".  To extend the family column
to forty characters use "%-40{family}%{style}\n".

</para><para>

Simple tags expand to list of all values for an element.  To only choose
one of the values, one can index using the syntax "%{elt[idx]}".  For example,
to get the first family name only, use "%{family[0]}".

</para><para>

If a simple tag ends with "=" and the element is found in the pattern, the
name of the element followed by "=" will be output before the list of values.
For example, "%{weight=}" may expand to the string "weight=80".  Or to the empty
string if <parameter>pattern</parameter> does not have weight set.

</para><para>

If a simple tag starts with ":" and the element is found in the pattern, ":"
will be printed first.  For example, combining this with the =, the format
"%{:weight=}" may expand to ":weight=80" or to the empty string
if <parameter>pattern</parameter> does not have weight set.

</para><para>

If a simple tag contains the string ":-", the rest of the the tag contents
will be used as a default string.  The default string is output if the element
is not found in the pattern.  For example, the format
"%{:weight=:-123}" may expand to ":weight=80" or to the string
":weight=123" if <parameter>pattern</parameter> does not have weight set.

</para><para>

A <firstterm>count</firstterm> tag
is one that starts with the character "#" followed by an element
name, and expands to the number of values for the element in the pattern.
For example, "%{#family}" expands to the number of family names
<parameter>pattern</parameter> has set, which may be zero.

</para><para>

A <firstterm>subexpression</firstterm> tag
is one that expands a subexpression.  The tag contents
are the subexpression to expand placed inside another set of curly braces.
Subexpression tags are useful for aligning an entire subexpression, or to
apply converters (explained later) on an entire subexpression.
For example, the format "%40{{%{family} %{style}}}" expands the subexpression
to construct the family name followed by the style, then takes the entire
string and pads it on the left to be at least forty characters.

</para><para>

A <firstterm>filter-out</firstterm> tag
is one starting with the character "-" followed by a
comma-separated list of element names, followed by a subexpression enclosed
in curly braces.  The subexpression will be expanded but with a pattern that
has the listed elements removed from it.
For example, the format "%{-size,pixelsize{subexpr}}" will expand "subexpr"
with <parameter>pattern</parameter> sans the size and pixelsize elements.

</para><para>

A <firstterm>filter-in</firstterm> tag
is one starting with the character "+" followed by a
comma-separated list of element names, followed by a subexpression enclosed
in curly braces.  The subexpression will be expanded but with a pattern that
only has the listed elements from the surrounding pattern.
For example, the format "%{+family,familylang{subexpr}}" will expand "subexpr"
with a sub-pattern consisting only the family and family lang elements of
<parameter>pattern</parameter>.

</para><para>

A <firstterm>conditional</firstterm> tag
is one starting with the character "?" followed by a
comma-separated list of element conditions, followed by two subexpression
enclosed in curly braces.  An element condition can be an element name,
in which case it tests whether the element is defined in pattern, or
the character "!" followed by an element name, in which case the test
is negated.  The conditional passes if all the element conditions pass.
The tag expands the first subexpression if the conditional passes, and
expands the second subexpression otherwise.
For example, the format "%{?size,dpi,!pixelsize{pass}{fail}}" will expand
to "pass" if <parameter>pattern</parameter> has size and dpi elements but
no pixelsize element, and to "fail" otherwise.

</para><para>

An <firstterm>enumerate</firstterm> tag
is one starting with the string "[]" followed by a
comma-separated list of element names, followed by a subexpression enclosed
in curly braces.  The list of values for the named elements are walked in
parallel and the subexpression expanded each time with a pattern just having
a single value for those elements, starting from the first value and
continuing as long as any of those elements has a value.
For example, the format "%{[]family,familylang{%{family} (%{familylang})\n}}"
will expand the pattern "%{family} (%{familylang})\n" with a pattern
having only the first value of the family and familylang elemtns, then expands
it with the second values, then the third, etc.

</para><para>

As a special case, if an enumerate tag has only one element, and that element
has only one value in the pattern, and that value is of type FcLangSet, the
individual languages in the language set are enumerated.

</para><para>

A <firstterm>builtin</firstterm> tag
is one starting with the character "=" followed by a builtin
name.  The following builtins are defined:

<variablelist>

<varlistentry><term>
unparse
</term><listitem><para>
Expands to the result of calling FcNameUnparse() on the pattern.
</para></listitem></varlistentry>

<varlistentry><term>
fcmatch
</term><listitem><para>
Expands to the output of the default output format of the fc-match
command on the pattern, without the final newline.
</para></listitem></varlistentry>

<varlistentry><term>
fclist
</term><listitem><para>
Expands to the output of the default output format of the fc-list
command on the pattern, without the final newline.
</para></listitem></varlistentry>

<varlistentry><term>
pkgkit
</term><listitem><para>
Expands to the list of PackageKit font() tags for the pattern.
Currently this includes tags for each family name, and each language
from the pattern, enumerated and sanitized into a set of tags terminated
by newline.  Package management systems can use these tags to tag their
packages accordingly.
</para></listitem></varlistentry>

</variablelist>

For example, the format "%{+family,style{%{=unparse}}}\n" will expand
to an unparsed name containing only the family and style element values
from <parameter>pattern</parameter>.

</para><para>

The contents of any tag can be followed by a set of zero or more
<firstterm>converter</firstterm>s.  A converter is specified by the
character "|" followed by the converter name and arguments.  The
following converters are defined:

<variablelist>

<varlistentry><term>
basename
</term><listitem><para>
Replaces text with the results of calling FcStrBasename() on it.
</para></listitem></varlistentry>

<varlistentry><term>
dirname
</term><listitem><para>
Replaces text with the results of calling FcStrDirname() on it.
</para></listitem></varlistentry>

<varlistentry><term>
downcase
</term><listitem><para>
Replaces text with the results of calling FcStrDowncase() on it.
</para></listitem></varlistentry>

<varlistentry><term>
shescape
</term><listitem><para>
Escapes text for one level of shell expansion.
(Escapes single-quotes, also encloses text in single-quotes.)
</para></listitem></varlistentry>

<varlistentry><term>
cescape
</term><listitem><para>
Escapes text such that it can be used as part of a C string literal.
(Escapes backslash and double-quotes.)
</para></listitem></varlistentry>

<varlistentry><term>
xmlescape
</term><listitem><para>
Escapes text such that it can be used in XML and HTML.
(Escapes less-than, greater-than, and ampersand.)
</para></listitem></varlistentry>

<varlistentry><term>
delete(<parameter>chars</parameter>)
</term><listitem><para>
Deletes all occurrences of each of the characters in <parameter>chars</parameter>
from the text.
FIXME: This converter is not UTF-8 aware yet.
</para></listitem></varlistentry>

<varlistentry><term>
escape(<parameter>chars</parameter>)
</term><listitem><para>
Escapes all occurrences of each of the characters in <parameter>chars</parameter>
by prepending it by the first character in <parameter>chars</parameter>.
FIXME: This converter is not UTF-8 aware yet.
</para></listitem></varlistentry>

<varlistentry><term>
translate(<parameter>from</parameter>,<parameter>to</parameter>)
</term><listitem><para>
Translates all occurrences of each of the characters in <parameter>from</parameter>
by replacing them with their corresponding character in <parameter>to</parameter>.
If <parameter>to</parameter> has fewer characters than
<parameter>from</parameter>, it will be extended by repeating its last
character.
FIXME: This converter is not UTF-8 aware yet.
</para></listitem></varlistentry>

</variablelist>

For example, the format "%{family|downcase|delete( )}\n" will expand
to the values of the family element in <parameter>pattern</parameter>,
lower-cased and with spaces removed.

Comment 6 Behdad Esfahbod 2009-06-24 12:35:25 UTC

I believe I've fixed this in 2.7.0.  Please reopen otherwise.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.