64176 – EDS: create new address books with the right configuration

Bug 64176 - EDS: create new address books with the right configuration

Summary: EDS: create new address books with the right configuration

Status:	RESOLVED FIXED

Alias:	None

Product:	SyncEvolution
Classification:	Unclassified
Component:	SyncEvolution (show other bugs)
Version:	1.3.99.3
Hardware:	Other All

Importance:	highest normal
Assignee:	SyncEvolution Community
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	55918
	Show dependency tree / graph

Reported:	2013-05-03 07:15 UTC by Patrick Ohly
Modified:	2013-06-11 19:01 UTC (History)
CC List:	3 users (show)

See Also:
i915 platform:
i915 features:

Attachments

Description Patrick Ohly 2013-05-03 07:15:22 UTC

Sorting and searching in PBAP address book caches must be fast, which depends on adding the right fields to the summary tables.

For address books created by EDS, this needs to be done by SyncEvolution. There is no good way of using text template files for the configuration (something that Matthew Barnes wants to add, but didn't get around to), so hard-code it in C++?

For the system address book, the distro needs choose how it configures that. Must be a distro patch for EDS.

TODO: determine the right configuration.

Comment 1 Patrick Ohly 2013-05-13 12:07:35 UTC

Tristan, can you please define how a local file backend should be configured with source extensions to work well for:
- sorting by names
- searching by name
- looking up caller ID (= normalized phone number)?

In the openismus-work branch for EDS 3.6, the system address book must be configured like that out of the box.

Comment 2 Patrick Ohly 2013-05-13 12:08:33 UTC

It probably would be best to just clone the config of the system address book when creating new databases via SyncEvolution.

Comment 3 Tristan Van Berkom 2013-05-13 12:33:08 UTC

Here is the .source file which would put the following fields
in the summary:
  E_CONTACT_FULL_NAME,
  E_CONTACT_FAMILY_NAME,
  E_CONTACT_NICKNAME,
  E_CONTACT_GIVEN_NAME,
  E_CONTACT_TEL

And all of them configured to have prefix search indexes,
as well as normalized phone number values for E_CONTACT_TEL:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Data Source]
DisplayName=Unnamed
Enabled=true
Parent=

[Address Book]
BackendName=local

[Backend Summary Setup]
SummaryFields=full_name:family_name:nickname:given_name:phone
IndexedFields=phone,prefix:phone,phone:full_name,prefix:family_name,prefix:nickname,prefix:given_name,prefix
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

My work-in-progress branch will eventually add the option to
set collation types for some fields, for example to set "phonebook"
collation on given_name and family_name, the configuration looks like this:

CollationFields=family_name,phonebook:given_name,phonebook

Also I'm pretty sure that if EDS has been freshly installed, you
can configure this using the ESourceBackendSummarySetup apis directly
on the default addressbook.

The code would look something like:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
source = e_source_registry_ref_default_addressbook (registry);
setup  = e_source_get_extension (source, E_SOURCE_BACKEND_SUMMARY_SETUP);

/* ... call e_source_backend_summary_setup_*() APIs ... */

e_source_registry_commit_source_sync (registry, source, NULL, NULL);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Perhaps an alternative to a downstream patch, would be to create
a small program which runs in the first boot of the target device
before the default addressbook is ever accessed for the first time.

Comment 4 Patrick Ohly 2013-05-13 12:53:11 UTC

(In reply to comment #3)
> Here is the .source file which would put the following fields
> in the summary:
>   E_CONTACT_FULL_NAME,
>   E_CONTACT_FAMILY_NAME,
>   E_CONTACT_NICKNAME,
>   E_CONTACT_GIVEN_NAME,
>   E_CONTACT_TEL
> 
> And all of them configured to have prefix search indexes,
> as well as normalized phone number values for E_CONTACT_TEL:
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> [Data Source]
> DisplayName=Unnamed
> Enabled=true
> Parent=
> 
> [Address Book]
> BackendName=local
> 
> [Backend Summary Setup]
> SummaryFields=full_name:family_name:nickname:given_name:phone
> IndexedFields=phone,prefix:phone,phone:full_name,prefix:family_name,prefix:
> nickname,prefix:given_name,prefix
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Ack.

> My work-in-progress branch will eventually add the option to
> set collation types for some fields, for example to set "phonebook"
> collation on given_name and family_name, the configuration looks like this:
> 
> CollationFields=family_name,phonebook:given_name,phonebook

I doubt that this will be practical. Choosing the "phonebook" collation only makes sense in some locales. Who decides to add "phonebook" in the EDS source config and when?

What happens if the user changes the locale from one where "phonebook" is right to one where it is not?

IMHO, overriding the default collation belongs into an additional env variable that then gets changed together with the LANG/LC_ env variables.

In the meantime, EDS should hard-code sane defaults for several "relevant" locales.

> Also I'm pretty sure that if EDS has been freshly installed, you
> can configure this using the ESourceBackendSummarySetup apis directly
> on the default addressbook.

True, however, which app is allowed to do that? SyncEvolution would be one candidate, but it may run too late.

> Perhaps an alternative to a downstream patch, would be to create
> a small program which runs in the first boot of the target device
> before the default addressbook is ever accessed for the first time.

I agree that this is the right solution for upstream and a general-purpose distro packaging of EDS. In the openismus-work branch, doing it directly in EDS is easier for your customer...

Comment 5 Tristan Van Berkom 2013-05-13 13:31:19 UTC

(In reply to comment #4)
> (In reply to comment #3)
> > Here is the .source file which would put the following fields
> > in the summary:
> >   E_CONTACT_FULL_NAME,
> >   E_CONTACT_FAMILY_NAME,
> >   E_CONTACT_NICKNAME,
> >   E_CONTACT_GIVEN_NAME,
> >   E_CONTACT_TEL
> > 
> > And all of them configured to have prefix search indexes,
> > as well as normalized phone number values for E_CONTACT_TEL:
> > 
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > [Data Source]
> > DisplayName=Unnamed
> > Enabled=true
> > Parent=
> > 
> > [Address Book]
> > BackendName=local
> > 
> > [Backend Summary Setup]
> > SummaryFields=full_name:family_name:nickname:given_name:phone
> > IndexedFields=phone,prefix:phone,phone:full_name,prefix:family_name,prefix:
> > nickname,prefix:given_name,prefix
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Ack.
> 
> > My work-in-progress branch will eventually add the option to
> > set collation types for some fields, for example to set "phonebook"
> > collation on given_name and family_name, the configuration looks like this:
> > 
> > CollationFields=family_name,phonebook:given_name,phonebook
> 
> I doubt that this will be practical. Choosing the "phonebook" collation only
> makes sense in some locales. Who decides to add "phonebook" in the EDS
> source config and when?
> 
> What happens if the user changes the locale from one where "phonebook" is
> right to one where it is not?
> 
> IMHO, overriding the default collation belongs into an additional env
> variable that then gets changed together with the LANG/LC_ env variables.
> 
> In the meantime, EDS should hard-code sane defaults for several "relevant"
> locales.

The idea is that "phonebook" collation is a preference that is either
available under the current locale, or not.

When a locale change has been detected, the sort keys will be generated
under the "phonebook" collation of the new locale... this will fallback
to the default ICU rules for the given locale if "phonebook" tailoring
is not available.

The problem with this currently is only that ICU has bugs falling back
to the correct locale (i.e. this bug:
   http://bugs.icu-project.org/trac/ticket/10149)

So to work around that bug, I'm only using "phonebook" sort order
under locales where the language code is either 'de' or 'fi' (as
Markus pointed out those are the only 2 locales with phonebook tailoring).

This workaround is currently working well with my tests, i.e. an
addressbook is configured to sort the "family_name" in "phonebook"
order... after restarting EDS under new locales and testing the
sort results I get the expected ordering under the new locales.

> > Also I'm pretty sure that if EDS has been freshly installed, you
> > can configure this using the ESourceBackendSummarySetup apis directly
> > on the default addressbook.
> 
> True, however, which app is allowed to do that? SyncEvolution would be one
> candidate, but it may run too late.
> 
> > Perhaps an alternative to a downstream patch, would be to create
> > a small program which runs in the first boot of the target device
> > before the default addressbook is ever accessed for the first time.
> 
> I agree that this is the right solution for upstream and a general-purpose
> distro packaging of EDS. In the openismus-work branch, doing it directly in
> EDS is easier for your customer...

Sure, whatever is easier.

Comment 6 Patrick Ohly 2013-05-13 13:43:08 UTC

(In reply to comment #5)
> > > My work-in-progress branch will eventually add the option to
> > > set collation types for some fields, for example to set "phonebook"
> > > collation on given_name and family_name, the configuration looks like this:
> > > 
> > > CollationFields=family_name,phonebook:given_name,phonebook
> > 
> > I doubt that this will be practical. Choosing the "phonebook" collation only
> > makes sense in some locales. Who decides to add "phonebook" in the EDS
> > source config and when?
> > 
> > What happens if the user changes the locale from one where "phonebook" is
> > right to one where it is not?
> > 
> > IMHO, overriding the default collation belongs into an additional env
> > variable that then gets changed together with the LANG/LC_ env variables.
> > 
> > In the meantime, EDS should hard-code sane defaults for several "relevant"
> > locales.
> 
> The idea is that "phonebook" collation is a preference that is either
> available under the current locale, or not.
> 
> When a locale change has been detected, the sort keys will be generated
> under the "phonebook" collation of the new locale... this will fallback
> to the default ICU rules for the given locale if "phonebook" tailoring
> is not available.

So the "phonebook" keyword is just a hint that this field is meant to be sorted according to phone book rules if applicable? Is there value in having to configure this, instead of hard-coding it for each field? Well, perhaps, because then it can be enabled for other fields, too, without having to modify the source code.

I don't see much value in being able to turn off "phonebook" for name fields, though.

How would this work for Pinyin? In other words, what would CollationFields=family_name,pinyin:given_name,pinyin do when switching from zn_CN (where Pinyin is expected) to de_DE (where phone book is execpted)?

It looks to me that "phonebook" in EDS should be merely a flag for certain fields which inside EDS gets translated into a suitable (hard-coded?!) collation per current locale (ICU "phonebook" in de and fi, ICU "pinyin" in Chinese locales). See also bug #64173.

Note that Pinyin may require additional work, see bug #64173 comment #3.

Comment 7 Tristan Van Berkom 2013-05-13 14:10:14 UTC

(In reply to comment #6)
> (In reply to comment #5)
[...]
> > 
> > The idea is that "phonebook" collation is a preference that is either
> > available under the current locale, or not.
> > 
> > When a locale change has been detected, the sort keys will be generated
> > under the "phonebook" collation of the new locale... this will fallback
> > to the default ICU rules for the given locale if "phonebook" tailoring
> > is not available.
> 
> So the "phonebook" keyword is just a hint that this field is meant to be
> sorted according to phone book rules if applicable? Is there value in having
> to configure this, instead of hard-coding it for each field? Well, perhaps,
> because then it can be enabled for other fields, too, without having to
> modify the source code.

I'm asking myself the opposite question actually.

Does it make sense to assume that an addressbook should be sorted
using "phonebook" rules ? and apply that everywhere ?

> I don't see much value in being able to turn off "phonebook" for name
> fields, though.
> 
> How would this work for Pinyin? In other words, what would
> CollationFields=family_name,pinyin:given_name,pinyin do when switching from
> zn_CN (where Pinyin is expected) to de_DE (where phone book is execpted)?

If "pinyin" was explicitly specified as the collation rule for a given
field, then "phonebook" would definitely not be expected for de_DE.

What would be expected is de_DE@collation=pinyin, and a fallback to de_DE
if no "pinyin" tailoring rule exists for de_DE.

So I refer to the previous question, on what grounds do we claim that
"phonebook" order is expected for the de_DE locale ?

If there is any logic behind these expectations, then perhaps we should
indeed hard code a list into EDS. Since I don't see any rationale in
hard coding it right now, I've went with a configuration instead.

> It looks to me that "phonebook" in EDS should be merely a flag for certain
> fields which inside EDS gets translated into a suitable (hard-coded?!)
> collation per current locale (ICU "phonebook" in de and fi, ICU "pinyin" in
> Chinese locales). See also bug #64173.
> 
> Note that Pinyin may require additional work, see bug #64173 comment #3.

Nod, also indexing anything by first letter/character seems to need the same
type of attention, there is AlphabetIndex which can be used for this:
    www.icu-project.org/apiref/icu4c/classicu_1_1AlphabeticIndex.html

My thoughts on this are, for sorting, one could use 2 sort keys for any
given field, the primary sort key would be the 'bucket' (character) which a
given word falls under according to AlphabetIndex, and the actual sort key
of the word would be the secondary sort key.

This would also require some API churn at higher levels, i.e. any view
which wants to display a sorted list with letter/character grouping, would
need to first query the addressbook to know which characters are valid
for the current locale (and in what order those groups should be displayed).

Comment 8 Patrick Ohly 2013-05-13 15:34:59 UTC

(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #5)
> [...]
> > > 
> > > The idea is that "phonebook" collation is a preference that is either
> > > available under the current locale, or not.
> > > 
> > > When a locale change has been detected, the sort keys will be generated
> > > under the "phonebook" collation of the new locale... this will fallback
> > > to the default ICU rules for the given locale if "phonebook" tailoring
> > > is not available.
> > 
> > So the "phonebook" keyword is just a hint that this field is meant to be
> > sorted according to phone book rules if applicable? Is there value in having
> > to configure this, instead of hard-coding it for each field? Well, perhaps,
> > because then it can be enabled for other fields, too, without having to
> > modify the source code.
> 
> I'm asking myself the opposite question actually.
> 
> Does it make sense to assume that an addressbook should be sorted
> using "phonebook" rules ? and apply that everywhere ?

I am less sure whether it should be applied to non-name fields. Not that sorting by address is useful, but in theory it is possible.

But in general I agree, always using "phonebook" (in de and fi) and "pinyin" (in China) sounds right.

The only reason for making it configurable is to accomodate for personal user preferences (both Pinyin and non-Pinyin sorting are valid), which (as far as I can tell) is not a useful use case (because Pinyin is sufficiently popular, and explaining to the user that they have the choice is too confusing).

> > I don't see much value in being able to turn off "phonebook" for name
> > fields, though.
> > 
> > How would this work for Pinyin? In other words, what would
> > CollationFields=family_name,pinyin:given_name,pinyin do when switching from
> > zn_CN (where Pinyin is expected) to de_DE (where phone book is execpted)?
> 
> If "pinyin" was explicitly specified as the collation rule for a given
> field, then "phonebook" would definitely not be expected for de_DE.

Then that's where configuring the collation in the source config breaks down, because we don't have some entity which would rewrite the config when the locale changes - other than EDS itself, of course. And if EDS rewrites the config, the option becomes useless.

> > It looks to me that "phonebook" in EDS should be merely a flag for certain
> > fields which inside EDS gets translated into a suitable (hard-coded?!)
> > collation per current locale (ICU "phonebook" in de and fi, ICU "pinyin" in
> > Chinese locales). See also bug #64173.
> > 
> > Note that Pinyin may require additional work, see bug #64173 comment #3.
> 
> Nod, also indexing anything by first letter/character seems to need the same
> type of attention, there is AlphabetIndex which can be used for this:
>     www.icu-project.org/apiref/icu4c/classicu_1_1AlphabeticIndex.html
> 
> My thoughts on this are, for sorting, one could use 2 sort keys for any
> given field, the primary sort key would be the 'bucket' (character) which a
> given word falls under according to AlphabetIndex, and the actual sort key
> of the word would be the secondary sort key.
> 
> This would also require some API churn at higher levels, i.e. any view
> which wants to display a sorted list with letter/character grouping, would
> need to first query the addressbook to know which characters are valid
> for the current locale (and in what order those groups should be displayed).

This grouping is a separate problem. Let's keep it out of the discussion for now.

Comment 9 Tristan Van Berkom 2013-05-14 10:57:43 UTC

(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #6)
> > > (In reply to comment #5)
> > [...]
> > > > 
> > > > The idea is that "phonebook" collation is a preference that is either
> > > > available under the current locale, or not.
> > > > 
> > > > When a locale change has been detected, the sort keys will be generated
> > > > under the "phonebook" collation of the new locale... this will fallback
> > > > to the default ICU rules for the given locale if "phonebook" tailoring
> > > > is not available.
> > > 
> > > So the "phonebook" keyword is just a hint that this field is meant to be
> > > sorted according to phone book rules if applicable? Is there value in having
> > > to configure this, instead of hard-coding it for each field? Well, perhaps,
> > > because then it can be enabled for other fields, too, without having to
> > > modify the source code.
> > 
> > I'm asking myself the opposite question actually.
> > 
> > Does it make sense to assume that an addressbook should be sorted
> > using "phonebook" rules ? and apply that everywhere ?
> 
> I am less sure whether it should be applied to non-name fields. Not that
> sorting by address is useful, but in theory it is possible.
> 
> But in general I agree, always using "phonebook" (in de and fi) and "pinyin"
> (in China) sounds right.
> 
> The only reason for making it configurable is to accomodate for personal
> user preferences (both Pinyin and non-Pinyin sorting are valid), which (as
> far as I can tell) is not a useful use case (because Pinyin is sufficiently
> popular, and explaining to the user that they have the choice is too
> confusing).
> 
> > > I don't see much value in being able to turn off "phonebook" for name
> > > fields, though.
> > > 
> > > How would this work for Pinyin? In other words, what would
> > > CollationFields=family_name,pinyin:given_name,pinyin do when switching from
> > > zn_CN (where Pinyin is expected) to de_DE (where phone book is execpted)?
> > 
> > If "pinyin" was explicitly specified as the collation rule for a given
> > field, then "phonebook" would definitely not be expected for de_DE.
> 
> Then that's where configuring the collation in the source config breaks
> down, because we don't have some entity which would rewrite the config when
> the locale changes - other than EDS itself, of course. And if EDS rewrites
> the config, the option becomes useless.

I'm not convinced of that, from what I understand your requirements
are described by ICU as "phonebook" order, where "phonebook" order
does not exist in 'zh' locales, and the the default in 'zh' locales
is already "pinyin" (if your requirement was instead "pinyin", then
you would not expect "phonebook" order in de_DE locale at all).

It may even be that one day "phonebook" order is added to 'zh' locales,
in which case you would want to automatically benefit from the new
"phonebook" tailoring (in whichever locale which it get's added to).

Anyway, this discussion is not prodictive if we just have two sets
of opinions, so let's try to be more productive.

My perspective here is that you have a list of expected behaviours,
and I'm just not convinced that the expected behaviors for you,
are going to be the same as every other expected behavior of EDS
(which is why I think it would be safer to have something configurable,
which would probably be easier to convince upstream EDS to accept).

If the expected behaviours are indeed universal, or if we have some
very good arguments at least as to why these expected behaviours would be
universal for anything that sorts 'person names' in the context of an
'addressbook', then there may be a chance I can sell this hard coded
list to upstream.

In any case, I could start taking down a list of what collation rule
you want to be effective for which language code / country code, I'm
just not convinced that this approach is appropriate for EDS, or if
we can land it upstream this way.

> > > It looks to me that "phonebook" in EDS should be merely a flag for certain
> > > fields which inside EDS gets translated into a suitable (hard-coded?!)
> > > collation per current locale (ICU "phonebook" in de and fi, ICU "pinyin" in
> > > Chinese locales). See also bug #64173.
> > > 
> > > Note that Pinyin may require additional work, see bug #64173 comment #3.
> > 
> > Nod, also indexing anything by first letter/character seems to need the same
> > type of attention, there is AlphabetIndex which can be used for this:
> >     www.icu-project.org/apiref/icu4c/classicu_1_1AlphabeticIndex.html
> > 
> > My thoughts on this are, for sorting, one could use 2 sort keys for any
> > given field, the primary sort key would be the 'bucket' (character) which a
> > given word falls under according to AlphabetIndex, and the actual sort key
> > of the word would be the secondary sort key.
> > 
> > This would also require some API churn at higher levels, i.e. any view
> > which wants to display a sorted list with letter/character grouping, would
> > need to first query the addressbook to know which characters are valid
> > for the current locale (and in what order those groups should be displayed).
> 
> This grouping is a separate problem. Let's keep it out of the discussion for
> now.

(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #6)
> > > (In reply to comment #5)
> > [...]
> > > > 
> > > > The idea is that "phonebook" collation is a preference that is either
> > > > available under the current locale, or not.
> > > > 
> > > > When a locale change has been detected, the sort keys will be generated
> > > > under the "phonebook" collation of the new locale... this will fallback
> > > > to the default ICU rules for the given locale if "phonebook" tailoring
> > > > is not available.
> > > 
> > > So the "phonebook" keyword is just a hint that this field is meant to be
> > > sorted according to phone book rules if applicable? Is there value in having
> > > to configure this, instead of hard-coding it for each field? Well, perhaps,
> > > because then it can be enabled for other fields, too, without having to
> > > modify the source code.
> > 
> > I'm asking myself the opposite question actually.
> > 
> > Does it make sense to assume that an addressbook should be sorted
> > using "phonebook" rules ? and apply that everywhere ?
> 
> I am less sure whether it should be applied to non-name fields. Not that
> sorting by address is useful, but in theory it is possible.
> 
> But in general I agree, always using "phonebook" (in de and fi) and "pinyin"
> (in China) sounds right.
> 
> The only reason for making it configurable is to accomodate for personal
> user preferences (both Pinyin and non-Pinyin sorting are valid), which (as
> far as I can tell) is not a useful use case (because Pinyin is sufficiently
> popular, and explaining to the user that they have the choice is too
> confusing).
> 
> > > I don't see much value in being able to turn off "phonebook" for name
> > > fields, though.
> > > 
> > > How would this work for Pinyin? In other words, what would
> > > CollationFields=family_name,pinyin:given_name,pinyin do when switching from
> > > zn_CN (where Pinyin is expected) to de_DE (where phone book is execpted)?
> > 
> > If "pinyin" was explicitly specified as the collation rule for a given
> > field, then "phonebook" would definitely not be expected for de_DE.
> 
> Then that's where configuring the collation in the source config breaks
> down, because we don't have some entity which would rewrite the config when
> the locale changes - other than EDS itself, of course. And if EDS rewrites
> the config, the option becomes useless.
> 

I'm not convinced of that, from what I understand your requirements
are described by ICU as "phonebook" order, where "phonebook" order
does not exist in 'zh' locales, and the the default in 'zh' locales
is already "pinyin" (if your requirement was instead "pinyin", then
you would definitely not expect "phonebook" order in de_DE locale at
all).

It may even be that one day "phonebook" order is added to 'zh' locales,
in which case you would want to automatically benefit from the new
"phonebook" tailoring (in whichever locale which it get's added to).

Anyway, this discussion is not prodictive if we just have two sets
of opinions, so I'll try to be more productive and to the point.

My perspective here is that you have a list of expected behaviours,
and I'm just not convinced that the expected behaviors for you,
are going to be the same as every other expected behavior of EDS
(which is why I think it would be safer to have something configurable,
which would probably be easier to convince upstream EDS to accept).

If the expected behaviours are indeed universal, or if we have some
very good arguments at least as to why these expected behaviours would be
universal for anything that sorts 'person names' in the context of an
'addressbook', then there may be a chance I can sell this hard coded
list to upstream.

In any case, I could start taking down a list of what collation rule
you want to be effective for which language code / country code, I'm
just not convinced that this approach is appropriate for EDS, or if
we can land it upstream this way.

> > > It looks to me that "phonebook" in EDS should be merely a flag for certain
> > > fields which inside EDS gets translated into a suitable (hard-coded?!)
> > > collation per current locale (ICU "phonebook" in de and fi, ICU "pinyin" in
> > > Chinese locales). See also bug #64173.
> > > 
> > > Note that Pinyin may require additional work, see bug #64173 comment #3.
> > 
> > Nod, also indexing anything by first letter/character seems to need the same
> > type of attention, there is AlphabetIndex which can be used for this:
> >     www.icu-project.org/apiref/icu4c/classicu_1_1AlphabeticIndex.html
> > 
> > My thoughts on this are, for sorting, one could use 2 sort keys for any
> > given field, the primary sort key would be the 'bucket' (character) which a
> > given word falls under according to AlphabetIndex, and the actual sort key
> > of the word would be the secondary sort key.
> > 
> > This would also require some API churn at higher levels, i.e. any view
> > which wants to display a sorted list with letter/character grouping, would
> > need to first query the addressbook to know which characters are valid
> > for the current locale (and in what order those groups should be displayed).
> 
> This grouping is a separate problem. Let's keep it out of the discussion for
> now.

Comment 10 Tristan Van Berkom 2013-05-14 11:02:37 UTC

(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > (In reply to comment #6)
> > > > (In reply to comment #5)
[...]
> 
> I'm not convinced of that, from what I understand your requirements
> are described by ICU as "phonebook" order, where "phonebook" order
> does not exist in 'zh' locales, and the the default in 'zh' locales
> is already "pinyin" (if your requirement was instead "pinyin", then
> you would definitely not expect "phonebook" order in de_DE locale at
> all).
> 
> It may even be that one day "phonebook" order is added to 'zh' locales,
> in which case you would want to automatically benefit from the new
> "phonebook" tailoring (in whichever locale which it get's added to).
> 
> Anyway, this discussion is not prodictive if we just have two sets
> of opinions, so I'll try to be more productive and to the point.
> 
> My perspective here is that you have a list of expected behaviours,
> and I'm just not convinced that the expected behaviors for you,
> are going to be the same as every other expected behavior of EDS
> (which is why I think it would be safer to have something configurable,
> which would probably be easier to convince upstream EDS to accept).
> 
> If the expected behaviours are indeed universal, or if we have some
> very good arguments at least as to why these expected behaviours would be
> universal for anything that sorts 'person names' in the context of an
> 'addressbook', then there may be a chance I can sell this hard coded
> list to upstream.
> 
> In any case, I could start taking down a list of what collation rule
> you want to be effective for which language code / country code, I'm
> just not convinced that this approach is appropriate for EDS, or if
> we can land it upstream this way.
> 

Err, sorry I have no idea how the above was commented twice, there was
some mid-air collision, anyway, only the above was intended ;-)

Comment 11 Patrick Ohly 2013-05-14 12:55:09 UTC

(In reply to comment #9)
> > Then that's where configuring the collation in the source config breaks
> > down, because we don't have some entity which would rewrite the config when
> > the locale changes - other than EDS itself, of course. And if EDS rewrites
> > the config, the option becomes useless.
> 
> I'm not convinced of that, from what I understand your requirements
> are described by ICU as "phonebook" order, where "phonebook" order
> does not exist in 'zh' locales, and the the default in 'zh' locales
> is already "pinyin" (if your requirement was instead "pinyin", then
> you would not expect "phonebook" order in de_DE locale at all).

My expectation is that EDS does "the right thing" by default for each field, without some non-existent entity having to tell it what that is. By all means, let's add overrides for the default behavior in EDS, but let's keep these overrides optional.

The overrides would be:
- "phonebook" - enable phonebook
- "no-phonebook" - disable phonebook
- "pinyin" - enable pinyin
- "no-pinyin" - disable pinyin

And the defaults should be:
- use "phonebook" for name fields (and only for name fields) in
  "de" and "fi" locales
- use "pinyin" with Chinese Pinyin and Latin characters mixed (which is
  *not* the default in ICU) for all strings in Chinese

It's quite likely that this knowledgebase will grow over time as more localization experts tell us about the preferences in their country. EDS might not be the best place to store such a knowledgebase, but where else can we put it?

By not storing the default in the source config, we can fix the wrong default by upgrading EDS. If we were to put the setting into the config, it would be difficult to change later, because it would be uncertain whether it merely represents the old, incorrect default value or was chosen intentionally.

Comment 12 Tristan Van Berkom 2013-05-14 14:09:44 UTC

(In reply to comment #11)
> (In reply to comment #9)
> > > Then that's where configuring the collation in the source config breaks
> > > down, because we don't have some entity which would rewrite the config when
> > > the locale changes - other than EDS itself, of course. And if EDS rewrites
> > > the config, the option becomes useless.
> > 
> > I'm not convinced of that, from what I understand your requirements
> > are described by ICU as "phonebook" order, where "phonebook" order
> > does not exist in 'zh' locales, and the the default in 'zh' locales
> > is already "pinyin" (if your requirement was instead "pinyin", then
> > you would not expect "phonebook" order in de_DE locale at all).
> 
> My expectation is that EDS does "the right thing" by default for each field,
> without some non-existent entity having to tell it what that is. By all
> means, let's add overrides for the default behavior in EDS, but let's keep
> these overrides optional.
> 
> The overrides would be:
> - "phonebook" - enable phonebook
> - "no-phonebook" - disable phonebook
> - "pinyin" - enable pinyin
> - "no-pinyin" - disable pinyin
> 
> And the defaults should be:
> - use "phonebook" for name fields (and only for name fields) in
>   "de" and "fi" locales
> - use "pinyin" with Chinese Pinyin and Latin characters mixed (which is
>   *not* the default in ICU) for all strings in Chinese

Right, this is not exactly 'pinyin' according to ICU (as mentioned in
https://bugs.freedesktop.org/show_bug.cgi?id=64173#c3)

And it might even mean using a different API altogether than UCollator
(not sure yet what exactly is a 'Han-Latin transliterator'), but of course,
sounds like something doable underneath an abstraction API (such as
the ECollator API I've been working on).

Also, I know you would rather not mix subjects but there is overlap here,
what happens when we want this specific interleaved brand of 'pinyin' to be
the sort order, and we also want to navigate to the results starting
with '江', or the results starting with 'J' ?

If the results are interleaved with latin characters, it's unclear at
this point if this can work properly when combined with AlphabetIndex.

> It's quite likely that this knowledgebase will grow over time as more
> localization experts tell us about the preferences in their country. EDS
> might not be the best place to store such a knowledgebase, but where else
> can we put it?

I've been thinking, if there is a consensus on what this knowledgebase is,
it might be very appropriate to have this behaviour in:

https://developer.gnome.org/glib/unstable/glib-Unicode-Manipulation.html#g-utf8-collate

Which is already documented as:
  "Compares two strings for ordering using the linguistically correct rules for the current locale"

And it looks like this knowledgebase is exactly that, i.e. what are the "linguistically correct rules".

Perhaps it might even make sense to add something like:
  g_utf8_collate_key_for_addressbook()

Which might be acceptable for Glib, seeing as there is already an existing function:
  g_utf8_collate_key_for_filename()

(also, we can't very well suddenly change the behaviour of the existing
g_utf8_collate_key() function, as that would certainly break existing
applications).

Of course, we'd have to be very sure about the value of this knowledebase
before proposing something like that for Glib, and it might help to have
the code live in EDS as a proof of concept before trying to push it into
Glib.

> 
> By not storing the default in the source config, we can fix the wrong
> default by upgrading EDS. If we were to put the setting into the config, it
> would be difficult to change later, because it would be uncertain whether it
> merely represents the old, incorrect default value or was chosen
> intentionally.

Comment 13 Patrick Ohly 2013-06-11 19:01:47 UTC

(In reply to comment #2)
> It probably would be best to just clone the config of the system address
> book when creating new databases via SyncEvolution.

This was implemented:

commit 5925ccee7f3b22d72af117db25fa21280433fa29
Author: Patrick Ohly <patrick.ohly@intel.com>
Date:   Thu May 23 17:38:04 2013 +0200

    EDS: create new databases by cloning the builtin ones (FDO #64176)
    
    Instead of hard-coding a specific "Backend Summary Setup" in
    SyncEvolution, copy the config of the system database. That way
    special flags (like the desired "Backend Summary Setup" for local
    address books) can be set on a system-wide basis and without having to
    modify or configure SyncEvolution.
    
    Because EDS has no APIs to clone an ESource or turn a .source file
    into a new ESource, SyncEvolution has to resort to manipulating and
    creating the keyfile directly.

The "openismus work" branches already include a suitably patched system DB, so I consider the issue resolved.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.