Bug 34978

Summary: Add international encodings to C.I.Cellular
Product: Telepathy Reporter: Marco Barisione <marco.barisione>
Component: tp-specAssignee: Telepathy bugs list <telepathy-bugs>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: medium CC: lassi.syrjala
Version: git masterKeywords: patch
Hardware: Other   
OS: All   
URL: http://git.collabora.co.uk/?p=user/bari/telepathy-spec.git;a=shortlog;h=refs/heads/international-charsets
Whiteboard:
i915 platform: i915 features:

Description Marco Barisione 2011-03-03 10:51:26 UTC
At the moment the Cellular interface allows to specify whether to use the 7-bit GSM encoding or UCS-2 (meaning that you have half of the characters in a SMS).
We need to add another property to specify the national shift table to use in alternative to 7-bit GSM (basically a way to encode other characters), while still keeping the ability to use UCS-2 if needed.
Comment 1 Marco Barisione 2011-03-03 10:56:30 UTC
http://git.collabora.co.uk/?p=user/bari/telepathy-spec.git;a=shortlog;h=refs/heads/international-charsets add a C.I.Cellular.MessageNationalCharacterSet property.

The reason for having a string property is that new charsets are been added with every standard release and we should avoid requiring a spec update every time.
I'm calling this property MessageNationalCharset as the word national is used in the standard (even if multiple nations share the same language...).

The behaviour I expect from a CM is:
* If MessageNationalCharset is 'gsm':
  * We can fit the text in the 7-bit GSM encoding:
    * Send it unchanged in GSM encoding
  * We cannot:
    * MessageReducedCharacterSet is true:
      * Force the conversion to 7-bit GSM
    * MessageReducedCharacterSet is false:
      * Use UCS-2
* If MessageNationalCharset is another encoding we understand:
  * We can fit the text in the encoding:
    * Send it encoded in the national charset (single or fixed? probably
      the CM should choose the best one)
  * We cannot:
    * MessageReducedCharacterSet is true:
      * Force the conversion to the national encoding
    * MessageReducedCharacterSet is false:
      * Use UCS-2
* If MessageNationalCharset is an encoding we don't know:
  * Just behave as if the encoding was GSM. (Or better, the CM should not
    accept setting the value to something it doesn't understand)

In case anybody wants to know more about GSM encodings look at http://www.3gpp.org/ftp/specs/archive/23_series/23.038/
Comment 2 Will Thompson 2011-03-07 09:19:50 UTC
+          Other valid character set are specified in the GSM standard and are,
+          for instance, “turkey”, “spain” or “portugal”.

You should link to the GSM standard, ideally… Which of the many zip files in http://www.3gpp.org/ftp/specs/archive/23_series/23.038/ should one look at?

I think the strings should be written as <code>"turkey"</code> in the spec, rather than with curly quotes.

 SMSes will be
+          encoded in the normal 7-bit GSM character set or in UCS-2, based on
+          the value of the
+          <tp:member-ref>MessageReducedCharacterSet</tp:member-ref> property.

… and whether the message can be represented by the normal 7-bit character set?

I think it looks fine otherwise.
Comment 3 Marco Barisione 2011-03-07 10:59:53 UTC
(In reply to comment #2)
> +          Other valid character set are specified in the GSM standard and are,
> +          for instance, “turkey”, “spain” or “portugal”.
> 
> You should link to the GSM standard, ideally… Which of the many zip files in
> http://www.3gpp.org/ftp/specs/archive/23_series/23.038/ should one look at?

The most recent one. They are different revisions of the same document, so I cannot just link to one as it will become outdated.

> I think the strings should be written as <code>"turkey"</code> in the spec,
> rather than with curly quotes.
>
Done.

>  SMSes will be
> +          encoded in the normal 7-bit GSM character set or in UCS-2, based on
> +          the value of the
> +          <tp:member-ref>MessageReducedCharacterSet</tp:member-ref> property.
> 
> … and whether the message can be represented by the normal 7-bit character set?

Done.

> I think it looks fine otherwise.

I will push it tomorrow if you don't have objections.
Comment 4 Will Thompson 2011-03-08 06:26:33 UTC
+          to “gsm”.</p>

you missed one. But if you fix this, ship it.
Comment 5 Will Thompson 2011-03-14 12:56:16 UTC
This was shipped in 0.21.12.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.