Bug 62473 - MEDIA/SUBTYPE.xml files should be stored in lowercase
Summary: MEDIA/SUBTYPE.xml files should be stored in lowercase
Status: RESOLVED FIXED
Alias: None
Product: shared-mime-info
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Shared Mime Info group
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-18 14:21 UTC by Thomas Kluyver
Modified: 2014-04-22 22:04 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Patch (1.42 KB, patch)
2013-03-19 20:30 UTC, Thomas Kluyver
Details | Splinter Review
Store MEDIA/SUBTYPE.xml files in lowercase (1.54 KB, patch)
2014-03-31 14:33 UTC, Bastien Nocera
Details | Splinter Review

Description Thomas Kluyver 2013-03-18 14:21:18 UTC
RFC 2045 says that MIME type names (e.g. text/plain) are case-insensitive. Most are written by convention in lowercase, although there are a few exceptions currently in the freedesktop.org.xml database.

At present, the MEDIA/SUBTYPE.xml files are named using the case as defined in the master XML files. This has two problems:

- Implementations have to store the original spelling, so that they can find these files on case-sensitive filesystems.
- There may be conflicting files that differ only in case. E.g. on my system, various MS Office formats are defined with both 'macroEnabled' and 'macroenabled'.

I suggest storing all these files with lowercase filenames. This is a minimally invasive change, because almost all of them are already lowercase. Besides a dozen 'macroEnabled' formats, the only exceptions are audio/AMR, audio/AMR-WB and text/x-iMelody.

This would require changes to the shared-mime-info spec and the code that creates these files.

Mailing list discussion: http://lists.freedesktop.org/archives/xdg/2013-March/012711.html
Comment 1 David Faure 2013-03-19 13:42:22 UTC
I am very much tempted to say that libreoffice.xml is buggy, instead:
The IANA registration for the macroEnabled stuff uses uppercase 'E'.
http://www.iana.org/assignments/media-types/application/vnd.ms-excel.sheet.macroEnabled.12

(Ideally libreoffice.xml should not define this mimetype at all, since it's already defined in freedesktop.org.xml, provided that they can rely on a recent enough version of it).


However I see that Bastien quoted in a recent commit http://tools.ietf.org/html/rfc2045#section-5.1, which says that mimetypes are not case sensitive. If we agree to that, then the mime spec should say so explicitely, and we have quite some code to fix... at least I do :)
Comment 2 Bastien Nocera 2013-03-19 14:59:49 UTC
(In reply to comment #1)
<snip>
> However I see that Bastien quoted in a recent commit
> http://tools.ietf.org/html/rfc2045#section-5.1, which says that mimetypes
> are not case sensitive. If we agree to that, then the mime spec should say
> so explicitely, and we have quite some code to fix... at least I do :)

We never actually *define* mime-type anywhere in the spec. We also don't define URI schemes. I don't think that it needs to "say so explicitely", when the specs we rely on already do.
Comment 3 Thomas Kluyver 2013-03-19 20:30:40 UTC
Created attachment 76776 [details] [review]
Patch

Here's a shot at a patch. I haven't done much C, so it probably needs some polish.

I first tried changing the name just before it saves the MEDIA/SUBTYPE.xml file, but when I tried that, I found that it didn't remove the old uppercase files, because the code that checks whether to delete them was still finding their names as MIMEtypes. So this works earlier, before it puts a MIME type into the hash table.
Comment 4 Thomas Kluyver 2013-04-29 09:09:30 UTC
Any more thoughts on this? I'm aiming for a 1.0 release of PyXDG. If this is what we're going to do, I'll just store all the names in lower case. If not, I'll need to rework things.
Comment 5 Bastien Nocera 2014-03-31 14:33:51 UTC
Created attachment 96664 [details] [review]
Store MEDIA/SUBTYPE.xml files in lowercase

RFC 2045 says that MIME type names (e.g. text/plain) are
case-insensitive. Most are written by convention in lowercase, although
there are a few exceptions currently in the freedesktop.org.xml
database.

Store the separate mime files as lower-case to make them easily findable
(eg. the synonymous application/vnd.*macroEnabled* and *macroenabled*
mime-type should have the same filename).
Comment 6 Bastien Nocera 2014-03-31 14:54:37 UTC
Instead of lower-casing the mime-type itself, I lower-cased the filename itself when it's being written. Can you check whether that works for you?
Comment 7 Bastien Nocera 2014-04-01 12:16:07 UTC
commit 3805d0bcf22b6344fb4a4a36ad4e15e30d17b624
Author: Bastien Nocera <hadess@hadess.net>
Date:   Mon Mar 31 16:22:33 2014 +0200

    Store MEDIA/SUBTYPE.xml files in lowercase
    
    RFC 2045 says that MIME type names (e.g. text/plain) are
    case-insensitive. Most are written by convention in lowercase, although
    there are a few exceptions currently in the freedesktop.org.xml
    database.
    
    Store the separate mime files as lower-case to make them easily findable
    (eg. the synonymous application/vnd.*macroEnabled* and *macroenabled*
    mime-type should have the same filename).
    
    https://bugs.freedesktop.org/show_bug.cgi?id=62473
Comment 8 Thomas Kluyver 2014-04-22 22:04:08 UTC
Sorry for the delayed reply.

When I tried that approach, I found that updating the mime db cache didn't remove existing .xml files with capitalised letters in their name - i.e. if beforehand you had audio/AMR.xml, afterwards you'd have audio/AMR.xml and audio/amr.xml. I consider that undesirable, although it's not a major problem.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.