Bug 12890

Summary: Xiph mime glob/magic handling needs updating (application/ogg etc.)
Product: shared-mime-info Reporter: Ed Catmur <ed>
Component: freedesktop.org.xmlAssignee: Jonathan Blandford <jrb>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: bugzilla, ghepeu, justivo, robin
Version: unspecified   
Hardware: All   
OS: All   
URL: http://wiki.xiph.org/index.php/MIME_Types_and_File_Extensions
Whiteboard:
i915 platform: i915 features:
Attachments: fdo-ogg-mime.patch

Description Ed Catmur 2007-10-22 16:38:10 UTC
According to http://wiki.xiph.org/index.php/MIME_Types_and_File_Extensions , application/ogg should be used for the *.ogx glob, but *.ogg should now resolve to audio/ogg (or audio/x-ogg).

There may be other stuff that needs to be changed.
Comment 1 Ed Catmur 2007-10-22 16:49:34 UTC
Created attachment 12157 [details] [review]
fdo-ogg-mime.patch
Comment 2 Bastien Nocera 2007-10-23 01:44:43 UTC
And break all the existing ogg videos?
Comment 3 Giacomo Perale 2007-10-23 08:14:45 UTC
Well, I think that most .ogg files are audio files anyway and they'll keep working; AFAIK ogg+theora isn't widely used (and if I remember correctly some years ago most ogg videos were distributed as .ogm files, so this wouldn't be the first extension change).
Comment 4 Bastien Nocera 2007-10-23 08:20:23 UTC
(In reply to comment #3)
> Well, I think that most .ogg files are audio files anyway and they'll keep
> working; AFAIK ogg+theora isn't widely used (and if I remember correctly some
> years ago most ogg videos were distributed as .ogm files, so this wouldn't be
> the first extension change).

Depends which domain you work in. There are loads of screencasts and videos available which use .ogg, and are Ogg Theora. OGM files aren't proper Ogg files (see http://en.wikipedia.org/wiki/Ogm).

I don't think that only allowing audio files to be names .ogg is a good idea. The Xiph people made that mistake despite what people were telling them. Now the users will have to live with it.

I might change my mind in a couple of years when .ogg files are only used for audio (again).
Comment 5 Giacomo Perale 2007-10-23 17:39:41 UTC
(In reply to comment #4)
>There are loads of screencasts and videos available which use .ogg, and are Ogg >Theora.

Screencasts are not exactly something you keep forever and backup on cd or dvd, they're transient files who are quickly deleted. However, this is a valid point.

> 
> I might change my mind in a couple of years when .ogg files are only used for
> audio (again).
> 

This is one of those situations in which nobody starts because nobody started.
I suggest applying a slightly modified version of the patch adding all the new extensions (.ogv, .ogx, .spx, .oga) but keeping .ogg as application/ogg. This way, when applications will start producing .ogv files (or someone renames his files) they'll be immediately recognized as Ogg Video, AND the current videos will work too. OT but since you also maintain totem, adding a new schema for a ogv thumbnailer should be easy. 

By the way, I applied the patch and this new approach fixes some minor glitches: for example, when you open a directory containing ogg audio files totem-video-thumbnailer tries to open them and fails, so for a while instead of the music icons you see the "clock" icon. Now this doesn't happen anymore. Not to speak of the preview problem who prompted the opening of this bug report.
Comment 6 Ivo Emanuel Gonçalves 2007-10-25 08:25:46 UTC
Although this may appear as only a wiki page, it is not.  It is a full-fledged Internet Draft[1] that will be submitted to the IETF in two, three days.  It is advised that the developers here read it, as all implementations dealing with Ogg files that do not follow the new guidelines will be marked as non-compliant.

Here's some brief details of what you need to know:
* Ogg files are now divided into Ogg Video, Ogg Audio, and Ogg Application.
* The codecs are defined by name only (e.g. no "Ogg Theora", but Theora).
* application/ogg is for cases where a basic media file like a video is not suited; imagine a large bunch of video streams and timed metadata being changed constantly.
* video/ogg and audio/ogg files SHOULD have a Skeleton stream.  This will make it easier to detect what's inside an Ogg file.  Skeleton is an identifier track, which is now a requirement for most Ogg content, except as noted below.
* There's no such thing as OGM.  It is not a Xiph thing and should never be considered, as besides being an Ogg hack, it usually contains MPEG-4 video, which in most countries means Linux cannot play it legally.
* Vorbis- and Speex-only files (usually .ogg and .spx) will not have a Skeleton stream.  They will also not use the new .oga extension.  This is for backwards-compatibility.  We care about that.
* Legacy Theora files, those without Skeleton and using .ogg, will require a compatibility mode that is beyond the reach of media types and file extensions.  Please do not use application/ogg to identify these files.
* Besides the new media types and file extensions, Macintosh File Type Codes have been changed to OggA, OggV, and OggX.  If this is relevant to the GNOME case or not, I do not know, but it may help in the identification process.

All these changes are to help you developers and the users.  The transition phase may be hard, but stick with us and it will improve the multimedia situation in Linux (and other free OS's).

However, the patch in this thread does not seem to be up for the task.

[1] https://trac.xiph.org/browser/experimental/ivo/drafts/draft-xiph-rfc3534bis.txt
Comment 7 Ed Catmur 2007-10-29 01:09:42 UTC
(In reply to comment #6)
> However, the patch in this thread does not seem to be up for the task.

What makes you say that?  How could it be improved?
Comment 8 Ivo Emanuel Gonçalves 2007-10-29 21:39:08 UTC
(In reply to comment #7)
> What makes you say that?  How could it be improved?

Correct me if I'm wrong, but this is my take of the patch:

It is mentioning something called audio/x-ogg, which doesn't exist, and it's not recommended to be used anywhere.  Same for video/x-ogm+ogg.  What is this?  It's not anything recommended by Xiph.

There are some Theora legacy files (without Skeleton) and using the file extension .ogg.  They should use the video/ogg type, but there should be a distinction of course, as .ogg is now only for Vorbis files.  In that sense, this patch does not address the problem.  My suggestion is that when dealing with a .ogg file, make the application look deeper to see if there's no Skeleton track (which should be the case) and if it's either Vorbis (normal) or Theora (legacy).  If the patch only looks at the OggS magic code, it will not make a distinction, and then the problems mentioned above by Bastien Nocera will happen.

I see .ogx, but I do not see any mention of application/ogg.  Was it cut out from the patch but is in the code?  Or is it missing altogether?

Speex files do not have a specific media type (e.g. audio/x-speex) _unless_ they are being transported through RTP or are otherwise "Oggless".  If they are in Ogg (common), they are audio/ogg, even though for backwards-compatibility those files will use the .spx extension.  Speex files do not use the .ogg extension.  I'm not aware of any that does so.

If you are wondering about OGM, even though it's not something from Xiph, I recommend that it should use the video/ogg type, albeit it should continue to use the .ogm file extension to make a distinction between free formats and non-free formats (like MPEG-4, which is common in OGM files).  Why should it use the video/ogg type?  Because as far as I see, it's still the Ogg container we are dealing with here, and adding more complexity through extra media types sounds counter-productive.

I'm CC'd on this bug.  Any question, comments, anything, feel free to state it.  I will be glad to help.
Comment 9 Ed Catmur 2007-10-30 10:02:02 UTC
(In reply to comment #8)
> It is mentioning something called audio/x-ogg, which doesn't exist, and it's
> not recommended to be used anywhere.
audio/x-ogg is provided as an alias.  This allows software that wishes to respect RFC 2045[1] to use an x-token form extension MIME type, querying for "audio/x-ogg" and transparently interoperating with software that uses the not-yet-standard "audio/ogg".

> Same for video/x-ogm+ogg.  What is this? 
This is a subtype of video/ogg.  f.d.o uses subtypes, analogously to the "+xml" subtypes standardised in [4], to allow software to handle file types with an appropriate degree of granularity for the application.  Again, it uses an x-token because it is not standardised, but can be identified as a quasi-standard within the free desktop.

> It's not anything recommended by Xiph.
Xiph recommends audio/ogg, but this is not a valid MIME type as it has not yet been registered with IANA or in a standards track RTF.
Anyway, [2] states:
# DISCLAIMER: currently, only application/ogg is a registered MIME type.
# Registration for the others is being undertaken. During this process, the
# "x-" versions of these unregistered MIME types may be used. 

> There are some Theora legacy files (without Skeleton) and using the file
> extension .ogg.  They should use the video/ogg type, but there should be a
> distinction of course, as .ogg is now only for Vorbis files.  In that sense,
> this patch does not address the problem.  My suggestion is that when dealing
> with a .ogg file, make the application look deeper to see if there's no
> Skeleton track (which should be the case) and if it's either Vorbis (normal) or
> Theora (legacy).  If the patch only looks at the OggS magic code, it will not
> make a distinction, and then the problems mentioned above by Bastien Nocera
> will happen.
This patch is to the f.d.o base MIME data file.  It's up to applications what to do with the data, but most will use globs for preliminary (fast) classification and magic for full content sniffing.  Legacy theora files with a .ogg extension would get identified as audio/ogg initially but sniffed as video/x-theora+ogg (a subtype of video/ogg) as soon as their content is examined (e.g. when launching from Nautilus).

Files with glob *.ogg need to be fast-classified without content sniffing.  Since [2] reserves .ogg for audio/ogg only, f.d.o should respect this.  Users are still able to force content sniffing of deprecated Theora-in-.ogg files (e.g. in Nautilus, by selecting the relevant files).

Currently in f.d.o, *.ogg files are fast-classified as application/ogg; since this does not identify the content media type, it breaks both audio/ogg and video/ogg files.

> I see .ogx, but I do not see any mention of application/ogg.  Was it cut out
> from the patch but is in the code?  Or is it missing altogether?
The first chunk applies to the application/ogg type.  Perhaps you might examine the intended base file for the patch:[3] freedesktop.org.xml.in is converted at compile time (to include localisations) into a f.d.o XML mime file, which is then parsed at package install time into the f.d.o mime database.

> Speex files do not have a specific media type (e.g. audio/x-speex) _unless_
> they are being transported through RTP or are otherwise "Oggless".  If they are
> in Ogg (common), they are audio/ogg, even though for backwards-compatibility
> those files will use the .spx extension.  Speex files do not use the .ogg
> extension.  I'm not aware of any that does so.
Yes; the patch adds the *.spx glob to the audio/x-speex+ogg subtype.  audio/x-speex is here reserved for Oggless Speex.

> If you are wondering about OGM, even though it's not something from Xiph, I
> recommend that it should use the video/ogg type, albeit it should continue to
> use the .ogm file extension to make a distinction between free formats and
> non-free formats (like MPEG-4, which is common in OGM files).  Why should it
> use the video/ogg type?  Because as far as I see, it's still the Ogg container
> we are dealing with here, and adding more complexity through extra media types
> sounds counter-productive.
OGM here uses video/x-ogm+ogg, a subtype of video/ogg.  Applications on the free desktop use MIME types, not file extensions, as the general, flexible file-type vocabulary, so for the distinction between free and non-free formats to be understood it needs to be expressed in terms of MIME types.  The subtype mechanism in f.d.o allows the inherent complexity of container and contained types to be expressed and used in a natural manner.

1. http://tools.ietf.org/html/rfc2045#section-5.1
2. http://wiki.xiph.org/index.php/MIME_Types_and_File_Extensions
3. http://webcvs.freedesktop.org/mime/shared-mime-info/freedesktop.org.xml.in?view=markup
4. http://tools.ietf.org/html/rfc3023
Comment 10 Ivo Emanuel Gonçalves 2008-08-26 21:35:59 UTC
Take note that video/ogg and audio/ogg have been registered with the IANA.  Refer to IANA's index and/or RFC 5334.  Also note that audio/vorbis and audio/vorbis-config were registered for use on RTP (RFC 5215).

You are invited to use these appropriate media types as described in RFC 5334.  A quick briefing:

.ogg .oga .spx > audio/ogg
.ogv > video/ogg
.ogx > application/ogg

Note that .ogg is now only used for single stream Vorbis files; all other kind of data in Ogg should use one the other file extensions.  application/ogg is now ONLY used for applications and complex multimedia.

Also note that these three media types support an optional parameter called "codecs".  Here's an example: video/ogg; codecs="theora, vorbis, kate".  I'm unsure if this is of any relevance for this component.  RFC 5334 goes further into detail about these things, including the magic of each stream known to be encapsulated in Ogg.

Finally, I can submit a patch if needed.
Comment 11 Bastien Nocera 2008-08-27 06:42:18 UTC
Fixed using the info at:
http://tools.ietf.org/id/draft-goncalves-rfc3534bis-07.txt

* freedesktop.org.xml.in: Only use *.ogg for audio Oggs
(Closes: #12890)
* tests/list:
* tests/test.ogg: Add a test ogg vorbis file

If there are any more problems with specific mime-types, please file separate bugs.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.