Bug 94423 - Improved identification of JPEG 2000 format family
Summary: Improved identification of JPEG 2000 format family
Status: RESOLVED FIXED
Alias: None
Product: shared-mime-info
Classification: Unclassified
Component: freedesktop.org.xml (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Shared Mime Info group
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-03-07 11:50 UTC by Johan van der Knijff
Modified: 2017-06-11 00:05 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Distinguish JPEG 2000 MIME subtypes (5.69 KB, patch)
2016-05-22 19:07 UTC, Martin Blanchard
Details | Splinter Review
attachment-30783-0.html (4.51 KB, text/html)
2016-05-27 23:12 UTC, Johan van der Knijff
Details
attachment-4900-0.html (14.66 KB, text/html)
2016-06-03 19:46 UTC, Johan van der Knijff
Details
Distinguish JPEG 2000 MIME subtypes (3.47 MB, patch)
2017-06-10 22:02 UTC, Martin Blanchard
Details | Splinter Review

Description Johan van der Knijff 2016-03-07 11:50:16 UTC
The current database entry for JPEG 2000 doesn't distinguish between the different sub-formats of the JPEG 2000 standard. I would suggest to replace it by the more specific entry that is given in the link below:

https://issues.apache.org/jira/browse/TIKA-970

A set of test files that covers all the sub-formats can be found here (I'm the author of this dataset; I originally prepared it to support a patch for Unix File):

https://github.com/bitsgalore/jp2kMagic

I don't know if using Tika definitions might introduce any license issues. In any case the author of the Tika definitions adapted them from my own earlier Unix File magic.
Comment 1 Bastien Nocera 2016-03-07 11:56:13 UTC
See the HACKING file for how to propose additions and changes to mime-types.
Comment 2 Martin Blanchard 2016-05-22 19:07:29 UTC
Created attachment 123969 [details] [review]
Distinguish JPEG 2000 MIME subtypes

Attached is a patch improving JPEG 2000 MIME subtypes detection based on jp2kMagic magic strings, enhanced using a bit mask. A test.jpc is included for JPC; JPX, JPM and MJ2 don't have one so far...

Here is the print-mime-data tool (xdgmime) output for the 4 sample files from the jp2kMagic repo:

>balloon.jp2:
>	name: image/jp2
>	data: image/jp2
>	file: image/jp2

>balloon.jpm:
>	name: image/jpm
>	data: image/jpm
>	file: image/jpm

>Speedway.mj2:
>	name: video/mj2
>	data: video/mj2
>	file: video/mj2

>balloon.j2c:
>	name: image/x-jp2-codestream
>	data: image/x-jp2-codestream
>	file: image/x-jp2-codestream

>balloon.jpf:
>	name: image/jpx
>	data: image/jpx
>	file: image/jpx
Comment 3 Bastien Nocera 2016-05-27 16:29:53 UTC
What is the use for differentiating those sub-types? Apart from "correctness".

If we had different applications handling them, or the applications used only shared-mime-info to differentiate the different types, then it would be a good reason to split them up. Otherwise we'd just add the sub-types as aliases for the jp2 format, and be done with it.
Comment 4 Johan van der Knijff 2016-05-27 23:12:41 UTC
Created attachment 124136 [details]
attachment-30783-0.html

The  reason is that we actually *do* have different applications for handling them. They're not aliases of JP2, they really are separate formats, with different format specs.

E.g. most applications that can read JP2 cannot handle (all aspects of) a JPX file. Support for JPM is even more limited, and MJ2 is a video format  which needs an entirely different class of applications.

        [Koninklijke Bibliotheek, National Library of the Netherlands]
        Prins Willem-Alexanderhof 5 | 2595 BE Den Haag
Postbus 90407 | 2509 LK Den Haag | (070) 314 09 11 | www.kb.nl<http://www.kb.nl/>
        [http://www.kb.nl/sites/default/files/dots.jpg]
        English version<http://www.kb.nl/en/email> | Disclaimer<http://www.kb.nl/disclaimer>

________________________________
From: bugzilla-daemon@freedesktop.org [bugzilla-daemon@freedesktop.org]
Sent: 27 May 2016 18:29
To: Johan van der Knijff
Subject: [Bug 94423] Improved identification of JPEG 2000 format family


Comment # 3<https://bugs.freedesktop.org/show_bug.cgi?id=94423#c3> on bug 94423<https://bugs.freedesktop.org/show_bug.cgi?id=94423> from Bastien Nocera<mailto:bugzilla@hadess.net>

What is the use for differentiating those sub-types? Apart from "correctness".

If we had different applications handling them, or the applications used only
shared-mime-info to differentiate the different types, then it would be a good
reason to split them up. Otherwise we'd just add the sub-types as aliases for
the jp2 format, and be done with it.

________________________________
You are receiving this mail because:

  *   You reported the bug.
Comment 5 Bastien Nocera 2016-05-31 11:38:48 UTC
(In reply to Johan van der Knijff from comment #4)
> Created attachment 124136 [details]
> attachment-30783-0.html
> 
> The  reason is that we actually *do* have different applications for
> handling them. They're not aliases of JP2, they really are separate formats,
> with different format specs.
> 
> E.g. most applications that can read JP2 cannot handle (all aspects of) a
> JPX file. Support for JPM is even more limited, and MJ2 is a video format 
> which needs an entirely different class of applications.

Which applications? I can understand the video format being separate (even though I doubt we'd find MJ2 videos without a container in the wild, I've seen them packed in MPEG-4 containers), but I'll need more justification for adding for file formats than "because they exist".
Comment 6 Martin Blanchard 2016-06-03 19:40:43 UTC
(In reply to Bastien Nocera from comment #5)
> Which applications? I can understand the video format being separate (even
> though I doubt we'd find MJ2 videos without a container in the wild, I've
> seen them packed in MPEG-4 containers), but I'll need more justification for
> adding for file formats than "because they exist".

The current shared-mime-info situation/state is, in my opinion, unfortunate: JPX (.jpx or .jpf) are reported as image/jp2 which they are NOT. Thus, for example, every Jasper based loader (gdk-pixbuf, GEGL at least) will systematically failed loding such files (Jasper only handles JP2 files).

GEGL would also benefit from the distinction: in an effort to port the JPEG 2000 loader from Jasper to OpenJpeg, we would really like to be able to rely on better content type detection based on shared-mime-info magic strings! See:

https://bugzilla.gnome.org/show_bug.cgi?id=764746
Comment 7 Johan van der Knijff 2016-06-03 19:46:27 UTC
Created attachment 124308 [details]
attachment-4900-0.html


I'm currently away for holidays. I will be back the 13th of June.

Best regards,

Johan

        [Koninklijke Bibliotheek, National Library of the Netherlands]
        Prins Willem-Alexanderhof 5 | 2595 BE Den Haag
Postbus 90407 | 2509 LK Den Haag | (070) 314 09 11 | www.kb.nl<http://www.kb.nl/>
        [http://www.kb.nl/sites/default/files/dots.jpg]
        English version<http://www.kb.nl/en/email> | Disclaimer<http://www.kb.nl/disclaimer>
Comment 8 Bastien Nocera 2017-05-15 18:59:02 UTC
Comment on attachment 123969 [details] [review]
Distinguish JPEG 2000 MIME subtypes

Review of attachment 123969 [details] [review]:
-----------------------------------------------------------------

::: freedesktop.org.xml.in
@@ +4267,5 @@
>      <glob pattern="*.jpe"/>
>      <alias type="image/pjpeg"/>
>    </mime-type>
> +  <mime-type type="image/x-jp2-codestream">
> +    <_comment>JPEG 2000 codestream</_comment>

JPEG-2000 as per previous mime-types.

@@ +4268,5 @@
>      <alias type="image/pjpeg"/>
>    </mime-type>
> +  <mime-type type="image/x-jp2-codestream">
> +    <_comment>JPEG 2000 codestream</_comment>
> +    <acronym>JPC</acronym>

The acronym needs to be present in the "comment". As JPC isn't present there, drop the acronym and the expanded acronym.

@@ +4281,2 @@
>    <mime-type type="image/jp2">
> +    <_comment>JPEG 2000 JP2 image</_comment>

JPEG-2000

::: tests/list
@@ +24,4 @@
>  test.ilbm image/x-ilbm
>  test.im1 image/x-sun-raster x
>  test.jp2 image/jp2
> +test.jpc image/x-jp2-codestream

Your tests only covers 2 of the 5 mime-types, you'll need to add more.
Comment 9 Martin Blanchard 2017-06-10 22:02:07 UTC
Created attachment 131847 [details] [review]
Distinguish JPEG 2000 MIME subtypes
Comment 10 Bastien Nocera 2017-06-11 00:05:24 UTC
Pushed with minimal changes to the commit message, thanks for the many iterations :)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.