Bug 93549

Summary: RAR archive detected by content as PDF
Product: shared-mime-info Reporter: Elvis Angelaccio <elvis.angelaccio>
Component: freedesktop.org.xmlAssignee: Shared Mime Info group <shared_mime_info>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Elvis Angelaccio 2015-12-31 16:00:08 UTC
Consider this RAR archive: https://goo.gl/LDoaJ7

Apparently it is detected-by-content as a PDF. The archive contains a buch of PDF files, FWIW. Steps to reprodue:

1. Download the archive above
2. Remove the .rar extension, e.g. rename it to archive_foo
3. $ xdg-mime query filetype archive_foo
   application/pdf

If the archive does have the .rar extension, it's correctly detected as application/x-rar.

Downstream bug report: https://bugs.kde.org/show_bug.cgi?id=357134

I'm on Archlinux with shared-mime-info 1.5.1.
Comment 1 Bastien Nocera 2016-01-04 16:10:08 UTC
(In reply to Elvis Angelaccio from comment #0)
> Consider this RAR archive: https://goo.gl/LDoaJ7

In the future, please consider sharing a truncated version of the file, because we really don't need 200 megs of data for testing the magic.

> Apparently it is detected-by-content as a PDF. The archive contains a buch
> of PDF files, FWIW. Steps to reprodue:
> 
> 1. Download the archive above
> 2. Remove the .rar extension, e.g. rename it to archive_foo
> 3. $ xdg-mime query filetype archive_foo
>    application/pdf
> 
> If the archive does have the .rar extension, it's correctly detected as
> application/x-rar.
> 
> Downstream bug report: https://bugs.kde.org/show_bug.cgi?id=357134
> 
> I'm on Archlinux with shared-mime-info 1.5.1.

commit 10fc17ec61c5b5c6c0e6ea2c8d7a2123271d07e3
Author: Bastien Nocera <hadess@hadess.net>
Date:   Mon Jan 4 17:07:09 2016 +0100

    Bump priority for archives mime-types magic
    
    When a long enough magic is available (4 characters in this case),
    prefer the magic of the archive type to the one of the
    maybe-not-compressed header of the first file in the archive.
    
    For example, in https://bugs.freedesktop.org/show_bug.cgi?id=93549
    a PDF inside the RAR archive made the archive be detected as a PDF.
Comment 2 Bastien Nocera 2016-02-23 15:03:59 UTC
*** Bug 94089 has been marked as a duplicate of this bug. ***
Comment 3 Stefan Radermacher 2016-05-27 16:35:46 UTC
It seems zip archives were forgotten in the fix for this bug, and so I have various zip files that detect as PDF.

Upping the priority property in line 3657 of freedesktop.org.xml fixes this problem.
Comment 4 Bastien Nocera 2016-05-27 16:44:38 UTC
(In reply to Stefan Radermacher from comment #3)
> It seems zip archives were forgotten in the fix for this bug, and so I have
> various zip files that detect as PDF.

This bug is about RAR files. If there's other problems, file a new bug, and attach a test case.

> Upping the priority property in line 3657 of freedesktop.org.xml fixes this
> problem.

And would likely break the detection of all the file formats that are sub-classes of zip.

In any case, file a new bug with a test case.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.