On my system (Debian jessie), the two MIME types 'application/vnd.ms-powerpoint.presentation.macroenabled.12' and 'application/vnd.ms-powerpoint.presentation.macroEnabled.12' that only differ in case exist, along with a bunch of other types like that: $ grep -iRF application/vnd.ms-powerpoint.presentation.macroenabled.12 /usr/share/mime /usr/share/mime/generic-icons:application/vnd.ms-powerpoint.presentation.macroEnabled.12:x-office-presentation /usr/share/mime/packages/libreoffice.xml: <mime-type type="application/vnd.ms-powerpoint.presentation.macroenabled.12"> /usr/share/mime/packages/freedesktop.org.xml: <mime-type type="application/vnd.ms-powerpoint.presentation.macroEnabled.12"> /usr/share/mime/types:application/vnd.ms-powerpoint.presentation.macroEnabled.12 /usr/share/mime/types:application/vnd.ms-powerpoint.presentation.macroenabled.12 /usr/share/mime/subclasses:application/vnd.ms-powerpoint.presentation.macroEnabled.12 application/vnd.openxmlformats-officedocument.presentationml.presentation /usr/share/mime/globs2:50:application/vnd.ms-powerpoint.presentation.macroEnabled.12:*.pptm /usr/share/mime/globs2:50:application/vnd.ms-powerpoint.presentation.macroenabled.12:*.pptm Übereinstimmungen in Binärdatei /usr/share/mime/mime.cache. /usr/share/mime/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml:<mime-type xmlns="http://www.freedesktop.org/standards/shared-mime-info" type="application/vnd.ms-powerpoint.presentation.macroEnabled.12"> /usr/share/mime/globs:application/vnd.ms-powerpoint.presentation.macroEnabled.12:*.pptm /usr/share/mime/globs:application/vnd.ms-powerpoint.presentation.macroenabled.12:*.pptm update-mime-database stores type names in the case-sensitive hashtable "types". However, before using them as filenames, it forces them to lowercase. The result is this: $ strace -f update-mime-database ./mime_copy/ 2>&1 | grep -Fi -A4 application/vnd.ms-powerpoint.presentation.macroenabled.12 open("./mime_copy/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml.new", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd94ec49000 write(3, "<?xml version=\"1.0\" encoding=\"ut"..., 4096) = 4096 write(3, "ment>\n <comment xml:lang=\"nb\">M"..., 3090) = 3090 -- open("./mime_copy/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml.new", O_RDWR) = 3 fdatasync(3) = 0 close(3) = 0 rename("./mime_copy/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml.new", "./mime_copy/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml") = 0 mkdir("./mime_copy/application", 0755) = -1 EEXIST (File exists) open("./mime_copy/application/x-krita.xml.new", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd94ec49000 -- open("./mime_copy/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml.new", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd94ec49000 write(3, "<?xml version=\"1.0\" encoding=\"ut"..., 2917) = 2917 close(3) = 0 -- open("./mime_copy/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml.new", O_RDWR) = 3 fdatasync(3) = 0 close(3) = 0 rename("./mime_copy/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml.new", "./mime_copy/application/vnd.ms-powerpoint.presentation.macroenabled.12.xml") = 0 mkdir("./mime_copy/application", 0755) = -1 EEXIST (File exists) open("./mime_copy/application/x-applix-word.xml.new", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd94ec49000 As you can see, two times, the file is created, written, synced and moved. This means that the first file (7186 bytes) was overwritten with the second one (2917 bytes) immediately. Here are the two files: https://gist.github.com/thejh/46f8e6621a7f51b0a484 I'm not very familiar with the whole MIME system, but that's a bug, isn't it?
If types that only differ in case should be treated the same, what would be the best way to fix it? Create functions g_str_equal_nocase and g_str_hash_nocase or so that call g_str_equal and g_str_hash with the argument lowercased, then pass pointers to those functions to g_hash_table_new?
I would write a patch, but I'm not sure what the expected behavior is here. Can some developer please comment on that?
Adding self to CC if not already on
That's not a bug. Mime-types are case unsensitive. See: https://bugs.freedesktop.org/show_bug.cgi?id=62473 for details. > On my system (Debian jessie), the two MIME types > 'application/vnd.ms-powerpoint.presentation.macroenabled.12' and > 'application/vnd.ms-powerpoint.presentation.macroEnabled.12' that > only differ in case exist They're not two different mime-types, it's a duplicated mime-type.
So what exactly is your position on duplicate mimetypes? "They must not exist and if they do, that invokes undefined behavior"? "They must not exist and if they do, one copy wins and the others are discarded silently?" In my opinion, if a tool can't cope with its input, it should throw an error message, or at least a warning, instead of blindly soldiering on. > That's not a bug. Mime-types are case unsensitive. And what I complained about here specifically is that the hashtable "types" is case-sensitive (while filenames are lowercased). If MIME types are case-insensitive, shouldn't the hashtable be case-insensitive, too?
I opened this bug because it blocks the patch in bug #82711. The problem is that update-mime-database ends up writing a new file, then immediately writes over the same file again. Certainly that's not desirable behavior?
(In reply to Jann Horn from comment #5) > So what exactly is your position on duplicate mimetypes? > "They must not exist and if they do, that invokes undefined behavior"? > "They must not exist and if they do, one copy wins and the others are > discarded silently?" > > In my opinion, if a tool can't cope with its input, it should throw an error > message, or at least a warning, instead of blindly soldiering on. My "position" is the status quo. It's not mentioned in the spec, and the behaviour is undefined. In our case, we don't merge, we override. So, yes, the others would be discarded silently. > > That's not a bug. Mime-types are case unsensitive. > > And what I complained about here specifically is that the hashtable "types" > is case-sensitive (while filenames are lowercased). If MIME types are > case-insensitive, shouldn't the hashtable be case-insensitive, too? Feel free to submit a test case that would fail before that change, and wouldn't afterwards. If you want a change of behaviour, then it must first be defined, and for that, discussions happen on the xdg mailing-list.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.