Summary: | Problem with magic "host32" attribute and friends. | ||
---|---|---|---|
Product: | shared-mime-info | Reporter: | Yevgen Muntyan <muntyan> |
Component: | general | Assignee: | Jonathan Blandford <jrb> |
Status: | RESOLVED NOTABUG | QA Contact: | |
Severity: | normal | ||
Priority: | high | CC: | faure |
Version: | unspecified | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
test
Extract from the unit tests I wrote for kde |
Description
Yevgen Muntyan
2007-01-06 13:56:48 UTC
Created attachment 8314 [details]
test
Attached is few files with magic inside, and test.xml with their mime types.
test16-BE and test32-BE are the files which should be detected as
text/x-test-mime-N; testXX-native should not. If it's other way around, there's
a bug.
I am only someone trying to implement the spec (in kde, so I don't know the gnome implementation you are referring to), but I think I can bring my own interpretation here. host32 means "this integer are to be interpreted in this host's order". host means native. test16-BE should _not_ match, on little-endian hosts, and test16-native _should_ match, since host means native. If big endian should match and little endian shouldn't match, then the xml snippet should say big32, not host32. All this being said... there might be a problem with the jar file magic indeed. They start with \xCA \xFE, i.e. 0xFECA as little-endian. This means type="host16" value="0xcafe" is wrong indeed... It was a bug in an old version of the file(1) magic file... It said "short 0xcafe". It has been fixed in more recent versions to say: "belong 0xcafebabe", which looks much more correct to me. It does mean \xCA \xFE \xBA \xBE on any host, which is correct. (In reply to comment #2) > I am only someone trying to implement the spec (in kde, so I don't know the > gnome implementation you are referring to), It's xdgmime, the implementation used in GTK and Gnome. It lives in CVS here, mime/xdgmime/. But to see a version with bug fixes, use http://svn.gnome.org/viewcvs/gtk%2B/trunk/gtk/xdgmime/ . > but I think I can bring my own > interpretation here. host32 means "this integer are to be interpreted in this > host's order". host means native. test16-BE should _not_ match, on > little-endian hosts, and test16-native _should_ match, since host means native. > > If big endian should match and little endian shouldn't match, then the xml > snippet should say big32, not host32. This is also how I understand "big", "host", and "little", and I'd think it's the only sensible interpretation. The problem is it's not quite what xdgmime and update-mime-database do. I guess it just should be fixed, and the xml file should be fixed too. I said spec should be fixed because of problems with xdgmime (it's not released as a library, so everybody uses kind-of-private-branch), so I simply said whatever junk I had in mind; not because there are real correct reasons. I based my observations, among other things, on the data I saw in the magic file generated by update-mime-database. From my tests (on a little-endian machine only) I don't see a bug in update-mime-database related to endianness. I do see a bug in the jar magic though, which needs to be fixed. Are you sure there's a bug in update-mime-database? Can you explain which one exactly? (In reply to comment #4) > Are you sure there's a bug in update-mime-database? Can you explain which one > exactly? The relevant places are match_word_size() and parse_value() in http://webcvs.freedesktop.org/mime/shared-mime-info/update-mime-database.c?revision=1.41&view=markup parse_value() treats hostXX and bigXX in the same way. By the way, what I said in comment #1 is wrong, it's all backwards. text/x-test-mime-2 type has <match value="0x1234" type="host16" offset="0"/>, i.e. it should match bytes 0x34 0x12 on little-endian machine. But, xdgmime matches 0x12 0x34 (on little-endian machine, no Sun here). I am not sure whose bug it is, maybe xdgmime, maybe update-mime-database, maybe both. Looks like it's update-mime-database fault. > parse_value() treats hostXX and bigXX in the same way. Yes, but match_word_size doesn't, and this is why this is no bug in update-mime-database, which overall treats those two differently: host16 gives ">0=\0\x12\x34~2" in the generated magic file, while big16 gives ">0=\0\x12\x34" (and little16 gives ">0=\0\x34\x12"). When no word size (~2) is given, the data in the generated magic file is matched byte-per-byte with the data in the file, so big16 and little16 are generating correct output. When a word size is given (as happens when using host16), the data from the generated magic file is swapped on little-endian hosts, so we end up with \x34\x12, which is correct as well. I added all the above cases to my unit tests, and I can say that update-mime-database behaves just like I expect, now that I understand how it's supposed to work [the spec could certainly be much more verbose about this]. You said: > xdgmime matches 0x12 0x34 Then this is a bug in xdg mime, if you're sure. Its code looks correct though, so I guess the bug is simply that LITTLE_ENDIAN is not being defined? Created attachment 9206 [details]
Extract from the unit tests I wrote for kde
(In reply to comment #6) > > parse_value() treats hostXX and bigXX in the same way. > Yes, but match_word_size doesn't, and this is why this is no bug in > update-mime-database, which overall treats those two differently: > host16 gives ">0=\0\x12\x34~2" in the generated magic file, while big16 gives > ">0=\0\x12\x34" (and little16 gives ">0=\0\x34\x12"). Thanks for explanation! > You said: > > xdgmime matches 0x12 0x34 > Then this is a bug in xdg mime, if you're sure. Its code looks correct though, > so I guess the bug is simply that LITTLE_ENDIAN is not being defined? In case of magic file, looks so, yes. But the bug I see here is with the cache, and code looks like it actually compares byte by byte whatever is written in cache file with data from file, i.e. it treats hostXX entries as bigXX. I may be wrong again though (or my copy of xdgmime may not be in sync with what you see). In any case, looks like indeed there is no problem with update-mime-database, so this one is NOTABUG. Ah, when I was talking about the xdgmime code, I wasn't talking about the code that deals with the cache. If you see a bug there, better report it indeed. Anyway. I'll write a patch for the bug in the jar magic and I'll create a bug report for it. Sorry, I meant application/x-java, not jar. jar is correct. https://bugs.freedesktop.org/show_bug.cgi?id=10334 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.