I have some excel files (saved using gnumeric and oocalc) which are incorrectly
sniffed as application/msword files. The coming patch removes the magic string
responsible for that. I don't know enough about ms file formats to know whether
this magic pattern should be removed or replaced with something more accurate
Created attachment 89 [details] [review]
remove the problematic magic string
Jody, can you comment?
The signature in question has two problems
1) It is incomplete. There should be a trailing \032 \341
2) It corresponds to an OLE2 file and hence will match
- quattro pro
and lots of others.
recognizing OLE2 files gets hard. you need to know more about the file format
that magic is going to give you to look up the name of the streams it contains.
frankly I'd like to see some special structured file match operations that would
allow a specification where magic is merely 1 subtree. It could recognize
things like OLE2. Then we could have a distinct subtree for
or something like it
There are several related instances of this
What if we make the sniffed mimetype something like application/x-ole2-stream
(and fix up the signature), and then we handle that manually by prefering
extension for it. Like we do with gz and zip.
Seems like this is a pretty important mimetype to get right.
cvs has application/x-ole-storage