I have some excel files (saved using gnumeric and oocalc) which are incorrectly sniffed as application/msword files. The coming patch removes the magic string responsible for that. I don't know enough about ms file formats to know whether this magic pattern should be removed or replaced with something more accurate
Created attachment 89 [details] [review] remove the problematic magic string
Jody, can you comment?
The signature in question has two problems 1) It is incomplete. There should be a trailing \032 \341 2) It corresponds to an OLE2 file and hence will match - excel - powerpoint - quattro pro and lots of others. recognizing OLE2 files gets hard. you need to know more about the file format that magic is going to give you to look up the name of the streams it contains. frankly I'd like to see some special structured file match operations that would allow a specification where magic is merely 1 subtree. It could recognize things like OLE2. Then we could have a distinct subtree for <ole2> <match stream="{Book,BOOK,book,Workbook,WORKBOOK,workbook">xls</match> or something like it There are several related instances of this - tar - xml - zip i
What if we make the sniffed mimetype something like application/x-ole2-stream (and fix up the signature), and then we handle that manually by prefering extension for it. Like we do with gz and zip. Seems like this is a pretty important mimetype to get right.
cvs has application/x-ole-storage
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.