Bug 335 - Chemical MIME-types database extension
Summary: Chemical MIME-types database extension
Status: RESOLVED NOTOURBUG
Alias: None
Product: shared-mime-info
Classification: Unclassified
Component: freedesktop.org.xml (show other bugs)
Version: unspecified
Hardware: x86 (IA32) All
: high enhancement
Assignee: Shared Mime Info group
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-03-16 10:03 UTC by Niels
Modified: 2012-04-30 02:54 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Chemical mimetypes (10.70 KB, patch)
2005-11-01 07:13 UTC, Niels
Details | Splinter Review
Chemical mimetypes (8.24 KB, text/xml)
2007-01-10 14:59 UTC, Andrew Sobala
Details

Description Niels 2004-03-16 10:03:39 UTC
For my software to run properly I had to include the (as far as I know)
non-official but widely accepted and used chemical mime-types into my mime
database. It is in use now for a few months and it is used to generate
extensions from mimetypes. Many of those filetypes are ascii files and there is
no magic string for them. 

Maybe this could be made public as an optional download to merge with the main
database.

Informations of chemical mime types are compiled from:

http://www.ch.ic.ac.uk/chemime/
http://www.edcenter.sdsu.edu/repository/Mbcw/chemmime.html
http://hackberry.chem.trinity.edu/IJC/Text/
http://www.geocrawler.com/archives/3/319/1996/3/0/1766928/
http://zabib.chemie.uni-erlangen.de/external/cic/tagungen/workshop95/heuer/

Niels

-------------------

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE mime-info [
	<!ELEMENT mime-info (mime-type)+>
	<!ATTLIST mime-info xmlns CDATA #FIXED
"http://www.freedesktop.org/standards/shared-mime-info">

	<!ELEMENT mime-type (comment|glob|magic|root-XML)*>
	<!ATTLIST mime-type type CDATA #REQUIRED>

	<!ELEMENT comment (#PCDATA)>
	<!ATTLIST comment xml:lang CDATA #IMPLIED>

	<!ELEMENT glob EMPTY>
	<!ATTLIST glob pattern CDATA #REQUIRED>

	<!ELEMENT magic (match)+>
	<!ATTLIST magic priority CDATA #IMPLIED>

	<!ELEMENT match (match)*>
	<!ATTLIST match offset CDATA #REQUIRED>
	<!ATTLIST match type (string|big16|big32|little16|little32|host16|host32|byte)
#REQUIRED>
	<!ATTLIST match value CDATA #REQUIRED>
	<!ATTLIST match mask CDATA #IMPLIED>

	<!ELEMENT root-XML EMPTY>
	<!ATTLIST root-XML
		namespaceURI CDATA #REQUIRED
	localName CDATA #REQUIRED>
]>

<!--
The freedesktop.org shared MIME database (this file) was created by merging
several existing MIME databases (all released under the GPL).

It comes with ABSOLUTELY NO WARRANTY, to the extent permitted by law. You may
redistribute copies of update-mime-database under the terms of the GNU General
Public License. For more information about these matters, see the file named
COPYING.

The latest version is available from:

	http://www.freedesktop.org/standards/shared-mime-info.html

To extend this database, users and applications should create additional
XML files in the 'packages' directory and run the update-mime-database
command to generate the output files.
-->

<!--
Informations of chemical mime types are compiled from:

http://www.ch.ic.ac.uk/chemime/
http://www.edcenter.sdsu.edu/repository/Mbcw/chemmime.html
http://hackberry.chem.trinity.edu/IJC/Text/
http://www.geocrawler.com/archives/3/319/1996/3/0/1766928/
http://zabib.chemie.uni-erlangen.de/external/cic/tagungen/workshop95/heuer/

copy into /usr/local/share/mime/packages and run /usr/local/bin/update-mime-database

ignore Warning: Unknown media type in type 'chemical/xyz'
-->

<mime-info xmlns="http://www.freedesktop.org/standards/shared-mime-info">

<!--
Alchemy file
-->
	<mime-type type="chemical/x-alchemy">
		<comment>Alchemy file</comment>
		<glob pattern="*.alc"/>
	</mime-type>

<!--
CACTVS binary file
-->
	<mime-type type="chemical/x-cactvs-binary">
		<comment>CACTVS binary file</comment>
		<glob pattern="*.cbin"/>
	</mime-type>

<!--
CACTVS ascii file
-->
	<mime-type type="chemical/x-cactvs-binary">
		<comment>CACTVS ascii file</comment>
		<glob pattern="*.cascii"/>
	</mime-type>
<!--
CACTVS table file
-->
	<mime-type type="chemical/x-cactvs-binary">
		<comment>CACTVS table file</comment>
		<glob pattern="*.ctab"/>
	</mime-type>

<!--
ChemDraw file
http://sdk.cambridgesoft.com/chemdraw/cdx/Clipboard.htm
-->
	<mime-type type="chemical/x-cdx">
		<comment>ChemDraw file</comment>
		<magic priority="50">
			<match type="string" value="VjCD0100" offset="0"/>
		</magic>
		<glob pattern="*.cdx"/>
	</mime-type>

<!--
MSI Cerius II file
-->
	<mime-type type="chemical/x-cerius">
		<comment>MSI Cerius II file</comment>
		<glob pattern="*.cer"/>
	</mime-type>

<!--
ChemDraw file
-->
	<mime-type type="chemical/x-chemdraw">
		<comment>ChemDraw file</comment>
		<glob pattern="*.chm"/>
	</mime-type>

<!--
Crystallographic Interchange file
-->
	<mime-type type="chemical/x-cif">
		<comment>Crystallographic Interchange file</comment>
		<glob pattern="*.cif"/>
	</mime-type>

<!--
Macromolecular Crystallographic Interchange file
-->
	<mime-type type="chemical/x-mmcif">
		<comment>Macromolecular Crystallographic Interchange file</comment>
		<glob pattern="*.mcif"/>
	</mime-type>

<!--
Chem3D file
-->
	<mime-type type="chemical/x-chem3d">
		<comment>Chem3D file</comment>
		<glob pattern="*.c3d"/>
	</mime-type>

<!--
CrystalMaker file
-->
	<mime-type type="chemical/x-cmdf">
		<comment>CrystalMaker file</comment>
		<glob pattern="*.cmdf"/>
	</mime-type>

<!--
Compass Takahashi file
-->
	<mime-type type="chemical/x-compass">
		<comment>Compass Takahashi file</comment>
		<glob pattern="*.cpa"/>
	</mime-type>

<!--
Crossfire file
-->
	<mime-type type="chemical/x-crossfire">
		<comment>Crossfire file</comment>
		<glob pattern="*.bsd"/>
	</mime-type>

<!--
Chemical Markup Language file
-->
	<mime-type type="chemical/x-cml">
		<comment>Chemical Markup Language file</comment>
		<glob pattern="*.cml"/>
	</mime-type>

<!--
Chemical Style Markup Language file
-->
	<mime-type type="chemical/x-csml">
		<comment>Chemical Style Markup Language file</comment>
		<glob pattern="*.csml"/>
		<glob pattern="*.csm"/>
	</mime-type>

<!--
Gasteiger group file
-->
	<mime-type type="chemical/x-ctx">
		<comment>Gasteiger group file</comment>
		<glob pattern="*.ctx"/>
	</mime-type>

<!--
SMILES file
-->
	<mime-type type="chemical/x-daylight-smiles">
		<comment>SMILES file</comment>
		<glob pattern="*.smi"/>
	</mime-type>

<!--
EMBL nucleotide file
-->
	<mime-type type="chemical/x-embl-dl-nucleotide">
		<comment>EMBL nucleotide file</comment>
		<glob pattern="*.spc"/>
	</mime-type>

<!--
SPC spectral and chromatographic data file
-->
	<mime-type type="chemical/x-galactic-spc">
		<comment>SPC spectral and chromatographic data file</comment>
		<glob pattern="*.spc"/>
	</mime-type>

<!--
GAMESS input file
-->
	<mime-type type="chemical/x-gamess-input">
		<comment>GAMESS Input file</comment>
		<glob pattern="*.inp"/>
		<glob pattern="*.gam"/>
	</mime-type>

<!--
Gaussian input file
-->
	<mime-type type="chemical/x-gaussian-input">
		<comment>Gaussian input file</comment>
		<glob pattern="*.gau"/>
	</mime-type>

<!--
Gaussian Checkpoint file
-->
	<mime-type type="chemical/x-gaussian-checkpoint">
		<comment>Gaussian Checkpoint file</comment>
		<glob pattern="*.fch"/>
		<glob pattern="*.fchk"/>
	</mime-type>

<!--
Gaussian Cube (Wavefunction) file
-->
	<mime-type type="chemical/x-gaussian-cube">
		<comment>Gaussian Cube (Wavefunction) file</comment>
		<glob pattern="*.cub"/>
	</mime-type>

<!--
ToGenBank file
-->
	<mime-type type="chemical/x-genbank">
		<comment>ToGenBank file</comment>
		<glob pattern="*.gen"/>
	</mime-type>

<!--
IsoStar Library of intermolecular interactions file
-->
	<mime-type type="chemical/x-isostar">
		<comment>IsoStar Library of intermolecular interactions file</comment>
		<glob pattern="*.istr"/>
		<glob pattern="*.ist"/>
	</mime-type>

<!--
JCAMP Spectroscopic Data Exchange
-->
	<mime-type type="chemical/x-jcamp-dx">
		<comment>JCAMP Spectroscopic Data Exchange file</comment>
		<glob pattern="*.jdx"/>
		<glob pattern="*.dx"/>
	</mime-type>

<!--
Kinetic (Protein Structure) Images file
-->
	<mime-type type="chemical/x-kinemage">
		<comment>Kinetic (Protein Structure) Images file</comment>
		<glob pattern="*.kin"/>
	</mime-type>

<!--
MacMolecule file
-->
	<mime-type type="chemical/x-macmolecule">
		<comment>MacMolecule file</comment>
		<glob pattern="*.mcm"/>
	</mime-type>

<!--
MacroModel molecular mechanics file
-->
	<mime-type type="chemical/x-macromodel-input">
		<comment>MacroModel molecular mechanics file</comment>
		<glob pattern="*.mmd"/>
		<glob pattern="*.mmod"/>
	</mime-type>

<!--
MDL molfile (e.g. Rasmol, IsisDraw)
text file, no pattern
-->
	<mime-type type="chemical/x-mdl-molfile">
		<comment>MDL molfile</comment>
		<glob pattern="*.mol"/>
	</mime-type>

<!--
Reaction-data file
-->
	<mime-type type="chemical/x-mdl-rdfile">
		<comment>Reaction-data file</comment>
		<glob pattern="*.rd"/>
	</mime-type>

<!--
MDL reaction file
-->
	<mime-type type="chemical/x-mdl-rxnfile">
		<comment>MDL reaction file</comment>
	    <magic priority="50">
	        <match type="string" value="$RXN\x0D\x0A" offset="0"/>
		</magic>
		<glob pattern="*.rxn"/>
	</mime-type>

<!--
MDL structure file
-->
	<mime-type type="chemical/x-mdl-sdfile">
		<comment>MDL structure file</comment>
		<glob pattern="*.sd"/>
	</mime-type>

<!--
ISIS-Draw transportable graphics file
textfile
-->
	<mime-type type="chemical/x-mdl-tgf">
		<comment>ISIS-Draw transportable graphics file</comment>
		<glob pattern="*.tgf"/>
	</mime-type>

<!--
CA molekular information file
-->
	<mime-type type="chemical/x-mif">
		<comment>CA molekular information file</comment>
		<glob pattern="*.mif"/>
	</mime-type>

<!--
SYBYL molecule file
-->
	<mime-type type="chemical/x-mol2">
		<comment>SYBYL molecule file</comment>
		<glob pattern="*.mol2"/>
	</mime-type>

<!--
Molconn-Z file
-->
	<mime-type type="chemical/x-molconn-Z">
		<comment>Molconn-Z file</comment>
		<glob pattern="*.b"/>
	</mime-type>

<!--
MOPAC input file
-->
	<mime-type type="chemical/x-mopac-input">
		<comment>MOPAC input file</comment>
		<glob pattern="*.mop"/>
	</mime-type>

<!--
MOPAC graph file
-->
	<mime-type type="chemical/x-mopac-graph">
		<comment>MOPAC graph file</comment>
		<glob pattern="*.gpt"/>
	</mime-type>

<!--
NCBI asn1 file (old form)
-->
	<mime-type type="chemical/x-ncbi-asn1">
		<comment>NCBI asn1 file (old form)</comment>
		<glob pattern="*.asn"/>
	</mime-type>

<!--
NCBI asn1 file
-->
	<mime-type type="chemical/x-ncbi-asn1-binary">
		<comment>NCBI asn1 file</comment>
		<glob pattern="*.val"/>
	</mime-type>

<!--
Protein DataBank (PDB) file
-->
	<mime-type type="chemical/x-pdb">
		<comment>Protein DataBank (PDB) file</comment>
		<glob pattern="*.pdb"/>
	</mime-type>

<!--
SWISS-PROT protein sequence file
-->
	<mime-type type="chemical/x-swissprot">
		<comment>SWISS-PROT protein sequence file</comment>
		<glob pattern="*.sw"/>
	</mime-type>

<!--
VMS file (Versailles Agreement on Materials and Standards)
-->
	<mime-type type="chemical/x-vamas-iso14976">
		<comment>VMS file (Versailles Agreement on Materials and Standards)</comment>
		<glob pattern="*.vms"/>
	</mime-type>

<!--
Visual Molecular Dynamics file
-->
	<mime-type type="chemical/x-vmd">
		<comment>Visual Molecular Dynamics file</comment>
		<glob pattern="*.vmd"/>
	</mime-type>

<!--
Xtelplot file
-->
	<mime-type type="chemical/x-xtel">
		<comment>Xtelplot file</comment>
		<glob pattern="*.xtel"/>
	</mime-type>

<!--
Co-ordinate animation file
-->
	<mime-type type="chemical/x-xyz">
		<comment>Co-ordinate animation file</comment>
		<glob pattern="*.xyz"/>
	</mime-type>

<!--
GCG file
-->
	<mime-type type="chemical/x-gcg8-sequence">
		<comment>GCG file</comment>
		<glob pattern="*.gcg"/>
	</mime-type>

<!--
XMol file
-->
	<mime-type type="chemical/x-cxf">
		<comment>XMol file</comment>
		<glob pattern="*.cxf"/>
	</mime-type>


<!--
JDX spectra file
-->
	<mime-type type="chemical/x-jcamp-dx">
		<comment>JDX spectra file</comment>
		<glob pattern="*.jdx"/>
	</mime-type>

<!--
Beilstein Rosdal file
-->
	<mime-type type="chemical/x-rosdal">
		<comment>Beilstein Rosdal file</comment>
		<glob pattern="*.ros"/>
	</mime-type>

<!--
HPGL vector graphic file
-->
	<mime-type type="application/x-hgl">
		<comment>HPGL vector graphic file</comment>
		<glob pattern="*.hgl"/>
		<glob pattern="*.hpgl"/>
	</mime-type>

<!--
	<mime-type type="">
		<comment> file</comment>
		<glob pattern="*."/>
	</mime-type>
-->

</mime-info>
Comment 1 Abel Cheung 2005-02-12 15:28:20 UTC
ping? Chemistry software have already been using some or all of these file
extensions.
Comment 2 Christian - Manny Calavera - Neumair 2005-10-30 01:41:24 UTC
Should we include this in freedesktop.org.xml or add a new
freedesktop.org.chemistry.xml file, which would also be distributed in the
shared-mime-info package?
Comment 3 Bastien Nocera 2005-11-01 03:18:31 UTC
Adding another file altogether might be easier, as long as the build work is
done as well. Niels, could you attach the file, rather than inline it?
Comment 4 Niels 2005-11-01 07:13:47 UTC
Created attachment 3675 [details] [review]
Chemical mimetypes
Comment 5 Bastien Nocera 2006-07-09 02:48:14 UTC
Comment on attachment 3675 [details] [review]
Chemical mimetypes

><?xml version="1.0" encoding="UTF-8"?>
><!DOCTYPE mime-info [
>	<!ELEMENT mime-info (mime-type)+>
>	<!ATTLIST mime-info xmlns CDATA #FIXED "http://www.freedesktop.org/standards/shared-mime-info">
>
>	<!ELEMENT mime-type (comment|glob|magic|root-XML)*>
<snip>
>	localName CDATA #REQUIRED>
>]>

I'm not sure whether repeating the DTD here is necessary.

><!--
>The freedesktop.org shared MIME database (this file) was created by merging
>several existing MIME databases (all released under the GPL).
>
>It comes with ABSOLUTELY NO WARRANTY, to the extent permitted by law. You may
>redistribute copies of update-mime-database under the terms of the GNU General
>Public License. For more information about these matters, see the file named
>COPYING.
>
>The latest version is available from:
>
>	http://www.freedesktop.org/standards/shared-mime-info.html
>
>To extend this database, users and applications should create additional
>XML files in the 'packages' directory and run the update-mime-database
>command to generate the output files.
>-->

The comment can go altogether.

><!--
>Informations for chemical mime types from:
>http://www.ch.ic.ac.uk/chemime/
>http://www.edcenter.sdsu.edu/repository/Mbcw/chemmime.html
>http://hackberry.chem.trinity.edu/IJC/Text/
>http://www.geocrawler.com/archives/3/319/1996/3/0/1766928/
>http://zabib.chemie.uni-erlangen.de/external/cic/tagungen/workshop95/heuer/
>
>copy into /usr/local/share/mime/packages and run /usr/local/bin/update-mime-database

That should be removed.

>ignore Warning: Unknown media type in type 'chemical/xyz'

Why do we get this warning, and why would we ignore it?

>-->
>
><mime-info xmlns="http://www.freedesktop.org/standards/shared-mime-info">
>
><!--
>Alchemy file
>-->
>	<mime-type type="chemical/x-alchemy">
>		<comment>Alchemy file</comment>
>		<glob pattern="*.alc"/>
>	</mime-type>

For all the mime-type, you need to:
- Mark the comment as translatable (should be _comment, not comment)
- Show the acronym, and the expanded acronyms if necessary
- Add magic data if it is available
- Remove the comment about which type it is above, we already have it in the
text itself.

<snip>
><!--
>	<mime-type type="">
>		<comment> file</comment>
>		<glob pattern="*."/>
>	</mime-type>
>-->

And that should obviously go.

></mime-info>
Comment 6 Andrew Sobala 2007-01-10 14:56:21 UTC
On "Warning: Unknown media type in type 'chemical/xyz'"

> Why do we get this warning, and why would we ignore it?

We get this warning because the top-level list of mime types (ie. "application",
"text", "inode" etc.) is hardcoded in update-mime-database. It doesn't
understand "chemical". Easy-patch.
Comment 7 Andrew Sobala 2007-01-10 14:59:17 UTC
Created attachment 8354 [details]
Chemical mimetypes

chemical_mimetypes.xml with the following changes:

* Made descriptions translatable
* Removed extraneous comments
* Added missing NCBI mimetypes for flat sequence files
(http://msdlocal.ebi.ac.uk/docs/mimetype.html from 1996, now widely in use)
* Added sub-type-of text/plain for a small selection of files that I know are
flat text

I haven't expanded all the acronyms, because I don't know what all the acronyms
are. I also don't know exhaustively which formats are based on text/plain.
Maybe the original submitter does?
Comment 8 Bastien Nocera 2007-08-01 08:09:43 UTC
Niels, any news on the comments made by Andrew?
Comment 9 Daniel Leidert 2008-01-15 10:42:18 UTC
The chemical MIME types are currently (to be honest: for a while now) collected in an external project: http://chemical-mime.sf.net (package: chemical-mime-data). The project creates the shared-mime-info entries and it contains the magic pattern for other systems, like libmagic/gnome-mime-data/KDE3. I also discussed this on the xdg list some time ago.

We are still discussing, how we should proceed with chemical/* in the future, because it has never been registered and currently there is also no RfC. That was the reason, why the discussion at the xdg list came to the conclusion, to not add it to the shared-mme-info database officially. So I'm wondering a bit about this bug still beeing open.

But I would like to request to not print out a warning for every MIME type running update-mime-database. I could e.g. imagine the following possibe solutions:

- accept chemical/ as primary and do not print warnings
- collect the warnings and then print
  Warning: foo (x times)

  maybe add an verbose option to output a warning per MIME type (including the
  MIME type in the warning) - like it is currently
- just print one warning, that chemical/* has been found, which is not official,
  but do this just once, not for every MIME type

Because chemical/* has been adopted, but is not accepted, I would probably prefer the middle or latter solution.

The chemical-mime project has also already been accepted to be hosted on fd.o. However, I left it on sf.net, but I could move it to fd.o, if someone thinks, it might be better to have it near shared-mime-info. Please don't hesitate to tell me your opinion!
Comment 10 Daniel Leidert 2008-10-09 04:09:03 UTC
The list of MIME types, update-mime-database complains about gets longer and longer [1] and people seem to be annoyed by it. I wonder, if we could turn the

g_warning("Unknown media type in type ...", name);

into a

g_message();

so output is only enabled, if -V is used? Do you really need the warning here? I understand, that a warning could be useful to find mtypos in media types. Maybe you could collect the warnings and then say something like this:

Unknown media type in types: ...list of all types, for which this warning was reported...

so the message appears just once?

[1] KDE defined some unusual MIME types, e.g.: uri/*, fonts/package.
Comment 11 Bastien Nocera 2011-05-25 07:09:42 UTC
(In reply to comment #10)
> The list of MIME types, update-mime-database complains about gets longer and
> longer [1] and people seem to be annoyed by it. I wonder, if we could turn the
> 
> g_warning("Unknown media type in type ...", name);
> 
> into a
> 
> g_message();
> 
> so output is only enabled, if -V is used? Do you really need the warning here?
> I understand, that a warning could be useful to find mtypos in media types.
> Maybe you could collect the warnings and then say something like this:
> 
> Unknown media type in types: ...list of all types, for which this warning was
> reported...
> 
> so the message appears just once?

I'd much rather you made the effort of registering chemical/ as a top-level media type, or changed the names of the mime-types.

> [1] KDE defined some unusual MIME types, e.g.: uri/*, fonts/package.

That doesn't mean it's correct to do so...
Comment 12 Gökçen Eraslan 2012-04-29 23:50:52 UTC
Any news on that? Without those mimetypes it's not possible to view chemical data files in according viewer applications by double-clicking on them.
Comment 13 Daniel Leidert 2012-04-30 00:19:09 UTC
The chemical MIME types are part of chemical-mime (http://chemical-mime.sf.net), which has been packaged for several Linux distributions.
Comment 14 Bastien Nocera 2012-04-30 02:54:21 UTC
I'll close this as fixed then, following comment 13.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.