SMILES is a user-friendly chemical language that allows input of molecular structures in molecular editors, databases, search engines and property estimation tools. For example, ChemSpider [2] accepts SMILES entries (also: nomenclature-based names, registry number or InChI). Looking for aspirin (2-(acetyloxy)benzoic acid)? Type the SMILES notation
CC(=O)Oc1ccccc1C(=O)O
into the search field. Notice that the six carbon atoms of the aromatic benzene ring have been entered in lower case to identify them as aromatic-ring members. Also, hydrogen atoms have not been explicitly specified, since their occurrence is deduced based on valence rules.What is “hiding” behind the notation
FC12C3(F)C4(F)C1(F)C5(F)C4(F)C3(F)C25F
?Correct, it is perfluorocubane (1,2,3,4,5,6,7,8-octafluorocubane). ChemSpider is finding it either way and it is your choice to type the SMILES notation or a name.
Computers “love” SMILES since they can automatically derive a connection table from the linear notation code, which is essential to draw the assocoated structure or to calculate molecular descriptors. Given a name such as 2-(acetyloxy)benzoic acid would require automatic structure interpretation on a much higher level—not so easy for a computer. And giving aspirin will cause a computer headache, speaking in human terms.
Keywords: molecular graphs, molecular connectivity, molecular data exchange
References
[1] David Weininger: SMILES, a Chemical Language and Information System. 1. Introduction to Methodology and Encoding Rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31-36.
DOI: 10.1021/ci00057a005.
[2] ChemSpider starting guide: How do you find compounds? [www.chemspider.com/GettingStarted.aspx].
No comments:
Post a Comment