CANGEN is a two-stage algorithm in computational molecular graph theory, which converts an arbitrarily entered SMILES notation of a chemical structure into a unique one. The first stage is a canonicalization procedure that labels each atomic node of the molecular graph such that a canonical order for the nodes is derived. The second stage, then, generates the unique linear notation of the graph by starting with the lowest labeled atomic node.
A CANGEN-derived notation of a molecular structure is an efficient search key to locate information for the encoded structure in a database or via Internet, while containing and transmitting the structural information along as a key name.
Keywords: molecular graphs, molecular connectivity, molecular data exchange, disambiguation, semantic web
Reference
D. Weininger, A. Weininger and J. L. Weininger: SMILES 2. Algorithm for Generation of Unique SMILES Notation. J. Chem. Inf. Comput. Sci. 1989, 29, 97-101.
DOI: 10.1021/ci00062a008.
Per Google Scholar, here's a source:
ReplyDeletehttp://www.iocd.unam.mx/organica/seminario/2-3.pdf
Thanks! I've been looking to do this.