Since most cheminformaticians do not regularly read JMB, I thought it might be useful to write a word or two about a recent publication of mine, which makes the following claims:
- it presents the first method to measure the similarity of two explicit reaction mechanisms
- it shows that this is a useful thing to do, especially in the context of biological reactions
The method used can be applied to any comparison of a sequence of sets. A pairwise similarity matrix of mechanism steps is calculated using Tanimoto coefficients or Euclidean distance (see paper for details), and the mechanisms are aligned and scored using the Needleman-Wunsch algorithm.
The mechanisms of biological reactions are taken from the MACiE database [1] of enzyme reaction mechanisms which has been developed over several years by the Mitchell Group, in collaboration with Prof. Janet Thornton (EBI, UK) and Prof. Peter Murray-Rust (Uni. of Cambridge, UK). Reaction mechanism data is taken directly from the literature, mainly from experimental work although in some cases from theoretical studies. GLH and DEA have put in a lot of hours entering this data, standardising it, developing a dictionary of terms, and so on.
The principal results were that:
- the proposed method could identify similar mechanisms, and found some cases of convergent evolution of chemical mechanism
- similar EC numbers indicate similar mechanism but only once the subclass is considered (that is, the class on its own is not very indicative of the mechanism)
Both of these studies highlight the fact that applying cheminformatic techniques to biological problems can yield some interesting results.
[1] G.L. Holliday, G.J. Bartlett, D.E. Almonacid, N.M. O'Boyle, P. Murray-Rust, J.M. Thornton & J.B.O. Mitchell MACiE: a database of enzyme reaction mechanisms Bioinformatics, 2005, 21, 4315-4316. [Open Access]
[2] D.A.R.S. Latino and J. Aires-de-Sousa, Genome-scale classification of metabolic reactions: a cheminformatics approach Angew. Chem. Int. Ed. 2006, 45, 2066–2069.