Reaction Searching with WebReactions
A major activity for synthetic chemists is to find literature precedents for a reaction query, either a particular transformation leading to a specified compound or a more general search for matching reaction analogs at various levels of similarity. Since reaction databases became available for the computer, the traditional searching mode has been via the structures of reactant and product.
These (sub-)structure based searches often retrieved either no hits at all or too many hits to be looked at in reasonable time. A more generalized or more closely specified query reaction had to be run again. The procedure was as cumbersome for the user as it required CPU resources from the database server.
WebReactions introduces a new concept for the retrieval of reactions from a large database in which reactions are indexed by the bond changes that occur. When a synthetic chemist thinks of a reaction, he envisions first the making and breaking of bonds at the reaction center as the defining nature of the reaction. Subsequently he considers the effects of surrounding groups, i.e., on rate, hindrance, or resistance to change under the reaction conditions. WebReactions mirrors this approach for indexing reaction entries in any database.
The solution to the problem of retrieval speed is to generalize and abstract the reaction entries first with a rough but rigorous description of the reaction change itself. This then affords a basis for an index of all the entries ordered by the groupings of this generalized format. Hence when a query is also formulated in the same terms, the first search instantly brings up all matched entries since they are all grouped together in the index.
In this way we capture very fast a relatively tiny subset which contains all relevant samples, irrespective of what structures are attached unchanged around the reaction center. This in turn provides a much smaller field for further search, so that subsequent retrieval for more detail is much faster.
In WebReactions the database entries are taxonomically indexed with these successively nested subheadings:
- rigorous digital generalization of the reaction class and type
- the nature of substitution surrounding the reaction center
- the nature of entering and/or leaving groups
- features in the reactant which remain unchanged in the reaction
The program then formulates the query reaction in the same terms as those above for the database entries, and so instantly locates all entries with the same features as the query. It then directly shows the number of hits on the screen.
In fact WebReactions initially moves down the nested matching criteria only as far as necessary to provide about 10-20 hits, enough for easy examination by the user. The user can subsequently fine-tune for himself the extent of similarity desired between the query and the entries found, to afford either more or less closely refined pruning of the matches. Thus tuning to a manageable number of hits is fast and facile.
Practical Hints for Using WebReactions
WebReactions does not run a reaction substructure search like conventional reaction database systems. It rather performs a customizable reaction similarity search with focus on the reaction center. As an implication you don't have to think of a substructure that is small enough to retrieve a decent amount of hits and yet specific enough to not retrieve half of the database. In WebReactions you simply draw a complete reaction, i.e. the reaction as you might draw it into a lab journal. WebReactions will detect any reaction centers for you automatically and then retrieve about a dozen most similar reactions from the database.
Query reactions should be stochiometrical, i.e. every atom on the reactant side should be present on the product side as well and vice versa. This is particularly important for carbon atoms being part of a reaction center. An exception to this rule are refunctionalization reactions. Incoming or leaving groups being attached via a hetero atom may not be drawn on one side of the reaction (e.g. bromine as reactant in a bromination, or HBr as product of an elimination).