Blue Gene 2002

 IBM and NeSC workshop on Protein Science

    National e-Science Centre, Edinburgh,  March 15-16 2002

   
   
The Quantum Biochemical Database
Lance M. Westerhoff and Kenneth M. Merz Jr.
The Pennsylvania State University - Department of Chemistry


In this project we are populating a database with the electronic structures (both gas and aqueous phase) for ~5000 high-resolution (<2Å resolution) protein, DNA, and RNA X-ray structures contained within the Protein Data Bank (PDB). With recent advances in linear scaling quantum mechanics algorithms, one can create a database encompassing quantum mechanical data derived from the structures of a range of biological macromolecules in a straightforward manner. The potential applications for the data contained within such a quantum bioinformatics database (QBD) are manifest. For example, with the aid of such a database, we could provide better charge models and have a much better understanding of the general electrostatic characteristics of biomolecules. Furthermore, through the use of energy and interaction decomposition
schemes, we will be able to obtain a more detailed picture of long and short-range interactions that govern the structure and function of biomolecules. This information could then be used to design next generation force fields as well as providing insights into protein/small molecule interactions and biomolecular stability and folding. Overall,
the number and type of queries are too large to list when one merges sequence, geometric structure, environment (gas-phase versus aqueous phase), and electronic structure into one addressable data structure. Recently, we have successfully constructed the infrastructure and Version 1.0 interface to this database, available at
http://qbiodb.chem.psu.edu/, and we are now carrying out the large number of quantum chemical calculations required to populate the database with they aide of the National Center of Supercomputing Applications (NCSA). Once we have completed this initial set of PDB structures, we will continue to characterize structures of lower resolution. With time the sophistication of this database will evolve (e.g., semiempirical calculations give way to DFT calculations) and we will continue on to include lower resolution structures within the PDB until we have characterized the electronic structures of most, if
not all, the biomolecules present within the PDB. Clearly, this is an ambitious effort, but we are convinced that the return from this effort will have a profound effect on our understanding of the structure and function of biomolecular systems.
SPONSORS
National e-Science Centre (NeSC)
The University of Edinburgh
SYMPOSIUM INFORMATION
IBM logo BlueGene logo NeSC logo