With the availability of complete sequences for genomes of numerous eubacteria, archaea, and eukaryotes, parallel structure/function studies of enzymes derived from a common progenitor is the best strategy for elucidating structure/function relationships for enzyme-catalyzed reactions. This approach, "genomic enzymology", allows an efficient and precise identification of essential structure/function relationships that are important in catalysis. It also allows an understanding of the strategies used by Nature to evolve "new" enzymes and, therefore, provides design principles for the in vitro design of novel enzymes that catalyzed unnatural reactions.
We are studying three superfamilies of enzymes that are derived from common ancestors that share the ubiquitous (β/α)8-barrel fold. The members of the enolase superfamily and catalyze different overall reactions initiated by abstraction of the α-proton of a carboxylate anion to form an enediolate anion intermediate that is stabilized by a Mg2+ in the active site. The members of the D-ribulose 1,5 bisphosphate carboxylase/oxygenase (RuBisCO) superfamily also utilize a Mg2+ to stabilize enediolate anion intermediates, although these are derived from phosphorylated ketoses. Finally, the members of the orotidine 5' monophosphate (OMP) decarboxylase suprafamily catalyze different reactions that do not share any mechanistic features; our discovery of this suprafamily supports the hypothesis that Nature is opportunistic and can both identify and utilize functionally versatile active site templates in the evolution of new enzymatic activities.
The active sites of members of the enolase superfamily contain a conserved binding site for a catalytically essential Mg2+. The acid/base groups are located on the rims of the active sites "cups" located at the C-terminal ends of (β/α)8-barrel domains, with one functional group located at the end of each of the eight β-strands. We have identified 20 functions within the enolase superfamily, each of which is initiated by abstraction of the α-proton of a carboxylate to generate an enolate anion that is stabilized by the Mg2+; many of the substrates are acid sugars or dipeptides. Members of the superfamily often are functionally promiscuous, i.e., in addition to their natural reaction they also catalyze an "accidental" reaction that may be exploited for the evolution of a new activity. Understanding how conserved structure delivers different functions has allowed us to 1) predict the functions of unknown homologues in the sequence databases; and 2) redesign "old" enzymes to catalyze "new" reactions.
The members of the RuBisCO superfamily that catalyze abstraction of the α-proton of a ketone to generate an enolate that is stabilized by an essential Mg2+. Most members of the superfamily catalyze the carboxylation of D-ribulose 1,5-bisphosphate to yield two molecules of 2-phosphoglyerate. However, genome sequencing projects have allowed identification of proteins that lack the active site residues required for CO2 fixation; these are called "RuBisCO-like proteins" or RLPs. We have established structure/function relationships for a family of RLPs that catalyzes a tautomerization reaction in the methionine salvage pathway and, also, identified a novel isomerization reaction involving an intermediate in methionine salvage that likely participates in a novel, uncharacterized salvage pathway. Our efforts are focused on establishing structure/function relationships for members of the superfamily as well as deciphering the metabolic pathways in which they participate.
OMP decarboxylase, the penultimate step in pyrimidine nucleotide biosynthesis, is one of Nature's most efficient catalysts. The decarboxylation reaction involves the metal ion independent formation of a vinyl carbanion intermediate. We are studying the mechanism of the reaction, with our focus on understanding how the active site destabilizes the substrate and, also, stabilizes the carbanion intermediate so that it can be kinetically competent. We also are studying the structural mechanism by which the "intrinsic binding energy" associated with the 5'-phosphate group of the substrate is used to enhance catalysis by promoting conformational changes that are essential for catalysis. These studies are being conducted in collaboration with Professors John P. Richard and Tina L. Amyes, University at Buffalo.
The OMP decarboxylase suprafamily includes not only OMP decarboxylase but also 3-keto-L-gulonate 6-phosphate decarboxylase and D-arabino-hex-3-ulose 6-phosphate synthase. In contrast to OMP decarboxylase, both utilize Mg2+ to stabilize an enolate anion intermediate. The active site residues are remarkably conserved in the suprafamily, although the mechanisms of the reactions are not conserved. The growing sequence databases now appear to contain novel members of the suprafamily that likely catalyze "new" reactions; the discovery of these reactions is under investigation.
As the sequence databases expand (now, >10,000,000 unique protein sequences are contained in the TrEMBL database), we are wanting to identify members of all three superfamilies that have unknown functions--these are determined in genome projects without regard to biological and biochemical function. Indeed, functional assignment of proteins of unknown function is a major challenge in postgenomic biology that is limiting both understanding the scope of Nature's diversity and, also, exploiting that diversity for biomedical and industrial applications.
To address this problem, we are leading two NIH funded projects that are developing an integrated sequence/structure based strategy to predict the substrates (and, therefore, functions) of unknown members of functionally diverse enzyme superfamilies that are discovered in genome projects. In these multidisciplinary projects, experts in functional biology (mechanistic enzymology), structural biology (X-ray crystallography), computational biology (bioinformatics, modeling, and in silico ligand docking), and microbiology (genetics, transcriptomics, and metabolomics) are brought together to address the problem of functional assignment of unknown proteins. This multidisciplinary approach reflects the intellectual and practical demands of assigning functions to unknown/uncharacterized enzymes.
A Program Project (P01GM071790, "Deciphering Enzyme Specificity") is focused on the functional diverse enolase and amidohydrolase (Frank Raushel, Texas A&M) superfamilies. A new Glue Grant (U54GM093342, "Enzyme Function Initiative") is additionally focused on the functionally diverse glutathione transferase (Richard Armstrong, Vanderbilt University School of Medicine), haloalkanoic acid dehalogenase (Karen Allen, Boston University; Debra Dunaway Mariano, New Mexico), and isoprenoid synthase (Dale Poulter, Utah) superfamilies. Protein production and structure determination is lead by Steve Almo (Albert Einstein College of Medicine). Collaborators at UCSF provide expertise in bioinformatics (Patsy Babbitt) and homology modeling/in silico ligand docking (Matt Jacobson, Andrej Sali, and Brian Shoichet). And, John Cronan (genetics) and Jonathan Sweedler (metabolomics) at Illinois are investigating the physiological roles of novel enzymatic activities that are discovered by the in silico and enzymological approaches.