GENOMES AND EVOLUTION
The evolution of molecular architecture and phylogenomics
___________________________________________________
The principal focus of research for the next several years will be in the crossroads of genomics and evolution. One important challenge in the now incipient post-genomic era relates to the ‘mapping’ of genotype, phenotype, function and fitness to each other. We are particularly interested in the origins of molecular diversification, transcript networks, biological processes that are linked to co-evolutionary phenomena (such as plant pathogenesis and symbiosis), and the study of levels and patterns of genome-wide mutation. Our research program will focus inquiry on the molecular evolution of macromolecular structure and on phylogenomics.
In the past few years, genome, proteome and transcriptome research resulted in rapid acquisition of nucleic acid and protein sequences. While acquired information has been largely analyzed at the polymer sequence level, there is continuing recognition that higher order structure is fundamental to establish structure-function relationships in biological macromolecules. This has led for example to structural genomic initiatives (e.g., the creation of a complete inventory of protein folds from structural information). Similarly, advances in crystallography have provided unusual mechanistic views of complex macromolecular ensembles, such as the RNA polymerase complex, the nucleosome, and the ribosome. The increasing acquisition of structural information therefore promises to unravel details on the function, interaction and evolution of nucleic acid and protein molecules. However, this necessitates the development of tools for comparative analysis that focus on high-order macromolecular structure.
The idea that biological entities can be related through history of common descent constitutes a general and powerful organizing principle in biology and the basis for phylogenetic analysis of molecules and organisms. Phylogenies can be traced at different levels, from nucleic acid sequences, genes, and molecules to features in individuals, populations, lineages, and species. Since most functional constraints on evolutionary divergence of molecules operate at the level of tertiary structure, three-dimensional structures are generally more evolutionarily conserved than sequences. We have therefore chosen to reconstruct phylogenetic history directly from the structure of proteins and nucleic acids. Using cladistic analysis, we have compared RNA structures at a wide range of phylogenetic levels, from the subspecies analysis of a fungal tree pathogen (Caetano-Anolles et al. 2001) to the universal tree of life (Caetano-Anolles 2002). In these studies, structural attributes were treated as ordered multi-state cladistic characters, and these characters were polarized by a state transformation sequence (grounded in statistical mechanic principles) that assumes that molecules are optimized by a process that increases molecular order. This phylogenetic approach has been extended to the study of a wide variety of macromolecules, and can be used to unravel evolutionary processes and uncover functional relationships in transcript RNA and protein molecules.
Current studies: (1) compare systematically the structure of proteins and nucleic acids at different evolutionary levels, (2) establish which are the ‘contextual’ constraints imposed by the function and inherent properties of these molecules, and (3) delimit a structural morphospace for phylogenomic analysis. Characters that describe how folded, branched, plastic, modular and stable are macromolecules, are used to infer models of molecular change and explore the origin and diversification of life, the existence of lateral gene transfer, and the role of mRNA structure in transcript networks.