Integrative Biology 335:
Systematics of Plants
Molecular Systematics
Announcements:
The first Lecture Exam is on Wednesday, Feb. 25th and will be worth 10% of your final course grade. It will cover lectures 1 – 15, text readings, and the first two lecture assignments. For information on how to study and what to do if you miss an exam, click here.
As a study guide, view a copy of an old exam by downloading a PDF by clicking here. This year, this same exam is reproduced in the back of your Class Notes (pp. 251-259). Answers will not be posted (click here to find out why). Of course, we will gladly go over your answers with you, should you have concerns.
There is the possibility of a brief lecture review on Monday, Feb. 23, after family coverage has been completed. This will not be a formal review of material, but rather an opportunity for the class to ask questions. Anybody interested?
Text:
Plant Systematics, A Phylogenetic Approach (Third Edition) by Judd, Campbell, Kellogg, Stevens and Donoghue:
Chapter 5 (Molecular Systematics), pp. 103-117. You will be responsible for relevant material covered in this pages.
Web Resources:
General Objectives:
After studying this material you should be able to:
- Explain how the three genomes present in plant cells differ. (This information will come from your textbook reading.)
- Know the important structural and evolutionary characteristics of the chloroplast genome that make it especially favorable for phylogenetic analyses.
- Explain how different types of molecular data can be used to infer evolutionary relationships.
- Describe the differences between structural and point mutations and how each can be used in a phylogenetic analysis.
- Construct or interpret a cladogram inferred on the basis of phylogenetic analysis of molecular data.
"We are the products of the genes of our ancestors"
- The genetic material itself (the macromolecule DNA; specifically, DNA sequences) provides the most basic or fundamental characters that can be employed for phylogenetic inference and classification.
- Of the three genomes present in plant cells (nucleus, mitochondrion, chloroplast), the chloroplast genome has been the most widely used for phylogenetic inference. (In animals, the mitochondrial genome is more widely used.) These three plant genomes differ dramatically in their size, structure, and tempo and mode of evolution. Your textbook reading has more information on this subject.
- Individual genes or other DNA sequences from different species can be sequenced and compared. Indeed, whole chloroplast genomes can now be sequenced and compared!
- Basically, the more similar the DNA the closer the evolutionary relationship (chimp and human DNA sequences are 98.4% similar, for example).
- Many more characters are available for analysis that other types of evidence (and their interpretation is generally easier).
- Molecular data have revolutionized our view of plant relationships (see APG II results) and are now widely used for generating phylogenetic hypotheses at all hierarchical levels.
PCR: The Polymerase Chain Reaction
The automatic replication of DNA using an enzyme and repeated cooling and heating.
Produces millions of copies of DNA from one copy in under four hours!
Molecular systematics progressed greatly with the invention of the PCR technique (and with current advances in automated DNA sequencing).
A PCR vial contains all the necessary components for DNA duplication: a piece of DNA, large quantities of the four nucleotides, large quantities of the primer sequence, and DNA polymerase. The polymerase is the Taq polymerase, named for Thermus aquaticus, a bacterium discovered in Yellowstone National Park from which it was isolated.
Sources of Variation and Approaches Used
Mutations in DNA are of two general kinds:
1. Structural Rearrangements
- Inversions, and loss of genes and introns
2. Single nucleotide substitutions (or point mutations)
- INDIRECT INFERENCE: Restriction fragment comparisons (or restriction site
mapping)
DNAs are cut with restriction endonucleases at specific places. Patterns of restriction fragments, separated by gel electrophoresis and visualized directly under UV light or by hybridization to radioactive DNA probes, can be compared and scored for phylogenetic analysis.
A sample matrix of restriction site presence and absence and the resulting phylogenetic tree
Restriction site analysis is used less widely today than it was initially, but it remains common for studying variation among closely related taxa.
- DIRECT INFERENCE: The Sequencing of DNA
DNA sequencing of genes, parts of genes, or noncoding regions is now widely used. This method determines the precise order of nucleotides (A, C, G, T) in a stretch of DNA.
Thousands of nucleotides (or more) can be compared and computers are used to analyze the data and construct the phylogeny. The DNA used can be from any organism, living or dead, and even from fossils too.
Click here for blow up
The Alignment of DNA Sequences. Once sequences are generated, they must be aligned. This critical step determines which bases will be compared. "Gaps" are added to infer insertion and deletion events during the evolution of these DNA regions. This link also shows how gaps can also be scored for phylogenetic inference.
Phylogeny of 8 species based on DNA sequencing and phylogenetic analysis of point mutations using the method of maximum parsimony
Interpretation:
- Species relationships are based on shared derived characters (synapomorphies).
- Species E is more closely related to species F than to any other species; species C is more closely related to species D than to any other species; species group E & F is more closely related to species group G & H than it is to any other species group; species group E & F is sister group to species group G & H; species group A & B is sister group to species group C, D, E, F, G & H.
The Chloroplast Genome
Diagram of the chloroplast genome
- A small, compact, circular genome of about 135-160 kbp (the smallest of the three plant genomes)
- Multicopy (20-200 copies in every chloroplast; several thousand copies in each green leaf cell; constitutes one-fourth of all DNA in a cell)
- Consists of large (LSC) and small (SSC) single-copy regions separated by two inverted repeat regions
- Inherited uniparentally from the maternal (seed) parent
- Contains some 113 genes, 20 of which contain introns
- Structural rearrangments of the genome are rare (but when they occur, they are useful phylogenetically; e.g., losses of introns, inversions)
- Completely sequenced for many flowering plant species
- Different genome regions and loci have differential rates of sequence evolution, therefore has a wide array of taxonomic applications (more rapidly mutating loci are used to assess relationships among more closely related taxa; loci that mutate more slowly are more helpful in studies of older groups)
Gene-by-Gene Sequencing
- A gene or non-coding region of interest is chosen, isolated from a large number of plant species, and sequenced
- By far, the more common (and least expensive) approach
DNA Sequencing: RbcL data matrix. The chloroplast gene rbcL, encoding the large subunit of the photosynthetic enzyme rubisco (about 1428 bp in size), was the first gene to be sequenced widely in flowering plants, and the resultant phylogenetic trees had an enormous influence on our view of relationships among angiosperm families.
Many other chloroplast (and nuclear and mitochondrial) genes are now routinely sequenced and our refining our views of angiosperm relationships. The classification of flowering plants presented in your text and supported by the APG II is based on nucleotide sequence data from all three plant genomes (predominantly, rbcL, ndhF, rpoA, rpoC2, matK, and atpB from the chloroplast genome)
Whole-Genome Sequencing
- An entire chloroplast (or nuclear) genome is sequenced and the sequences of many genes from that genome are analyzed
- Technology is continuously improving and becoming increasingly automated
- Currently, whole chloroplast genomes can be PCR-amplified and sequenced using "next generation sequencing technologies"
- Fields of genomics and phylogenetic analyses of genomic data have developed; Bioinformatics - the use of computers to manipulate and analyze sequence data from genes and genomes
Tools and Resources for Molecular Systematic Analyses
- Phylogenetic Analysis Computer Programs
- Phylogeny Programs. A gallery of some 386 available software packages for phylogenetic analysis!
- Data bases
- NCBI Taxonomic Browser. A searchable database for all known DNA sequences.
- TreeBASE. A searchable database of data matrices and phylogenetic trees maintained by the Yale Peabody Museum.
- NCBI Blast. Searching nucleotide databases using a nucleotide query
What do we do?
- Obtain plant material (field collections, greenhouse material, herbarium specimens, fossils, other systematists)
- Extract DNA using a commerical kit from minute amounts of leaf tissue
- Select method and/or genome and/or gene region to uncover source of variation depending upon questions to be addressed (species or higher-level phylogeny, infraspecific studies, population differences: RAPDs, microsatellites, minisatellites, AFLPs). If comparing DNA sequences, preliminary studies of multiple loci are carried out to determine which genes will have a level of variation appropriate for the question being asked
- If comparing genes, use PCR to amplify gene(s) of choice
- Purify PCR product and prepare for DNA sequencing
- Carry out DNA sequencing reactions using a commerical kit and/or have it sequenced at a biotechnology facility, like the W.M. Keck Center for Comparative and Functional Genomics on campus
- Process the DNA sequences for phylogenetic analysis (edit, confirm, align)
- Repeat
- Build cladograms using a variety of phylogenetic methods to infer evolutionary relationships
- Compare cladograms inferred using different molecular loci, or different kinds of systematic data
- Knowing the evolutionary history of a group permits testing of other evolutionary hypotheses, such as phenotypic character evolution, patterns and processes of diversification, historical biogeography, rates of evolution, and co-evolutionary phenomena, and may also result in the production of a phylogenetic-based classification for the group.
Click here to get home!