Molecular Markers
and Mapping Populations
I. Molecular Markers:
-largely neutral sites of variation at the DNA sequence level
-unlike morphological markers - do not show themselves in the phenotype
-advantage - much more numerous, do not disturb physiology (viable)
-discovered in 1980 by Botstein, Davis et al
Restriction fragment length polymorphism (RFLP) - the original molecular marker. They are DNA fragments that detect (on Southern blots) restriction site polymorphisms. They are co-dominant - can detect both homozygotes and heterozygotes and can be used in all members of a species or even related species.
-other examples of molecular markers to be discussed in class:
Randomly amplified polymorphic DNA (RAPD) these are not used a lot any more because they are not consistent from lab to lab.
Microsatellites (also called simple sequence repeats or SSRs) - these are tremendously popular now because they are co-dominant (can tell the difference between homozygotes and heterozygotes) and can be used in virtually any member of the species. SSR markers are available as PCR primers that flank a variable, repeated sequence.
Amplified fragment length polymorphism (AFLP)
AFLPs are used primarily to generate hundreds/thousands of markers in a population that is segregating for a trait of interest and identifying the few markers that are tightly linked to trait.
Transposon display (TD) (modification of AFLP) to be discussed in class
Cleaved amplified polymorphic sequences (CAPs)
-ESTs - expressed sequence tags - actually cDNAs
-many EST projects go hand in hand with genome sequencing projects
-make mRNA from a variety of organs, developmental times, induction regimes (pathogen attack, high or lo temperature, drought etc), from each of these mRNA collections - make partial or complete cDNAs, subclone or sequence directly. EST collections represents the so-called "transcriptome" - the transcribed part of the genome. Extremely valuable for the huge genome organisms like wheat, barley, even maize (get the "most bang for your sequencing buck")
Utility of ESTs - for example, for maize there is a public collection of ~50,000 ESTs and a private (DuPont) collection of ~600,000 ESTs. These can be used in many ways including (1) determining the number of expressed genes for a particular protein (e.g. actin, histones) (2) with an EST, one can design primers based on any sequence for TUSC (Mutator) knock-outs, (2) as molecular markers - since ESTs among related organisms are also related and frequently syntenous (at same chromosomal location), ESTs make excellent anchor markers (see below)
Anchor markers - landmarks selected from the developed marker map, selection based on even distribution across the genome and the ability to cross hybridize with related species. They are usually RFLPs but can also be ESTs. They are only rarely SSRs but never AFLP or RAPDs (it is important that you understand why this is so, we will discuss in class).
Adding genetic molecular markers to
a physical map:
-RFLP markers are DNA fragments that have been shown to detect polymorphism, usually on Southern blots. As such, their position on a BAC can be determined and the position of the BAC on the physical map will be already known. The same is true for SSR markers or for ESTs.
-Correlating physical and genetic map distances: If we have 2 RFLP markers that are 2 cM apart, we can determine what BACs they reside on and determine the physical distance between these RFLP markers.
II. Mapping Populations
With our current level of knowledge, most of the traits in complex organisms reflect the action of unknown genes. To find (isolate, sequence etc) those genes, you need to first map the trait to a chromosomal location (locus). Then, genes are cloned based on their map position - this is called positional cloning. Thus far, virtually all of the genes that have been cloned are single genes responsible for a trait (so-called Mendelian segregating traits). However, keep in mind that most traits (e.g. drought tolerance, yield, height) are caused by the action of many genes (so-called polygenes or quantitative trait loci or QTLs). Only in the last 2 years have QTLs been cloned. To identify the genes responsible for either Mendelian or polygenic traits, a mapping population is needed. There are several kinds of mapping populations that have different purposes for the plant geneticist. The major ones are described below:
F2 population: Parents with contrasting traits are crossed giving rise to F1 progeny which are then selfed giving rise to an F2 population where the trait segregates. The goal is to cross one parent with the trait of interest (like a mutant flower) with another parent with normal flowers. The F1's have normal flowers and the F2 segregates for the mutant flower (usually in a 3:1 ratio, normal to mutant). It is the F2 progeny that are analyzed for their molecular marker content. If more tissue is required than you can obtain in an F2 plant, you can self individual F2 plants and pool the F3 progeny.
The goal is to establish linkage between the mutant flower plants and molecular markers. The more F2 plants you analyze the tighter the linkage you can establish between certain markers and the trait. If you are working with a plant that has a poor genetic map, a popular way to identify markers is with AFLP. This will be described in class.
Near isogenic lines (NILs): this is usually the product of plant breeding programs where a desirable trait is introgressed from a donor plant (sometimes a wild relative) into an agronomically acceptable cultivar (the recurrent parent). The donor and recurrent parent are crossed and the trait is selected for in the F1 if it is dominant. If recessive, self, select the trait segregating in the F2 and cross these plants back to the recurrent parent (called a backcross, F2 X P2, or BC individuals). Progeny are again selected for the trait of interest and these plants are crossed back to P2. This cycle is repeated for 7 or 8 times. It takes years! The result is a pair of strains (recurrent and NIL) that are essentially identical at all loci except for the region surrounding the gene under selection.
Any polymorphism detected between the NIL and recurrent parent is likely to be near the selected locus.
Again, a quick way to identify markers in this region is using AFLP markers.
Recombinant Inbred lines (RIs): RI populations are usually created for a very different purpose than NILs or F2 populations. In the above cases, you usually (although not always) have a particular trait that you want to follow. However, it is also very important to create a single genetic map for the entire community - that is - a map that everyone can add to. So, let's say that you have cloned a gene by transposon tagging and want to know where it maps. Alternatively, you have SSR markers and you want to add it to the community map. Or, you have mapped a gene (using an F2 population, see above) and want to obtain markers nearby to initiate positional cloning. Read the Burr and Burr paper for a description of RIs.