Document Detail

Rapid isolation of gene homologs across taxa: Efficient identification and isolation of gene orthologs from non-model organism genomes, a technical report.
Jump to Full Text
MedLine Citation:
PMID:  21362165     Owner:  NLM     Status:  PubMed-not-MEDLINE    
BACKGROUND: Tremendous progress has been made in the field of evo-devo through comparisons of related genes from diverse taxa. While the vast number of species in nature precludes a complete analysis of the molecular evolution of even one single gene family, this would not be necessary to understand fundamental mechanisms underlying gene evolution if experiments could be designed to systematically sample representative points along the path of established phylogenies to trace changes in regulatory and coding gene sequence. This isolation of homologous genes from phylogenetically diverse, representative species can be challenging, especially if the gene is under weak selective pressure and evolving rapidly.
RESULTS: Here we present an approach - Rapid Isolation of Gene Homologs across Taxa (RIGHT) - to efficiently isolate specific members of gene families. RIGHT is based upon modification and a combination of degenerate polymerase chain reaction (PCR) and gene-specific amplified fragment length polymorphism (AFLP). It allows targeted isolation of specific gene family members from any organism, only requiring genomic DNA. We describe this approach and how we used it to isolate members of several different gene families from diverse arthropods spanning millions of years of evolution.
CONCLUSIONS: RIGHT facilitates systematic isolation of one gene from large gene families. It allows for efficient gene isolation without whole genome sequencing, RNA extraction, or culturing of non-model organisms. RIGHT will be a generally useful method for isolation of orthologs from both distant and closely related species, increasing sample size and facilitating the tracking of molecular evolution of gene families and regulatory networks across the tree of life.
Alison Heffer; Leslie Pick
Related Documents :
2341155 - Evolution of human alpha 1-acid glycoprotein genes and surrounding alu repeats.
20004235 - Characterization of luciferases and its paralogue in the panamanian luminous click beet...
19001295 - High nucleotide divergence in developmental regulatory genes contrasts with the structu...
20544345 - The complete inventory of receptors encoded by the rat natural killer cell gene complex.
21765995 - The role of epigenetic alterations in papillary thyroid carcinogenesis.
23186325 - Analyses of expressed sequence tags in neurospora reveal rapid evolution of genes assoc...
Publication Detail:
Type:  Journal Article     Date:  2011-03-01
Journal Detail:
Title:  EvoDevo     Volume:  2     ISSN:  2041-9139     ISO Abbreviation:  Evodevo     Publication Date:  2011  
Date Detail:
Created Date:  2011-03-14     Completed Date:  2011-07-14     Revised Date:  2011-11-21    
Medline Journal Info:
Nlm Unique ID:  101525836     Medline TA:  Evodevo     Country:  England    
Other Details:
Languages:  eng     Pagination:  7     Citation Subset:  -    
Program in Molecular & Cell Biology and Department of Entomology, University of Maryland, 4112 Plant Sciences Building, College Park, MD 20742, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Full Text
Journal Information
Journal ID (nlm-ta): EvoDevo
ISSN: 2041-9139
Publisher: BioMed Central
Article Information
Download PDF
Copyright ©2011 Heffer and Pick; licensee BioMed Central Ltd.
Received Day: 16 Month: 12 Year: 2010
Accepted Day: 1 Month: 3 Year: 2011
collection publication date: Year: 2011
Electronic publication date: Day: 1 Month: 3 Year: 2011
Volume: 2First Page: 7 Last Page: 7
ID: 3055837
Publisher Id: 2041-9139-2-7
PubMed Id: 21362165
DOI: 10.1186/2041-9139-2-7

Rapid isolation of gene homologs across taxa: Efficient identification and isolation of gene orthologs from non-model organism genomes, a technical report
Alison Heffer1 Email:
Leslie Pick1 Email:
1Program in Molecular & Cell Biology and Department of Entomology, University of Maryland, 4112 Plant Sciences Building, College Park, MD 20742, USA


One focus of evolutionary biologists is to understand how changes in regulatory and coding regions of genes contribute to species evolution and adaptation [1,2]. This requires sequence comparisons across distantly related taxa as well as among closely related species. A major limitation in studying molecular evolution is the amount of comprehensive sequence data available to track these changes in genes and their networks. Standard approaches include comparisons across widely divergent model organisms, comparison of gene sequences that have been deposited in databases, and comparisons of whole genome sequences. This can result in an incomplete matrix of information about the lineages of particular gene families, making it difficult to trace steps leading to functional changes in regulatory and coding sequences. Additionally, the sequence conservation of duplicated and diverged genes within gene families [3,4] poses a challenge: How can we identify a particular member of a gene family without isolating and screening through closely-related homologs? Here we report a strategy to efficiently isolate genes from genomic DNA that can be used to obtain sequence information from un-sequenced genomes and non-model organisms not easily reared in the laboratory. Rapid Isolation of Gene Homologs across Taxa (RIGHT) is based on the fact that homologous genes (both paralogs and orthologs) generally show conservation of at least one domain, even if other parts of the sequence are under weaker selective pressure. For example, the Hox proteins have retained the conserved DNA binding domain after duplication and divergence [5,6]. While not forging fundamentally new technology, this approach combines and modifies existing procedures to facilitate the rapid isolation of genes, allowing sampling of a large number of taxa.

RIGHT Methodology

RIGHT methodology utilizes degenerate polymerase chain reaction (PCR) and gene-specific amplified-length fragment polymorphism (AFLP) to allow for rapid gene isolation. First, degenerate primers are designed to amplify a small region of less than 200 bases of the conserved domain characteristic of the gene family (Figure 1, Step 1). One primer is derived from the unique signature motif in the homolog of interest, while the other can be shared with other family members. A variation of touchdown PCR is used that is optimized for these degenerate primers (see attached protocol, Additional file 1). Nested PCR is done to ensure that the correct gene-family member is amplified. The PCR product is run on an agarose gel, purified and sequenced. The product can be positively identified by characteristic residues in the homolog of interest, along with BLAST searches against other species. Together, this allows for increased confidence that the isolated gene product corresponds to the gene of interest. From this short DNA sequence obtained from degenerate PCR, gene-specific primers are designed for subsequent reactions.

Sequence up- and downstream of the conserved region (obtained in Step 1, Figure 1) is next isolated by modifications of AFLP and TE-display techniques [7-12] that allow selective amplification of only the gene sequence of interest. Traditional AFLP uses restriction enzymes to digest genomic DNA followed by ligation of adapters of known sequence to DNA ends. Adapter-specific primers are used in subsequent PCRs to amplify DNA fragments, which are then separated on a gel and analyzed. RIGHT uses the basic idea of AFLP up to the amplification step; however, instead of amplifying DNA fragments using adapter sequences as both primers (which generates many fragments), an adapter-specific primer is used as one primer and a gene-specific primer (derived from degenerate PCR used in Figure 1, Step 1) as the other primer. Thus, only a sequence from the gene of interest is isolated. The digestion of genomic DNA and ligation of adapters is done in a single step (Figure 1, Step 2). Adapter sequences are designed to anneal to, but destroy, restriction sites in order to avoid re-digestion in this combined restriction/ligation reaction. Several different restriction digests are set up in parallel to provide different-length PCR templates covering the gene of interest. This is also beneficial because restriction site locations are not known for genomes that have not been sequenced. The digestion/ligation is followed by two rounds of nested PCR (Figure 1, Step 3), which functions to increase specificity of primer binding and the amount of product. After the PCR product is amplified and sequenced, new gene-specific primers are designed at the sequence ends to repeat PCRs with a different restriction digest/ligation as template in order to extend the sequence. By repeating this process, one can "walk" along the genomic sequence to isolate the entire coding sequence (Figure 1, Step 4).

In most cases only one clear product was observed after nested PCR; however, occasionally there were several. In these situations, either all products were sequenced or products were re-amplified using the same primers or another nested set to reduce the number of products. In cases where multiple bands persisted, it was usually due to restriction sites that were very close together in the genome and almost all of the sequenced regions overlapped. After a new sequence has been isolated, its continuity is always checked by PCR with primers at extreme opposite ends of the sequence that has been obtained to make sure the sequence being isolated is contiguous with that upstream and/or downstream (Figure 1, Step 5). This is very important because, although infrequent, ligation may occur between genomic DNA fragments in Step 2. As demonstrated, RIGHT provides efficiency and saves time when compared to other protocols. This combination is a powerful method for obtaining full gene sequence information, including coding and regulatory regions.

RIGHT isolation of homeobox and nuclear receptor genes

RIGHT has been used successfully in our laboratory to efficiently isolate specific members of several large gene families. The technique was first developed to isolate a rapidly-evolving member of the Hox gene family, fushi tarazu (ftz) [13-15]. First, degenerate PCR primers were designed based upon signature residues encoding the amino-terminal end of the DNA-binding homeodomain and another highly conserved region with low degeneracy approximately 150 bases downstream (Figure 2A, arrows). Ftz homologs were positively identified by characteristic residues in the homeodomain. Next, gene-specific AFLP was carried out using a combination of unique restriction/ligation templates for PCR with one ftz-specific primer and one adapter-specific primer (Figure 2B). All products were sequenced and gene-sequence continuity verified by PCR with genomic DNA and primers designed to the extreme 5' and 3' ends of sequence that had been isolated. Full-length ftz sequences, including putative introns, were isolated by genomic walking until translation initiation and stop codons were identified. Using RIGHT allowed us to isolate ftz genes from diverse arthropods representing approximately 550 million years of evolutionary divergence [16], including the dermapteran Forficula auricularia and archaeognathan Pedetontus saltator (Figure 2C). Additionally, genomic DNA of two non-model beetles was used for degenerate PCR to obtain the ftz homeobox, and in combination with RACE on embryonic cDNA, full-length ftz sequences were obtained [16]. To date, we have isolated 2 full-length and 10 partial ftz genes from a range of non-model organisms using RIGHT.

In addition to ftz, we isolated other homeobox-containing genes such as extradenticle (exd) and the orphan nuclear receptor ftz-f1 from multiple species with great success (unpublished). RIGHT was used to isolate partial exd sequences from Thermobia domestica (firebrat), Callosobruchus maculatus (beetle), and Folsomia candida (collembolan). In combination with RACE, full-length exd coding regions were isolated from these species. Several partial ftz-f1 sequences were isolated, including Artemia salina (brine shrimp), Folsomia candida (collembolan), Thermobia domestica (firebrat), Callosobruchus maculatus (beetle), Dermestes maculatus (beetle) Oncopeltus fasciatus (milkweed bug), and Acyrthosiphon pisum (aphid). As for exd, full-length ftz-f1 sequences have been obtained from many of these organisms in combination with RACE. For this work, as per experimental design, sequences were obtained from species representing key points in arthropod phylogeny to allow for systematic analysis of a small network of functionally related genes from different families (ftz, ftz-f1, exd). Thus far, every gene that we have attempted to isolate from any chosen species using RIGHT has been obtained.


The ability to isolate homologous genes from diverse taxa will empower studies of molecular evolution of genes, families and gene networks. In the past, these approaches were limited by absence of genomic information. Even though genome sequencing is now practical for a larger number of species, it is unlikely to make a dent in the millions of species on Earth. Similarly, investments are being made in developing new model systems, to expand on the standard fly, mouse and worm systems. However, the investment to bring a new model system up to speed is substantial and it is neither necessary nor practical to fully develop hundreds of genetic model systems. We suggest that these approaches, while enormously important for the field of evo-devo, are not always necessary to answer specific evolutionary questions. RIGHT provides a fast and efficient way to isolate genes, including coding regions and candidate cis-regulatory regions, and overcomes many practical constraints, realistically allowing for the isolation of 10s if not 100s of genes from families or gene networks to study molecular evolution across divergent taxa or within specific clades. This approach obviates common limitations, such as genome sequence availability or rearing species in the lab. It has been used successfully to isolate specific members of several large gene families, allowing for a comparative analysis over millions of years of evolutionary time.


AFLP: Amplified Fragment Length Polymorphism; Antp: Antennapedia; exd: extradenticle; ftz: fushi tarazu; ftz-f1: fushi tarazu factor 1; PCR: Polymerase Chain Reaction; RACE: Rapid Amplification of cDNA Ends; RIGHT: Rapid Isolation of Gene Homologs across Taxa; Scr: Sex combs reduced; TE-display: Transposable Element-display

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AH participated in experimental design, carried out experiments and participated in writing the paper. LP participated in experimental design, analysis of results and writing the paper.

Supplementary Material Additional file 1

Heffer and Pick Supplementary Information Protocol. A detailed description of protocol for the technique described in this manuscript.

Click here for additional data file (2041-9139-2-7-S1.PDF)


We thank Arun Subramanian, Jerry Regier, Dave Hawthorne and Jeff Shultz for helpful suggestions and David O'Brochta for comments on the manuscript. AH acknowledges support from the University of Maryland's Graduate Student Summer Fellowship. This work was supported by NSF grant IBN0641717 to L.P.

Schlosser G,Wagner GP,Modularity in Development and EvolutionYear: 2004Chicago: University of Chicago Press
Carroll SB,Grenier JK,Weatherbee SD,From DNA to diversity: molecular genetics and the evolution of animal designYear: 2005SecondOxford: Blackwell Science Ltd
Ohno S,Evolution by gene duplicationYear: 1970Berlin: Springer-Verlag
Lynch M,Force A,The probability of duplicate gene preservation by subfunctionalizationGeneticsYear: 200015445947310629003
McGinnis W,Krumlauf R,Homeobox genes and axial patterningCellYear: 19926828330210.1016/0092-8674(92)90471-N1346368
Gehring WJ,Kloter U,Suga H,Evolution of the Hox gene complex from an evolutionary ground stateCurr Top Dev BiolYear: 2009883561 full_text. 19651301
Vos P,Hogers R,Bleeker M,Reijans M,van de Lee T,Hornes M,Frijters A,Pot J,Peleman J,Kuiper M,et al. AFLP: a new technique for DNA fingerprintingNucleic Acids ResYear: 1995234407441410.1093/nar/23.21.44077501463
Beeman RW,Stauth DM,Rapid cloning of insect transposon insertion junctions using 'universal' PCRInsect Mol BiolYear: 19976838810.1046/j.1365-2583.1997.00159.x9013259
Casa AM,Brouwer C,Nagel A,Wang L,Zhang Q,Kresovich S,Wessler SR,Inaugural article: the MITE family heartbreaker (Hbr): molecular markers in maizeProc Natl Acad Sci USAYear: 200097100831008910.1073/pnas.97.18.1008310963671
Biedler J,Qi Y,Holligan D,della Torre A,Wessler S,Tu Z,Transposable element (TE) display and rapid detection of TE insertion polymorphism in the Anopheles gambiae species complexInsect Mol BiolYear: 20031221121610.1046/j.1365-2583.2003.00403.x12752653
Hawthorne DJ,AFLP-based genetic linkage map of the Colorado potato beetle Leptinotarsa decemlineata: sex chromosomes and a pyrethroid-resistance candidate geneGeneticsYear: 200115869570011404333
Subramanian RA,Arensburger P,Atkinson PW,O'Brochta DA,Transposable element dynamics of the hAT element Herves in the human malaria vector Anopheles gambiae s.sGeneticsYear: 20071762477248710.1534/genetics.107.071811
Telford MJ,Evidence for the derivation of the Drosophila fushi tarazu gene from a Hox gene orthologous to lophotrochozoan Lox5Curr BiolYear: 20001034935210.1016/S0960-9822(00)00387-010744975
Lohr U,Yussa M,Pick L,Drosophila fushi tarazu: a gene on the border of homeotic functionCurr BiolYear: 2001111403141210.1016/S0960-9822(01)00443-211566098
Lohr U,Pick L,Cofactor-interaction motifs and the cooption of a homeotic Hox protein into the segmentation pathway of Drosophila melanogasterCurr BiolYear: 20051564364910.1016/j.cub.2005.02.04815823536
Heffer A,Shultz J,Pick L,Surprising flexibility in a conserved Hox transcription factor over 550 million years of evolutionProc. Natl. Acad. Sci. USAYear: 2010107180401804510.1073/pnas.1010746107


[Figure ID: F1]
Figure 1 

Overview of RIGHT technique used to isolate homologous genes from large gene families. All steps are described in the text. Oligonucleotides that were annealed to make the adapter destroyed the restriction site. All reverse primers were ordered with the 5' end phosphorylated, or were phosphorylated before annealing with an appropriate enzyme. For example, MseI digest/ligation: F-5' GACGATGAGTCTTGAGTTCAGTCTGTA, R-5'PhosTATACAGACTGAACTCAAGACTCATC; XhoI: F-5' GACGATGAGTCTTGAGTTCAGTCTGTA, R-5'PhosTCGATACAGACTGAACTCAAGACTCATC

[Figure ID: F2]
Figure 2 

Isolation of ftz homologs using RIGHT. A) Homolog-specific residues guide degenerate PCR design. Different Hox proteins have different N-terminal regions (grey shaded region, with differences highlighted in yellow) that can be used for isolation of one family member. The arrows indicate the regions used for degenerate primer design to isolate ftz. The forward degenerate primer makes use of the signature motifs in the N-terminal region, allowing for specific amplification of one member of the Hox gene family. (Drosophila melanogaster Antp: Dm-Antp; D. melanogaster Scr: Dm-Scr; D. melanogaster Ftz: Dm-Ftz; Tribolium castaneum Ftz: Tc-Ftz; Schistocerca gregaria Ftz: Sg-Ftz). B) Isolation of ftz from genomic DNA of non-model organism. A schematic of one application of our approach to isolate new homologs is shown. C) Degenerate PCR was used to isolate the ftz homebox of P.salt and F.auri, and full-length ftz sequences were obtained using different restriction digests/ligations and subsequent PCRs. For P. salt, three fragments were obtained by RIGHT and sequenced after degenerate PCR identified the ftz homeobox; fragment sizes are (from left to right): MseI-EcoRI 320 bp, EcoRI-MspI 114 bp, MspI-EcoRI 945 bp. For F. auri, three fragments were also obtained by RIGHT and sequenced after degenerate PCR identified the ftz homeobox; fragment sizes are (from left to right): EcoRI-MseI 273 bp, MseI-MspI 383 bp, MspI-XhoI 875 bp. Homeobox regions are shown in grey, and other coding regions in blue.

Article Categories:
  • Short Report

Previous Document:  Maternal care practices among the ultra poor households in rural Bangladesh: a qualitative explorato...
Next Document:  Confirmed local endemicity and putative high transmission of Schistosoma mansoni in the Sesse Island...