meta data for this page
  •  

This is an old revision of the document!


Finding "corresponding" genes

How to identify “corresponding” genes across species


Genes in two species that originated from the same ancestral gene in the last common ancestor of the species are good candidates for being 'corresponding' genes (orthologs)

Orthologs are genes in two species that are mutually most similar to each other

Analysis

In this course, we will follow the steps of a comparative genomics analysis of plant cell wall degrading enzymes (PCDs). Plant cell walls are made up of cellulose and hemi-cellulose, making them two of the most common organic molecules on Earth and key components of the carbon cycle.

It was thought for a long time that PCDs are only produced by fungi and certain bacteria, but in recent years, evidence has accumulated that some invertebrate animals may be able to degrade plant cell walls as well.

To find out how widespread the ability to degrade plant cell wall, really is we trace the distribution of 235 potential PCDs across all eukaryotic datasets available in the RefSeq database.

Task 1: Exploration

  1. Open a visualizaiton of the results generated using this interactive web-viewer
  2. Wait a moment for the data to load. Once you were redirected to the “Main profile” page, select a taxonomic rank and click the red PLOT button
  3. Explore the “Main plot” on different taxonomic levels
    1. Which information is displayed in the rows, which in the columns?
    2. What patterns can you observe in the plot and how do you interpret them?
  4. Select the “Dimension reduction” plot from the top menu and explore the plot with “Phylum”-level labels

Task 2: Finding corresponding genes

After having explored the results, let's find out how to generate them. The task at hand is to find out which genes “correspond” to each other in different species, and should be displayed together in one row.

Have another look at the assumptions at the top of the page. Our best bet to find “corresponding” genes between species is to identify orthologs. In practice, this means finding genes in two species that are mutually most similar to each other.

Finding genes with significantly similar sequences is typically done with a BLAST search. To identify orthologs, we will perform a “reciprocal best hit serach”. As an example, we will use the GH45 type cellulase of Rhizoctonia solani (XP_043186466.1.


Back to main