meta data for this page
  •  

This is an old revision of the document!


KEGG

KEGG is a collection of databases (Figure 1) and connected software tools that integrates genes into pathways linking them functions, diseases, drugs and the like. The KEGG identifiers that are used for the various databases are listed in Figure 2).

Figure 1: Overview of the databases hosted by KEGG. The table is provided via the overview page from KEGG
Figure 2: Overview of the identifiers used in the various KEGG databases. The table is provided via the overview page from KEGG

KEGG integrates this information via KEGG orthologs (KO) groups across a comprehensive set of species. Yet, KEGG provides an annotation tool BlastKOala (Kanehisa et al. 2016) to propagate information stored in KEGG to sequences from further species. In a nutshell, BlastKOala makes heavy use of phylogenetic relationships among sequences in order to propagate functional annotation of so called KEGG orhtolog groups (KO) to further sequences. It is a Blast based tool that additionally compares domain architectures between the sequences in a KO group and a new candidate.

Figure 3: Workflow of the two tools for assigning KO numbers to an unanotated protein sequence. While KAAS is relying on orthology assignments, BlastKOala recruits unidirectional Blast searches.