=== KEGG === [[https://en.wikipedia.org/wiki/KEGG|KEGG]] is a collection of databases (Figure {{ref>KEGG1}}) and connected software tools that integrates genes into pathways linking them functions, diseases, drugs and the like. The KEGG identifiers that are used for the various databases are listed in Figure {{ref>KEGG2}}).
{{ :physaliacg:kegg.png?400 }} Overview of the databases hosted by KEGG. The table is provided via the [[http://www.kegg.net/api/overview.html|overview page]] from KEGG
{{ :physaliacg:kegg2.png?400 }}Overview of the identifiers used in the various KEGG databases. The table is provided via the [[http://www.kegg.net/api/overview.html|overview page]] from KEGG
KEGG integrates this information via KEGG orthologs (KO) groups across a comprehensive yet obviously limited set of species. To allow the custom propagation of the information stored in KEGG to sequences from species that are not represented in the KEGG database, they provide the annotation tool [[https://www.kegg.jp/blastkoala/|BlastKOala]] ([[https://www.ncbi.nlm.nih.gov/pubmed/26585406|Kanehisa et al. 2016]]). In a nutshell, BlastKOala makes heavy use of phylogenetic relationships among sequences in order to propagate functional annotation of so called KEGG orhtolog groups (KO) to further sequences. It is a Blast based tool that additionally compares domain architectures between the sequences in a KO group and a new candidate.
{{ :physaliacg:blastkoala.png?600 }} Workflow of the two tools for assigning KO numbers to an unanotated protein sequence. While KAAS is relying on orthology assignments, BlastKOala recruits unidirectional Blast searches.