===== Software list ===== Below you will find a diverse toolbox comprising numerous tools for the analysis of biological sequences. Not all of the tools will be used in the course, but maybe you are interested in taking a look around to see what is currently around. ^Type^Name^Description^%%GUI%%^Installation^%%URL%%| ^Read simulation|ART|Illumina, 454 and Solid read simulator|yes|[[https://bioconda.github.io/recipes/art/README.html|bioconda]]|[[https://www.niehs.nih.gov/research/resources/software/biostatistics/art/index.cfm|Link]]| ^Read processing|FastQC|Generates summary statistics and overview information for DNA and RNA seq data|yes|[[https://anaconda.org/bioconda/fastqc|bioconda]]|[[https://www.bioinformatics.babraham.ac.uk/projects/fastqc/|Link]]| | |SequelTools|Software package for the quality control, filtering and subread selection of PacBio reads|no|[[https://github.com/ISUgenomics/SequelTools#installation|manual]]|[[https://github.com/ISUgenomics/SequelTools|Link]]| | |Trimmomatic|Adapter clipping and quality trimming of short read data|no|[[https://anaconda.org/bioconda/trimmomatic|bioconda/conda-forge]]|[[http://www.usadellab.org/cms/?page=trimmomatic|Link]]| ^Transcriptome assembly|Trinity|De novo reconstruction of transcriptomes from RNA-seq data|no|[[https://anaconda.org/bioconda/trinity|bioconda]]|[[https://github.com/trinityrnaseq/trinityrnaseq/wiki|Link]]| ^Genome assembly|Flye| Fast and accurate de novo assembler for single molecule sequencing reads (suitable for PacBio HiFi reads)|no|[[https://anaconda.org/bioconda/flye|bioconda]]|[[https://github.com/fenderglass/Flye|Link]]| | |CANU |Assembler for High Noise / single molecule sequencing data (Overlap) |no |[[https://anaconda.org/bioconda/canu|bioconda]]|[[https://github.com/marbl/canu|Link]] | | |Velvet |Sequence assembler for short reads (deBruijn graph)|no |[[https://anaconda.org/bioconda/velvet|bioconda]]|[[https://www.ebi.ac.uk/~zerbino/velvet/|Link]] | | |SPades |Intended for both standard isolates and single-cell MDA bacteria assemblies (deBruijn Graph)|no |[[https://anaconda.org/bioconda/spades|bioconda]]|[[http://cab.spbu.ru/software/spades/|Link]]| | |Mira |Whole genome shotgun and EST sequence assembler for Sanger, 454, Solexa (Illumina), IonTorrent data and PacBio (Overlap)|no |[[https://anaconda.org/bioconda/mira|bioconda]]|[[https://sourceforge.net/p/mira-assembler/wiki/Home/|Link]]| ^Database search|NCBI Blast|Heuristic for the rapid identification of significantly similar sequences in a sequence data base using local alignments|no|[[https://anaconda.org/bioconda/blast|bioconda]]|[[https://blast.ncbi.nlm.nih.gov/Blast.cgi|Link]]| | |NCBI Legacy Blast|Heuristic for the rapid identification of significantly similar sequences in a sequence data base using local alignments. These C toolkit binaries are no longer maintained and supported|no|[[https://anaconda.org/biocore/blast-legacy|biocore]]|[[ftp://ftp.ncbi.nlm.nih.gov/blast/executables/legacy.NOTSUPPORTED/|Link]]| | |Diamond|Accelerated BLAST compatible local sequence aligner|no|[[https://anaconda.org/bioconda/diamond|bioconda]]|[[https://github.com/bbuchfink/diamond|Link]]| ^Gene prediction|maker|A comprehensive pipeline for genome annotation|no|[[https://anaconda.org/bioconda/maker|bioconda]]|[[http://www.yandell-lab.org/software/maker.html|Link]]| | |funannotate|A pipeline for gene annotation in fungal genomes|no|[[https://anaconda.org/bioconda/funannotate|bioconda]]|[[https://github.com/nextgenusfs/funannotate|link]]| | |Transdecoder|Identify candidate coding regions within transcript sequences|no|[[https://anaconda.org/bioconda/transdecoder|bioconda]]|[[https://github.com/TransDecoder/TransDecoder/wiki|Link]]| ^Gene set completeness|Busco|Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs|no|[[https://anaconda.org/bioconda/busco|bioconda]]|[[https://busco.ezlab.org/|Link]]| | |fCAT|Gene set completeness assessment tool using domain-architecture aware targeted ortholog searches|no|[[https://github.com/BIONF/fcat#how-to-install|pip]] |[[https://github.com/BIONF/fcat|link]]| ^Repeat annotation|Repeat Masker|Smith-Waterman based identification and optional masking of repeats provided in a repeat database|no|[[https://anaconda.org/bioconda/repeatmasker|bioconda]]|[[http://www.repeatmasker.org/|Link]]| ^Genome visualization|JBrowse Desktop|Desktop version of JBrowse that does not need any web server configuration|yes|[[https://jbrowse.org/blog/|manually]]|[[http://gmod.org/wiki/JBrowse_Desktop|Link]]| | |JBrowse|A browser based viewer for genomes and genome-wide annotation|yes|[[https://anaconda.org/bioconda/jbrowse|bioconda]]|[[https://jbrowse.org/blog/|Link]]| | |IGV|Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations |yes|[[https://anaconda.org/bioconda/igv|bioconda]]|[[http://software.broadinstitute.org/software/igv/home|Link]]| | |igvtools|The igvtools utility provides a set of tools for pre-processing data files.|no|[[https://anaconda.org/bioconda/igvtools|bioconda]]|[[https://software.broadinstitute.org/software/igv/igvtools|Link]]| ^Structural variant detection|Lumpy|A general probabilistic framework for structural variant discovery|no|[[https://anaconda.org/bioconda/lumpy-sv|bioconda]]|[[https://github.com/arq5x/lumpy-sv|Link]]| | |Delly|A SV caller leveraging multiple signals|no|[[https://anaconda.org/bioconda/delly|bioconda]]|[[https://github.com/dellytools/delly|Link]]| | |Manta|A SV caller leveraging multiple signals|no|[[https://anaconda.org/bioconda/manta|bioconda]]|[[https://github.com/Illumina/manta|Link]]| ^Structural variation comparison |SURVIOR|A tool kit to compare, merge, and generate stats over SVs vcf files|no|[[https://anaconda.org/bioconda/survivor|bioconda]]|[[https://github.com/fritzsedlazeck/SURVIVOR|Link]]| ^SAM/BAM manipulation |Samtools|A tool kit to operate with sam/bam files|no|[[https://anaconda.org/bioconda/samtools|bioconda]]|[[https://github.com/samtools/samtools|Link]]| | |BCFTOOLS| BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF |no |[[https://anaconda.org/bioconda/bcftools|bioconda]]|[[https://github.com/samtools/bcftools|LINK]]| ^Read mapping |BWA|A method to align short reads|no|[[https://anaconda.org/bioconda/bwa|bioconda]]|[[https://github.com/lh3/bwa|Link]]| | |Bowtie2|Ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences|no|[[https://anaconda.org/bioconda/bowtie2|bioconda]]|[[http://bowtie-bio.sourceforge.net/bowtie2/index.shtml|Link]]| ^SNP caller |xAtlas|A method to detect SNP + indels|no|[[https://anaconda.org/bioconda/xatlas|bioconda]]|[[https://github.com/jfarek/xatlas|Link]]| ^RNA-Seq mapping |STAR|A method to align RNA-Seq data|no|[[https://anaconda.org/bioconda/star|bioconda]]|[[https://github.com/alexdobin/STAR|Link]]| | |Hisat2|A method to align RNA-Seq data|no|[[https://anaconda.org/bioconda/hisat2|bioconda]]|[[https://github.com/infphilo/hisat2|Link]]| | |Kallisto|A method to align RNA-Seq data|no|[[https://anaconda.org/bioconda/kallisto|bioconda]]|[[https://pachterlab.github.io/kallisto/download|Link]]| | |TopHat|A spliced read mapper for RNA-Seq |no|[[https://anaconda.org/bioconda/tophat|bioconda]]|[[http://ccb.jhu.edu/software/tophat/index.shtml|Link]]| ^Assembly evaluation|Quast |Quality Assessment Tool for Genome Assemblies |yes |[[https://anaconda.org/bioconda/quast|bioconda]] |[[http://quast.sourceforge.net/|Link]] | | |RnaQuast|Quality evaluation tool for assembled transcripts| |[[https://anaconda.org/bioconda/rnaquast|bioconda]]|[[http://cab.spbu.ru/software/rnaquast/|Link]]| ^Taxonomic Assignment|Megan |interactive microbiome analysis tool|no |[[http://ab.inf.uni-tuebingen.de/software/megan6/|manually]] |[[http://ab.inf.uni-tuebingen.de/software/megan6/|Link]] | ^ |Krona Tools | Interaktive viewer for metagenome composition | yes | [[https://anaconda.org/bioconda/krona|bioconda]]|[[https://github.com/marbl/Krona/wiki|Link]] | ^Ortholog search|InParanoid|Pairwise ortholog search tool |no |[[http://software.sbc.su.se/cgi-bin/request.cgi?project=inparanoid|manually]] |[[http://software.sbc.su.se/cgi-bin/request.cgi?project=inparanoid|Link]] | | |OMA |Ortholog matrix project | no |[[https://omabrowser.org/standalone/#downloads|manually]] |[[https://omabrowser.org/standalone/|Link]] | | |fDOG |Targeted ortholog search tool |no |[[https://github.com/BIONF/fDOG|pip]] |[[https://github.com/BIONF/fDOG|Link]] | ^ Phylogenetic Profiling|PhyloProfile |A browser based tool for visualizing and exploring phylogenetic profiles |yes |[[https://github.com/BIONF/PhyloProfile|Bioconductor]] |[[https://github.com/BIONF/PhyloProfile|Link]] | ^Phylogeny reconstruction|RAxML |Phylogenetics - Randomized Axelerated Maximum Likelihood |no |[[https://anaconda.org/bioconda/raxml|bioconda]] |[[http://sco.h-its.org/exelixis/web/software/raxml/index.html|Link]] | | |ProtTest3 |ProtTest is a bioinformatic tool for the selection of best-fit models of aminoacid replacement for the data at hand|yes|[[https://github.com/ddarriba/prottest3|manually]]|[[https://github.com/ddarriba/prottest3|Link]]| ^Tree visualization|FigTree|Graphical viewer of phylogenetic trees|yes|[[https://github.com/rambaut/figtree/|manually]]|[[https://github.com/rambaut/figtree/|Link]]| | |iTOL|Interactive software for the visualisation and annotation of phylogenetic trees|yes|[[https://itol.embl.de|Web tool]]|[[https://itol.embl.de|Link]]| | |matt|Interactive tree visualisation, modification and topology testing|yes|[[https://github.com/BIONF/matt#installation |pip]]|[[https://github.com/BIONF/matt|Link]]| ^Sequence alignment|Muscle|Multiple sequence alignment|no|[[https://anaconda.org/bioconda/muscle|bioconda]]|[[https://www.drive5.com/muscle/manual/index.html|Link]]| | | | | | | ===== Additional installation information ===== ==== RepeatMasker ==== Once you have installed the repeat masker, either directly or via the installation of the maker pipeline, you will need to install the [[https://www.girinst.org/server/RepBase/index.php|RepBase repeat library]]. To do so, identify the location of your RepeatMasker installation, download the current release of [[https://www.girinst.org/server/RepBase/index.php|RepBase repeat library]] - note, you will have to complete a free registration before you can download the file. Move the archive file name //RepBaseRepeatMaskerEdition-20170127.tar.gz// into the RepeatMasker directory in~/anaconda/envs/compgen/share/RepeatMaskerand unpack it by typingtar -xzf RepBaseRepeatMaskerEdition-20170127.tar.gz ==== JBrowse ==== In case you want to use the [[http://gmod.org/wiki/JBrowse_Desktop|desktop version of JBrowse]] for your work, download the version that matches your operating system from the [[https://jbrowse.org/blog/|JBrowse web sites]] and follow the installation guidelines provided with the software. ===== Databases ===== **Databases and web tools**: * NCBI - [[https://www.ncbi.nlm.nih.gov/|https://www.ncbi.nlm.nih.gov/]] * KEGG - [[http://www.genome.jp/kegg/|http://www.genome.jp/kegg/]] * ENSEMBL - [[http://www.ensembl.org/index.html|http://www.ensembl.org/index.html]] * Uniprot - [[http://www.uniprot.org/|http://www.uniprot.org/]] * PFAM - [[http://pfam.xfam.org/|http://pfam.xfam.org/]] * InterPro - [[https://www.ebi.ac.uk/interpro/]] * HMMER - [[http://hmmer.org/|http://hmmer.org/]] * gNOME - [[http://ghubs.izn-ffm.intern:5000/g-nom/assemblies/list]] :!: This is an ongoing project, and we will be working with an alpha version of this web tool. The %%URL%% is reachable only from within our network. * [[ecoevo_molevol:course_introduction|Back to EcoEvo course]] * [[pbioc_basics:start|Back to PBioC course]] * [[digikomp_bio:course_introduction|Back to DigiKomp course]] * [[mbw_bioinf:mastermbw| Back to MBW course]]