One of the main challenges in bioinformatics analyses - though frequently neglected - is the set up of the computer environment such that you can run your analyses smoothly. Rather than providing you with a nice and shiny pre-installed environment, we will guide you through the main steps of setting up your system on your own. Please note, installing software for comparative genomics works quite well for LINUX and MacOS, but if you run Microsoft_Windows you are likely to encounter considerable issues. This is, since many of the software for biological sequence analysis has not been implemented for this operating system. In this case, you may want to consider installing a LINUX distribution on a virtual machine.
Whenever possible, we will use the conda package manager to install the software we need for the individual analysis steps. The main advantage of this package manager is that it not only installs the program you wish to use, but also all the dependencies you are often not aware of. The conda package manager, which can be easily installed on your system provides access to the Anaconda cloud. Using the conda package manager is considerably straightforward and we will guide you through the first steps. However, reading the documentation and using cheat sheets can certainly not harm. Sometimes, it may also help to take a look at the Anaconda glossary to get the meaning behind some of the terms used in the context of conda.
We will be using Miniconda in the course, this is sufficient for our purposes
The standard installation of Anaconda becomes considerably slow after a couple of different environments have been set up. We therefore recommend using Mamba which is much faster than Anaconda but otherwise behaves just as Anaconda does. In fact, Mamba is build upon Anaconda so installing Mamba will also install a bare bones version of Anaconda.
After installing mamba, you can replace the conda
command you use for creating environments and installing packages with mamba
to drastically improve speed. Throughout this wiki, the conda
and mamba
commands will be used interchangeably. You are free to use either one, but mamba
will be faster most of the time.
x86_64 Linux
installer from the miniforge GitHub pagewget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh
Do you wish the installer to initialize Anaconda3 by running conda init? [yes|no]
Answer with:
yes
This directly activates conda's base environment whenever you open a terminal.
source <path to conda>/bin/activate conda init
For changes to take effect, close and re-open your current shell.
So close your terminal and open a new one.
(base) francisca@workstation1:~$
This “(base)” indicates the base conda environment is activated. Now type:
conda -h
This should print some basic help to your terminal. If all is good, proceed to the next step.
mamba create --name compgen python=3.9
conda activate compgen
conda deactivate
If you want to switch between environments, please de-activate the current environment, and only then activate the new environment!
conda env remove -n EnvNAME
Please replace EnvName with the name of your environment
conda info
into a shell. You will obtain something like the following:
active environment : compgen active env location : /Users/ingo/anaconda/envs/compgen shell level : 1 user config file : /Users/ingo/.condarc populated config files : /Users/ingo/.condarc conda version : 4.5.11 conda-build version : 2.0.2 python version : 2.7.12.final.0 base environment : /Users/ingo/anaconda (writable) channel URLs : https://conda.anaconda.org/conda-forge/osx-64
conda install --help
base
environment. This might cause Mamba/Anaconda to malfunction. Always create a new environment and install packages in theremamba install CHANNELNAME::PACKAGENAME
conda config --add channels bioconda conda config --add channels conda-forge
If you completed all of the steps above, then you should be ready to make full use of the conda package manager.