meta data for this page
This is an old revision of the document!
Working with the command line
Introduction
We trust that all of you have worked with computers before in some way. Programs that you are already familiar with provide you with a nice and shiny graphical user interface (GUI) that makes the use of these programs pretty easy. Probably, one of the most widely known examples is the text editor Microsoft Word. As one part of our course, we will, however, introduce you into the working with the shell1). This is, for an average user, the most direct way of interacting with the computer. We assume that the use of the shell is a bit cryptic at the beginning. However once you got used to it you can appreciate the 'power of the shell'. But let's get started.
The general appearance of the command line prompt is something along the following lines
username@computername:~$
Code documentation
Sometimes we will provide you with short pieces of code that you can copy and paste into your terminal. Commands like that will appear in grey boxes like this:
# this is only a comment $ mkdir <dirname>
- Lines starting with a hashtag # are comments and are not part of the code
- The $ sign represents the prompt in your shell (see the example above). Please do not copy it, when trying to copy-past commands from the DokuWiki to the command line
- Words in-between <these signs> are placeholders and need to be exchanged for something when typing the command.
Prerequisites
To complete this set of exercises, you should be familiar with
- how to
open a shell on your computer (need info?)
Task List
Once you know how to open a BASH shell on the system you are using, it is time to learn how to use it. We have compiled a selection of resources and exercises for you after which you will be comfortably working with the command line in no time.
1. Command Line Bootcamp
If you are not familiar with the command line, the command line bootcamp is a nice way to introduce you into working with the command line. You can spend some time in walking through the tutorial in a shell on your system- if you are within the AppliedBionformaticsFrankfurt network you have access to an interactive environment.
Memorize the individual commands, and it might be good idea to generate yourself short wiki pages that outline the individual functions together with the most relevant options. See the following pages as an example:
- Changing directories: cd
- locating your position in the directory tree: pwd
- looking into files: less
- linking files: ln
Remember that the DokuWiki is a shared resource and that you can work together when creating these notes.
2. Custom exercises
2.1 Anaconda and Jupyter
We have compiled a set of tasks for you that will deepen your knowledge about working with the BASH shell and will introduce some principles and dataformats which are common to bioinformatics.
These exercises will come in the format of Jupyter notebooks which are a great way of making analyses reproducible and easy to share. If you don't have a working version of jupyter notebook on your computer system you can install it via Anaconda. Please set up Anaconda with the tutorial in our wiki. Now you can install Jupyter notebook by typing:
conda install -c anaconda jupyter
Go ahead and download our exercises from GitHub via this LINK. The easiest way to start the download is to click on the green “Code button” in the top right corner and select “Download ZIP” (Figure 2). Don't forget to unpack the directory with a ZIP file manager of your choice.
2.2 Exercises
Open a terminal on your system and navigate to the directory you have just downloaded and extracted. Now, you can start a Jupyter notebook by simply typing:
jupyter notebook
This will open a window in your browser with which you can navigate to the `.ipynb` files of each exercise. The notebooks contain a set of instructions and some tasks. They also contain code cells in which you should document the command which solve the task.
You can also use the code cells to experiment and find your solution, but we encourage you to try out all commands in your local terminal as well.
2.3. The final boss
Professor Ebersberger prepared a heartwarming message, but the tutor decided to corrupt it:
In order to decode the secret message, the following will be needed:
- Files will be downloaded to your “~/Downloads”.
- Use the codex to restore the words from the first column, into the second one. (Make a backup. Try a combination of while-read + sed -i “”)
- Get rid of the lines with numbers
- Instead of capital letter K, we need the letter e
- Instead of $, we need the letter a
- Sort the lines
- The message should be in the 20th column.
- Read them in one line
3. Using a computer cluster
In the previous exercises you have learned to write commands and pipelines in the BASH shell. Now we want to look at how we can expand our analyses to large-scale analyses or datasets. For such resource heavy jobs we have a computer cluster available which is managed by the SLURM architecture. Please read through the information about SLURM and then solve the task below.
- Have a look at the FASTQ file stored here:
/share/project/mscmbw2/data/C_1.2/ForwardFile.fq
- Check the size of the file using
ls -lh - Count the number of header lines in the file and measure how long your command takes with the time command
- Create a SLURM script file that executes the same command and run it on the cluster
- Discuss with the other people in the course when best to use the computer cluster
4. Environments
.bashrc
The .bashrc file will be loaded and executed every time a user logs in. It contains a series of configurations for the terminal session like settings for completion, shell history, command aliases, paths to computer programs, and more. The .bashrc is a hidden file and will not be listed with a normal ls command. You can make it visible in your home directory with the following commands:
cd ~ ls -a
Alias
If you often use the same long command you can simplify your life by adding an alias to the end of .bashrc.
alias alias_name="command_to_run"
Excercise:
Create an alias called … that navigates you two folders back in the folder tree.
$PATH
By using commands like ls or cd you're basically telling the shell to run an executable file. The files are usually in different folders on your computer system. Therefore the variable $PATH exists. When you type a command the computer searches through all locations saved in $PATH for executable scripts with the correct name. You can learn here how to add a new path to $PATH to simplify your life.
If you want to have a look at which paths are already stored in $PATH you can use the following command:
echo $PATH
A new path can be added through the following command:
export PATH="<new path>:$PATH"
The export command will export the modified variable $PATH to the shell child process environments. But this is change is only temporary. If you want to make the change permanent you have to add the same command at the end of the .bashrc. After saving you have to reload the .bashrc
source ~/.bashrc
Excercise:
Sometimes you will have to install programs without using a package manager like Anaconda. You can always start installed packages by using the absolute path to their main script, but adding them to your $PATH will save you a lot of typing. Follow the next steps to add an example script to your $PATH.
- create a new folder you want to add later on to $PATH
mkdir scripts
- open a new file and add this example function
# will use the date function to print out some information echo This is a `date +"%A %d in %B of %Y (%r)"`
- Add the path to the newly created file to $PATH
export PATH="/home/hannah/scripts:$PATH"
check if your path was added correctly
today This is a Tuesday 12 in October of 2021 (04:38:31 PM)
Maybe you get a permission denied error. Then you have to change the rights of your file.
chmod +x today
Enjoy your new power
Additional ressources
- Python for biologists for people interested in learning python
