====== A non-comprehensive list of bash commands ====== I have listed a number of common bash commands that come in handy when working in a shell.\\ You can try the [[http://applbio.biologie.uni-frankfurt.de/mscmbw/lib/exe/fetch.php?media=wiki:mbw:shell_basic_commands.txt|following basic commands]] on the [[http://applbio.biologie.uni-frankfurt.de/mscmbw/lib/exe/fetch.php?media=wiki:mbw:ejemplo.txt|example file]] to understand what they are doing. Feel free to complement your notes on the {{ :general:computerenvironment:bash_spellbook.xlsx |compilation of basic commands}}. ===== Slightly more advanced bash ===== === AWK === This command is also considered a programming language on its own. It is particularly useful when you need to process the elements of a table. The basic syntax is as follows: awk -F "\t" '{print $2 "\t" $1} file.txt * -F to indicate the delimiter of your table as tabs (default is space). * '{print}' to select with "$" the column number you want in the output. * Text added to the output must be written inside quotation marks; in this case, the text is just the addition of a new tab. * You can also process columns by mathematical operations. For instance, '{print $1 + $2}' awk -F "\t" '$3 > 10 {print $0}' * You can indicate you want only the rows in which a column meets certain condition. For example, the column 3 requires a value greater than 10. * Equal == * Non equal =! * > Greater than * '{print $0}' will print all the columns of the table ---- === Sort === sort -g -u file.txt * -r to sort in descending order * -g to sort by number * -k1,1 to sort by a specific column, in this example, first column only. * -u make values unique * -t ' ' to select a non-default field separator (default is tab, and in this case I changed to a space) ---- === Translate === tr '[ATGC]' '[TACG]' | rev * This is a trick if you are working on the complementary strand. * It will convert each "A" into "T" and so on. * rev will make everything read backwards. ---- === sed (slightly more advanced) === sed -n -e '/AAA/,/BBB/ p' file.txt * This will find AAA, and keep all the lines in a file until it reaches BBB. Pro tip: use this one to extract a sequence in a multi-line fasta. * Note that using variables inside a sed command requires double quotation marks " instead of single '. ===== Working with lists and tables ===== Play with the following sample files.{{ :general:computerenvironment:comm_join_example.tar.gz |}} === comm === To compare contents of both files (in this case, the identifiers of the first column of the two files): comm <() <() * Within each "<()" we place the command of the input to compare. * These should be sorted out * I use the second part of the command (the sed) to adjust the output to have the correct number of columns comm <(cut -f1 1_table.txt | sort) <(cut -f1 2_table.txt | sort) | sed -e 's/$/\t\t/' | cut -f1,2,3 * Output: * First column: identifiers exclusive of the table in input 1 * Second column: identifiers exclusive of the table in input 2 * Third column: identifiers present in both tables ---- === join === Join two tables based on a column in common. join -t $'\t' <() <() * Input within the "<()" must be sorted out. * I highly recommend using only the first column with the identifiers to join. * If one of the tables has repeated identifiers, the output will generate all combinations possible. * The standard output will display only lines with columns in common. We can add option -a1 or -a2 to also include the entries of one of the tables, with no joined values from other. Do not use both. join -t $'\t' -a1 <(sort -k1,1 1_table.txt) <(sort -k1,1 2_table.txt)