Differences

This shows you the differences between two versions of the page.

--- general:computerenvironment:bash [2019/04/24 12:36] – [A non-comprehensive list of bash commands] ruben
+++ general:computerenvironment:bash [2023/04/11 13:17] (current) – felix
@@ Line 4: / Line 4: @@
 You can try the [[http://applbio.biologie.uni-frankfurt.de/mscmbw/lib/exe/fetch.php?media=wiki:mbw:shell_basic_commands.txt|following basic commands]] on the [[http://applbio.biologie.uni-frankfurt.de/mscmbw/lib/exe/fetch.php?media=wiki:mbw:ejemplo.txt|example file]] to understand what they are doing.
-Feel free to complement your notes on the [[media=https://applbio.biologie.uni-frankfurt.de/teaching/wiki/lib/exe/fetch.php?media=general:computerenvironment:bash_spellbook.xlsx|compilation of basic commands]].
+Feel free to complement your notes on the {{ :general:computerenvironment:bash_spellbook.xlsx |compilation of basic commands}}.
 ===== Slightly more advanced bash =====
@@ Line 11: / Line 11: @@
 This command is also considered a programming language on its own. It is particularly useful when you need to process the elements of a table. The basic syntax is as follows:
-<font 14px/Courier New,Courier,monospace;;inherit;;inherit>awk</font><font 14px/Courier New,Courier,monospace;;#27ae60;;inherit>**-**</font>  <font inherit/Courier New,Courier,monospace;;#27ae60;;inherit>**F**</font><font inherit/Courier New,Courier,monospace;;#27ae60;;inherit>** "\t"**</font> '{print **<font inherit/inherit;;#2980b9;;inherit>$2</font>**  **<font inherit/inherit;;#8e44ad;;inherit>"\t"</font>**  $1}'
+<code>awk -F "\t" '{print $2 "\t" $1} file.txt</code>
-  * use after reading the contents of your table, i.e. after cat file.txt
+  * -F to indicate the delimiter of your table as tabs (default is space).
-  * indicate the **<font inherit/inherit;;#27ae60;;inherit>delimiter</font>** **<font inherit/inherit;;#27ae60;;inherit>of your table</font>**  to tabs in this case (default is space)
+  * '{print}' to select with "$" the column number you want in the output.
-  * **<font inherit/inherit;;#2980b9;;inherit>$ indicates the column</font>** **<font inherit/inherit;;#2980b9;;inherit>number</font>**you want to return
+  * Text added to the output must be written inside quotation marks; in this case, the text is just the addition of a new tab.
-  * **<font inherit/inherit;;#8e44ad;;inherit>Text added to your</font>** **<font inherit/inherit;;#8e44ad;;inherit>table</font>**must be written inside quotation marks; in this case, the text is just the addition of a new tab.
+  * You can also process columns by mathematical operations. For instance, '{print $1 + $2}'
-  * You can also process columns by mathematical operations. For instance, print $1+ $2
-<font 16px/Courier New,Courier,monospace;;inherit;;inherit>awk -F "\t"</font><font 16px/Courier New,Courier,monospace;;#e74c3c;;inherit>'**$3=="+"**</font>  {print **<font inherit/inherit;;#f39c12;;inherit>$0</font>**}'
-  * You can indicate you want only the rows in which a **<font inherit/inherit;;#e74c3c;;inherit>column meets certain condition.</font>**In this case the column 3 has to be exactly a plus sign.
+<code>awk -F "\t" '$3 > 10 {print $0}'</code>
+  * You can indicate you want only the rows in which a column meets certain condition. For example, the column 3 requires a value greater than 10.
       * Equal ==
       * Non equal =!
       * > Greater than
-  * In this example $0 will **<font inherit/inherit;;#f39c12;;inherit>print</font>** **<font inherit/inherit;;#f39c12;;inherit>all columns of the table.</font>**
+  * '{print $0}' will print all the columns of the table
 ----
@@ Line 31: / Line 31: @@
 === Sort ===
-<font inherit/Courier New,Courier,monospace;;inherit;;inherit>sort -r -g -k2</font>
+<code>sort -g -u file.txt</code>
-  * -r if you want descending order
+  * -r to sort in descending order
-  * -g if you want to sort by number
+  * -g to sort by number
-  * -k sort by a specific column
+  * -k1,1 to sort by a specific column, in this example, first column only.
+  * -u make values unique
-----
+  * -t ' ' to select a non-default field separator (default is tab, and in this case I changed to a space)
-=== Sort more ===
-<font inherit/Courier New,Courier,monospace;;inherit;;inherit>sort -u -t ' ' -k1,1</font>
-  * -u for unique
-  * -t to select a delimiter as space (default is tab)
-  * -k1,1 to apply the unique only for the value of the first column, but still keeping the rest of the row.
 ----
@@ Line 51: / Line 43: @@
 === Translate ===
-<font inherit/Courier New,Courier,monospace;;inherit;;inherit>tr '[ATGC]' '[TACG]' | rev</font>
+<code>tr '[ATGC]' '[TACG]' | rev</code>
   * This is a trick if you are working on the complementary strand.
@@ Line 61: / Line 53: @@
 === sed (slightly more advanced) ===
-<font inherit/Courier New,Courier,monospace;;inherit;;inherit>sed -n -e '/AAA/,/BBB/ p'</font>
+<code>sed -n -e '/AAA/,/BBB/ p' file.txt</code>
+  * This will find AAA, and keep all the lines in a file until it reaches BBB. Pro tip: use this one to extract a sequence in a multi-line fasta.
+  * Note that using variables inside a sed command requires double quotation marks " instead of single '.
+===== Working with lists and tables =====
+Play with the following sample files.{{ :general:computerenvironment:comm_join_example.tar.gz |}}
+=== comm ===
+To compare contents of both files (in this case, the identifiers of the first column of the two files):
+<code>comm <() <()</code>
+  * Within each "<()" we place the command of the input to compare.
+  * These should be sorted out
+  * I use the second part of the command (the sed) to adjust the output to have the correct number of columns
+<code>comm <(cut -f1 1_table.txt | sort) <(cut -f1 2_table.txt | sort) | sed -e 's/$/\t\t/' | cut -f1,2,3</code>
+  * Output:
+  * First column: identifiers exclusive of the table in input 1
+  * Second column: identifiers exclusive of the table in input 2
+  * Third column: identifiers present in both tables
+----
+=== join ===
+Join two tables based on a column in common.
+<code>join -t $'\t' <() <()</code>
+  * Input within the "<()" must be sorted out.
+  * I highly recommend using only the first column with the identifiers to join.
+  * If one of the tables has repeated identifiers, the output will generate all combinations possible.
+  * The standard output will display only lines with columns in common. We can add option -a1 or -a2 to also include the entries of one of the tables, with no joined values from other. Do not use both.
+<code>join -t $'\t' -a1 <(sort -k1,1 1_table.txt) <(sort -k1,1 2_table.txt)</code>
-  * <font inherit/Arial,Helvetica,sans-serif;;inherit;;inherit>This will find AAA, and keep all the lines in</font><font inherit/Arial,Helvetica,sans-serif;;inherit;;inherit>a</font><font inherit/Arial,Helvetica,sans-serif;;inherit;;inherit>file</font><font inherit/Arial,Helvetica,sans-serif;;inherit;;inherit>until</font><font inherit/Arial,Helvetica,sans-serif;;inherit;;inherit>it</font><font inherit/Arial,Helvetica,sans-serif;;inherit;;inherit>reaches</font>BBB.
-  * <font inherit/Arial,Helvetica,sans-serif;;inherit;;inherit>Pro tip: use this one to extract a sequence in a multi-line fasta.</font>*

Tools

menus and quick search

quick search

site status

Page Tools

meta data for this page

Differences