meta data for this page
Differences
This shows you the differences between two versions of the page.
| general:legacy_backup [2023/10/16 11:21] – created felix | general:legacy_backup [2023/10/16 11:22] (current) – felix | ||
|---|---|---|---|
| Line 116: | Line 116: | ||
| The taxon set we will use for this analysis consists of //L. hispanica//, | The taxon set we will use for this analysis consists of //L. hispanica//, | ||
| - | ==== Install fDOG and PhyloProfile ==== | ||
| - | fDOG has been installed in the environment **/ | ||
| - | |||
| - | ==== Perform fDOG search ==== | ||
| - | - Collecting data | ||
| - | - Seed proteins: Identify the 4 different types of **candidate //L. hispanica// proteins** and save __each sequence in a separate fasta file__ - __each group in their own directory__. They will be the seed genes/ | ||
| - | # please find the solution by yourself ;) You will need at least these functions: for, grep, cut and less / cat | ||
| - | </ | ||
| - | - Taxon data (you can check [[https:// | ||
| - | - Create your own folder for storing the data for fDOG < | ||
| - | - Add //**U. muehlenbergii**// | ||
| - | fdog.addTaxon -f / | ||
| - | </ | ||
| - | - Check in your fDOG data folder (can be found at the end of '' | ||
| - | mkdir / | ||
| - | ln -s / | ||
| - | </ | ||
| - | - Do the same for your own //**L. hispanica**// | ||
| - | fdog.addTaxon -f / | ||
| - | ln -s / | ||
| - | fdog.addTaxon -f / | ||
| - | ln -s / | ||
| - | </ | ||
| - | - We also need the data for the 78 QfO taxa < | ||
| - | cd / | ||
| - | ln -s / | ||
| - | ln -s / | ||
| - | ln -s / | ||
| - | </ | ||
| - | - Now you can run fDOG (using Slurm with 8 CPUs and 8GB memory). Please make sure that the input folder for '' | ||
| - | # add one command like this for each group of candidate genes to your SLURM script (or you can make 4 SLURM scripts, each for one seed directory) | ||
| - | fdogs.run --input / | ||
| - | </ | ||
| - | - The output for fDOG will be '' | ||
| - | - Upload *.phyloprofile and *_forward.domains into PhyloProfile | ||
| - | - Select //Lasallia hispanica// as the reference taxon and plot the profiles | ||
| - | - Apply clustering to bring the similar profiles together | ||
| - | - You can use the function //" | ||
| - | - In the //" | ||
| - | - Also from the //" | ||
| - | |||
| - | <hidden OPTIONAL but highly recommended :-P> | ||
| - | //**This is an example for analysing one specific gene of interest**// | ||
| - | - Searching for DHFR (Dihydrofolate reductase) in //L. pustulata// and //L. hispanica// | ||
| - | - First, we need a DHFR protein as the seed sequence. Please search for this protein in human using the UniProt database. Get the UniProt ID of this protein and check if you have it in the human gene set of your fDOG data. The human gene set of fDOG can be found at ''/ | ||
| - | - Make sure that you have already added //L. pustulata// and //L. hispanica// to fDOG data set (check for LASHI@580046@ecoevo22 and LASPU@136370@ecoevo22 in / | ||
| - | - Now we can run fDOG with the DHFR protein. **// | ||
| - | fdog.run --seqFile / | ||
| - | </ | ||
| - | - Finally, check the phylogenetic profile for DHFR protein if you can find any ortholog in //L. hispanica// and in //L. pustulata// | ||
| - | </ | ||