meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
general:bioseqanalysis:genesetanalysis:fcat [2025/04/08 15:55] – [Preparing the fCAT run] ingogeneral:bioseqanalysis:genesetanalysis:fcat [2025/04/08 16:34] (current) – [fCAT analysis - Output visualization and interpretation] ingo
Line 63: Line 63:
     - Annotate the protein domains in the gene set of interest:<WRAP>     - Annotate the protein domains in the gene set of interest:<WRAP>
 <code> <code>
-fdog.addTaxon -f Crypto_Metaeuk.fas -n Cryho_metaeuk -i 237895 -o $HOME/Analysis/fCAT --annopath $HOME/Analysis/fcat/annotation_dir/ --replace+fdog.addTaxon -f Crypto_Metaeuk.fas -n Crypa_metaeuk -i 5807 -o $HOME/Analysis/fCAT --annopath $HOME/Analysis/fcat/annotation_dir/ --replace
 </code>The second command will annotate protein features, such as PFAM and SMART domains, low complexity regions, transmembrane domains, etc. If you do not want to run the annotation, which will take a couple of minutes using 8 cores, copy it from ''/home/ubuntu/Share/ProteinSets/fcat/CRYPA_METAEUK2@5807@240209.json'' into $HOME/Analysis/fcat/annotation_dir/<WRAP> </code>The second command will annotate protein features, such as PFAM and SMART domains, low complexity regions, transmembrane domains, etc. If you do not want to run the annotation, which will take a couple of minutes using 8 cores, copy it from ''/home/ubuntu/Share/ProteinSets/fcat/CRYPA_METAEUK2@5807@240209.json'' into $HOME/Analysis/fcat/annotation_dir/<WRAP>
 <hidden CommonErrors> <hidden CommonErrors>
Line 101: Line 101:
  
 ==== fCAT analysis - Output visualization and interpretation ==== ==== fCAT analysis - Output visualization and interpretation ====
-fCAT in combination with [[https://bioconductor.org/packages/release/bioc/html/PhyloProfile.html|PhyloProfile]] allows to visualize and explore the results of the geneset completeness analysis. Follow the steps below to :!:  {{ :physaliacg:2024:data:eukaryota.tar.gz |download the data}} to your local computer and :!: to open it in PhyloProfile.+fCAT in combination with [[https://bioconductor.org/packages/release/bioc/html/PhyloProfile.html|PhyloProfile]] allows to visualize and explore the results of the geneset completeness analysis. Follow the steps below to :!:  {{ :physaliacg:2025:data:CRYPA_Metaeuk-fcat.tar.gz|download the data}} to your local computer and :!: to open it in PhyloProfile.
 <hidden PrecomputedFiles> <hidden PrecomputedFiles>
 You will find all pre-computed fCAT results at ''/home/ubuntu/Share/Analysis/fCAT/fcatOutput/eukaryota''. Use these, if your analysis did not complete in time. You will find all pre-computed fCAT results at ''/home/ubuntu/Share/Analysis/fCAT/fcatOutput/eukaryota''. Use these, if your analysis did not complete in time.
 </hidden> </hidden>
 === Downloading the data === === Downloading the data ===
-Download the following three files from the fcat output folder, e.g. ''$HOME/Analyses/fcat/fcatOutput/eukaryota/CRYHO@237895@220307/phyloprofileOutput'' for the //eukaryota// dataset.+Download the following three files from the fcat output folder, e.g. ''$HOME/Analyses/fcat/fcatOutput/eukaryota/CRYPA@5807@250408/phyloprofileOutput'' for the //eukaryota// dataset.
   - *.phyloprofile :!: These files contains the information about the presence/absence of orthologs to the genes in your coreset together with the domain architecture similarity scores. You will find the information for both your taxon of interest **and** the core taxa. **It is the main input file for PhyloProfile**. :!: Choose the one that is represents the fCAT scoring mode you are interested in.   - *.phyloprofile :!: These files contains the information about the presence/absence of orthologs to the genes in your coreset together with the domain architecture similarity scores. You will find the information for both your taxon of interest **and** the core taxa. **It is the main input file for PhyloProfile**. :!: Choose the one that is represents the fCAT scoring mode you are interested in.
   - *.mod.fa :!: This file contains the sequences of the orthologs in FASTA format   - *.mod.fa :!: This file contains the sequences of the orthologs in FASTA format
Line 121: Line 121:
   - upload the *domains file into the field at the lower left   - upload the *domains file into the field at the lower left
   - specify the origin of group IDs you are using   - specify the origin of group IDs you are using
-    - Dataset //alveolata//: select **OMA** 
     - Dataset //eukaryota//: select **OrthoDB**     - Dataset //eukaryota//: select **OrthoDB**
   - plot the results by clicking on ‘’Plot’’   - plot the results by clicking on ‘’Plot’’
Line 136: Line 135:
   - redo the selection, this time selecting all genes from the //eukaryota// dataset that are present in all core species but are absent in your //C. parvum// gene set((This requires some experimenting to find the correct clade in the tree, unfortunately))   - redo the selection, this time selecting all genes from the //eukaryota// dataset that are present in all core species but are absent in your //C. parvum// gene set((This requires some experimenting to find the correct clade in the tree, unfortunately))
   - if you do not find a single clade comprising all the genes that are missing in //C. parvum// do the following:   - if you do not find a single clade comprising all the genes that are missing in //C. parvum// do the following:
-    - Look for the file ''missing.txt'' in your fCat output folder+    - Look for the file ''{{ :physaliacg:2025:data:crypa_metaeuk-fcat_missing.txt.gz |missing.txt}}'' in your fCat output folder
     - go to the tab ''Customised profile''     - go to the tab ''Customised profile''
     - find the button to upload a gene list for selecting a gene set of interest<WRAP>     - find the button to upload a gene list for selecting a gene set of interest<WRAP>