meta data for this page
This is an old revision of the document!
InterProScan
General information
To run InterProScan locally, you need to download the InterPro databases and modify the configuration file of the program. As the databases are large files, this would mean a large amount of data redundancy if you were to install them individually. For this reason, we prepared a ready to use version of InterProScan with Anaconda. Environments can be shared, so you can activate the environment with:
conda activate /home/freya/miniconda3/envs/interproscan_conda
Because InterProScan takes up a lot of memory, you should send this job to our cluster. Note, that InterProScan can not create folders for the output. That means that you need to use an existing directory that does not contain special characters like dots in its name for the output.
Running Interproscan
interproscan.sh -cpu 8 -i <YOURINPUTFILE> -b ./interproscan -goterms -appl TIGRFAM,PANTHER,Coils,SMART,Pfam,MobiDBLite,CDD
Options:
- -i - specify the input file containing the sequences in Fasta format.
- -b - specify the name of the output directory
- -goterms - Extract the GO terms from the annotated domains and features and assigns them to the query sequence
- -appl - specify the analyses you want to include in your InterProScan run. Selecting fewer applications will speed up the search
For further information about how to run InterProScan, please see the online wiki provided by the developer. Once everything is set, run the analysis on all sequences. This analysis will take some time, so you best run it over night.
- RunInterpro.sh
#!/bin/bash #SBATCH --partition=all,inteli7 #SBATCH --cpus-per-task=8 #SBATCH --mem=10GB #SBATCH --job-name="InterProTest" interproscan.sh -cpu 8 -i <YOURINPUTFILE> -b ./interproscan -goterms -appl TIGRFAM,PANTHER,Coils,SMART,Pfam,MobiDBLite,CDD
Just download and modify this file, then send it to the cluster with
sbatch RunInterpro.sh