meta data for this page
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
general:computerenvironment:interpro [2019/05/13 13:03] – felix | general:computerenvironment:interpro [2025/05/09 06:56] (current) – [General information] freya | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== |
+ | ===== General information ===== | ||
+ | To run InterProScan locally, you need to download the InterPro databases and modify the configuration file of the program. As the databases are large files, this would mean a large amount of data redundancy if you were to install them individually. For this reason, we prepared a ready to use version of InterProScan with Anaconda. Environments can be shared, so you can activate the environment with: | ||
+ | < | ||
- | To run InterPro locally, you need to download the InterPro databases and modify the configure file of the program. Luckily, we already have a ready to use version of InterProScan available :) | + | Because InterProScan takes up a lot of memory, you should send this job to our cluster. Note, that InterProScan can not create folders for the output. That means that you need to use an existing directory that does not contain special characters like dots in its name for the output. |
- | You can find InterProScan at /share/project/ | + | / |
+ | - Python 2.7 | ||
+ | - you can check your python version with the command < | ||
+ | - Java | ||
+ | - install Java with anaconda: < | ||
+ | - Hmmer: '' | ||
+ | - Blast: '' | ||
+ | </ | ||
+ | **Running Interproscan** | ||
+ | < | ||
+ | **Options: | ||
+ | * **-i** - specify the input file containing the sequences in Fasta format. | ||
+ | * **-b** - specify the name of the output directory | ||
+ | * **-dp** - disables using the online lookup services - all analysis are run locally | ||
+ | * **-goterms** - Extract the GO terms from the annotated domains and features and assigns them to the query sequence | ||
+ | * **-appl** - specify the analyses you want to include in your InterProScan run. Selecting fewer applications will speed up the search | ||
+ | For further information about how to run InterProScan, | ||
+ | Once everything is set, run the analysis on all sequences. This analysis will take some time, so you best run it over night. | ||
- | You can even run it straight from the share folder and don't even have to copy it. Because InterProScan takes up a lot of memory, you should send this job to our cluster: | + | <file bash RunInterpro.sh> |
- | + | #!/bin/bash | |
- | Please note, that InterProScan can not create folders for the output. That means that you need to use an existing directory that does not contain special characters like dots in its name for the output. | + | |
- | + | ||
- | + | ||
- | <file bash RunInterpro.sh># | + | |
#SBATCH --partition=all, | #SBATCH --partition=all, | ||
- | #SBATCH --cpus-per-task=5 | + | #SBATCH --cpus-per-task=8 |
#SBATCH --mem=10GB | #SBATCH --mem=10GB | ||
#SBATCH --job-name=" | #SBATCH --job-name=" | ||
- | + | interproscan.sh -cpu 8 -i < | |
- | bash / | + | /*removed from inteproscan run due to failure: PROSITEPROFILES |
+ | local installation that is no longer usable due to missing global installations of hmmer, java, etc.: / | ||
Just download and modify this file, then send it to the cluster with | Just download and modify this file, then send it to the cluster with | ||
< | < | ||
sbatch RunInterpro.sh | sbatch RunInterpro.sh | ||
</ | </ | ||
+ | |||
+ | /* | ||
+ | ===== InterProScan configuration ===== | ||
+ | We have tweaked the configuration of InterProScan a bit to improve the [[https:// | ||
+ | < | ||
+ | / | ||
+ | </ | ||
+ | Specifically, | ||
+ | * We have set the max number of workers to 2< | ||
+ | < | ||
+ | ## Master/ | ||
+ | ## | ||
+ | |||
+ | # Set the number of embedded workers to the number of processors that you would like to employ | ||
+ | # on the machine you are using to run InterProScan. | ||
+ | #number of embedded workers | ||
+ | number.of.embedded.workers=1 | ||
+ | maxnumber.of.embedded.workers=2 | ||
+ | </ | ||
+ | * and start hmmsearch and pfamscan with 3 cpus each | ||
+ | < | ||
+ | ## cpu options for parallel processing | ||
+ | ## | ||
+ | |||
+ | #hmmer cpu options for the different jobs | ||
+ | hmmer3.hmmsearch.cpu.switch.gene3d=--cpu 3 | ||
+ | hmmer3.hmmsearch.cpu.switch.panther=--cpu 3 | ||
+ | hmmer3.hmmsearch.cpu.switch.pfama=--cpu 3 | ||
+ | hmmer3.hmmsearch.cpu.switch.pirsf=--cpu 3 | ||
+ | hmmer3.hmmsearch.cpu.switch.sfld=--cpu 3 | ||
+ | hmmer3.hmmsearch.cpu.switch.superfamily=--cpu 3 | ||
+ | hmmer3.hmmsearch.cpu.switch.tigrfam=--cpu 3 | ||
+ | |||
+ | hmmer3.hmmsearch.cpu.switch.hmmfilter=--cpu 3 | ||
+ | |||
+ | hmmer2.hmmpfam.cpu.switch.smart=--cpu 3 | ||
+ | |||
+ | |||
+ | #panther binary cpu options (for blastall and hmmsearch) | ||
+ | panther.binary.cpu.switch=-c 3 | ||
+ | |||
+ | #pirsf binary cpu options (for hmmscan) | ||
+ | pirsf.pl.binary.cpu.switch=-cpu 3 | ||
+ | </ | ||
+ | */ |