Running the pipeline
Check
At first, it is recommended to make a dry-run of the analysis:
$ snakemake -nrp
Run locally
This will check all the rules and th parameters in the config.yaml
file and print all the command lines which would have been executed.
If there is no errors, then you can execute the pipeline with:
$ snakemake
If you have multiple CPUs available on your computer, you can choose to use them.
For example, if you want to use up to 8 CPUs in parallel, you can run:
$ snakemake -j 8
Run on cluster
If you are on a computer cluster with a job scheduler, you can tell the pipeline to use this scheduler instead of runnin all the processes on the local machine:
$ snakemake -j 32 --cluster sbatch
$ snakemake -j 32 --cluster "sbatch -J {cluster.jobName} -c {cluster.c} --mem {cluster.mem} -e {cluster.error} -o {cluster.output} -p gdec" --verbose"
This will allow to have at most 32 subproccess run through the SLURM scheduler with sbatch
.
It is possible to force Snakemake to wait for a defined amount of time in case of latency on the filesystem of your cluster/server.
# wating 30 seconds after each job to check for output files
$ snakemake --latency-wait 30 [...]
Diagrams and graphs
You can generate the diagram of all the processes and dependancies of you analysis:
$ snakemake --dag |dot -T png > dag.png
This will generate a PNG file of your diagram.
If you simply want the global process of the pipeline, you may run:
$ snakemake --rulegraph |dot -T png > rulegraph.png
This will generate a PNG file of your diagram.