Basic Structure of Snakemake Pipeline Run !

/user/snakemake-demo$ ls
config.json data envs scripts slurm-240702.out Snakefile
  • data = mock data for the snakefile to use
  • Snakefile = name of the snakemake “formula” file
    • Note: The default file that snakemake looks for in the current working directory is the Snakefile. If you would like to override that you can specify it following the -s
      • snakemake -s snakefile.py
  • envs = directory for storing the conda environments that the workflow will use.
  • scripts = directory for storing python scripts called by the snakemake formula.
  • config.json = json format file with extra parameters for our snakemake file to use.
  • cluster.json = json format file with specification for running on the HPC
  • samples.txt = file we will use later relating to the config.json file.

Run the snakemake file as a dry run (the example workflow shown above).

  • This will build a DAG of the jobs to be run without actually executing them.
  • snakemake --dry-run

User can execute rules of interest.

  • snakemake --dry-run all VS. snakemake --dry-run call VS. snakemake --dry-run bwa

Run the snakemake file in order to produce an image of the DAG of jobs to be run.

  • snakemake --dag | dot -Tsvg > dag.svg OR snakemake --dag | dot -Tsvg > dag.svg

Run the snakemake (this time not as a dry run)

  1. snakemake --use-conda