This document describes the parameter options used by the pipeline.
- Running the pipeline
- Inputs and outputs
- Modifying parameters
- Trimming parameters
- Additional parameters
- Software dependencies
- Other command line parameters
The main command for running the pipeline is as follows:
nextflow run ecSeq/DNAseq [OPTIONS]
Note that the pipeline will create files in your working directory:
work/ # Directory containing the nextflow working files
.nextflow.log # Log file from Nextflow
.nextflow/ # Nextflow cache and history information
Specify the path to the directory containing input reads in either "*_{1,2}.fastq.gz" format (paired-end) or "*.fastq.gz" format (single-end).
Specify the path to the reference genome in fasta format. NB: there must also be a corresponding fasta index file "*.fai".
Name the output directory where containing final results. [default: "./"]
Indicate to the pipeline whether input reads should be expected in single-end format (i.e. "*.fastq.gz"). [default: off]
Specify in order to generate QC reports of trimmed reads with FastQC. [default: off]
Specify in order to produce QC reports of alignments using Qualimap bamQC. [default: off]
Specify in order to keep trimmed fastq reads as well as alignments. [default: off]
Forward adapter sequence. [default: "GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG"]
Reverse adapter sequence. [default: "ACACTCTTTCCCTACACGACGCTCTTCCGATCT"]
Minimum base quality threshold. [default: 20]
Minimum read length threshold. [default: 25]
Minimum adapter overlap threshold. [default: 3]
Specify in order to prevent Nextflow from clearing the work dir cache following a successful pipeline completion. [default: off]
When called with nextflow run ecseq/dnaseq --version
this will display the pipeline version and quit.
When called with nextflow run ecseq/dnaseq --help
this will display the parameter options and quit.
There are different ways to provide the required software dependencies for the pipeline. The recommended method is to use the Conda, Docker or Singularity profiles as provided by the pipeline.
Use this parameter to choose a preset configuration profile. Profiles available with the pipeline are:
standard
- The default profile, used if
-profile
is not specified. - Uses sensible resource allocation for , runs using the
local
executor (native system calls) and expects all software to be installed and available on the$PATH
. - This profile is mainly designed to be used as a starting point for other configurations and is inherited by most of the other profiles below.
- The default profile, used if
conda
- Builds a conda environment from the environment.yml file provided by the pipeline
- Requires conda to be installed on your system.
docker
- Launches a docker image pulled from ecseq/dnaseq
- Requires docker to be installed on your system.
singularity
- Launches a singularity image pulled from ecseq/dnaseq
- Requires singularity to be installed on your system.
custom
- No configuration at all. Useful if you want to build your own config from scratch and want to avoid loading in the default
base
config for process resource allocation.
- No configuration at all. Useful if you want to build your own config from scratch and want to avoid loading in the default
If you wish to provide your own package containers it is possible to do so by setting the standard
or custom
profile, and then providing your custom package with the command line flags below. These are not required with the the other profiles.
Flag to enable conda. You can provide either a pre-built environment or a *.yaml file.
Flag to enable docker. The image will automatically be pulled from Dockerhub.
Flag to enable use of singularity. The image will automatically be pulled from the internet. If running offline, follow the option with the full path to the image file.
Specify the path to a custom work directory for the pipeline to run with (eg. on a scratch directory)
Provide a file with specified parameters to avoid typing them out on the command line. This is useful for carrying out repeated analyses. A template params file assets/params.config
has been made available in the pipeline repository.
Provide a custom config file for adapting the pipeline to run on your own computing infrastructure. A template config file assets/custom.config
has been made available in the pipeline repository. This file can be used as a boilerplate for building your own custom config.
Specify this when restarting a pipeline. Nextflow will used cached results from any pipeline steps where the inputs are the same, continuing from where it got to previously. Give a specific pipeline name as an argument to resume it, otherwise Nextflow will resume the most recent. NOTE: This will not work if the specified run finished successfully and the cache was automatically cleared. (see: --debug
)
Name for the pipeline run. If not specified, Nextflow will automatically generate a random mnemonic.