Troubleshooting Nextflow run
- Learn basic troubleshooting of nextflow log
- Learn the structure of nextflow work directory
- Examine the run command stitched together by nextflow for manual debugging
2.2.1. Nextflow log
It is important to keep a record of the commands you have used to generate your results. Nextflow helps with this by creating and storing metadata and logs about the run in hidden files and folders in your current directory (unless otherwise specified). This data can be used by Nextflow to generate reports. It can also be queried using the Nextflow log
command:
nextflow log
The log
command has multiple options to facilitate the queries and is especially useful while debugging a workflow and inspecting execution metadata. You can view all of the possible log
options with -h
flag:
nextflow log -h
To query a specific execution you can use the RUN NAME
or a SESSION ID
:
nextflow log <run name>
To get more information, you can use the -f
option with named fields. For example:
nextflow log <run name> -f process,hash,duration
There are many other fields you can query. You can view a full list of fields with the -l
option:
nextflow log -l
Exercise: Use the log
command to view with process
, hash
, and script
fields for your tasks from your most recent Nextflow execution.
First, use the log
command to get a list of your recent executions:
nextflow log
TIMESTAMP DURATION RUN NAME STATUS REVISION ID SESSION ID COMMAND
2025-05-27 15:23:39 2m 16s hopeful_dubinsky OK b89fac3265 54c06115-6867-45e3-86b3-8566a69f7406 nextflow run nf-core/rnaseq -r 3.14.0 -profile apptainer -params-file ./workshop-params.yaml
2025-05-27 15:56:36 37.1s tiny_bartik OK b89fac3265 54c06115-6867-45e3-86b3-8566a69f7406 nextflow run nf-core/rnaseq -r 3.14.0 -profile apptainer -params-file ./workshop-params.yaml -resume 2025-05-27 15:57:23 35.6s angry_mcclintock OK b89fac3265 54c06115-6867-45e3-86b3-8566a69f7406 nextflow run nf-core/rnaseq -r 3.14.0 -profile apptainer -params-file ./workshop-params.yaml -resume --outdir my_results
Query the process, hash, and script using the -f
option for the most recent run:
nextflow log marvelous_shannon -f process,hash,script
[... truncated ...]
NFCORE_RNASEQ:RNASEQ:QUANTIFY_PSEUDO_ALIGNMENT:SALMON_QUANT d7/01e251
salmon quant \
--geneMap genome.filtered.gtf \
--threads 2 \
--libType=ISR \
--index salmon \
-1 SRR6357071_1_val_1.fq.gz -2 SRR6357071_2_val_2.fq.gz \
\
-o SRR6357071
if [ -f SRR6357071/aux_info/meta_info.json ]; then
cp SRR6357071/aux_info/meta_info.json "SRR6357071_meta_info.json"
fi
cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASEQ:RNASEQ:QUANTIFY_PSEUDO_ALIGNMENT:SALMON_QUANT":
salmon: $(echo $(salmon --version) | sed -e "s/salmon //g")
END_VERSIONS
[... truncated ... ]
NFCORE_RNASEQ:RNASEQ:MULTIQC 7c/b0bbc5
multiqc \
-n multiqc_report.html \
-f \
\
\
.
cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASEQ:RNASEQ:MULTIQC":
multiqc: $( multiqc --version | sed -e "s/multiqc, version //g" )
END_VERSIONS
2.2.2. Execution cache and resume
Task execution caching is an essential feature of modern workflow managers. As such, Nextflow provides an automated caching mechanism for every execution. When using the Nextflow -resume
option, successfully completed tasks from previous executions are skipped and the previously cached results are used in downstream tasks.
Nextflow caching mechanism works by assigning a unique ID to each task. The task unique ID is generated as a 128-bit hash value composing the the complete file path, file size, and last modified timestamp. These ID’s are used to create a separate execution directory where the tasks are executed and the outputs are stored. Nextflow will take care of the inputs and outputs in these folders for you.
You can re-launch the previously executed nf-core/rnaseq
workflow again using -resume
, and observe the progress. Change the output directory to be my_results
. Notice the time it takes to complete the workflow.
nextflow run nf-core/rnaseq -r 3.14.0 \
-profile apptainer \
-params-file ./workshop-params.yaml \
-resume \ --outdir my_results
[6d/10f0b4] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER (genome.fasta) [100%] 1 of 1, cached: 1 ✔
[77/74cbf2] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED (genome.filtered.gtf) [100%] 1 of 1, cached: 1 ✔
[02/f7e668] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:MAKE_TRANSCRIPTS_FASTA (rsem/genome.fasta) [100%] 1 of 1, cached: 1 ✔ ...
Executing this workflow will create a my_results
directory that contain selected results files, as well as the work
directory, which contains further sub-directories.
In the schematic above, the hexadecimal numbers, such as 6d/10f0b4
, identify the unique task execution. These numbers are also the prefix of the work
subdirectories where each task is executed.
You can inspect the files produced by a task by looking inside the work
directory and using these numbers to find the task-specific execution path. Use tab
to autocomplete the full file path:
ls work/6d/10f0b4d0e6cf920e35657ce78feb1d/
If you look inside the work
directory of a FASTQC
process, you will find the files that were staged and created when this task was executed:
ls -la work/e9/60b2e80b2835a3e1ad595d55ac5bf5/
total 1940
drwxrwxr-x 2 larigan larigan 4096 May 27 15:24 .
drwxrwxr-x 3 larigan larigan 4096 May 27 15:23 ..
-rw-rw-r-- 1 larigan larigan 0 May 27 15:24 .command.begin
-rw-rw-r-- 1 larigan larigan 0 May 27 15:24 .command.err
-rw-rw-r-- 1 larigan larigan 34 May 27 15:24 .command.log
-rw-rw-r-- 1 larigan larigan 34 May 27 15:24 .command.out
-rw-rw-r-- 1 larigan larigan 10394 May 27 15:24 .command.run
-rw-rw-r-- 1 larigan larigan 468 May 27 15:24 .command.sh
-rw-rw-r-- 1 larigan larigan 261 May 27 15:24 .command.trace
-rw-rw-r-- 1 larigan larigan 1 May 27 15:24 .exitcode
-rw-rw-r-- 1 larigan larigan 598884 May 27 15:24 SRR6357071_1_fastqc.html
-rw-rw-r-- 1 larigan larigan 365752 May 27 15:24 SRR6357071_1_fastqc.zip
lrwxrwxrwx 1 larigan larigan 66 May 27 15:24 SRR6357071_1.fastq.gz -> /home/larigan/rnaseq_data/testdata/GSE110004/SRR6357071_1.fastq.gz
lrwxrwxrwx 1 larigan larigan 21 May 27 15:24 SRR6357071_1.gz -> SRR6357071_1.fastq.gz
-rw-rw-r-- 1 larigan larigan 604569 May 27 15:24 SRR6357071_2_fastqc.html
-rw-rw-r-- 1 larigan larigan 355487 May 27 15:24 SRR6357071_2_fastqc.zip
lrwxrwxrwx 1 larigan larigan 66 May 27 15:24 SRR6357071_2.fastq.gz -> /home/larigan/rnaseq_data/testdata/GSE110004/SRR6357071_2.fastq.gz
lrwxrwxrwx 1 larigan larigan 21 May 27 15:24 SRR6357071_2.gz -> SRR6357071_2.fastq.gz -rw-rw-r-- 1 larigan larigan 83 May 27 15:24 versions.yml
The FASTQC
process runs twice, executing in a different work directories for each set of inputs. Therefore, in the previous example, the work directory [e9/60b2e8]
represents just one of the two sets of input data that was processed.
It’s very likely you will execute a workflow multiple times as you find the parameters that best suit your data. You can save a lot of storage space (and time) by resuming a workflow from the last step that was completed successfully and/or unmodified.
In practical terms, the workflow is executed from the beginning. However, before launching the execution of a process, Nextflow uses the unique task ID to check if the work directory already exists and that it contains a valid command exit state with the expected output files. If this condition is satisfied, the task execution is skipped and previously computed results are saved in the output directory.
Notably, the -resume
functionality is very sensitive. Even touching a file in the work directory can invalidate the cache.
Exercise: Invalidate the cache by touching a .fastq.gz
file inside the FASTQC
task work directory (you can use the touch
command). Execute the workflow again with the -resume
option. Has the cache has been invalidated?
Execute the workflow for the first time (if you have not already).
Use the task ID shown for the FASTQC
process and use it to find and touch
a .fastq.gz
file:
touch work/ff/21abfa87cc7cdec037ce4f36807d32/SRR6357071_1.fastq.gz
Execute the workflow again with the -resume
command option:
nextflow run nf-core/rnaseq -r 3.14.0 \
-profile apptainer \
-params-file ./workshop-params.yaml \
-resume \ --outdir my_results
You should see that some task were invalid and were executed again.
Why did this happen?
In this example, the caching of one of the two FASTQC
tasks were invalid. The fastq file we touch
is used by in the pipeline in multiple places. Thus, touching the symbolic link for this file and changing the date of last modification disrupted one of the task execution and its related downstream processes.
2.2.3. Troubleshoot warning and error messages
If we go back to our last exercise (exercise_rnaseq
output), you might recall that while that workflow execution completed successfully, there were a couple of warning messages that may be cause for concern:
-[nf-core/rnaseq] Pipeline completed successfully with skipped sampl(es)-
-[nf-core/rnaseq] Please check MultiQC report: 2/2 samples failed strandedness check.-
Completed at: 04-May-2025 15:03:01
Duration : 7m 59s
CPU hours : 0.8 Succeeded : 66
The first warning message isn’t very descriptive (see this pull request). You might come across issues like this when running nf-core pipelines, too. Bug reports and user feedback is very important to open source software communities like nf-core. If you come across any issues, submit a GitHub issue or start a discussion in the relevant nf-core Slack channel so others are aware and it can be addressed by the pipeline’s developers.
➤ Take a look at the MultiQC report, as directed by the second message. You can find the MultiQC report in the exercise_rnaseq
directory:
ls -la exercise_rnaseq/multiqc/star_salmon/
total 1402
drwxrwxr-x 4 rlupat rlupat 4096 Nov 22 00:29 .
drwxrwxr-x 3 rlupat rlupat 4096 Nov 22 00:29 ..
drwxrwxr-x 2 rlupat rlupat 8192 Nov 22 00:29 multiqc_data
drwxrwxr-x 5 rlupat rlupat 4096 Nov 22 00:29 multiqc_plots -rw-rw-r-- 1 rlupat rlupat 1419998 Nov 22 00:29 multiqc_report.html
➤ Download the multiqc_report.html
using the file navigator panel on the left side of your VS Code window. Right click the file navagator, then select Download
. Open the file on your computer.
Take a look a the section labelled WARNING: Fail Strand Check
The warning indicates that the read strandedness we specified in our samplesheet.csv
and inferred strandedness identified by the RSeqQC process in the pipeline do not match. In the samplesheet.csv
, it seems we have incorrectly specified strandedness as forward
, when our raw reads actually show an equal distribution of sense and antisense reads.
For those who are not familiar with RNAseq data, incorrectly specified strandedness may negatively impact the read quantification step (process: Salmon quant) and give us inaccurate results. So, let’s clarify how the Salmon quant process is gathering strandedness information for our input files by default and find a way to address this with the parameters provided by the nf-core/rnaseq pipeline.
2.2.4. Identify the run command for a process
To observe the exact command used a process, we can attempt to infer this information from the module’s main.nf
script in the modules/
directory. However, given all the different parameters that may be applied at the process level, this may not be very clear.
➤ Take a look at the Salmon quant main.nf file.
This file contains many function definitions within the process, variable substitutions, and internal parameters determined based on strandedness. This makes it very hard to see what is actually happening in the code, given all the different variables and conditional arguments inside this script.
Above the script block, we can see strandedness is being applied using a few different conditional arguments. Instead of trying to infer how the $strandedness
variable is being defined and applied to the process, let’s use the hidden command files saved for this process in its work
execution directory.
Remember that the pipeline’s results are cached in the work
directory. In addition to the cached files, each task execution directory inside the work directory contains a number of hidden files:
.command.sh
: The command used for the task..command.run
: Specifying resources, executor, software management profiles to use..command.out
: The task’s standard output log..command.err
: The task’s standard error log..command.log
: A wrapper for the execution output..command.begin
: A file created as soon as the job is launched..exitcode
: A file containing the task exit code (0
if successful)
Within the nextflow log
command that we discussed previously, there are multiple options to facilitate pipeline debugging and inspecting pipeline execution metadata. To understand how Salmon is interpreting strandedness, we’re going to use this command to determine the full path to hidden .command.sh
scripts for each Salmon quant task that was run. This will allow us to investigate how Salmon handles strandedness and if there is a way for us to override this.
➤ Use the nextflow log
command to get the unique run name information of previously executed pipelines. Then, add that run name to your command:
nextflow log <run-name>
After running the command, we can see that it provided a list of all the work subdirectories created for each processes when the pipeline was executed. How do we use this information to find the speicfic hidden.command.sh
for Salmon tasks?
➤ Let’s use Bash to query a Nextflow run with the run name from the previous lesson. First, save your run name in a Bash variable run_name
. For example:
run_name=marvelous_shannon
➤ And let’s save the tool of interest (salmon
) in another Bash variable tool
:
tool=salmon
➤ Next, run the following bash command:
nextflow log ${run_name} | while read line;
do
cmd=$(ls ${line}/.command.sh 2>/dev/null);
if grep -q $tool $cmd;
then
echo $cmd;
fi; done
This will list all process .command.sh
scripts containing the word ‘salmon’. Notice that there are a few different processes that run Salmon to perform other steps in the workflow. We are looking for Salmon quant which performs the read quantification:
/home/larigan/lesson2.1/work/9c/9cdaec01c009a4fef6de3b50b0d2c9/.command.sh
/home/larigan/lesson2.1/work/d9/e696aa3903f2f7bef3ead3852a7d51/.command.sh
/home/larigan/lesson2.1/work/57/95f7806c62313c5788780d1fadc89a/.command.sh
/home/larigan/lesson2.1/work/2f/bef610318ab85c7fdbb3a773c568d5/.command.sh
/home/larigan/lesson2.1/work/ec/d8a46743dc73214b04e09c7ae9ecb4/.command.sh
/home/larigan/lesson2.1/work/f2/cc71ed58bbfba78ea034a26bd48370/.command.sh
/home/larigan/lesson2.1/work/3b/0a2737be44be977e4b695c5bade23f/.command.sh
/home/larigan/lesson2.1/work/8d/fed78effdd1b28435a698d8a6efb7a/.command.sh /home/larigan/lesson2.1/work/65/bd7329a29aaf2136f25173a22918ae/.command.sh
Compared with the salmon quant main.nf
file, we get a lot more fine scale details from the .command.sh
process scripts:
main.nf
:
salmon quant \\
--geneMap $gtf \\
--threads $task.cpus \\
--libType=$strandedness \\
$reference \\
$input_reads \\
$args \\ -o $prefix
.command.sh
:
salmon quant \
--geneMap genome.filtered.gtf \
--threads 2 \
--libType=ISF \
-t genome.transcripts.fa \
-a SRR6357071.Aligned.toTranscriptome.out.bam \
\ -o SRR6357071
From .command.sh
, we see that --libType
has been set to ISF
(ie. forward strandedness), based on our samplesheet.
Exercise: Besides changing the samplesheet input, we can use parameter settings to over-ride the --libType
. Use the pipeline Parameters documentation to determine what parameter has to be changed. Instead, we would like this to be ISR
(ie. reverse strandedness). How can we do this?
From the pipeline documentation, the --salmon_quant_libtype
can be changed. To change the libType
specified to Salmon to be ISR
, we can specify --salmon_quant_libtype ISR
using the command line or in a parameter file.
2.2.5. Using a parameter file
From the previous section we learn that Nextflow can accept a yaml
parameter file. Any of the pipeline-specific parameters can be supplied to a Nextflow pipeline in this way.
Exercise: Set the Salmon libType
to ISR
, inside the workshop-params.yaml
file we created previously.
- Strings need to be inside double quotes
- Booleans (true/false) and numbers do not require quotes
workshop-params.yaml
should now contain one additional parameter:
salmon_quant_libtype: "ISR"
➤ Now that our params file has been saved, we can rerun the pipeline:
nextflow run nf-core/rnaseq -r 3.14.0 \
-resume
-profile apptainer \
-params-file workshop-params.yaml \ --outdir exercise_rnaseq
As the workflow runs a second time, you will notice 4 things:
- The
nextflow run
command is much tidier, due to the use of a-params-file
that stores pipeline parameters used in a Nextflow run - The
-resume
flag. Nextflow has many run options including the ability to use cached output - Some processes will be pulled from the cache. These processes remain unaffected by our addition of the new parameter.
- This run of the pipeline will complete in a much shorter time compared to starting the pipeline from the beginning, due to pipeline caching.
-[nf-core/rnaseq] Pipeline completed successfully with skipped sampl(es)-
-[nf-core/rnaseq] Please check MultiQC report: 2/2 samples failed strandedness check.-
Completed at: 27-May-2025 17:13:48
Duration : 1m 55s
CPU hours : 0.2 (70.8% cached)
Succeeded : 11 Cached : 41
We still seem to be getting the warning Please check MultiQC report: 2/2 samples failed strandedness check.-
. Let’s check what --libType
has been used inside the salmon process.
Exercise: Determine the hexadecimal code output by Nextflow for a SALMON_QUANT
process in your most recent run. Use this code to determine the work
execution directory for SALMON_QUANT
, and look inside the .command.sh
. What --libType
has been used? Is it the one we specified in our parameter file?
In the Nextflow output, the following line provides the hexadeximal for a SALMON_QUANT
process.
[d9/69e2a7] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_QUANT (SRR6357071) [100%] 2 of 2 ✔
The work
execution directory for this process is located in (using tab
to autocomplete the folder name):
ls -a work/d9/69e2a7563de5e6046f140a1317c1d2/
Inside the .command.sh
file, the --libType
parameter matches the one specified in our parameter file:
#!/bin/bash -euo pipefail
salmon quant \
--geneMap genome.filtered.gtf \
--threads 2 \
--libType=ISR \
-t genome.transcripts.fa \
-a SRR6357071.Aligned.toTranscriptome.out.bam \
\
-o SRR6357071
if [ -f SRR6357071/aux_info/meta_info.json ]; then
cp SRR6357071/aux_info/meta_info.json "SRR6357071_meta_info.json"
fi
cat <<-END_VERSIONS > versions.yml
"NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_QUANT":
salmon: $(echo $(salmon --version) | sed -e "s/salmon //g") END_VERSIONS
The tool is working as we expected!
If we want to remove the warning message Please check MultiQC report: 2/2 samples failed strandedness check
, we will have to change the strandedness fields in our samplesheet.csv
. Keep in mind, doing this will invalidate the pipeline’s cache and will cause the pipeline to run from the beginning.
- Use nextflow log to query the record of commands used in the pipeline
- Use
-resume
to re-launch previously executed workflows in order to get Nextflow to utilise its task execution caching feature - Examine the .command.sh to inside the work directory to troubleshoot the command that Nextflow use to run a particular task
Next Chapter: Introduction to Nextflow Processes and Channels
This workshop is adapted from various nextflow training materials, including: