Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: Please check design file header #135

Closed
bschilder opened this issue Dec 17, 2020 · 1 comment
Closed

ERROR: Please check design file header #135

bschilder opened this issue Dec 17, 2020 · 1 comment
Labels
bug Something isn't working
Milestone

Comments

@bschilder
Copy link

Reposting here from the nf-core Slack for posterity. Thanks for @drpatelh for figuring this out!

Problem

nextflow/atacseq pipeline does not recognize the design.csv file, even though it follows the structure indicated here.

The example bash script below follows a modified version of the command shown here.

Bash script

#!/bin/bash

source ~/.bashrc
module load nextflow

export repo_dir=$HOME/neurogenomics/GitRepos/CUT_n_TAG
export project_id=HK5M2BBXY
mkdir -p $repo_dir/processed_data/$project_id


nextflow run nf-core/atacseq \
    --input $repo_dir/raw_data/$project_id/design_noindex.csv \
    --genome GRCh37 \
    --narrow_peak \
    --outdir $repo_dir/processed_data/$project_id \
    -with-singularity $HOME/atacseq_latest.sif \
    -c $repo_dir/hpc_config \
    -r 1.2.1 

Error output

...
Error executing process > 'CHECK_DESIGN (design.csv)'

Caused by:
  Process `CHECK_DESIGN (design.csv)` terminated with an error exit status (1)

Command executed:

  check_design.py design.csv design_reads.csv

Command exit status:
  1

Command output:
  ERROR: Please check design file header: group,replicate,fastq_1,fastq_2 != group,replicate,fastq_1,fastq_2

Command wrapper:
  singularity/default :: no need to load a module to use singularity
  ERROR: Please check design file header: group,replicate,fastq_1,fastq_2 != group,replicate,fastq_1,fastq_2
  
  ============================================
  
          Job resource usage summary 
  
                   Memory (GB)    NCPUs
   Requested  :         6             1
   Used       :         0 (peak)   0.50 (ave)
  
  ============================================

Work dir:
  /rds/general/user/bms20/ephemeral/tmp/b8/e3357840d49b488c3f691f8db4592d

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

Solution

The design.csv file MUST be in plain csv format, not UTF-8 encoded, which is the default in Excel.
See attached screenshot of how to save as correct format.

Screenshot 2020-12-17 at 17 52 43

@drpatelh drpatelh added the bug Something isn't working label Jan 25, 2021
@drpatelh drpatelh added this to the 2.0 milestone Nov 15, 2022
@drpatelh
Copy link
Member

Hi @bschilder ! Thanks for reporting and apologies for the delay in responding! We are about to release a much updated version of the pipeline that has been completely refactored to be written in Nextflow DSL2.

I believe this should be fixed now because I added the encoding below to the nf-core pipeline template that has now been incorporated in this pipeline too:

with open(file_in, "r", encoding="utf-8-sig") as fin:

x-ref: nf-core/rnaseq@3e4d35b

Please feel free to re-open if the issue persists. I will close this issue for now.

@drpatelh drpatelh changed the title ERROR: Please check design file header: group,replicate,fastq_1,fastq_2 != group,replicate,fastq_1,fastq_2 ERROR: Please check design file header Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants