Skip to content

Latest commit



104 lines (79 loc) · 4.81 KB

File metadata and controls

104 lines (79 loc) · 4.81 KB


General information

  • Every time you read something like <your_name>, or or this is just a place holder. Replace it with the actual name!


Task 0

Create a local MS Word File name it 024_ngs_phd_<surename>.docx. Replace with your actual name. Copy commands, useful information, links etc to this file. This file should help you to reproduce steps from home after the class is over.


Task 1

Connect to with your username/password using the Windows Powershell

Task 2

Try to navigate on the Bash, create a folder and navigate around.

Task 3

eGFR_SNPs.csv and HDL_SNPs.txt are located in the folder teaching/ngs/data/unix/snp_lookup. How many lines are included in each file? (Tip: You can either navigate to the folder with cd or you can execute grep directly from your home directory).

Task 4

Grep the SNP rs13326165 and the SNP rs17173637 from eGFR_SNPs.csv and write it to a file.

Task 5

Now grep the SNP rs133299 from eGFR_SNPs.csv. How many lines are displayed? How would you interpret the output? What happens if you use the following grep command and what is the difference?: grep -w rs133299 teaching/ngs/data/unix/snp_lookup/eGFR_SNPs.csv

Task 6 (with Stefan)

Now, try to find the SNPs your boss asked you. Use the grep command to output the lines from eGFR_SNPs.csv. As a pattern file use HDL_SNPs.txt. Also add the -w option. Why do we need to add -w? (eGFR_SNPs.csv and HDL_SNPs.txt). How many SNPs did you find? Write them to a file and copy it to Windows.

Data: teaching/ngs/data/unix/snp_lookup


Task 1

In the first exercise we align data with bwa mem:

  • Create a folder mapping under teaching/students/<q-number> and change to this folder.
  • Copy the files 4153_S13_L001_R1_001.fastq.gz and 4153_S13_L001_R2_001.fastq.gz from here: ~/teaching/ngs/data/fastq/exercises/miseq using cp <path_to_file> .. (The point at the end of command means that the data is copied to the current location).
  • Reference is available here: ~/teaching/ngs/data/ref/kiv2_6.fasta (No need to execute bwa index)
  • Update the following command from the Getting Started Guide: ./bwa mem ref.fa read1.fq read2.fq | gzip -3 > aln-pe.sam.gz

Task 2

Now, we convert the file to the BAM format.

  • Use samtools to convert and sort a SAM file to a BAM file. Ask Google or ChatGPT for help.
  • Create an index with samtools index <bam_file>. This will create an index file. Why is an index needed?

Task 3

Run samtools depth <aligned-file-sorted.bam> on the file and interpret the output. Learn about the -a parameter and add it to your command. Write the output to a file.

Tasks 4

Download the file to Windows with WinSCP.

Task 5

Install "Tablet" (*.exe available in the Shared Drive) and load the BAM file via Open Assembly. You also need to specify the reference, you can find the KIV_2.fasta reference in the Shared Drive.

Variant Calling

Task 1 - Use the aligned file and call variants

Checkout freebayes and call your variants. As an input the aligned file (aligned.bam) is required. Write the output to a file ending with .vcf. (freebayes > out.vcf)

Similar to our You can also use a different variant caller

Task 2 - Use a second variant caller

Similar to our "minimal variant calling experiment" you can also combine "bcftools mpileup" with "bcftools call".

bcftools mpileup -f <ref.fa> <input.bam> | bcftools call -m -v -Ov -o <out.vcf> -

Task 2 - Learn bcftools

Bcftools are utilities for variant calling and manipulating VCFs and BCFs. Try to learn the bcftools convert command extract a region from the vcf file.

Task 3 - Learn three commands

Again, go to the bcftools website and learn about three bcftools commands from the "list of commands".


Task 1

Execute the pipeline with a test profile.

git clone
cd ngs-class/nf-preprocess
export NXF_SINGULARITY_CACHEDIR=/mnt/genepi-lehre/teaching/ngs/singualarity/
nextflow run -profile singularity,test

Task 2

Write a config file for your project (e.g. projectXY.config). The file looks similar to this one but you need to adapt the paths to your data.

params {
    input         = "test-data/*.fastq"
    output        = "fastp-test"

Execute it as follows:

export NXF_SINGULARITY_CACHEDIR=/mnt/genepi-lehre/teaching/ngs/singualarity/
nextflow run -c projectXY.config -profile singularity

Task 3

Go to and find a pipeline which could be useful in your field. Try to understand how to execute it!