Skip to content
/ aligncov Public

Obtain tidy alignment coverage info from sorted BAM files

License

Notifications You must be signed in to change notification settings

pcrxn/aligncov

Repository files navigation

PyPI version Anaconda-Server Badge Biocontainers Badge

AlignCov

AlignCov is a Python package which can be used to obtain a) alignment summary statistics and b) read depths from sorted BAM files in tidy tab-separated tables.

Introduction

This script takes a sorted BAM file as input and uses SAMtools and Python Pandas to generate two tables:

  • _stats.tsv: A table of alignment summary statistics, including fold-coverages (fold_cov) and proportions of target lengths covered by mapped reads (prop_cov).
    • target: Name of the target.
    • seqlen: Length of the target sequence (bp).
    • depth: Total number of base pairs mapped to the target.
    • len_cov: Total number of base pairs within the target that are covered by at least one mapped read.
    • prop_cov: Proportion of the target length covered by at least one mapped read (len_cov / seqlen).
    • fold_cov: Fold-coverage of mapped reads to the target (i.e. the number of times the target is completely covered by mapped reads) (depth / seqlen).
  • _depth.tsv: A table of read depths for each bp position of each target.
    • target: Name of the target.
    • position: Base pair position within the target.
    • depth: Total number of reads aligned to the base pair position within the target.

Dependencies

  • samtools>=1.15

Installation

Conda

The recommended installation method for AlignCov is using the Conda package manager. After adding the Bioconda channel to your Conda installation, AlignCov can be installed into a new Conda environment named aligncov with the following command:

conda create -n aligncov aligncov

The aligncov Conda environment can then be activated with conda activate aligncov.

Pip

Alternatively, AlignCov can be installed into a Python environment using Pip with the following command:

pip install aligncov

Usage

Quick start

For a sorted BAM file named 'bacillus.bam', compute alignment statistics and read depths, and save results to files named 'subtilis_stats.tsv' and 'subtilis_depth.tsv':

$ aligncov -i bacillus.bam -o subtilis

More options

To show the program's help message:

$ aligncov -h
usage: aligncov [-h] -i INPUT [-o OUTPUT]

Parse a sorted BAM file to generate two tables: a table of alignment summary statistics ('_stats.tsv'), including fold-coverages (fold_cov) and proportions of target lengths covered by mapped reads (prop_cov), and a table of read
depths ('_depth.tsv') for each bp position of each target.

options:
  -h, --help            show this help message and exit

Required:
  -i INPUT, --input INPUT
                        Path to sorted BAM file to process.

Optional:
  -o OUTPUT, --output OUTPUT
                        Path and base name of files to save as tab-separated tables ('[output]_stats.tsv', '[output]_depth.tsv'). Default: 'sample'

Credits

Packages

  • Pandas: McKinney W. 2011. Pandas: A foundation python library for data analysis and statistics. Python for High Performance and Scientific Computing 1–9.

Dependencies

  • SAMtools: Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. 2021. Twelve years of SAMtools and BCFtools. GigaScience 10(2) giab008. doi: 10.1093/gigascience/giab008

Project structure

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

About

Obtain tidy alignment coverage info from sorted BAM files

Resources

License

Stars

Watchers

Forks

Packages

No packages published