Skip to content

Commit 5863e0a

Browse files
committed
Initial commit: adding latest HERVx pipeline and resources
1 parent 3ab14de commit 5863e0a

8 files changed

+189710
-0
lines changed

Dockerfile

+108
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
FROM ubuntu:18.04
2+
3+
MAINTAINER Skyler Kuhn <[email protected]>
4+
5+
RUN mkdir -p /data2
6+
RUN mkdir -p /opt2
7+
8+
# Apt-get packages
9+
# Install python (3.6), bowtie2=2.3.4.1-1 (ubuntu:18.04 default)
10+
RUN apt-get update && apt-get -y upgrade
11+
RUN DEBIAN_FRONTEND=noninteractive apt-get install --yes \
12+
build-essential \
13+
apt-utils \
14+
git-all \
15+
python3 \
16+
python3-pip \
17+
bowtie2 \
18+
wget
19+
20+
WORKDIR /opt2
21+
22+
# Build Samtools 1.9, Telescope requires htslib=1.9 (must install SAMtools=1.9)
23+
# SAMtools installation information: https://github.com/samtools/samtools/blob/develop/INSTALL
24+
# HTSlib installation information: https://github.com/samtools/htslib/blob/1.9/INSTALL
25+
# Apt-get remaining dependencies
26+
RUN DEBIAN_FRONTEND=noninteractive apt-get install --yes \
27+
gcc \
28+
make \
29+
perl \
30+
bzip2 \
31+
zlibc \
32+
libssl-dev \
33+
libbz2-dev \
34+
zlib1g-dev \
35+
libncurses5-dev \
36+
libncursesw5-dev \
37+
libcurl4-gnutls-dev \
38+
liblzma-dev \
39+
locales \
40+
pigz && \
41+
apt-get clean && apt-get purge && \
42+
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
43+
44+
# Build SAMtools 1.9
45+
RUN wget https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2 && \
46+
tar -xjvf samtools-1.9.tar.bz2 && \
47+
rm samtools-1.9.tar.bz2 && \
48+
cd samtools-1.9 && \
49+
./configure --prefix $(pwd) && \
50+
make
51+
52+
# Build HTSlib 1.9 (required by telescope)
53+
RUN wget https://github.com/samtools/htslib/releases/download/1.9/htslib-1.9.tar.bz2 && \
54+
tar -vxjf htslib-1.9.tar.bz2 && \
55+
rm htslib-1.9.tar.bz2 && \
56+
cd htslib-1.9 && \
57+
./configure --prefix $(pwd) && \
58+
make
59+
60+
# Add SAMtools and HTSlib to PATH
61+
ENV PATH=${PATH}:/opt2/samtools-1.9
62+
ENV PATH=${PATH}:/opt2/htslib-1.9
63+
ENV HTSLIB_INCLUDE_DIR="/opt2/htslib-1.9"
64+
65+
# pip install: Cutadapt, Telescope dependencies, and then Telescope
66+
RUN pip3 install --upgrade pip
67+
RUN pip3 install cutadapt==2.10
68+
69+
# Python requirements from github page some are need for compiling telescope, installing now
70+
RUN pip3 install future pyyaml cython==0.29.7 numpy==1.16.3 scipy==1.2.1 pysam==0.15.2 intervaltree==3.0.2
71+
RUN pip3 install git+git://github.com/mlbendall/telescope.git
72+
73+
# Adpater sequences for cutadapt and HERV reference files
74+
RUN mkdir -p /opt2/refs
75+
COPY refs/trimmonatic_TruSeqv3_adapters.fa /opt2/refs
76+
COPY refs/HERV_rmsk.hg38.v2.genes.gtf /opt2/refs
77+
COPY refs/HERV_rmsk.hg38.v2.transcripts.gtf /opt2/refs
78+
COPY refs/L1Base.hg38.v1.transcripts.gtf /opt2/refs
79+
COPY refs/retro.hg38.v1.transcripts.gtf /opt2/refs
80+
81+
82+
# hg38 bowtie2 indices
83+
RUN mkdir -p /opt2/bowtie2/
84+
COPY refs/hg38.1.bt2 /opt2/bowtie2/
85+
COPY refs/hg38.2.bt2 /opt2/bowtie2/
86+
COPY refs/hg38.3.bt2 /opt2/bowtie2/
87+
COPY refs/hg38.4.bt2 /opt2/bowtie2/
88+
COPY refs/hg38.rev.1.bt2 /opt2/bowtie2/
89+
COPY refs/hg38.rev.2.bt2 /opt2/bowtie2/
90+
91+
92+
# Set environment variable(s)
93+
# Configure "locale", see https://github.com/rocker-org/rocker/issues/19
94+
# Adding pigz for cutadapt
95+
RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
96+
&& locale-gen en_US.utf8 \
97+
&& /usr/sbin/update-locale LANG=en_US.UTF-8
98+
99+
100+
# Add HERVx pipeline PATH
101+
RUN mkdir -p /opt2/HERVx/
102+
COPY src/HERVx /opt2/HERVx/
103+
ENV PATH=${PATH}:/opt2/HERVx
104+
105+
106+
# Copy the Dockerfile used to create image in /opt2
107+
COPY Dockerfile /opt2
108+
WORKDIR /data2

README.md

+70
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Telescope
2+
3+
Characterization of Human Endogenous Retrovirus (HERV) expression within the transcriptomic landscape using RNA-seq is complicated by uncertainty in fragment assignment because of sequence similarity. Telescope is a computational method that provides accurate estimation of transposable element expression (retrotranscriptome) resolved to specific genomic locations. Telescope directly addresses uncertainty in fragment assignment by reassigning ambiguously mapped fragments to the most probable source transcript as determined within a Bayesian statistical model.
4+
5+
Telescope can be installed from [Github](https://github.com/mlbendall/telescope). It can be installed using Conda, but I did not go down that route. This repository contains the Dockerfile to build Telescope from scratch along with a few other tools.
6+
7+
The Dockerfile will install Cutadapt, bowtie2, SAMtools, HTSlib, and Telescope. Small reference files are located in `/opt2/refs/` in the container's filesystem.
8+
9+
Currently, the following files are located in `/opt2/refs/`:
10+
- trimmonatic_TruSeqv3_adapters.fa
11+
- HERV_rmsk.hg38.v2.genes.gtf
12+
- HERV_rmsk.hg38.v2.transcripts.gtf
13+
- L1Base.hg38.v1.transcripts.gtf
14+
- retro.hg38.v1.transcripts.gtf
15+
16+
17+
> **Please Note:** Bowtie2 indices for `hg38` are bundled in the container's filesystem in `/opt2/bowtie2/`. Other indices can be provided by mounting the host filesystem to this PATH (overrides current hg38 indices).
18+
19+
### Build from Dockerfile
20+
21+
In the example below, change `skchronicles` with your DockerHub username.
22+
23+
```bash
24+
# See listing of images on computer
25+
docker image ls
26+
27+
# Build
28+
docker build --tag=ccbr_telescope:v0.0.1 .
29+
30+
# Updating tag(s) before pushing to DockerHub
31+
docker tag ccbr_telescope:v0.0.1 skchronicles/ccbr_telescope:v0.0.1
32+
docker tag ccbr_telescope:v0.0.1 skchronicles/ccbr_telescope # latest
33+
docker tag ccbr_telescope:v0.0.1 nciccbr/ccbr_telescope:v0.0.1
34+
docker tag ccbr_telescope:v0.0.1 nciccbr/ccbr_telescope # latest
35+
36+
# Check out new tag(s)
37+
docker image ls
38+
39+
# Peak around the container: verify things run correctly
40+
docker run -ti ccbr_telescope:v0.0.1 /bin/bash
41+
42+
# Push new tagged image to DockerHub
43+
docker push skchronicles/ccbr_telescope:v0.0.1
44+
docker push skchronicles/ccbr_telescope:latest
45+
docker push nciccbr/ccbr_telescope:v0.0.1
46+
docker push nciccbr/ccbr_telescope:latest
47+
```
48+
49+
### Run using Singularity
50+
```bash
51+
module load singularity
52+
# Pull from DockerHub
53+
SINGULARITY_CACHEDIR=$PWD singularity pull -F docker://nciccbr/ccbr_telescope
54+
# Display usage and help information
55+
singularity exec -B $PWD:$PWD ccbr_telescope_latest.sif HERVx -h
56+
# Run HERVx pipeline
57+
singularity exec -B $PWD:$PWD ccbr_telescope_latest.sif HERVx -r1 small_S25_1.fastq -r2 small_S25_2.fastq -o ERV_hg38
58+
```
59+
60+
### Build bowtie2 indices
61+
```bash
62+
# Get UCSC hg38 genome
63+
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz
64+
zcat hg38.fa.gz > hg38.fa
65+
66+
# Build the indices
67+
module load singularity
68+
SINGULARITY_CACHEDIR=$PWD singularity pull -F docker://nciccbr/ccbr_telescope
69+
singularity exec -B $PWD:$PWD ccbr_telescope_latest.sif bowtie2-build hg38.fa hg38
70+
```

0 commit comments

Comments
 (0)