TiNDA

Tumor in Normal Detection Analysis

Overview

This is an R package to rescue somatic variants called as germline due to tumor DNA contamination in the patient's blood/control sample.

TiNDA makes use of the Canopy's EM-cluster function to partition the variants into different clusters. And uses the following assumptions to define these clusters into somatic and germline.

Based on the following assumptions:

The variant allele frequency (VAF) of somatic variants in tumor samples will be higher than contaminated somatic variants in the control sample.
The contamination exceeding a certain threshold (max_control_af: 0.25) will be difficult to separate from the germline VAF.

An area of interest (AOI) is defined in the control vs tumor VAF 2D space. Clusters with a majority (min_clst_members: 0.85) of its members within this AOI are defined as 'omatic rescue'.

Area of Interest

In the tumor VAF vs control VAF, the AOI for somatic and ChiP variants are defined in the following image. The "golden" polygon defines the somatic region, and the "red" polygon defines the ChiP region, with the rest of the areas defining germline variants.

Key Features

Rescuing Misclassified Variants: TiNDA rescues somatic variants that are misclassified due to tumor-in-normal contamination.
Detecting CHiP Clusters: TiNDA identifies CHiP clusters by distinguishing germline variants from genuine somatic mutations in blood.
Visualization: TiNDA provides visualization tools to help users assess quality of the clustering.

Installation

Install directly from the GitHub

devtools::install_github("nagacombio/tinda")

Usage

Workflow

The TiNDA input consists of read counts for rare and private variants, including both germline and somatic variants. These variants should be identified through the joint analysis of tumor and control samples, and they must be filtered to remove common SNPs and technical artifacts. If the dataset is still too large and to expedite clustering and plotting, consider using only exonic variants.

An ideal workflow with TiNDA:

Input data format

The input data for TiNDA is a data frame containing the following information/columns,

CHR - Chromosome name
POS - Variant position
Control_ALT_DP - Read depth of the variant's alternate allele in the control sample
Control_DP - Total read depth of the variant in the control sample
Tumor_ALT_DP - Read depth of the variant's alternate allele in the tumor sample
Tumor_DP - Total read_depth of the variant in the tumor sample

Note: Keep the column names in the input table.

An example table,

CHR	POS	Control_ALT_DP	Control_DP	Tumor_ALT_DP	Tumor_DP
1	1039001	20	40	23	46
1	2123023	12	32	14	23
1	3343543	23	56	34	67

Example TiNDA analysis

# Generate data to test the package
library(TiNDA)
data(hg19_length)
test_df <- generate_test_data(hg19_length, num_variants = 500)

Run the TiNDA function

# Check the documentation for the paramaters
tinda_object <- TiNDA(test_df)

Plotting the results

# Plot the results of the canopy cluster analysis
canopy_clst_plot(tinda_object)

# Plot the TiNDA cluster assignment
tinda_clst_plot(tinda_object)

# Plot the linear plot of the TiNDA results
tinda_linear_plot(tinda_object)

# Plot the summary of the TiNDA results - includes canopy clusters, TiNDA cluster assignment and linear plots
tinda_summary_plot(tinda_object)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github		.github
R		R
data		data
man		man
vignettes		vignettes
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TiNDA

Tumor in Normal Detection Analysis

Overview

Area of Interest

Key Features

Installation

Usage

Workflow

Input data format

Example TiNDA analysis

Run the TiNDA function

Plotting the results

About

Releases

Packages

Languages

License

NagaComBio/TiNDA

Folders and files

Latest commit

History

Repository files navigation

TiNDA

Tumor in Normal Detection Analysis

Overview

Area of Interest

Key Features

Installation

Usage

Workflow

Input data format

Example TiNDA analysis

Run the TiNDA function

Plotting the results

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages