Add feature dispersion script #201

rbeiko · 2024-07-05T18:38:47Z

The purpose of Feature_Dispersion.py is to compute the sparsity of features on the phylogenetic tree. So, for example, a feature that is restricted to one part of a tree but is present in most or all of the genomes in that part will have a small dispersion value, whereas a feature that is present only in two very distantly related genomes will have a high dispersion. This captures the idea that transferred genes should have a patchy distribution on a phylogenetic tree.

The Python script requires:

The reference tree (core_gene_alignment.tre) that is produced in the phylogenomics workflow
The feature profile that is output by the annotation workflow

The script calculates the dispersion statistic and other bits of information, and prints these to a .tsv. Additionally, it produces a heatmap that considers the relationship between the dispersion statistic and count of genomes.

Optionally:

The original samplesheet

If provided, the user can specify the samplesheet and indicate that they wish to calculate dispersion statistics based on columns in the samplesheet that they specify. These statistics are included in the .tsv but no corresponding heatmap is produced.

FeatureDispersion.tar.gz

The script should be run immediately after the completion of the phylogenomics subworkflow, if there is an accessible feature profile in the annotation directory (and, if specified, the samplesheet).

jvfe added the enhancement New feature or request label Jul 11, 2024

jvfe self-assigned this Jul 22, 2024

jvfe added this to ARETE Planning Jul 22, 2024

jvfe moved this to Todo in ARETE Planning Jul 22, 2024

jvfe moved this from Todo to In Progress in ARETE Planning Jul 22, 2024

jvfe mentioned this issue Jul 22, 2024

feat: Add feature dispersion script #203

Merged

jvfe closed this as completed in #203 Jul 23, 2024

github-project-automation bot moved this from In Progress to Done in ARETE Planning Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add feature dispersion script #201

Add feature dispersion script #201

rbeiko commented Jul 5, 2024

Add feature dispersion script #201

Add feature dispersion script #201

Comments

rbeiko commented Jul 5, 2024