Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature dispersion script #201

Closed
rbeiko opened this issue Jul 5, 2024 · 0 comments · Fixed by #203
Closed

Add feature dispersion script #201

rbeiko opened this issue Jul 5, 2024 · 0 comments · Fixed by #203
Assignees
Labels
enhancement New feature or request

Comments

@rbeiko
Copy link
Contributor

rbeiko commented Jul 5, 2024

The purpose of Feature_Dispersion.py is to compute the sparsity of features on the phylogenetic tree. So, for example, a feature that is restricted to one part of a tree but is present in most or all of the genomes in that part will have a small dispersion value, whereas a feature that is present only in two very distantly related genomes will have a high dispersion. This captures the idea that transferred genes should have a patchy distribution on a phylogenetic tree.

The Python script requires:

  • The reference tree (core_gene_alignment.tre) that is produced in the phylogenomics workflow
  • The feature profile that is output by the annotation workflow

The script calculates the dispersion statistic and other bits of information, and prints these to a .tsv. Additionally, it produces a heatmap that considers the relationship between the dispersion statistic and count of genomes.

Optionally:

  • The original samplesheet

If provided, the user can specify the samplesheet and indicate that they wish to calculate dispersion statistics based on columns in the samplesheet that they specify. These statistics are included in the .tsv but no corresponding heatmap is produced.

FeatureDispersion.tar.gz

The script should be run immediately after the completion of the phylogenomics subworkflow, if there is an accessible feature profile in the annotation directory (and, if specified, the samplesheet).

@jvfe jvfe added the enhancement New feature or request label Jul 11, 2024
@jvfe jvfe self-assigned this Jul 22, 2024
@jvfe jvfe moved this to Todo in ARETE Planning Jul 22, 2024
@jvfe jvfe moved this from Todo to In Progress in ARETE Planning Jul 22, 2024
@jvfe jvfe closed this as completed in #203 Jul 23, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in ARETE Planning Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants