-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use GFF3 genemap instead of genbank reference so that genenames are OPG #69
feat: use GFF3 genemap instead of genbank reference so that genenames are OPG #69
Conversation
Note: this requires forthcoming PRs in augur and auspice
Previously, we would backfill `rule_traversal` with wildcards ("*") if we could not find a matching rule for a particular `raw_geolocation`. This would NOT work for cases where there are partial rule matches AND wildcard rules that match. I realized the flaw in this logic while responding to @victorlin's post-merge review: #41 (comment) This commit updates the logic for when there are no matching rules. The `rule_traversal` is reset to the last index that is currently not a wildcard rule, and then change this value to a wildcard. This allows the recursive function to try different iterations of `rule_traversal` with different combinations of raw values and wildcards.
This is necessary for the temporal colour scale introduced in the preceding commit. For searchibility i'll past the error message that you would get when running with <16.0.0 (which shouldn't be reachable as the snakemake pipeline should exit up front): ERROR: 'temporal' is not one of ['continuous', 'ordinal', 'categorical', 'boolean'] Validation of config/auspice_config_mpxv.json failed.
The previous strategy of assigning colors used the entire metadata.tsv to assign country colors, whereas the hpmxv1 build target has a subset of 18 of these 30 countries and the mpxv build target has a subset of 26 of these 30 countries. Consequently, we weren't using color spectrum as efficiently as we could especially for the hmpxv1 target. This commit updates augur filter to output filtered metadata for both hmpxv1 and mpxv and uses this filtered metadata to assign colors. Additionally, this commit updates country order in update_colours.py to start with Africa rather than Asia. For ncov, I had started with Asia as this was the basal region. For monkeypox this is Africa.
Rather than replacing "accession" with "strain" in the initial wrangle_metadata.py script, instead keep "accession" column and create an additional "strain" column. This will allow "augur export" to recognize this "accession" column and include properly linked field in tooltips. Remove "genbank_accession_rev" as "Accession" coloring. Now "Accession" will be automatically recognized.
Provision colors separately for mpxv and hmpxv1 build targets
Include "accession" in metadata
…eolocation-rules Fix typos in `apply-geolocation-rules`
Co-authored-by: Victor Lin <[email protected]>
ingest/apply-geolocation-rules: update general rules logic
mpxv temporal colouring scale
feat: rename nextalign -> nextalign2 to work with base image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested using nextstrain/augur#976 and all is working 👍
This fixes the issue with having a number of gray "outbreak_associated" tips. We may want to drop the coloring at some point, but I was still finding it valuable when putting together presentations to easily highlight the outbreak clade B.1
Note one difference. Current builds have "gene" names such as |
WIP: add nextclade annotations to metadata
…om ingest This way we don't get clade_x and clade_y when merging nextclade in ingest
…eak-association-column ingest(fix/feat): remove old clade and outbreak association column from ingest
Indeed @jameshadfield that's on purpose, too - the UK gene names were not good, that was just arbitrary, whereas OPG is the standard suggested by NCBI recently. |
feat(ingest): reverse reverse-complemented sequences
feat: reverse complement sequences if annotated in metadata
Needs
augur translate
to be patched first before merging. @rneher