Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Explain our QC, clades and alignment better #3592

Open
1 of 3 tasks
anna-parker opened this issue Jan 31, 2025 · 5 comments
Open
1 of 3 tasks

feat: Explain our QC, clades and alignment better #3592

anna-parker opened this issue Jan 31, 2025 · 5 comments
Labels
high_priority Work on this as soon as possible (potentially post-MVP)

Comments

@anna-parker
Copy link
Contributor

anna-parker commented Jan 31, 2025

We should at least link somewhere to the nextclade dataset that we use. Even better might be as @emmahodcroft suggested adding an info page where we describe our annotation process in detail.

@anna-parker anna-parker added the high_priority Work on this as soon as possible (potentially post-MVP) label Jan 31, 2025
@emmahodcroft
Copy link
Member

Do we have public nextclade datasets available for each pathogen? I can't remember - I guess that would be a 'step 1' for any where we don't (actually put them on Nextclade)

@emmahodcroft
Copy link
Member

emmahodcroft commented Jan 31, 2025

As another idea (but may be too manual/too much hard work) - At Nextstrain we have this fig for SC2 which I use all the time. It doesn't work ideally, but something like for each pathogen - if we could get it to automatically generated and update - would probably be the most useful thing for a naive user:

https://github.com/nextstrain/ncov-clades-schema

@anna-parker
Copy link
Contributor Author

anna-parker commented Jan 31, 2025

Do we have public nextclade datasets available for each pathogen? I can't remember - I guess that would be a 'step 1' for any where we don't (actually put them on Nextclade)

they are all on nextclade - but some are just on branches and not public merged on main

Datasets to merge:

@emmahodcroft
Copy link
Member

Slightly related to this, we may want to prioritze #3194
Pulling in more QC scores so that people have more metrics by which to judge data quality. This was requested specifically by Aine/Andrew in mpox context, but probably applies more broadly.

(But posting this mostly for visibility, we don't have to try and tackle all of this at once)

@j23414
Copy link

j23414 commented Feb 4, 2025

+1 for documenting and merging the WNV dataset.

In nextstrain/WNV, we've been using the Pathoplexus API to get the draft WNV global lineage calls:

but would love to have more documentation on the reference, QC scores, and other nextclade specific parameters for the WNV dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high_priority Work on this as soon as possible (potentially post-MVP)
Projects
None yet
Development

No branches or pull requests

3 participants