Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for additional attributes for Virus Data Report #310

Closed
joverlee521 opened this issue Feb 2, 2024 · 3 comments
Closed

Request for additional attributes for Virus Data Report #310

joverlee521 opened this issue Feb 2, 2024 · 3 comments

Comments

@joverlee521
Copy link

Hi NCBI datasets team,

Is there an official way to request additional metadata attributes to include in the Virus Data Report?
If possible, I'd like to request the report to include the strain attribute.


Currently we are trying to use the NCBI Datasets CLI to download measles genomes via

 datasets download virus genome taxon 11234 --no-progressbar --filename data/ncbi_dataset.zip

The downloaded Virus Data Report does not include the strain attribute, which we can potentially use to parse out dates for records as @jameshadfield commented:

For strain names following the WHO schema, the "XX.YY" corresponds to

Date of rash onset if known, otherwise date of specimen collection by epidemiological week (1-53) and year.

So, e.g., Accession JX187583, which doesn't have a date in our workflow, has sample name MVs/Parma.ITA/47.08 in nuccore so it'd have date of 2008-11-XX | 2008-12-XX

@olearyna
Copy link
Contributor

olearyna commented Feb 2, 2024

Hi joverlee521,

NCBI Datasets retrieves its data directly from NCBI Virus. I've forwarded your message to them. They intend to incorporate strain information from GenBank records into their data sometime this year. It will then be added to the NCBI Datasets virus data report.

Thanks

Nuala

@joverlee521
Copy link
Author

@syntheticgio Has the strain attribute been added to the outputs?

I don't see any new columns in the output when running datasets downloads then dataset format.

@olearyna
Copy link
Contributor

Hi joverlee521,

The NCBI Datasets virus data report will automatically include the data from NCBI Virus once it becomes available, so no further action is required from our end. I also sent another message to the Virus team with your request.

Let me know if you have any other questions

Nuala

Nuala A. O'Leary, PhD
Product Owner, NCBI Datasets
National Center for Biotechnology Information, NLM, NIH, DHHS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants