Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we still need the global cases counts? #131

Open
joverlee521 opened this issue Mar 3, 2025 · 1 comment
Open

Do we still need the global cases counts? #131

joverlee521 opened this issue Mar 3, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@joverlee521
Copy link
Contributor

While checking #130, I noticed that there hasn't been a Slack notification regarding the global cases counts TSV for a ~while. I then realized that the global case counts at s3://nextstrain-data/files/workflows/forecasts-ncov/cases/global.tsv.gz has not been updated since 2024-08-17! The upstream OWID COVID-19 data has been moved and we need to update fetch-ncov-global-case-count.

However, the bigger question is do we still need the case counts? From what I can see, the case counts are only used by the renewal model and we are no longer running the renewal model.

Possible solutions

  1. Update fetch-ncov-global-case-count
  2. Remove the case counts related scripts and workflows.
@joverlee521 joverlee521 added the bug Something isn't working label Mar 3, 2025
@joverlee521
Copy link
Contributor Author

Updating fetch-ncov-global-case-count should be pretty straight-forward:

diff --git i/ingest/bin/fetch-ncov-global-case-counts w/ingest/bin/fetch-ncov-global-case-counts
index ef68fc9..35be204 100755
--- i/ingest/bin/fetch-ncov-global-case-counts
+++ w/ingest/bin/fetch-ncov-global-case-counts
@@ -2,13 +2,13 @@
 set -euo pipefail
 
 # Fetch CSV from Our World in Data
-curl https://covid.ourworldindata.org/data/owid-covid-data.csv \
+curl https://catalog.ourworldindata.org/garden/covid/latest/compact/compact.csv \
     --fail --silent --show-error --location \
     --header 'User-Agent: https://github.com/nextstrain/counts ([email protected])' |
-    # Only keep the date, location, and new_cases columns
-    csvtk cut -f location,date,new_cases |
-    # Rename new_cases to cases
-    csvtk rename -f new_cases -n cases |
+    # Only keep the date, country, and new_cases columns
+    csvtk cut -f country,date,new_cases |
+    # Rename new_cases to cases and country to location
+    csvtk rename -f new_cases,country -n cases,location |
     # Only keep rows that have more than 0 cases
     csvtk filter -f "cases>0" |
     # Remove decimals from case counts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant