Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ST_profiles from pubMLST not updating #197

Open
eliottBo opened this issue Mar 10, 2025 · 2 comments
Open

ST_profiles from pubMLST not updating #197

eliottBo opened this issue Mar 10, 2025 · 2 comments
Labels

Comments

@eliottBo
Copy link

eliottBo commented Mar 10, 2025

Describe the bug
Whenever the microsalt pipeline is started, the ST_profiles are fetched from the database PubMLST if there are new profiles. When looking at the most recent profiles for S. aureus in hasta it is ST 9541 (looked at 2025-03-10) whereas in PubMLST, the last profile at that date is 9704. As a note, ST 9541 was added 2024-12-23 suggesting that after this date the updates did not work.

This issue seems to be for all organisms.

Related update from PubMLST from 2025-01-02:

Change of data access policy - Please note that registration is now necessary to access allele, profile, and isolate data added after 31 December 2024. This includes access via the application programming interface (API) and may affect the results of your queries as a non-authenticated user.

@ahdamin Did work in Microsalt in order to fix the request issue related to this PubMLST update.

Investigation with KN:
The problem seems that when we access the data on PubMLST, we are restricted to the data before 2024-12-31.

{'records': 9666, 'message': 'Please note that you are currently restricted to accessing data that was submitted on or prior to 2024-12-31. Please authenticate to access the full dataset.', 'primary_key_field': 'https://rest.pubmlst.org/db/pubmlst_saureus_seqdef/schemes/1/fields/ST', 'profiles_csv':

To Reproduce
Steps to reproduce the behavior:

  1. Submit recipe '...'
  2. Go to '....'
  3. Open log '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Software version (please complete the following information):

  • MIP: [e.g. 6.0.12]

Additional context

@eliottBo eliottBo added the bug label Mar 10, 2025
@ahdamin
Copy link
Contributor

ahdamin commented Mar 10, 2025

@eliottBo I tried to refresh the credentials. Would you please check if this makes any difference?

[15:22] [hiseq.clinical@hasta:~] [base]  $ up
[15:22] [hiseq.clinical@hasta:~] [P_base]  $ conda activate P_microSALT
[15:23] [hiseq.clinical@hasta:~] [P_microSALT]  $ cd /home/proj/production/bin/git/microSALT/
[15:23] [hiseq.clinical@hasta:/home/proj/production/bin/git/microSALT] [P_microSALT]  (master) $ python -m microSALT.utils.pubmlst.get_credentials
Please log in using your user account at https://pubmlst.org/bigsdb?db=pubmlst_test_seqdef&page=authorizeClient&oauth_token=<redacted> using a web browser to obtain a verification code.
Please enter verification code: <redacted>

Access Token: <redacted>
Access Token Secret: <redacted>
Tokens saved to /home/proj/production/microbial/credentials/pubmlst_credentials.env
[15:24] [hiseq.clinical@hasta:/home/proj/production/bin/git/microSALT] [P_microSALT]  (master) $

@ahdamin
Copy link
Contributor

ahdamin commented Mar 10, 2025

Status update:

The PR should fix the authentication issue. However, we get warning messages below for the isolates databases since they don't include the last_updated field, by contrast, the same endpoint for the seqdef databases returns the last_updated field:

WARNING - No 'last_updated' field found for db: pubmlst_spyogenes_isolates, scheme_id: 1
WARNING - No 'last_updated' field found for db: pubmlst_spyogenes_isolates, scheme_id: 1
WARNING - Could not get version for staphylococcus_aureus, skipping version check
WARNING - Could not get version for staphylococcus_aureus, skipping version check

To check the expected responses, you can refer to the documentation

I tested pubmlst_spyogenes_seqdef and pubmlst_saureus_seqdef and they seemed fine. While: pubmlst_saureus_isolates and pubmlst_spyogenes_isolates triggered warning messages similar to the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants