Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add error handling to fastq clean #4206

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

add error handling to fastq clean #4206

wants to merge 3 commits into from

Conversation

diitaz93
Copy link
Contributor

@diitaz93 diitaz93 commented Feb 14, 2025

Description

Closes #3817
Adds error handling to the CLI command cg compress clean fastq so that any exception raised when cleaning a single sample is caught and logged without affecting the cleaning of other samples.

Added

  • Error handling

How to prepare for test

  • Ssh to relevant server (depending on type of change)
  • Use stage: us
  • Paxa the environment: paxa
  • Install on stage (example for Hasta):
    bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_cg -t cg -b fix-clean-fastq -a

How to test

  • Find a case with some of its samples ready to be cleaned but at least one sample with uncompressed fastq files
  • run cg compress clean fastq --case-id <some-case>
  • [ ]

Expected test outcome

  • Check that ...
  • Take a screenshot and attach or copy/paste the output.

Review

  • Tests executed by
  • "Merge and deploy" approved by
    Thanks for filling in who performed the code review and the test!

This version is a

  • MAJOR - when you make incompatible API changes
  • MINOR - when you add functionality in a backwards compatible manner
  • PATCH - when you make backwards compatible bug fixes or documentation/instructions

Implementation Plan

  • Document in ...
  • Deploy this branch on ...
  • Inform to ...

@diitaz93 diitaz93 requested a review from a team as a code owner February 14, 2025 09:18
@@ -86,20 +86,24 @@ def clean_fastq(context: CGConfig, case_id: str | None, days_back: int, dry_run:

cases: list[Case] = get_cases_to_process(case_id=case_id, days_back=days_back, store=store)
if not cases:
LOG.info("Did not find any FASTQ files to clean. Closing")
return

cleaned_inds = 0
for case in cases:
sample_ids: Iterable[str] = store.get_sample_ids_by_case_id(case_id=case.internal_id)
for sample_id in sample_ids:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely unrelated to anything in this PR but is this logic not off? Should we not look for old samples rather than old cases? What happens if we have a re-run so the sample is in multiple cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be a good idea to open a new issue to reproduce this situation: a sample belonging to two cases, one more than 60 days old and one ongoing to evaluate the consequences of this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are more issues around the case-centric handling of sample properties. I agree with you Isak that, since a sample can be in multiple cases it could potentially receive multiple "compression, cleaning" and so on states based on the current case that is evaluated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Catch Errors and continue in clean FASTQ command
3 participants