Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enh: add zero parameter download cli #96

Merged
merged 6 commits into from
Jul 4, 2024
Merged

enh: add zero parameter download cli #96

merged 6 commits into from
Jul 4, 2024

Conversation

fedorov
Copy link
Member

@fedorov fedorov commented Jul 2, 2024

This should address most common usage scenario.

Follow up on #33.

@fedorov fedorov force-pushed the add-smart-download branch 5 times, most recently from c5c88c6 to 5f791c9 Compare July 2, 2024 21:47
This should address most common usage scenario.
@fedorov fedorov force-pushed the add-smart-download branch from 5f791c9 to 4370ebb Compare July 2, 2024 21:50
@fedorov fedorov requested a review from vkt1414 July 2, 2024 21:53
idc_index/cli.py Outdated
else:
item_ids = [generic_argument]
# this is a streamlined command, we will only check the first item, and will assume all other items are of the same kind
if client.index["collection_id"].str.contains(item_ids[0]).any():
Copy link
Collaborator

@vkt1414 vkt1414 Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a reason why we should not validate every value passed on. This may go back to the philosophical debate of whether we inform the user for passing some invalid values. In my opinion, we must. So I would change this to..

else:
    # Split the input string and filter out any empty values
    item_ids = [item for item in generic_argument.split(",") if item]

    if not item_ids:
        logger_cli.error("No valid IDs provided.")
        raise ValueError("No valid IDs provided.")

    index_df = client.index

    def check_and_download(column_name, item_ids, download_dir, kwarg_name):
        matches = index_df[column_name].isin(item_ids)
        matched_ids = index_df[column_name][matches].tolist()
        if not matched_ids:
            return False
        unmatched_ids = list(set(item_ids) - set(matched_ids))
        if unmatched_ids:
            raise ValueError(f"Partial match for {column_name}: matched {matched_ids}, unmatched {unmatched_ids}")
        logger_cli.debug(f"Downloading from {column_name}")
        client.download_from_selection(**{kwarg_name: matched_ids, 'downloadDir': download_dir})
        return True

    # Check for matches in each column and download if matches found
    if not (
        check_and_download("collection_id", item_ids, download_dir, 'collection_id') or
        check_and_download("PatientID", item_ids, download_dir, 'patientId') or
        check_and_download("StudyInstanceUID", item_ids, download_dir, 'studyInstanceUID') or
        check_and_download("SeriesInstanceUID", item_ids, download_dir, 'seriesInstanceUID')
    ):
        logger_cli.error("None of the values passed matched any of the four UUIDs: collection_id, PatientID, StudyInstanceUID, SeriesInstanceUID.")
        raise ValueError("None of the values passed matched any of the four UUIDs: collection_id, PatientID, StudyInstanceUID, SeriesInstanceUID.")

If you think it's an overkill, I would at least add a raise exception at the end to catch if the first value passed does not match any of the four UUIDs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. Can you add this in a commit?

I do not like the exception though, I don't think it is helpful. I think console error is sufficient.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know what a console error is. Also what do you not like about the exception message? Isn't it exactly what we trying to do and failing if it reaches the last else condition?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, you simply wanted to remove raising exceptions but warn the user from logger? If so, I made the change now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know what a console error is.

I mean logger error.

Also what do you not like about the exception message? Isn't it exactly what we trying to do and failing if it reaches the last else condition?

Raising exception aborts the command line execution and prints execution stack. I see zero benefit in showing execution stack to the user. It is sufficient to let them know none of the identifiers was matched.

vkt1414 and others added 5 commits July 3, 2024 20:38
since we call IDCClient first, logging level will be set to info
by default, as it is the level in index.py
To get around it, we can set the desired log level in cli.py
- invoke download in sequence, to allow for mixing different kinds of identifiers
- revisit logging output types to reduce console clutter
- change default logging level to info
@fedorov fedorov merged commit fa56355 into main Jul 4, 2024
10 checks passed
@fedorov fedorov deleted the add-smart-download branch July 4, 2024 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants