Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python-irodsclient v2.x raises permission errors in BatchMoveDataObjectsTask #1955

Closed
mikkonie opened this issue Jun 25, 2024 · 7 comments
Closed
Assignees
Labels
app: taskflowbackend Issue in the taskflowbackend app bug Something isn't working ongoing Ongoing issue, needs observation or pending on other projects

Comments

@mikkonie
Copy link
Contributor

Something I noticed happening in production. A user had landing_zone_move flow fail with the following error:

Error running flow: BatchMoveDataObjectsTask failed: Could not receive server response (Error getting permissions of target "")

I have not experienced this in practice and the tests have not encountered it. Possibly something to do with python-irodsbackend v2.0 update?

The user afterwards modified the zone and received a more regular error. Still, this needs to be looked into ASAP.

@mikkonie mikkonie added bug Something isn't working app: taskflowbackend Issue in the taskflowbackend app labels Jun 25, 2024
@mikkonie mikkonie added this to the v0.14.3 milestone Jun 25, 2024
@mikkonie mikkonie self-assigned this Jun 25, 2024
@mikkonie mikkonie added the high priority A decidedly high priority issue label Jun 25, 2024
@mikkonie
Copy link
Contributor Author

mikkonie commented Jun 25, 2024

Dump of error. It seems the iRODS connection crashes when performing a specific ACL operation.

sodar-celeryd-default_1  | 2024-06-25 12:34:52,679 [ERROR] taskflowbackend.api: Error running flow: BatchMoveDataObjectsTask failed: Could not receive server response (Error getting permissions of target "<iRODSDataObject xxx yyy.vcf.gz>")
sodar-celeryd-default_1  | [2024-06-25 12:35:03,530: INFO/MainProcess] Task landingzones.tasks_celery.trigger_zone_move[7821ab7d-1b47-4341-886e-9e3e5080b66d] received
sodar-celeryd-default_1  | [2024-06-25 12:35:04,281: ERROR/ForkPoolWorker-16] Unable to send message. Connection to remote host may have closed. Releasing connection from pool.
sodar-celeryd-default_1  | [2024-06-25 12:35:04,287: WARNING/ForkPoolWorker-16] Exception ignored in: 
sodar-celeryd-default_1  | [2024-06-25 12:35:04,287: WARNING/ForkPoolWorker-16] <function Connection.__del__ at 0x7f91069963a0>
sodar-celeryd-default_1  | [2024-06-25 12:35:04,287: WARNING/ForkPoolWorker-16] Traceback (most recent call last):
sodar-celeryd-default_1  | [2024-06-25 12:35:04,287: WARNING/ForkPoolWorker-16]   File "/usr/local/lib/python3.8/site-packages/irods/connection.py", line 109, in __del__
sodar-celeryd-default_1  | [2024-06-25 12:35:04,288: WARNING/ForkPoolWorker-16]     
sodar-celeryd-default_1  | [2024-06-25 12:35:04,288: WARNING/ForkPoolWorker-16] self.disconnect()
sodar-celeryd-default_1  | [2024-06-25 12:35:04,288: WARNING/ForkPoolWorker-16]   File "/usr/local/lib/python3.8/site-packages/irods/connection.py", line 322, in disconnect
sodar-celeryd-default_1  | [2024-06-25 12:35:04,288: WARNING/ForkPoolWorker-16]     
sodar-celeryd-default_1  | [2024-06-25 12:35:04,288: WARNING/ForkPoolWorker-16] self.send(disconnect_msg)
sodar-celeryd-default_1  | [2024-06-25 12:35:04,288: WARNING/ForkPoolWorker-16]   File "/usr/local/lib/python3.8/site-packages/irods/connection.py", line 125, in send
sodar-celeryd-default_1  | [2024-06-25 12:35:04,289: WARNING/ForkPoolWorker-16]     
sodar-celeryd-default_1  | [2024-06-25 12:35:04,289: WARNING/ForkPoolWorker-16] raise NetworkException("Unable to send message")
sodar-celeryd-default_1  | [2024-06-25 12:35:04,289: WARNING/ForkPoolWorker-16] irods.exception
sodar-celeryd-default_1  | [2024-06-25 12:35:04,289: WARNING/ForkPoolWorker-16] .
sodar-celeryd-default_1  | [2024-06-25 12:35:04,289: WARNING/ForkPoolWorker-16] NetworkException
sodar-celeryd-default_1  | [2024-06-25 12:35:04,289: WARNING/ForkPoolWorker-16] : 
sodar-celeryd-default_1  | [2024-06-25 12:35:04,289: WARNING/ForkPoolWorker-16] Unable to send message

@mikkonie
Copy link
Contributor Author

For the record, the crash happens (consistently) here:

            try:
                target_access = self.irods.acls.get(target=target)
            except Exception as ex:
                self._raise_irods_exception(
                    ex,
                    'Error getting permissions of target "{}"'.format(target),
                )

@mikkonie mikkonie changed the title Permission error raised by BatchMoveDataObjectsTask in landing_zone_move Python-irodsclient v2.x raises permission errors in BatchMoveDataObjectsTask Jun 25, 2024
@mikkonie
Copy link
Contributor Author

Temporarily fixed by rolling back to v1.1.x. I need to try reproduce this in dev to figure out what is going on.

@mikkonie mikkonie added ongoing Ongoing issue, needs observation or pending on other projects and removed high priority A decidedly high priority issue labels Jun 25, 2024
@mikkonie mikkonie removed this from the v0.14.3 milestone Jun 25, 2024
@mikkonie
Copy link
Contributor Author

Personal note: I need to also check the actions preceeding the point of actual crash. It's possible the previous step fails silently and we try to impose ACLs on a file which is not in the expected place or state.

@mikkonie
Copy link
Contributor Author

The repo has now been upgraded for iRODS 4.3.3 and python-irodsclient 2.1.0. I will look into whether this still happens when testing the new release in staging.

@mikkonie
Copy link
Contributor Author

One thought: could it be possible this is caught by the same redirect issue in python-irodsclient v1.1.9+ that caused problems I detailed in #2007 and #2022?

If so, we may have to wait for the release of v2.1.1 before deploying this release. See #2023

@mikkonie
Copy link
Contributor Author

This has no longer occurred with python-irodsclient v2.2.0 and iRODS v4.3.3. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app: taskflowbackend Issue in the taskflowbackend app bug Something isn't working ongoing Ongoing issue, needs observation or pending on other projects
Projects
None yet
Development

No branches or pull requests

1 participant