Skip to content

Conversation

@korydraughn
Copy link

@korydraughn korydraughn commented Aug 14, 2025

This PR updates support for iRODS by replacing Jargon (legacy iRODS library) with irods4j.

The foundational work was implemented by @MINGYJ, a recent iRODS intern. My commits are mainly polish and corrections around the use of the irods4j library.

Basic functionality is working - i.e. single stream uploads/downloads, renames, editing, etc.

The parallel transfer implementation doesn't appear to be working as intended. This is likely due to not having a full understanding of how the Cyberduck components fit together - i.e. Read/WriteFeature vs Upload/DownloadFeature.

Here are the steps for performing parallel transfer (i.e. multipart uploads) in iRODS.

Putting in draft for now. Feedback and guidance on how to implement proper support for parallel transfer would be greatly appreciated.

Resolves #14449.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@korydraughn
Copy link
Author

@dkocher Can you provide the Entity CLA?

@dkocher dkocher changed the title [#14449] Update support for iRODS Update support for iRODS Sep 1, 2025
@dkocher
Copy link
Contributor

dkocher commented Sep 1, 2025

Putting in draft for now. Feedback and guidance on how to implement proper support for parallel transfer would be greatly appreciated.

Discussed in #14449 (comment)

@korydraughn
Copy link
Author

TODO: Update documentation for iRODS - https://docs.cyberduck.io/protocols/irods/

@trel
Copy link

trel commented Oct 9, 2025

updating the docs is a separate repo, would have its own PR

@korydraughn
Copy link
Author

TODO: Release new version of irods4j for checksum fixes.

@korydraughn
Copy link
Author

korydraughn commented Oct 9, 2025

Rebased PR on top of master.

Verified the following:

  • Parallel/Single stream uploads and downloads works
  • Config options defined in new iRODS profile are honored
  • PAM authentication works via the pam_password auth scheme
  • Other operations work too - rename, delete, etc.

With this PR, users will not be able to calculate checksums on data in an iRODS zone. Cyberduck will report checksums in their hex form if they exist in iRODS. We could add an option to the iRODS profile which allows users to instruct Cyberduck to calculate checksums following an upload, but that didn't feel like the correct approach, mainly because profiles seem to be loaded once on program start and never reread.

As for the unit/integration tests, I'm not sure how the implementation can be tested without a real iRODS server.

Finally, this PR bumps the minimum iRODS version requirement to 4.3.2. A new version of irods4j will be needed before this is merged. See irods/irods4j#126. Will get a new version released this week soon.

@korydraughn
Copy link
Author

@dkocher When I move a file in or out of a directory, I see a new iRODS connection get created. It disappears eventually, but not cleanly.

Can you explain why Cyberduck creates a new connection every time a file is moved?

The move class is very similar to other implementations so it's not clear to me what causes the new connection.

@dkocher
Copy link
Contributor

dkocher commented Oct 14, 2025

@dkocher When I move a file in or out of a directory, I see a new iRODS connection get created. It disappears eventually, but not cleanly.

Can you explain why Cyberduck creates a new connection every time a file is moved?

The move class is very similar to other implementations so it's not clear to me what causes the new connection.

You will need to return Statefulness.stateless in IRODSProtocol#getStatefulness.

@korydraughn
Copy link
Author

@dkocher When would someone choose stateless or stateful?
If I change the protocol to stateless, how will that affect other operations?
If I continue using stateful, what needs to change to make sure the additional connections are closed immediately?

To provide some context, a single iRODS connection cannot be used to execute multiple API operations in parallel. A single request is sent and the server returns a response.

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

If you can share credentials with us for your test environment we can run daily integration tests from our CI.

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

@dkocher When would someone choose stateless or stateful? If I change the protocol to stateless, how will that affect other operations? If I continue using stateful, what needs to change to make sure the additional connections are closed immediately?

To provide some context, a single iRODS connection cannot be used to execute multiple API operations in parallel. A single request is sent and the server returns a response.

We will have to keep it stateful then as otherwise it will be attempted to use a single connection for multiple actions in parallel, i.e. when the user is browsing folders or a file transfer with multiple files in parallel.

I will need to review how we can still support the native copy feature implementation for iRODS that would not require a new connection.

@korydraughn
Copy link
Author

If you can share credentials with us for your test environment we can run daily integration tests from our CI.

We don't have a CI system for people to hook into yet. We have a small set of tools which make it easy to launch one or more iRODS servers for testing. It's likely overkill for your needs though.

With that said, building a containerized environment for testing iRODS is pretty easy. I'm happy to put together a Docker compose project for the iRODS component. That will allow you to launch it on a local computer and have it available for testing.

We will have to keep it stateful then as otherwise it will be attempted to use a single connection for multiple actions in parallel, i.e. when the user is browsing folders or a file transfer with multiple files in parallel.

I will need to review how we can still support the native copy feature implementation for iRODS that would not require a new connection.

So, because the IRODSProtocol class reports iRODS as being stateful, does that mean every operation results in a new iRODS connection being created?

If that's true, why is it that the Move operation results in connections which do not disconnect? No other operation shares that behavior.

I added some log statements to my local build to try and box in what is leading to the additional connections, but it didn't help. However, it did reveal a high number of instantiations of IRODSMoveFeature. Any idea why the move operation is instantiated so much?

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

We have other usages of Docker Compose containers in integration tests, thus that should be feasible.

@korydraughn
Copy link
Author

Where should I place the Docker Compose project? Is the test directory for irods appropriate?

I figure you can move things around if needed.

@korydraughn
Copy link
Author

@dkocher What triggers the Copy implementation?

Using the Duplicate option within the context menu doesn't appear to trigger it.

@korydraughn
Copy link
Author

Docker compose project added under test directory.

Squashing everything down.

@korydraughn
Copy link
Author

@dkocher Any updates regarding the CLA?

@korydraughn
Copy link
Author

TODO: Release new version of irods4j for checksum fixes.

Released irods4j 0.5.0 and updated PR to use it.

@dkocher
Copy link
Contributor

dkocher commented Oct 16, 2025

@dkocher What triggers the Copy implementation?

Using the Duplicate option within the context menu doesn't appear to trigger it.

The Duplicate… option in Cyberduck is expected to trigger the Copy feature.

@dkocher
Copy link
Contributor

dkocher commented Oct 16, 2025

@dkocher Any updates regarding the CLA?

I have received your signed CLA and I should be able to merge this regardless of the auto check failing.

@dkocher
Copy link
Contributor

dkocher commented Oct 16, 2025

Docker compose project added under test directory.

Squashing everything down.

  • We previously had integration tests running with iPlant Collaborative which are still functioning. Can you confirm this is still the case and there are no regressions?
  • We have tests that use a Testcontainers/Docker setup in the smb and the sts package of the s3 module. Ideally the iRODS container would be launched similarly to run tests.

@dkocher
Copy link
Contributor

dkocher commented Oct 16, 2025

So, because the IRODSProtocol class reports iRODS as being stateful, does that mean every operation results in a new iRODS connection being created?

That is true but the connections are pooled using the DefaultSessionPool implementation. We only have FTP that uses the same model, all other implementations are stateless.

@korydraughn
Copy link
Author

The Duplicate… option in Cyberduck is expected to trigger the Copy feature.

I do see references to CopyWorker in the log file. Maybe I didn't have any logging in the IRODSCopyFeature class. I'll take another look at it.

  • We previously had integration tests running with iPlant Collaborative which are still functioning. Can you confirm this is still the case and there are no regressions?

mvn test -DskipITs, as documented in the README, results in passing tests, but it only covers tests which do not require interaction with a server.

Whenever I tried running the tests without -DskipITs, they failed immediately due to vault/permission issues. I think I lack credentials to do that level of testing.

So, I can't confirm the state of the tests. I can say that clicking the buttons within the GUI have shown the desired behavior and do work with a live iRODS 4.3.4 server.

  • We have tests that use a Testcontainers/Docker setup in the smb and the sts package of the s3 module. Ideally the iRODS container would be launched similarly to run tests.

I'll take a look at those and see if I can mirror them. I may not be able to resolve the vault/permission related issues, but I can at least document the necessary changes so that you can wire things up in the way that you like.

That is true but the connections are pooled using the DefaultSessionPool implementation. We only have FTP that uses the same model, all other implementations are stateless.

I'll take a look at the FTP implementation. Given the connection management requires a deeper understanding of the Cyberduck internals, I might end up needing to let someone more familiar with that handle it.

@korydraughn
Copy link
Author

@dkocher Any thoughts on my previous comment? Specifically, the vault/permission issues?

@dkocher
Copy link
Contributor

dkocher commented Oct 29, 2025

Unfortunately the secrets to access the credentials for integration tests are only accessible in our CI enviroment. Running the tests against data.iplantcollaborative.org gives error [ -12000 ].

@trel
Copy link

trel commented Oct 29, 2025

-12000 is SYS_UNMATCHED_API_NUM

We know the iplant / CyVerse Zone is too old (iRODS 4.2.8). This is okay from their perspective, but something we'll need to remedy before the tests pass again.

@korydraughn
Copy link
Author

I've been updating the tests to use the docker compose project I added, so the API issues are no longer an issue. This work has not been pushed yet because I'm still working through a few test failures.

@dkocher I'm reviewing the test, IRODSUploadFeatureTest#testInterruptStatus, and trying to determine what the behavior should be. The test is failing because it doesn't expect the upload() method to throw an exception. Can you point me to a test for a different protocol which checks interruption/cancellation of an upload? I want to compare what I have with other protocols.

@korydraughn
Copy link
Author

Nevermind, I fixed the upload test failure.

@korydraughn
Copy link
Author

korydraughn commented Oct 30, 2025

All tests now rely on a Docker Compose setup and pass. Tests were run on Linux.

I've also confirmed that this PR still works on Windows.

During the development of this PR, I noted that move operations would result in unexpected new iRODS connections. I did not see that behavior when running the unit/integration tests. That indicates there might be a bug in the GUI code.

I believe the current state of this PR is in a good place. The Cyberduck team should have everything they need to test and fix any bugs related to iRODS given the addition of the iRODS Docker Compose setup.

@dkocher Please advise on next steps.

@korydraughn korydraughn marked this pull request as ready for review October 30, 2025 23:20
@korydraughn korydraughn requested a review from a team as a code owner October 30, 2025 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PAM passwords not handled correctly

4 participants