Skip to content

Conversation

@james-rms
Copy link
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

crepererum and others added 30 commits June 17, 2025 12:54
See apache#339. The old name is kept as a deprecated alias for now.
This used to work once but got broken during refactoring. Added a test
to catch regressions.

See apache#48 where -- I
think -- we decided that instead of printing entire error chains in
`Display` the API users shall walk the error chain. This is especially
relevant for error type that we do NOT control, like upstream `request`
types.
This code is what HTTP uses to indicate the client has sent too many
requests in a given amount of time. It may sound counter-intuitive to
retry in this situation, but that's why we use an exponential backoff
mechanism. It gives the server the opportunity to recover, without
failing the requests immediately.

The retry mechanism already works in object stores like S3 because they
return a server error. But without this change, we are not handling GCS
properly. GCS returns a client error `429 Too Many Requests` instead.
This change enables retries on this response too.

A more advanced retry mechanism would use the optional response header
`Retry-After`, but that is beyond the scope of this PR.

Closes: apache#309
* fix: update links in release docs and script

* Remove another reference to the previous crate setup
* Add (failing) test for retrying connection errors

* Fix not retrying connection errors

closes apache#368

* Fix clippy error

---------

Co-authored-by: John Garland <john.garland@vivcourt.com>
Re-export `HeaderMap`, `HeaderValue`, and `Extensions` from http crate to
avoid forcing users to add http dependency when using object_store public API.

Fixes apache#263
…32 (apache#468)

* chore: fix some clippy 1.89 warnings

* fix another warning

* Skip some doctests for wasm32
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4 to 5.
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](actions/setup-node@v4...v5)

---
updated-dependencies:
- dependency-name: actions/setup-node
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/github-script](https://github.com/actions/github-script) from 7 to 8.
- [Release notes](https://github.com/actions/github-script/releases)
- [Commits](actions/github-script@v7...v8)

---
updated-dependencies:
- dependency-name: actions/github-script
  dependency-version: '8'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ache#487)

On a request retry, it logs an info message stating that an error was
encountered and information about the retry process but it hasn't
included any details about the error that is causing the retry. This PR
updates the logging to include the status if it is a server error and
the http error kind if a transport error occurred. While the last error
when retries are exhausted is returned up the call stack, the
intermediate errors need not be exactly the same. It is helpful to
include some minimum information about what error triggered a retry each
time it happens.
* Add storage class for aws and gcp

* Add azure storage class attribute

* Update attribute docs

* Update http client
…guration (apache#480)

* Allow setting STS endpoint via env var

* Properly use AmazonS3Builder::credentials_from_env for AssumeRoleWithWebIdentity auth flow

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Update version to 0.12.4

* Update update_changelog.sh script

* Update changelog

* Last touchups

* Update changelog
Comment on lines +1196 to +1205
let multipart_copy_threshold = self
.multipart_copy_threshold
.map(|val| val.get())
.transpose()?
.unwrap_or(MAX_SINGLE_REQUEST_COPY_SIZE);
let multipart_copy_part_size = self
.multipart_copy_part_size
.map(|val| val.get())
.transpose()?
.unwrap_or(MAX_SINGLE_REQUEST_COPY_SIZE);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clamp to 5GiB because that's the documented maximum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if someone wants to push it over 5GB, they should be able to. There are many "s3-compatible" object stores that might not share the same limitations.

Comment on lines +369 to +387
// Determine source size to decide between single CopyObject and multipart copy
let head_meta = self
.client
.get_opts(
from,
GetOptions {
head: true,
..Default::default()
},
)
.await?
.meta;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this first try to use CopyObject and then fall back if it fails due to size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - let me see if that's straightforward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with that approach is that on error, AWS does not respond with anything more specific than "InvalidRequest":

<Error><Code>InvalidRequest</Code><Message>The specified copy source is larger than the maximum allowable size for a copy source: 5368709120</Message><RequestId>8550KAYYHRYF33SM</RequestId><HostId>R7zaiPWt96z/yQm2PtDT+pyFmYF76YCBcW0AeukdrXpS4qlSuO1nmXTFI4Ak2YcHMsBoymw33j4=</HostId></Error>

So there's not really a stable API for determining that the request is invalid because of the size of the source.

@james-rms james-rms force-pushed the jrms/0.12.4-aws-copy-etc branch from 5608942 to 68b24de Compare December 5, 2025 05:10
@james-rms james-rms changed the base branch from main to release/0.12 December 5, 2025 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.