Skip to content

Commit c8e681e

Browse files
authored
Proxy asset HREFs (#991)
* feat: initial, untested implementation * tests: WIP, not all running * refactor: second iteration of initial implementation * refactor: use app.local state, sundry fixups, working tests * docs: update openapi.yaml to reflect asset proxy endpoints * docs: update README * chore: update CHANGELOG * review: move appInstance initialization out of function scope so it occurs during lambda init (rather than execution) * review: initialize assetProxy outside function so it runs during lamba init phase * review: move from v2 to v3 of AWS SDK * review: remove unnecessary S3 client caching in AssetProxy * review: 403 t0 404 when asset proxy is disabled * review: remove redundant asset proxy isEnabled check * review: significant refactor to improve bucket caching and region determination * docs: update docs * docs: minor README update and logging improvement in asset-proxy.js * review: remove commented code * review: pull asset proxy bucket management into its own class * chore: update CHANGELOG * refactor: extract AssetBuckets class from asset-proxy.js * review: correct requester pay information in README * fix: correct errors revealed by testing a deployment
1 parent c999648 commit c8e681e

29 files changed

+2184
-453
lines changed

CHANGELOG.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,22 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
66
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
77

8+
## [Unreleased]
9+
10+
### Added
11+
12+
- Asset proxying for generating pre-signed S3 URLs through proxy endpoints `GET
13+
/collections/{collectionId}/items/{itemId}/assets/{assetKey}` and `GET
14+
/collections/{collectionId}/assets/{assetKey}`.
15+
- Environment variables `ASSET_PROXY_BUCKET_OPTION`, `ASSET_PROXY_BUCKET_LIST`, and
16+
`ASSET_PROXY_URL_EXPIRY` to configure asset proxying.
17+
18+
### Changed
19+
20+
- When asset proxying is enabled, when a STAC Item or Collection is served, asset S3 hrefs
21+
are replaced with proxy endpoint URLs and the original S3 URLs are preserved in
22+
`alternate.s3.href` using the Alternate Assets Extension.
23+
824
## [4.4.0] - 2025-09-10
925

1026
## Changed
@@ -579,8 +595,7 @@ Initial release, forked from [sat-api](https://github.com/sat-utils/sat-api/tree
579595

580596
Compliant with STAC 0.9.0
581597

582-
<!-- [unreleased]: https://github.com/stac-utils/stac-api/compare/v3.6.0...main -->
583-
598+
[unreleased]: https://github.com/stac-utils/stac-server/compare/v4.4.0...main
584599
[4.4.0]: https://github.com/stac-utils/stac-api/compare/v4.3.0...v4.4.0
585600
[4.3.0]: https://github.com/stac-utils/stac-api/compare/v4.2.0...v4.3.0
586601
[4.2.0]: https://github.com/stac-utils/stac-api/compare/v4.1.0...v4.2.0

README.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@
5353
- [Filter Extension](#filter-extension)
5454
- [Query Extension](#query-extension)
5555
- [Aggregation](#aggregation)
56+
- [Asset Proxy](#asset-proxy)
5657
- [Collections and filter parameters for authorization](#collections-and-filter-parameters-for-authorization)
5758
- [Collections](#collections)
5859
- [CQL2 Filter](#cql2-filter)
@@ -617,6 +618,9 @@ There are some settings that should be reviewed and updated as needeed in the se
617618
| ENABLE_INGEST_ACTION_TRUNCATE | Enables support for ingest action "truncate". | none (not enabled) |
618619
| ENABLE_RESPONSE_COMPRESSION | Enables response compression. Set to 'false' to disable. | enabled |
619620
| ITEMS_MAX_LIMIT | The maximum limit for the number of items returned from the /search and /collections/{collection_id}/items endpoints. It is recommended that this be set to 100. There is an absolute max limit of 10000 for this. | 10000 |
621+
| ASSET_PROXY_BUCKET_OPTION | Control which S3 buckets are proxied through the API. Options: `NONE` (disabled), `ALL` (all S3 assets), `ALL_BUCKETS_IN_ACCOUNT` (all buckets in AWS account), `LIST` (specific buckets only). | NONE |
622+
| ASSET_PROXY_BUCKET_LIST | Comma-separated list of S3 bucket names to proxy. Required when `ASSET_PROXY_BUCKET_OPTION` is `LIST`. | |
623+
| ASSET_PROXY_URL_EXPIRY | Pre-signed URL expiry time in seconds for proxied assets. | 300 |
620624

621625
Additionally, the credential for OpenSearch must be configured, as decribed in the
622626
section [Populating and accessing credentials](#populating-and-accessing-credentials).
@@ -1124,6 +1128,144 @@ Available aggregations are:
11241128
- geometry_geohash_grid_frequency ([geohash grid](https://opensearch.org/docs/latest/aggregations/bucket/geohash-grid/) on Item.geometry)
11251129
- geometry_geotile_grid_frequency ([geotile grid](https://opensearch.org/docs/latest/aggregations/bucket/geotile-grid/) on Item.geometry)
11261130

1131+
## Asset Proxy
1132+
1133+
The Asset Proxy feature enables stac-server to proxy access to S3 assets through the STAC
1134+
API by generating pre-signed URLs. Only assets with S3 URIs (`s3://` prefix) are proxied;
1135+
other URL schemes are ignored. When the Asset Proxy feature is enabled, asset `href`
1136+
values pointing to S3 are replaced with proxy endpoint URLs when an Item or Collection is
1137+
served, while the original S3 URLs are preserved in the `alternate.s3.href` field using
1138+
the [Alternate Assets Extension](https://github.com/stac-extensions/alternate-assets).
1139+
Subsequent GET requests to the proxy endpoint URLs are redirected to pre-signed S3 URLS
1140+
for download. Note that the AWS account that stac-server is running under must have
1141+
permission to access the S3 buckets containing the assets and that the stac-server AWS
1142+
account will be charged for the S3 egress, regardless of whether the bucket is a
1143+
"Requester Pays" bucket or not (the stac-server AWS account is the requester when
1144+
generating the pre-signed URL).
1145+
1146+
### Configuration
1147+
1148+
Asset proxying uses three environment variables:
1149+
1150+
- **`ASSET_PROXY_BUCKET_OPTION` -** Specifies one of four modes to control which S3 buckets are proxied.
1151+
1152+
- **NONE** (default): Asset proxy is disabled. All asset hrefs are returned unchanged.
1153+
- **ALL**: Proxy all S3 assets regardless of which bucket they are in.
1154+
- **ALL_BUCKETS_IN_ACCOUNT**: Proxy assets from any S3 bucket accessible to the AWS account credentials. The list of buckets is fetched at Lambda startup.
1155+
- **LIST**: Only proxy assets from specific buckets listed in `ASSET_PROXY_BUCKET_LIST`.
1156+
1157+
- **`ASSET_PROXY_BUCKET_LIST`** — Comma-separated list of bucket names (required only when the `ASSET_PROXY_BUCKET_OPTION` environment variable is set to `LIST`)
1158+
1159+
```yaml
1160+
ASSET_PROXY_BUCKET_OPTION: "LIST"
1161+
ASSET_PROXY_BUCKET_LIST: "my-bucket-1,my-bucket-2,my-bucket-3"
1162+
```
1163+
1164+
- **`ASSET_PROXY_URL_EXPIRY`** — Pre-signed URL expiry in seconds (default: `300`)
1165+
1166+
### Endpoints
1167+
1168+
When asset proxying is enabled, two endpoints are available for accessing proxied assets:
1169+
1170+
- `GET /collections/{collectionId}/items/{itemId}/assets/{assetKey}` - Redirects (HTTP 302) to a pre-signed S3 URL for an item asset
1171+
- `GET /collections/{collectionId}/assets/{assetKey}` - Redirects (HTTP 302) to a pre-signed S3 URL for a collection asset
1172+
1173+
### IAM Permissions
1174+
1175+
For the Asset Proxy feature to generate pre-signed URLs, the API and ingest Lambdas must
1176+
be assigned permissions for the S3 buckets containing the assets. Add the following to the
1177+
IAM role statements in your `serverless.yml` file, adjusting the resources as needed:
1178+
1179+
For the `LIST` mode, you can specify the buckets listed in `ASSET_PROXY_BUCKET_LIST`:
1180+
1181+
```yaml
1182+
- Effect: Allow
1183+
Action:
1184+
- s3:GetObject
1185+
Resource:
1186+
- "arn:aws:s3:::my-bucket-1/*"
1187+
- "arn:aws:s3:::my-bucket-2/*"
1188+
- Effect: Allow
1189+
Action:
1190+
- s3:HeadBucket
1191+
- s3:ListBucket
1192+
Resource:
1193+
- "arn:aws:s3:::my-bucket-1"
1194+
- "arn:aws:s3:::my-bucket-2"
1195+
```
1196+
1197+
For the `ALL` mode, use wildcards:
1198+
1199+
```yaml
1200+
- Effect: Allow
1201+
Action:
1202+
- s3:GetObject
1203+
Resource: "arn:aws:s3:::*/*"
1204+
- Effect: Allow
1205+
Action:
1206+
- s3:HeadBucket
1207+
- s3:ListBucket
1208+
Resource: "arn:aws:s3:::*"
1209+
```
1210+
1211+
When using `ALL_BUCKETS_IN_ACCOUNT` mode, the Lambda also needs permission to list the
1212+
account buckets:
1213+
1214+
```yaml
1215+
- Effect: Allow
1216+
Action:
1217+
- s3:GetObject
1218+
Resource: "arn:aws:s3:::*/*"
1219+
- Effect: Allow
1220+
Action:
1221+
- s3:HeadBucket
1222+
- s3:ListBucket
1223+
Resource: "arn:aws:s3:::*"
1224+
- Effect: Allow
1225+
Action:
1226+
- s3:ListAllMyBuckets
1227+
Resource: "*"
1228+
```
1229+
1230+
### Asset Transformation
1231+
1232+
When asset proxying is enabled and an asset's `href` points to an S3 URL, the asset is transformed as follows:
1233+
1234+
**Original asset:**
1235+
```json
1236+
{
1237+
"thumbnail": {
1238+
"href": "s3://my-bucket/path/to/thumbnail.png",
1239+
"type": "image/png",
1240+
"roles": ["thumbnail"]
1241+
}
1242+
}
1243+
```
1244+
1245+
**Transformed asset:**
1246+
```json
1247+
{
1248+
"thumbnail": {
1249+
"href": "https://api.example.com/collections/my-collection/items/my-item/assets/thumbnail",
1250+
"type": "image/png",
1251+
"roles": ["thumbnail"],
1252+
"alternate": {
1253+
"s3": {
1254+
"href": "s3://my-bucket/path/to/thumbnail.png"
1255+
}
1256+
}
1257+
}
1258+
}
1259+
```
1260+
1261+
The item or collection will also have the Alternate Assets Extension added to its `stac_extensions` array:
1262+
1263+
```json
1264+
"stac_extensions": [
1265+
"https://stac-extensions.github.io/alternate-assets/v1.2.0/schema.json"
1266+
]
1267+
```
1268+
11271269
## Collections and filter parameters for authorization
11281270

11291271
One key concern in stac-server is how to restrict user's access to items. These
@@ -1160,6 +1302,8 @@ The endpoints this applies to are:
11601302
- /collections/:collectionId/items
11611303
- /collections/:collectionId/items/:itemId
11621304
- /collections/:collectionId/items/:itemId/thumbnail
1305+
- /collections/:collectionId/items/:itemId/assets/:assetKey
1306+
- /collections/:collectionId/assets/:assetKey
11631307
- /search
11641308
- /aggregate
11651309

@@ -1187,6 +1331,8 @@ The endpoints this applies to are:
11871331
- /collections/:collectionId/items
11881332
- /collections/:collectionId/items/:itemId
11891333
- /collections/:collectionId/items/:itemId/thumbnail
1334+
- /collections/:collectionId/items/:itemId/assets/:assetKey
1335+
- /collections/:collectionId/assets/:assetKey
11901336
- /search
11911337
- /aggregate
11921338

serverless.example.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,10 @@ provider:
3434
STAC_API_URL: "https://some-stac-server.example.com"
3535
CORS_ORIGIN: "https://ui.example.com"
3636
CORS_CREDENTIALS: true
37+
# Asset Proxy Environment Variables
38+
# ASSET_PROXY_BUCKET_OPTION: "NONE" # Options: NONE, ALL, ALL_BUCKETS_IN_ACCOUNT, LIST
39+
# ASSET_PROXY_BUCKET_LIST: "bucket1,bucket2,bucket3" # Required only when ASSET_PROXY_BUCKET_OPTION is LIST
40+
# ASSET_PROXY_URL_EXPIRY: 300 # Pre-signed URL expiry in seconds (default: 300)
3741
iam:
3842
role:
3943
statements:

0 commit comments

Comments
 (0)