Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
a20715a
feat(router): Subgraph Timeout Configuration
ardatan Oct 31, 2025
c6f2f54
I guess?
kamilkisiela Nov 25, 2025
6951e90
Make sure Duration in config is treated as String
kamilkisiela Nov 25, 2025
c77bbe1
e2e tests, fix config and adjust naming
kamilkisiela Nov 25, 2025
08b1d20
Compile timeout expressions during config load
kamilkisiela Nov 25, 2025
4e850e5
Use a link to the vector.dev documentation for the VRL error messages
kamilkisiela Nov 25, 2025
b36912e
syntax
kamilkisiela Nov 25, 2025
a66386e
Introduce `.default` value as a fallback
kamilkisiela Nov 26, 2025
54a9540
Refactor expression handling into a unified, generic module
kamilkisiela Nov 26, 2025
da9ac80
Use VRL formatter and make executor error transparent
kamilkisiela Nov 26, 2025
ed403d8
Actually, bring back the old logic as I forgot to do subgraph url
kamilkisiela Nov 26, 2025
e089162
fmt
kamilkisiela Nov 26, 2025
ac94878
Avoid client override when pool timeout unchanged
kamilkisiela Nov 26, 2025
5eb9e41
Set tracing::instrument level to trace
kamilkisiela Nov 26, 2025
87d1a2a
Move ProgramResolutionError and improve parsing
kamilkisiela Nov 26, 2025
0121247
Fix changeset
kamilkisiela Nov 26, 2025
395e5d4
Fix RequestTimeout duration being s instead of ms
kamilkisiela Nov 26, 2025
97a59c8
changeset
kamilkisiela Nov 26, 2025
4e90e07
Remove obsolete changeset about pool_idle_timeout
kamilkisiela Nov 27, 2025
416918f
Use u128 for request timeout duration
kamilkisiela Nov 27, 2025
a1dea83
Fix duration expression documentation
kamilkisiela Nov 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .changeset/asd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
config: minor
router: minor
executor: minor
---

# Subgraph Request Timeout Feature

Adds support for configurable subgraph request timeouts via the `traffic_shaping` configuration. The `request_timeout` option allows you to specify the maximum time the router will wait for a response from a subgraph before timing out the request. You can set a static timeout (e.g., `30s`) globally or per-subgraph, or use dynamic timeouts with VRL expressions to vary timeout values based on request characteristics. This helps protect your router from hanging requests and enables fine-grained control over how long requests to different subgraphs should be allowed to run.
27 changes: 27 additions & 0 deletions .gemini/styleguide.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,33 @@ async fn handle(user: &User, req: &Request) -> Result<Response> {
Ok(...)
}

---

## `std::time::Duration` in `router-config` Crate

When using `std::time::Duration` in the `router-config` crate **only**, you **must** add both serde and schemars attributes:

```rust
use std::time::Duration;

#[derive(serde::Serialize, serde::Deserialize)]
struct Config {
#[serde(
deserialize_with = "humantime_serde::deserialize",
serialize_with = "humantime_serde::serialize",
)]
#[schemars(with = "String")]
timeout: Duration,
}
```

- **`#[serde(...)]`** enables human-readable time formats (e.g., `"30s"`, `"1m30s"`) in config files.
- **`#[schemars(with = "String")]`** ensures the JSON schema correctly represents the field as a string, not as a numeric value.

**Important:** This pattern applies **only** to the `router-config` crate.

---

## Releasing

We are using `knope` with changesets for declaring changes. If you detect a new file in a PR under `.changeset/` directory, please confirm the following rules:
Expand Down
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions bin/router/src/pipeline/progressive_override.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ use std::collections::{BTreeMap, HashMap, HashSet};

use hive_router_config::override_labels::{LabelOverrideValue, OverrideLabelsConfig};
use hive_router_plan_executor::{
execution::client_request_details::ClientRequestDetails, utils::expression::compile_expression,
execution::client_request_details::ClientRequestDetails, expressions::CompileExpression,
};
use hive_router_query_planner::{
graph::{PlannerOverrideContext, PERCENTAGE_SCALE_FACTOR},
Expand Down Expand Up @@ -135,7 +135,7 @@ impl OverrideLabelsEvaluator {
static_enabled_labels.insert(label.clone());
}
LabelOverrideValue::Expression { expression } => {
let program = compile_expression(expression, None).map_err(|err| {
let program = expression.compile_expression(None).map_err(|err| {
OverrideLabelsCompileError {
label: label.clone(),
error: err.to_string(),
Expand Down
2 changes: 1 addition & 1 deletion bin/router/src/schema_state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ pub enum SupergraphManagerError {
PlannerBuilderError(#[from] PlannerError),
#[error("Failed to build authorization: {0}")]
AuthorizationMetadataError(#[from] AuthorizationMetadataError),
#[error("Failed to init executor: {0}")]
#[error(transparent)]
ExecutorInitError(#[from] SubgraphExecutorError),
#[error("Unexpected: failed to load initial supergraph")]
FailedToLoadInitialSupergraph,
Expand Down
125 changes: 47 additions & 78 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
|[**override\_subgraph\_urls**](#override_subgraph_urls)|`object`|Configuration for overriding subgraph URLs.<br/>Default: `{}`<br/>||
|[**query\_planner**](#query_planner)|`object`|Query planning configuration.<br/>Default: `{"allow_expose":false,"timeout":"10s"}`<br/>||
|[**supergraph**](#supergraph)|`object`|Configuration for the Federation supergraph source. By default, the router will use a local file-based supergraph source (`./supergraph.graphql`).<br/>||
|[**traffic\_shaping**](#traffic_shaping)|`object`|Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.<br/>Default: `{"dedupe_enabled":true,"max_connections_per_host":100,"pool_idle_timeout":"50s"}`<br/>||
|[**traffic\_shaping**](#traffic_shaping)|`object`|Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.<br/>Default: `{"all":{"dedupe_enabled":true,"pool_idle_timeout":"50s","request_timeout":"30s"},"max_connections_per_host":100}`<br/>||

**Additional Properties:** not allowed
**Example**
Expand Down Expand Up @@ -113,9 +113,11 @@ query_planner:
timeout: 10s
supergraph: {}
traffic_shaping:
dedupe_enabled: true
all:
dedupe_enabled: true
pool_idle_timeout: 50s
request_timeout: 30s
max_connections_per_host: 100
pool_idle_timeout: 50s

```

Expand Down Expand Up @@ -1744,7 +1746,7 @@ The path can be either absolute or relative to the router's working directory.
|Name|Type|Description|Required|
|----|----|-----------|--------|
|**path**|`string`|The path to the supergraph file.<br/><br/>Can also be set using the `SUPERGRAPH_FILE_PATH` environment variable.<br/>Format: `"path"`<br/>|yes|
|[**poll\_interval**](#option1poll_interval)|`object`, `null`|Optional interval at which the file should be polled for changes.<br/>|yes|
|**poll\_interval**|`string`|Optional interval at which the file should be polled for changes.<br/>If not provided, the file will only be loaded once when the router starts.<br/>|no|
|**source**|`string`|Constant Value: `"file"`<br/>|yes|

**Additional Properties:** not allowed
Expand All @@ -1766,11 +1768,11 @@ Loads a supergraph from Hive Console CDN.
|Name|Type|Description|Required|
|----|----|-----------|--------|
|**accept\_invalid\_certs**|`boolean`|Whether to accept invalid TLS certificates when connecting to the Hive Console CDN.<br/>Default: `false`<br/>|no|
|[**connect\_timeout**](#option2connect_timeout)|`object`|Connect timeout for the Hive Console CDN requests.<br/>Default: `"10s"`<br/>|yes|
|**connect\_timeout**|`string`|Connect timeout for the Hive Console CDN requests.<br/>Default: `"10s"`<br/>|no|
|**endpoint**|`string`|The CDN endpoint from Hive Console target.<br/><br/>Can also be set using the `HIVE_CDN_ENDPOINT` environment variable.<br/>|yes|
|**key**|`string`|The CDN Access Token with from the Hive Console target.<br/><br/>Can also be set using the `HIVE_CDN_KEY` environment variable.<br/>|yes|
|[**poll\_interval**](#option2poll_interval)|`object`|Interval at which the Hive Console should be polled for changes.<br/>Default: `"10s"`<br/>|yes|
|[**request\_timeout**](#option2request_timeout)|`object`|Request timeout for the Hive Console CDN requests.<br/>Default: `"1m"`<br/>|yes|
|**poll\_interval**|`string`|Interval at which the Hive Console should be polled for changes.<br/><br/>Can also be set using the `HIVE_CDN_POLL_INTERVAL` environment variable.<br/>Default: `"10s"`<br/>|no|
|**request\_timeout**|`string`|Request timeout for the Hive Console CDN requests.<br/>Default: `"1m"`<br/>|no|
|[**retry\_policy**](#option2retry_policy)|`object`|Interval at which the Hive Console should be polled for changes.<br/>Default: `{"max_retries":10}`<br/>|yes|
|**source**|`string`|Constant Value: `"hive"`<br/>|yes|

Expand All @@ -1788,132 +1790,99 @@ retry_policy:
```


<a name="option1poll_interval"></a>
## Option 1: poll\_interval: object,null

Optional interval at which the file should be polled for changes.
If not provided, the file will only be loaded once when the router starts.


**Properties**

|Name|Type|Description|Required|
|----|----|-----------|--------|
|**nanos**|`integer`|Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
|**secs**|`integer`|Format: `"uint64"`<br/>Minimum: `0`<br/>|yes|

**Example**

```yaml
{}

```
<a name="option2retry_policy"></a>
## Option 2: retry\_policy: object

<a name="option2connect_timeout"></a>
## Option 2: connect\_timeout: object
Interval at which the Hive Console should be polled for changes.

Connect timeout for the Hive Console CDN requests.
By default, an exponential backoff retry policy is used, with 10 attempts.


**Properties**

|Name|Type|Description|Required|
|----|----|-----------|--------|
|**nanos**|`integer`|Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
|**secs**|`integer`|Format: `"uint64"`<br/>Minimum: `0`<br/>|yes|
|**max\_retries**|`integer`|The maximum number of retries to attempt.<br/><br/>Retry mechanism is based on exponential backoff, see https://docs.rs/retry-policies/latest/retry_policies/policies/struct.ExponentialBackoff.html for additional details.<br/>Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|

**Example**

```yaml
10s
max_retries: 10

```

<a name="option2poll_interval"></a>
## Option 2: poll\_interval: object

Interval at which the Hive Console should be polled for changes.
<a name="traffic_shaping"></a>
## traffic\_shaping: object

Can also be set using the `HIVE_CDN_POLL_INTERVAL` environment variable.
Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.


**Properties**

|Name|Type|Description|Required|
|----|----|-----------|--------|
|**nanos**|`integer`|Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
|**secs**|`integer`|Format: `"uint64"`<br/>Minimum: `0`<br/>|yes|
|[**all**](#traffic_shapingall)|`object`|The default configuration that will be applied to all subgraphs, unless overridden by a specific subgraph configuration.<br/>Default: `{"dedupe_enabled":true,"pool_idle_timeout":"50s","request_timeout":"30s"}`<br/>||
|**max\_connections\_per\_host**|`integer`|Limits the concurrent amount of requests/connections per host/subgraph.<br/>Default: `100`<br/>Format: `"uint"`<br/>Minimum: `0`<br/>||
|[**subgraphs**](#traffic_shapingsubgraphs)|`object`|Optional per-subgraph configurations that will override the default configuration for specific subgraphs.<br/>||

**Additional Properties:** not allowed
**Example**

```yaml
10s
all:
dedupe_enabled: true
pool_idle_timeout: 50s
request_timeout: 30s
max_connections_per_host: 100

```

<a name="option2request_timeout"></a>
## Option 2: request\_timeout: object
<a name="traffic_shapingall"></a>
### traffic\_shaping\.all: object

Request timeout for the Hive Console CDN requests.
The default configuration that will be applied to all subgraphs, unless overridden by a specific subgraph configuration.


**Properties**

|Name|Type|Description|Required|
|----|----|-----------|--------|
|**nanos**|`integer`|Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|
|**secs**|`integer`|Format: `"uint64"`<br/>Minimum: `0`<br/>|yes|
|**dedupe\_enabled**|`boolean`|Enables/disables request deduplication to subgraphs.<br/><br/>When requests exactly matches the hashing mechanism (e.g., subgraph name, URL, headers, query, variables), and are executed at the same time, they will<br/>be deduplicated by sharing the response of other in-flight requests.<br/>Default: `true`<br/>||
|**pool\_idle\_timeout**|`string`|Timeout for idle sockets being kept-alive.<br/>Default: `"50s"`<br/>||
|**request\_timeout**||Optional timeout configuration for requests to subgraphs.<br/><br/>Example with a fixed duration:<br/>```yaml<br/> timeout:<br/> duration: 5s<br/>```<br/><br/>Or with a VRL expression that can return a duration based on the operation kind:<br/>```yaml<br/> timeout:<br/> expression: \|<br/> if (.request.operation.type == "mutation") {<br/> "10s"<br/> } else {<br/> "15s"<br/> }<br/>```<br/>Default: `"30s"`<br/>||

**Additional Properties:** not allowed
**Example**

```yaml
1m
dedupe_enabled: true
pool_idle_timeout: 50s
request_timeout: 30s

```

<a name="option2retry_policy"></a>
## Option 2: retry\_policy: object
<a name="traffic_shapingsubgraphs"></a>
### traffic\_shaping\.subgraphs: object

Interval at which the Hive Console should be polled for changes.

By default, an exponential backoff retry policy is used, with 10 attempts.
Optional per-subgraph configurations that will override the default configuration for specific subgraphs.


**Properties**
**Additional Properties**

|Name|Type|Description|Required|
|----|----|-----------|--------|
|**max\_retries**|`integer`|The maximum number of retries to attempt.<br/><br/>Retry mechanism is based on exponential backoff, see https://docs.rs/retry-policies/latest/retry_policies/policies/struct.ExponentialBackoff.html for additional details.<br/>Format: `"uint32"`<br/>Minimum: `0`<br/>|yes|

**Example**

```yaml
max_retries: 10

```

<a name="traffic_shaping"></a>
## traffic\_shaping: object

Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.
|[**Additional Properties**](#traffic_shapingsubgraphsadditionalproperties)|`object`|||

<a name="traffic_shapingsubgraphsadditionalproperties"></a>
#### traffic\_shaping\.subgraphs\.additionalProperties: object

**Properties**

|Name|Type|Description|Required|
|----|----|-----------|--------|
|**dedupe\_enabled**|`boolean`|Enables/disables request deduplication to subgraphs.<br/><br/>When requests exactly matches the hashing mechanism (e.g., subgraph name, URL, headers, query, variables), and are executed at the same time, they will<br/>be deduplicated by sharing the response of other in-flight requests.<br/>Default: `true`<br/>||
|**max\_connections\_per\_host**|`integer`|Limits the concurrent amount of requests/connections per host/subgraph.<br/>Default: `100`<br/>Format: `"uint"`<br/>Minimum: `0`<br/>||
|**pool\_idle\_timeout**|`string`|Timeout for idle sockets being kept-alive.<br/>Default: `"50s"`<br/>||
|**dedupe\_enabled**|`boolean`, `null`|Enables/disables request deduplication to subgraphs.<br/><br/>When requests exactly matches the hashing mechanism (e.g., subgraph name, URL, headers, query, variables), and are executed at the same time, they will<br/>be deduplicated by sharing the response of other in-flight requests.<br/>||
|**pool\_idle\_timeout**|`string`, `null`|Timeout for idle sockets being kept-alive.<br/>||
|**request\_timeout**||Optional timeout configuration for requests to subgraphs.<br/><br/>Example with a fixed duration:<br/>```yaml<br/> timeout:<br/> duration: 5s<br/>```<br/><br/>Or with a VRL expression that can return a duration based on the operation kind:<br/>```yaml<br/> timeout:<br/> expression: \|<br/> if (.request.operation.type == "mutation") {<br/> "10s"<br/> } else {<br/> "15s"<br/> }<br/>```<br/>||

**Additional Properties:** not allowed
**Example**

```yaml
dedupe_enabled: true
max_connections_per_host: 100
pool_idle_timeout: 50s

```


18 changes: 18 additions & 0 deletions e2e/configs/timeout_per_subgraph_dynamic.router.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# yaml-language-server: $schema=../../router-config.schema.json
supergraph:
source: file
path: ../supergraph.graphql
traffic_shaping:
all:
request_timeout: 2s
# Disable deduplication to better hunt for deadlocks in tests
dedupe_enabled: false
subgraphs:
accounts:
request_timeout:
expression: |
if (.request.headers."x-timeout" == "short") {
"10s"
} else {
.default
}
12 changes: 12 additions & 0 deletions e2e/configs/timeout_per_subgraph_static.router.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# yaml-language-server: $schema=../../router-config.schema.json
supergraph:
source: file
path: ../supergraph.graphql
traffic_shaping:
all:
request_timeout: 2s
# Disable deduplication to better hunt for deadlocks in tests
dedupe_enabled: false
subgraphs:
accounts:
request_timeout: 5s
2 changes: 2 additions & 0 deletions e2e/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,5 @@ mod probes;
mod supergraph;
#[cfg(test)]
mod testkit;
#[cfg(test)]
mod timeout_per_subgraph;
Loading
Loading