Skip to content

Conversation

@haacked
Copy link
Collaborator

@haacked haacked commented Dec 16, 2025

Problem

Feature flag requests can fail due to transient issues like read timeouts, connection errors, or temporary server problems. Currently these failures bubble up immediately without any retry attempt.

Example error:

[FEATURE FLAGS] Unable to get flag remotely: HTTPSConnectionPool(host='us.i.posthog.com', port=443): Read timed out. (read timeout=3)

Solution

Create a dedicated HTTP session for feature flag requests configured with urllib3's Retry:

  • 2 retries (3 total attempts) with exponential backoff
  • Retries on: network failures (connect/read errors), transient server errors (408, 500, 502, 503, 504)
  • Does NOT retry on: rate limits (429), quota errors (402), client errors (4xx)

This leverages urllib3's battle-tested retry logic rather than adding application-level retry code.

Why not retry 429?

Rate limits indicate the client should back off, not immediately retry. Retrying would make the problem worse.

Related

Examples

I also updated example.py so it can run without a personal api key set. This allows testing the sdk against /flags.

This comment was marked as outdated.

@haacked haacked force-pushed the haacked/retries branch 2 times, most recently from 9bb9617 to 4b16ae9 Compare December 16, 2025 19:31
@haacked haacked changed the title fix(flags): Add retry logic with jitter for feature flag requests feat(flags): Add retry support for feature flag requests Dec 16, 2025
@haacked haacked requested a review from Copilot December 16, 2025 20:01
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@haacked haacked force-pushed the haacked/retries branch 2 times, most recently from 8fb57c9 to 08687d7 Compare December 16, 2025 21:38
@haacked haacked requested a review from Copilot December 16, 2025 21:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Use urllib3's built-in Retry mechanism for feature flag POST requests
instead of application-level retry logic. This is simpler and leverages
well-tested library code.

Key changes:
- Add `RETRY_STATUS_FORCELIST` = [408, 500, 502, 503, 504]
- Add `_build_flags_session()` with POST retries and `status_forcelist`
- Update `flags()` to use dedicated flags session
- Add tests for retry configuration and session usage

The flags session retries on:
- Network failures (connect/read errors)
- Transient server errors (408, 500, 502, 503, 504)

It does NOT retry on:
- 429 (rate limit) - need to wait, not hammer
- 402 (quota limit) - won't resolve with retries
@haacked haacked marked this pull request as ready for review December 16, 2025 22:00
@haacked haacked requested a review from a team December 16, 2025 22:00
@posthog-project-board-bot posthog-project-board-bot bot moved this to In Review in Feature Flags Dec 16, 2025
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@dustinbyrne dustinbyrne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - one comment about the network failures 👍

@github-project-automation github-project-automation bot moved this from In Review to Approved in Feature Flags Dec 16, 2025
Add tests that verify actual retry behavior, not just configuration:

- test_retries_on_503_then_succeeds: Spins up a local HTTP server that
  returns 503 twice then 200, verifying 3 requests are made
- test_connection_errors_are_retried: Verifies connection errors trigger
  retries by measuring elapsed time with backoff

Both tests use dynamically allocated ports for CI safety.
@haacked haacked requested a review from dustinbyrne December 16, 2025 23:38
@posthog-project-board-bot posthog-project-board-bot bot moved this from Approved to In Review in Feature Flags Dec 16, 2025
@haacked haacked enabled auto-merge (squash) December 16, 2025 23:39
@haacked haacked merged commit 5d0bae1 into master Dec 16, 2025
16 checks passed
@haacked haacked deleted the haacked/retries branch December 16, 2025 23:41
@github-project-automation github-project-automation bot moved this from In Review to Done in Feature Flags Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants