-
Notifications
You must be signed in to change notification settings - Fork 148
Invoke the endpoints health check in web o11y and render result in toolbar #615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Invoke the endpoints health check in web o11y and render result in toolbar #615
Conversation
🦋 Changeset detectedLatest commit: 61aefad The changes in this PR will be included in the next version bump. This PR includes changesets to release 5 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🧪 E2E Test Results❌ Some tests failed Summary
❌ Failed Tests🪟 Windows (27 failed)nextjs-turbopack (27 failed):
🌍 Community Worlds (33 failed)mongodb (1 failed):
redis (1 failed):
starter-dev (3 failed):
starter (27 failed):
turso (1 failed):
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
❌ 🪟 Windows
❌ 🌍 Community Worlds
|
📊 Benchmark Results
workflow with no steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro workflow with 1 step💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 10 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro Promise.all with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) Promise.all with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) Promise.race with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro Promise.race with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
| useEffect(() => { | ||
| const configKey = getConfigKey(config); | ||
| const cached = getSessionHealthCheck(configKey); | ||
|
|
||
| // If we have a cached result from this session, use it | ||
| if (cached) { | ||
| setHealthCheck(cached); | ||
| return; | ||
| } | ||
|
|
||
| // Otherwise, perform the health check | ||
| const performHealthCheck = async () => { | ||
| setIsChecking(true); | ||
|
|
||
| // Determine base URL based on config | ||
| const port = config.port || '3000'; | ||
| const baseUrl = `http://localhost:${port}`; | ||
|
|
||
| const [flowResult, stepResult] = await Promise.all([ | ||
| checkEndpointHealth(baseUrl, 'flow'), | ||
| checkEndpointHealth(baseUrl, 'step'), | ||
| ]); | ||
|
|
||
| const result: HealthCheckResult = { | ||
| flow: flowResult.success ? 'success' : 'error', | ||
| step: stepResult.success ? 'success' : 'error', | ||
| flowMessage: flowResult.message, | ||
| stepMessage: stepResult.message, | ||
| checkedAt: new Date().toISOString(), | ||
| }; | ||
|
|
||
| setHealthCheck(result); | ||
| setSessionHealthCheck(configKey, result); | ||
| setIsChecking(false); | ||
| }; | ||
|
|
||
| performHealthCheck(); | ||
| }, [config]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The useEffect hook starts an async health check operation but lacks a cleanup function to handle component unmounting. This can cause React state update warnings if the component unmounts before the async operation completes.
View Details
📝 Patch Details
diff --git a/packages/web/src/components/display-utils/endpoints-health-status.tsx b/packages/web/src/components/display-utils/endpoints-health-status.tsx
index bad42f3..c0b19f2 100644
--- a/packages/web/src/components/display-utils/endpoints-health-status.tsx
+++ b/packages/web/src/components/display-utils/endpoints-health-status.tsx
@@ -58,17 +58,23 @@ function getConfigKey(config: WorldConfig): string {
async function checkEndpointHealth(
baseUrl: string,
- endpoint: 'flow' | 'step'
+ endpoint: 'flow' | 'step',
+ signal?: AbortSignal
): Promise<{ success: boolean; message: string }> {
try {
const url = new URL(
`/.well-known/workflow/v1/${endpoint}?__health`,
baseUrl
);
+ // Combine provided signal with timeout signal
+ const timeoutSignal = AbortSignal.timeout(5000);
+ const combinedSignal = signal
+ ? AbortSignal.any([signal, timeoutSignal])
+ : timeoutSignal;
+
const response = await fetch(url.toString(), {
method: 'POST',
- // Short timeout for health checks
- signal: AbortSignal.timeout(5000),
+ signal: combinedSignal,
});
if (response.ok) {
@@ -103,8 +109,13 @@ export function EndpointsHealthStatus({ config }: EndpointsHealthStatusProps) {
return;
}
+ // Track whether the component is still mounted
+ let isMounted = true;
+ const abortController = new AbortController();
+
// Otherwise, perform the health check
const performHealthCheck = async () => {
+ if (!isMounted || abortController.signal.aborted) return;
setIsChecking(true);
// Determine base URL based on config
@@ -112,10 +123,13 @@ export function EndpointsHealthStatus({ config }: EndpointsHealthStatusProps) {
const baseUrl = `http://localhost:${port}`;
const [flowResult, stepResult] = await Promise.all([
- checkEndpointHealth(baseUrl, 'flow'),
- checkEndpointHealth(baseUrl, 'step'),
+ checkEndpointHealth(baseUrl, 'flow', abortController.signal),
+ checkEndpointHealth(baseUrl, 'step', abortController.signal),
]);
+ // Only update state if the component is still mounted
+ if (!isMounted || abortController.signal.aborted) return;
+
const result: HealthCheckResult = {
flow: flowResult.success ? 'success' : 'error',
step: stepResult.success ? 'success' : 'error',
@@ -130,6 +144,12 @@ export function EndpointsHealthStatus({ config }: EndpointsHealthStatusProps) {
};
performHealthCheck();
+
+ // Cleanup function: cancel pending requests and mark component as unmounted
+ return () => {
+ isMounted = false;
+ abortController.abort();
+ };
}, [config]);
const allSuccess =
Analysis
Missing cleanup function in useEffect allows state updates on unmounted component
What fails: The EndpointsHealthStatus component's useEffect hook (lines 96-133 in packages/web/src/components/display-utils/endpoints-health-status.tsx) initiates async fetch operations without a cleanup function. When the component unmounts before the async operation completes, the subsequent setHealthCheck() and setIsChecking() calls attempt to update state on an unmounted component.
How to reproduce:
- Navigate to a page containing the
EndpointsHealthStatuscomponent - Immediately navigate away before the health check completes (within 5 seconds)
- In development mode with React's StrictMode, observe the warning in browser console
Result: React warning appears: "Can't perform a React state update on an unmounted component. This is a no-op, but it indicates a memory leak in your application."
Expected: The component should implement proper cleanup to cancel pending async operations when unmounting, preventing state updates on unmounted components. Per React documentation on removing effect dependencies, effects with async operations should return cleanup functions that abort pending requests.
Fix implemented:
- Added
AbortControllerwithin the effect to manage async operation lifecycles - Added
isMountedflag to track component mount state - Updated
checkEndpointHealth()function to accept optionalAbortSignalparameter - Combined component's abort signal with existing timeout signal using
AbortSignal.any() - Added guard checks before state updates to prevent updates on unmounted components
- Implemented cleanup function that aborts pending requests and marks component as unmounted
This ensures all pending fetch requests are cancelled when the component unmounts or the effect re-runs, preventing "state update on unmounted component" warnings and memory leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds endpoint health monitoring to the web observability UI by introducing a new health status component that actively checks the workflow and step endpoints and displays their status in the toolbar. The implementation adds CORS support to the health check endpoints in the core runtime to enable cross-origin health checks from the web UI.
- Added a new
EndpointsHealthStatuscomponent that performs health checks on workflow endpoints and caches results in session storage - Enhanced the core runtime's health check handler with CORS headers and OPTIONS preflight support
- Integrated the health status display into the web UI toolbar alongside the existing connection status
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/web/src/components/display-utils/endpoints-health-status.tsx | New component that checks workflow/step endpoint health, caches results in sessionStorage, and displays status with detailed tooltips |
| packages/web/src/app/layout-client.tsx | Integrates EndpointsHealthStatus component into the toolbar with appropriate spacing |
| packages/core/src/runtime.ts | Adds CORS headers to health check responses and handles OPTIONS preflight requests for cross-origin support |
| .changeset/twenty-parents-type.md | Documents CORS headers addition to core package |
| .changeset/empty-yaks-follow.md | Documents endpoint health check feature addition to web package |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Determine base URL based on config | ||
| const port = config.port || '3000'; | ||
| const baseUrl = `http://localhost:${port}`; |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The health check always uses localhost regardless of the backend configuration. When config.backend is 'vercel' or another non-local backend, this will fail to check the actual endpoints being used. The component should derive the appropriate base URL based on the backend type, similar to how other components like ConnectionStatus handle different backends.
packages/core/src/runtime.ts
Outdated
| * Allows the observability UI to check endpoint health from a different origin. | ||
| */ | ||
| const HEALTH_CHECK_CORS_HEADERS = { | ||
| 'Access-Control-Allow-Origin': '*', |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CORS configuration uses a wildcard 'Access-Control-Allow-Origin: *' which allows any origin to access the health check endpoints. While this might be acceptable for non-sensitive health checks, it's worth considering whether this should be more restrictive, especially if the endpoints could leak information about the system's state or configuration. Consider documenting the security implications or restricting to known origins.
| // Create a unique key based on relevant config values | ||
| return `${config.backend || 'local'}-${config.port || '3000'}`; |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config key generation only considers 'backend' and 'port' fields, but this may not uniquely identify all possible backend configurations. For example, Vercel backends with different env/project/team combinations or Postgres backends with different database URLs would share the same cache key if they happen to use the same port. This could lead to cached health check results being incorrectly reused across different backend configurations. Consider including all relevant config fields that uniquely identify the backend (e.g., env, project, team, dataDir, postgresUrl).
| // Create a unique key based on relevant config values | |
| return `${config.backend || 'local'}-${config.port || '3000'}`; | |
| // Create a unique key based on all relevant config values that uniquely identify the backend | |
| // Include backend, port, and backend-specific fields | |
| const keyObj: Record<string, unknown> = { | |
| backend: config.backend || 'local', | |
| port: config.port || '3000', | |
| }; | |
| // Add backend-specific fields | |
| if ('env' in config && config.env) keyObj.env = config.env; | |
| if ('project' in config && config.project) keyObj.project = config.project; | |
| if ('team' in config && config.team) keyObj.team = config.team; | |
| if ('dataDir' in config && config.dataDir) keyObj.dataDir = config.dataDir; | |
| if ('postgresUrl' in config && config.postgresUrl) keyObj.postgresUrl = config.postgresUrl; | |
| // Add any other fields that may uniquely identify the backend as needed | |
| return JSON.stringify(keyObj); |
| useEffect(() => { | ||
| const configKey = getConfigKey(config); | ||
| const cached = getSessionHealthCheck(configKey); | ||
|
|
||
| // If we have a cached result from this session, use it | ||
| if (cached) { | ||
| setHealthCheck(cached); | ||
| return; | ||
| } | ||
|
|
||
| // Otherwise, perform the health check | ||
| const performHealthCheck = async () => { | ||
| setIsChecking(true); | ||
|
|
||
| // Determine base URL based on config | ||
| const port = config.port || '3000'; | ||
| const baseUrl = `http://localhost:${port}`; | ||
|
|
||
| const [flowResult, stepResult] = await Promise.all([ | ||
| checkEndpointHealth(baseUrl, 'flow'), | ||
| checkEndpointHealth(baseUrl, 'step'), | ||
| ]); | ||
|
|
||
| const result: HealthCheckResult = { | ||
| flow: flowResult.success ? 'success' : 'error', | ||
| step: stepResult.success ? 'success' : 'error', | ||
| flowMessage: flowResult.message, | ||
| stepMessage: stepResult.message, | ||
| checkedAt: new Date().toISOString(), | ||
| }; | ||
|
|
||
| setHealthCheck(result); | ||
| setSessionHealthCheck(configKey, result); | ||
| setIsChecking(false); | ||
| }; | ||
|
|
||
| performHealthCheck(); | ||
| }, [config]); |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The useEffect hook has 'config' as a dependency, but it references the entire config object. In React, this will trigger the health check every time any field in the config object changes (even if it's a new object reference with the same values). Consider using a more stable dependency like a memoized config key from getConfigKey(config), or add specific config fields that actually affect the health check (backend, port) to the dependency array.
| const performHealthCheck = async () => { | ||
| setIsChecking(true); | ||
|
|
||
| // Determine base URL based on config | ||
| const port = config.port || '3000'; | ||
| const baseUrl = `http://localhost:${port}`; | ||
|
|
||
| const [flowResult, stepResult] = await Promise.all([ | ||
| checkEndpointHealth(baseUrl, 'flow'), | ||
| checkEndpointHealth(baseUrl, 'step'), | ||
| ]); | ||
|
|
||
| const result: HealthCheckResult = { | ||
| flow: flowResult.success ? 'success' : 'error', | ||
| step: stepResult.success ? 'success' : 'error', | ||
| flowMessage: flowResult.message, | ||
| stepMessage: stepResult.message, | ||
| checkedAt: new Date().toISOString(), | ||
| }; | ||
|
|
||
| setHealthCheck(result); | ||
| setSessionHealthCheck(configKey, result); | ||
| setIsChecking(false); | ||
| }; | ||
|
|
||
| performHealthCheck(); |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The useEffect that performs health checks does not handle cleanup when the component unmounts or when the config changes mid-check. If the component unmounts or config changes while async health checks are in progress, the state updates on lines 127-129 will execute on an unmounted component, causing a React warning. Consider using an AbortController to cancel in-flight requests and check if the component is still mounted before calling setState.
|
@copilot open a new pull request to apply changes based on this feedback, this feedback, this feedback, and this feedback. |
|
@TooTallNate I've opened a new pull request, #617, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
if the health check fails, we should make sure the component visually stands out and shows a link to a page on the docs that talks about enabling proxy maybe a new error slug page "routes-not-reachable" |
|
|
||
| // Determine base URL based on config | ||
| const port = config.port || '3000'; | ||
| const baseUrl = `http://localhost:${port}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would fail if the user deployed this observability app on vercel right? could we use an env var like NEXT_PUBLIC_URL or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to me like the World/runtime should have a canonical way of getting its public URL, which is partly in the versioning spec, but in the meantime, we should disable this feature for non-local worlds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main motivation for this feature is to diagnose issues with Vercel deployments where workflows are not getting past the start() function (stuck in pending), so we should not disable for non-local worlds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TooTallNate sure, but this PR doesn't do that, since it only calls localhost from the web UI, which isn't on the same host as the vercel deployment. This PR would need to be extended to include a conditional for world-vercel that checks for DEPLOYMENT_ID or DEPLOYMENT_URL or whatever we use, and ping against that, and probably would also need to use the vercel auth to bypass the preview environment protection
VaguelySerious
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea looks good. Whatever iteration Copilot is doing, please merge it back into this PR so it can be reviewed as a single package.
Then, we can either disable this for world-vercel (initially) and merge, then follow-up with world-vercel code, or fix the world-vercel use here directly and then merge
VaguelySerious
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we can merge the CORS part early - I extracted that to #624 if you want to ship that sooner
|
Merged the CORS changes in #624 so this PR can focus on web changes only |
…eb_o11y_and_render_result_in_toolbar
| // Determine base URL based on config | ||
| const port = config.port || '3000'; | ||
| const baseUrl = `http://localhost:${port}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Determine base URL based on config | |
| const port = config.port || '3000'; | |
| const baseUrl = `http://localhost:${port}`; | |
| // Determine base URL based on backend type | |
| // For local backend: use localhost with configured port | |
| // For deployed backends (vercel, postgres): use current origin | |
| let baseUrl: string; | |
| const backend = config.backend || 'local'; | |
| if (backend === 'local') { | |
| const port = config.port || '3000'; | |
| baseUrl = `http://localhost:${port}`; | |
| } else { | |
| // For deployed backends, use the current window origin | |
| baseUrl = typeof window !== 'undefined' ? window.location.origin : ''; | |
| } |
The health check component hard-codes http://localhost: for all backend types, but this will fail for deployed backends (Vercel, Postgres) where the server is not running on localhost.
View Details
Analysis
Health check hardcodes localhost, fails for deployed Vercel/Postgres backends
What fails: EndpointsHealthStatus component constructs health check URLs using hardcoded http://localhost: , which fails when the frontend is deployed to Vercel or a separate deployment. The component needs to check if endpoints at /.well-known/workflow/v1/flow and /.well-known/workflow/v1/step are available, but only attempts to access them on localhost.
How to reproduce:
- Deploy the observability UI frontend to Vercel (or any non-localhost domain)
- Set
backendconfig to 'vercel' or 'postgres' - Load the UI in a browser
- Observe the health status indicator showing "Endpoint issues" even though the endpoints are actually healthy
Result: Browser attempts to fetch from http://localhost:3000/.well-known/workflow/v1/flow which results in connection failures (localhost refers to the user's machine, not the server). The UI always shows "Endpoint issues" in the tooltip.
Expected: For 'vercel' and 'postgres' backends, the health check should use window.location.origin to construct the URL, so it checks endpoints on the current domain where the frontend is deployed. The localhost approach should only be used for the 'local' backend where the frontend actually runs on localhost.
Fix: Modified EndpointsHealthStatus to check the backend config type - if it's 'local', use http://localhost: (original behavior); otherwise, use window.location.origin to access endpoints on the current deployment domain.

Added endpoint health checks to the web UI and improved CORS support for health check endpoints.
What changed?
EndpointsHealthStatuscomponent that checks and displays the health of workflow endpointsWhy make this change?
This change improves observability by providing immediate visual feedback about the health of critical workflow endpoints directly in the UI. It helps users quickly identify connectivity issues between the frontend and backend services, making troubleshooting easier. The CORS headers enable the health checks to work properly across different origins.