Skip to content

Conversation

@pranaygp
Copy link
Collaborator

@pranaygp pranaygp commented Dec 16, 2025

Pranay:

corresponding workflow-server PR: https://github.com/vercel/workflow-server/pull/154

important: This is a big change to the way workflows work since everything is now event sourced, I introduced new events types, and changed the shape of the step object (lastKnownError -> error and startedAt -> firstStartedAt). New event logs that use this published version of workflow will be incompatible with previous workflow version event logs. This doesn't affect the runtime of workflows since those are deployment pegged - but this does affect observability since the event shape looks different and the world spec has changed. The web-shared package just needs to be compatible with viewing workflow runs of the old schema for this to work correctly (which I believe it does, but please double check @VaguelySerious if I missed anything).

The currently failing e2e tests on vercel world are related to the CLI I believe (slack x-ref). However once we merged the workflow-server PR, we can drop the env var changes on the vercel deployments for PR so that this PR points to the main prod deployment, again and then I'll re-run e2e tests to make sure they work :)

I Also added a new docs page with diagrams to explain the event sourcing and state machine lifecycles (preview link):

Docs preview

small: I also removed the unused run paused/resumed stuff which we've never used to simplify

Summary

Implement event-sourced architecture for runs, steps, and hooks:

  • Add run lifecycle events (run_created, run_started, run_completed, run_failed, run_cancelled)
  • Add step_retrying event for non-fatal step failures that will be retried
  • Remove fatal field from step_failed event (step_failed now implies terminal failure)
  • Rename step's lastKnownError to error for consistency with server
  • Update world-local, world-postgres, and world-vercel to create/update entities from events via events.create()
  • Entities (runs, steps, hooks) are now materializations of the event log
  • Fix hook token conflict error to use WorkflowAPIError with status 409
  • Move event log corruption check to step_created event for earlier detection
  • BREAKING CHANGE: Remove unused run_paused/run_resumed events and paused status

This makes the system faster, easier to reason about, and resilient to data inconsistencies.

Test plan

  • TypeScript compiles
  • Unit tests pass
  • E2E tests pass

🤖 Generated with Claude Code

@changeset-bot
Copy link

changeset-bot bot commented Dec 16, 2025

🦋 Changeset detected

Latest commit: 08ff4d1

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 18 packages
Name Type
@workflow/core Patch
@workflow/world Patch
@workflow/world-local Patch
@workflow/world-postgres Patch
@workflow/world-vercel Patch
@workflow/web-shared Patch
@workflow/cli Patch
@workflow/web Patch
@workflow/builders Patch
@workflow/docs-typecheck Patch
@workflow/next Patch
@workflow/nitro Patch
workflow Patch
@workflow/world-testing Patch
@workflow/astro Patch
@workflow/sveltekit Patch
@workflow/nuxt Patch
@workflow/ai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Contributor

vercel bot commented Dec 16, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
example-nextjs-workflow-turbopack Ready Ready Preview, Comment Jan 3, 2026 3:15am
example-nextjs-workflow-webpack Ready Ready Preview, Comment Jan 3, 2026 3:15am
example-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workbench-astro-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workbench-express-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workbench-fastify-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workbench-hono-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workbench-nitro-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workbench-nuxt-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workbench-sveltekit-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workbench-vite-workflow Ready Ready Preview, Comment Jan 3, 2026 3:15am
workflow-docs Ready Ready Preview, Comment Jan 3, 2026 3:15am

@github-actions
Copy link
Contributor

github-actions bot commented Dec 16, 2025

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
❌ ▲ Vercel Production 299 9 11 319
✅ 💻 Local Development 282 0 8 290
✅ 📦 Local Production 282 0 8 290
✅ 🐘 Local Postgres 282 0 8 290
✅ 🪟 Windows 29 0 0 29
❌ 🌍 Community Worlds 24 104 0 128
Total 1198 113 35 1346

❌ Failed Tests

▲ Vercel Production (9 failed)

vite (9 failed):

  • addTenWorkflow
  • addTenWorkflow
  • retryAttemptCounterWorkflow
  • crossFileErrorWorkflow - stack traces work across imported modules
  • hookCleanupTestWorkflow - hook token reuse after workflow completion
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars)
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly
🌍 Community Worlds (104 failed)

mongodb (26 failed):

  • addTenWorkflow
  • addTenWorkflow
  • should work with react rendering in step
  • promiseAllWorkflow
  • promiseRaceWorkflow
  • promiseAnyWorkflow
  • readableStreamWorkflow
  • hookWorkflow
  • webhookWorkflow
  • sleepingWorkflow
  • nullByteWorkflow
  • workflowAndStepMetadataWorkflow
  • outputStreamWorkflow
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions
  • fetchWorkflow
  • promiseRaceStressTestWorkflow
  • retryAttemptCounterWorkflow
  • retryableAndFatalErrorWorkflow
  • maxRetriesZeroWorkflow - maxRetries=0 runs once without retrying
  • crossFileErrorWorkflow - stack traces work across imported modules
  • hookCleanupTestWorkflow - hook token reuse after workflow completion
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars)
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument
  • closureVariableWorkflow - nested step functions with closure variables
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly

redis (26 failed):

  • addTenWorkflow
  • addTenWorkflow
  • should work with react rendering in step
  • promiseAllWorkflow
  • promiseRaceWorkflow
  • promiseAnyWorkflow
  • readableStreamWorkflow
  • hookWorkflow
  • webhookWorkflow
  • sleepingWorkflow
  • nullByteWorkflow
  • workflowAndStepMetadataWorkflow
  • outputStreamWorkflow
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions
  • fetchWorkflow
  • promiseRaceStressTestWorkflow
  • retryAttemptCounterWorkflow
  • retryableAndFatalErrorWorkflow
  • maxRetriesZeroWorkflow - maxRetries=0 runs once without retrying
  • crossFileErrorWorkflow - stack traces work across imported modules
  • hookCleanupTestWorkflow - hook token reuse after workflow completion
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars)
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument
  • closureVariableWorkflow - nested step functions with closure variables
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly

starter (26 failed):

  • addTenWorkflow
  • addTenWorkflow
  • should work with react rendering in step
  • promiseAllWorkflow
  • promiseRaceWorkflow
  • promiseAnyWorkflow
  • readableStreamWorkflow
  • hookWorkflow
  • webhookWorkflow
  • sleepingWorkflow
  • nullByteWorkflow
  • workflowAndStepMetadataWorkflow
  • outputStreamWorkflow
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions
  • fetchWorkflow
  • promiseRaceStressTestWorkflow
  • retryAttemptCounterWorkflow
  • retryableAndFatalErrorWorkflow
  • maxRetriesZeroWorkflow - maxRetries=0 runs once without retrying
  • crossFileErrorWorkflow - stack traces work across imported modules
  • hookCleanupTestWorkflow - hook token reuse after workflow completion
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars)
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument
  • closureVariableWorkflow - nested step functions with closure variables
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly

turso (26 failed):

  • addTenWorkflow
  • addTenWorkflow
  • should work with react rendering in step
  • promiseAllWorkflow
  • promiseRaceWorkflow
  • promiseAnyWorkflow
  • readableStreamWorkflow
  • hookWorkflow
  • webhookWorkflow
  • sleepingWorkflow
  • nullByteWorkflow
  • workflowAndStepMetadataWorkflow
  • outputStreamWorkflow
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions
  • fetchWorkflow
  • promiseRaceStressTestWorkflow
  • retryAttemptCounterWorkflow
  • retryableAndFatalErrorWorkflow
  • maxRetriesZeroWorkflow - maxRetries=0 runs once without retrying
  • crossFileErrorWorkflow - stack traces work across imported modules
  • hookCleanupTestWorkflow - hook token reuse after workflow completion
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars)
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument
  • closureVariableWorkflow - nested step functions with closure variables
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly

Details by Category

❌ ▲ Vercel Production
App Passed Failed Skipped
✅ astro 28 0 1
✅ example 28 0 1
✅ express 28 0 1
✅ fastify 28 0 1
✅ hono 28 0 1
✅ nextjs-turbopack 28 0 1
✅ nextjs-webpack 28 0 1
✅ nitro 28 0 1
✅ nuxt 28 0 1
✅ sveltekit 28 0 1
❌ vite 19 9 1
✅ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 28 0 1
✅ express-stable 28 0 1
✅ fastify-stable 28 0 1
✅ hono-stable 28 0 1
✅ nextjs-turbopack-stable 29 0 0
✅ nextjs-webpack-stable 29 0 0
✅ nitro-stable 28 0 1
✅ nuxt-stable 28 0 1
✅ sveltekit-stable 28 0 1
✅ vite-stable 28 0 1
✅ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 28 0 1
✅ express-stable 28 0 1
✅ fastify-stable 28 0 1
✅ hono-stable 28 0 1
✅ nextjs-turbopack-stable 29 0 0
✅ nextjs-webpack-stable 29 0 0
✅ nitro-stable 28 0 1
✅ nuxt-stable 28 0 1
✅ sveltekit-stable 28 0 1
✅ vite-stable 28 0 1
✅ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 28 0 1
✅ express-stable 28 0 1
✅ fastify-stable 28 0 1
✅ hono-stable 28 0 1
✅ nextjs-turbopack-stable 29 0 0
✅ nextjs-webpack-stable 29 0 0
✅ nitro-stable 28 0 1
✅ nuxt-stable 28 0 1
✅ sveltekit-stable 28 0 1
✅ vite-stable 28 0 1
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 29 0 0
❌ 🌍 Community Worlds
App Passed Failed Skipped
✅ mongodb-dev 3 0 0
❌ mongodb 3 26 0
✅ redis-dev 3 0 0
❌ redis 3 26 0
✅ starter-dev 3 0 0
❌ starter 3 26 0
✅ turso-dev 3 0 0
❌ turso 3 26 0

📋 View full workflow run


Some E2E test jobs failed:

  • Vercel Prod: failure
  • Local Dev: success
  • Local Prod: success
  • Local Postgres: success
  • Windows: success

Check the workflow run for details.

Copy link
Collaborator Author

pranaygp commented Dec 16, 2025

@pranaygp pranaygp force-pushed the pranaygp/perf-phase-3b-atomic-events branch from 6ebd4c5 to 2e46b8a Compare December 16, 2025 05:45
@pranaygp pranaygp force-pushed the pranaygp/12-04-perf_parallelize_suspension_handler_for_high-concurrency branch from eece359 to 290e879 Compare December 16, 2025 05:45
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a performance optimization for event creation by adding a createBatch() method to the World interface. The implementation enables atomic batch creation of multiple events, significantly improving the wait completion logic in the runtime from O(n²) to O(n) complexity.

Key Changes

  • Added events.createBatch() method to the World interface for creating multiple events in a single operation
  • Implemented batch creation across three storage backends (world-vercel, world-postgres, world-local) with backend-specific optimizations
  • Optimized runtime wait completion logic using Set-based correlation ID lookup and batch event creation

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
packages/world/src/interfaces.ts Added createBatch() method signature with JSDoc documentation to the Storage events interface
packages/world-vercel/src/storage.ts Integrated batch event creation into the storage adapter
packages/world-vercel/src/events.ts Implemented createWorkflowRunEventBatch() using parallel API calls via Promise.all
packages/world-postgres/src/storage.ts Implemented batch creation using a single INSERT query with multiple values for optimal database performance
packages/world-local/src/storage.ts Implemented sequential batch creation to maintain monotonic ULID ordering for filesystem storage
packages/core/src/runtime.ts Refactored wait completion to use Set-based lookup and batch event creation, improving from O(n²) to O(n) complexity
.changeset/brave-dots-bake.md Added changeset documenting the performance improvement across all affected packages

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

data.eventType === 'step_created' ||
data.eventType === 'hook_created'
) {
throw new WorkflowAPIError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status code thrown when creating entities on a terminal run is incorrect. The code throws 409, but the suspension handler expects 410.

View Details
📝 Patch Details
diff --git a/packages/world-local/src/storage.ts b/packages/world-local/src/storage.ts
index 1a2ead1..7135f36 100644
--- a/packages/world-local/src/storage.ts
+++ b/packages/world-local/src/storage.ts
@@ -398,7 +398,7 @@ export function createStorage(basedir: string): Storage {
           ) {
             throw new WorkflowAPIError(
               `Cannot create new entities on run in terminal state "${currentRun.status}"`,
-              { status: 409 }
+              { status: 410 }
             );
           }
         }
diff --git a/packages/world-postgres/src/storage.ts b/packages/world-postgres/src/storage.ts
index 026e9e2..17e6697 100644
--- a/packages/world-postgres/src/storage.ts
+++ b/packages/world-postgres/src/storage.ts
@@ -309,7 +309,7 @@ export function createEventsStorage(drizzle: Drizzle): Storage['events'] {
         ) {
           throw new WorkflowAPIError(
             `Cannot create new entities on run in terminal state "${currentRun.status}"`,
-            { status: 409 }
+            { status: 410 }
           );
         }
       }

Analysis

Incorrect HTTP status code when creating entities on terminal runs

What fails: When events.create() is called with step_created or hook_created event types on a run in terminal state (completed, failed, or cancelled), both packages/world-local/src/storage.ts (line 401) and packages/world-postgres/src/storage.ts (line 312) throw WorkflowAPIError with status 409, but the suspension handler in packages/core/src/runtime/suspension-handler.ts (lines 83-87) expects status 410.

How to reproduce:

// Create a workflow run and complete it
const run = await storage.events.create(null, {
  eventType: 'run_created',
  eventData: { deploymentId: 'test', workflowName: 'test', input: [] }
});

// Move run to terminal state
await storage.events.create(run.run.runId, {
  eventType: 'run_completed',
  eventData: { output: 'done' }
});

// Try to create a step on the completed run
try {
  await storage.events.create(run.run.runId, {
    eventType: 'step_created',
    correlationId: 'step1',
    eventData: { stepName: 'test-step', input: [] }
  });
} catch (err) {
  console.log(err.status); // Outputs 409, but handler expects 410
}

Result: The suspension handler receives status 409 and logs "Hook already exists, continuing" instead of the correct "Workflow run has already completed, skipping hook" message, resulting in misleading log messages when hooks/steps cannot be created due to the run being in a terminal state.

Expected: Status code 410 is thrown for terminal run entity creation, while status 409 remains for duplicate token conflicts (verified: duplicate token check at lines 776 in world-local and 704 in world-postgres correctly use 409).

Files fixed:

  • packages/world-local/src/storage.ts line 401: Changed { status: 409 } to { status: 410 }
  • packages/world-postgres/src/storage.ts line 312: Changed { status: 409 } to { status: 410 }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants