Skip to content

Conversation

@mo-radwan1
Copy link
Collaborator

Summary

Completes backend architecture split for Platform Service, making it fully independent from WebUI Gateway.

Stories Completed:

  • ✅ DATAGO-118497: Platform service receives agent heartbeats
  • ✅ DATAGO-118496: Platform validates bearer tokens with OAuth
  • ✅ Architecture improvement: Shared utilities layer

Platform Service Enhancements (DATAGO-118497)

Background Tasks Migrated

  • Added broker connection for heartbeat monitoring and agent registry
  • Implemented heartbeat listener (monitors deployer status)
  • Implemented deployment status checker (runs every 60s)
  • Added agent registry for deployment verification
  • All background tasks now run in Platform Service, not WebUI Gateway

Message Publishing Added

  • Implemented publish_a2a() method
  • Sends deployment commands to deployer
  • Topics: {namespace}/deployer/agent/{id}/{deploy|update|undeploy}

Message Receiving

  • Receives deployer heartbeats via HeartbeatListener
  • Receives agent cards via AgentRegistry
  • Complete deployment monitoring capability

Configuration Added

  • deployment_timeout_minutes (default: 5)
  • heartbeat_timeout_seconds (default: 90)
  • deployment_check_interval_seconds (default: 60)
  • Created examples/services/platform_service_example.yaml

OAuth Implementation (DATAGO-118496)

  • Extracted OAuth middleware from http_sse to shared/auth/middleware.py
  • Platform Service now validates bearer tokens with external OAuth service
  • Same auth implementation for both gateways and services
  • Supports production mode (real validation) and dev mode

Shared Utilities Layer

Created src/solace_agent_mesh/shared/ Module

Moved 13 files from http_sse/shared/ to enable code reuse:

  • api/: pagination, response_utils, auth_utils
  • database/: base_repository, database_exceptions, database_helpers
  • exceptions/: exceptions, exception_handlers, error_dto
  • auth/: dependencies (ValidatedUserConfig), middleware (OAuth)
  • utils/: timestamp_utils, enums, types, utils

Benefits

  • Gateways and services import from shared/ (no cross-dependencies)
  • Platform Service completely independent of http_sse
  • Clean architectural boundary
  • Enables future services (billing, monitoring, etc.)

WebUI Gateway Cleanup (Chat-Only)

Removed from http_sse

  • ❌ Enterprise router mounting (no more /api/v1/enterprise/*)
  • ❌ Background task startup/shutdown
  • ❌ Platform migration calls
  • ❌ Deployment monitoring dependencies

Result

  • WebUI Gateway serves chat endpoints only
  • Single database (WEB_UI_GATEWAY_DATABASE_URL)
  • No platform functionality

Breaking Changes

  • http_sse/shared/ module removed (raises ImportError with migration guide)
  • Enterprise routers no longer mounted in WebUI Gateway
  • Platform database no longer used by WebUI Gateway
  • Users must run Platform Service separately for agent management

Architecture: Service vs Gateway

SERVICES (Platform Service):

  • Provide internal platform functionality
  • Admin-facing CRUD operations
  • Background processing
  • Example: services/platform/

GATEWAYS (WebUI, Slack, etc.):

  • Handle external communication channels
  • User-facing interactions
  • Protocol-specific implementations
  • Example: gateway/http_sse/

Migration Path

Before

# Single service on port 8000
sam run configs/gateways/webui.yaml

After

# Terminal 1: Chat service
sam run configs/gateways/webui.yaml  # Port 8000

# Terminal 2: Platform service
sam run configs/services/platform.yaml  # Port 8001

Test Plan

  • Platform Service starts with broker connection
  • Background tasks initialize (heartbeat listener, deployment checker)
  • OAuth middleware validates tokens correctly
  • Deployment operations publish messages to deployer
  • Agent registry tracks agent presence
  • WebUI Gateway starts without platform functionality
  • Shared utilities import correctly
  • All unit tests pass
  • Integration tests pass

Related PRs

  • Enterprise PR: (to be created with matching changes)
  • Frontend PR: DATAGO-118499 (routing to platform service)

Next Steps

  1. Frontend routing to Platform Service (DATAGO-118499)
  2. Helm chart updates (DATAGO-118498)
  3. Docker/K8s configurations (DATAGO-118706)
  4. Documentation (DATAGO-118501, DATAGO-118708)

@mo-radwan1 mo-radwan1 force-pushed the mradwan/DATAGO-118497-platform-service-heartbeats-and-shared-utilities branch from 545db31 to cc0ca2b Compare December 4, 2025 19:49
@mo-radwan1 mo-radwan1 marked this pull request as draft December 4, 2025 20:18
"description": "Configuration for the Platform Service (enterprise features: agents, connectors, deployments).",
"description": "Configuration for connecting to the Platform Service (runs separately on port 8001).",
"dict_schema": {
"database_url": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just confirming that removing the config doesn't break anything

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it shouldnt break anything, it would just be ignored if it exists

log.debug("%s Publishing A2A message to topic: %s", self.log_identifier, topic)

try:
super().publish_a2a_message(payload, topic, user_properties)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to extend SamComponentBase to call this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right good catch, let me fix that

… shared utilities

This PR completes the backend architecture split for Platform Service,
making it fully independent from WebUI Gateway.

## Platform Service Enhancements (DATAGO-118497)

### Background Task Migration
- Added broker connection to Platform Service component
- Initialized agent registry for deployment monitoring
- Implemented heartbeat listener for deployer status tracking
- Implemented deployment status checker (runs every 60s)
- Added graceful cleanup for all background tasks

### Message Publishing
- Added publish_a2a() method to Platform Service
- Can now send deployment commands to deployer
- Topics: {namespace}/deployer/agent/{id}/{deploy|update|undeploy}

### Message Receiving
- Receives deployer heartbeats: {namespace}/deployer/heartbeat
- Receives agent cards: {namespace}/a2a/agent-cards (via AgentRegistry)

### Configuration
- Added deployment_timeout_minutes (default: 5)
- Added heartbeat_timeout_seconds (default: 90)
- Added deployment_check_interval_seconds (default: 60)
- Created examples/services/platform_service_example.yaml

## OAuth Implementation (DATAGO-118496)

- Extracted OAuth middleware to shared/auth/middleware.py
- Platform Service now validates bearer tokens with OAuth service
- Supports both production (use_authorization=true) and dev mode
- Same auth implementation for gateways and services

## Shared Utilities Layer (Architecture Improvement)

### Created shared/ Module
- Moved 13 files from http_sse/shared/ to shared/
- Created 5 subdirectories: api, database, exceptions, auth, utils
- Extracted auth dependencies from http_sse to shared/auth
- Total: ~2,000 lines of reusable code

### Benefits
- Gateways and services import from shared/ (no cross-dependencies)
- Platform Service no longer depends on http_sse
- Clean architectural boundary
- Enables future services (billing, monitoring, etc.)

## WebUI Gateway Cleanup (Chat-Only)

### Removed
- Enterprise router mounting (no more /api/v1/enterprise/*)
- Background task startup/shutdown
- Platform migration calls
- Enterprise background task dependencies

### Result
- WebUI Gateway now serves chat endpoints only
- Single database (WEB_UI_GATEWAY_DATABASE_URL)
- No platform functionality

## Enterprise Import Updates

- Updated imports: http_sse.shared.* → shared.*
- Updated imports: webui_backend → platform_service
- Uses shared auth dependencies
- 31 files updated (54 import replacements)

## Breaking Changes

- http_sse/shared/ module removed (raises ImportError)
- Enterprise routers no longer mounted in WebUI Gateway
- Platform database no longer used by WebUI Gateway
- Users must run Platform Service separately for agent management

## Migration Path

### Before (Merged)
- Single service on port 8000 (chat + platform)

### After (Split)
- WebUI Gateway on port 8000 (chat only)
- Platform Service on port 8001 (platform management)

## Testing

- Verified Platform Service starts with broker connection
- Verified background tasks initialize successfully
- Verified OAuth middleware loads
- All imports migrated successfully
@mo-radwan1 mo-radwan1 force-pushed the mradwan/DATAGO-118497-platform-service-heartbeats-and-shared-utilities branch from cc0ca2b to 458ba5b Compare December 5, 2025 14:35
…message

SamComponentBase provides:
- publish_a2a_message() method with size validation
- Async event loop management
- Timer callback registry

ComponentBase does not have these SAM-specific features.

Added max_message_size_bytes config parameter (required by SamComponentBase).
Platform Service now uses direct message publisher for deployer commands,
which is semantically correct since:
- Deployer is a service, not an A2A agent
- No A2A protocol features needed (no JSON-RPC, correlation IDs)
- Deployment commands are simple service-to-service communication

Changes:
- Extends ComponentBase instead of SamComponentBase
- Removed max_message_size_bytes config (deployment YAMLs are small)
- Implemented direct message publisher initialization
- Updated publish_a2a() to use direct publishing
- Added cleanup for direct publisher
- Updated documentation to clarify direct messaging vs A2A

Benefits:
- Semantically correct (not pretending to be A2A)
- Simpler implementation (no unnecessary overhead)
- Clear service vs gateway distinction
- Sets pattern for future services

Risk: None - deployer only reads JSON payload, doesn't use A2A features.
…ttern)

Follow exact same pattern as WebUI Gateway for enterprise background tasks.

Community Platform Service:
- Provides framework only
- Calls enterprise hook after routers loaded
- Gracefully degrades without enterprise

Enterprise init_enterprise.py:
- Added start_platform_background_tasks(component)
- Added _start_platform_heartbeat_listener(component)
- Owns ALL background task logic

Pattern:
- Community: Calls enterprise function
- Enterprise: Owns initialization
- Same proven pattern as WebUI Gateway

Benefits:
- No enterprise table access in community (safe for community users)
- Enterprise controls its features entirely
- Future-proof for community platform features
- Matches working WebUI Gateway pattern

This ensures community Platform Service never crashes trying to access
enterprise deployment tables.
Added Kubernetes-ready health check endpoints:

1. /health (liveness) - Simple check that service is running
2. /health/live (liveness alias) - Same as /health
3. /health/ready (readiness) - Dependency checks:
   - Database connectivity
   - Direct publisher initialized
   - Enterprise package loaded

Readiness probe returns 503 if dependencies not ready,
allowing Kubernetes to delay traffic until service is fully initialized.

Updated sam-kubernetes deployment to use:
- livenessProbe: /health/live
- readinessProbe: /health/ready
This commit consolidates all fixes for the shared module migration and Platform Service architecture split:

**Import Fixes (main.py):**
- Added missing Alembic imports (Config, command)
- Added missing FastAPI imports (RequestValidationError, JSONResponse, CORSMiddleware)
- Added missing Starlette imports (SessionMiddleware, StaticFiles)
- Added missing a2a SDK imports (InternalError, InvalidRequestError, JSONRPCError, JSONRPCResponse)
- Added missing dependencies import
- Added all router imports
- Restored uvicorn.Config() usage (reverted broken "fix")

**Import Fixes (shared module migration):**
- Updated 30+ files to use new shared module paths
- Fixed http_sse repository, routers, services, DTOs to import from solace_agent_mesh.shared.*
- Fixed shared module internal imports (database, exceptions)
- Removed references to deleted http_sse/shared directory
- Updated import paths: types, enums, pagination, timestamp_utils, etc.

**Platform Service Fixes:**
- Programmatically create components in PlatformServiceApp
- Configure broker connection (input_enabled/output_enabled)
- Move background task startup to component._start_background_tasks()
- Make direct publisher initialization non-fatal
- Remove await from synchronous setup_dependencies()
- Fix enterprise background tasks to be synchronous with async task spawning

**Configuration:**
- Migrate logging config from YAML to INI format
- Update launch.json to use logging_config.ini
- Add Platform Service to launch configurations

All services (WebUI Gateway, Platform Service, Agents) now start successfully.
@sonarqube-solacecloud
Copy link

Quality Gate failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 70%)

See analysis details on SonarQube

mo-radwan1 and others added 7 commits December 9, 2025 11:24
…ation

Platform Service now extends SamComponentBase to gain full message processing
infrastructure, matching WebUI Gateway's architecture. This enables Platform
Service to discover agents and populate its AgentRegistry for deployment monitoring.

Changes:
- PlatformServiceComponent now extends SamComponentBase (was ComponentBase)
- Added agent discovery subscription to broker configuration
- Implemented _handle_message_async() for async message processing
- Added CoreA2AService with component_id for log differentiation
- Implemented _late_init() hook for broker-dependent service initialization
- Added required SamComponentBase abstract methods
- Added get_config() override to access nested app_config
- Moved direct publisher and background tasks to _late_init()
- Added max_message_size_bytes to example YAML configuration
- CoreA2AService now accepts optional component_id for clear logging

Component identification:
- CoreA2AService logs now show [CoreA2AService-Platform] vs [CoreA2AService-WebUI]
- Makes it clear which component is discovering agents

Architecture:
- Platform Service and WebUI Gateway now use identical message handling infrastructure
- Both independently discover and track agents
- Consistent patterns across gateway and service components
This allows the API calls to be routed correctly based on config,
providing better support and flexibility in the split-platform
system.
@mo-radwan1 mo-radwan1 force-pushed the mradwan/DATAGO-118497-platform-service-heartbeats-and-shared-utilities branch from 8b7f89d to 77b12f2 Compare December 9, 2025 16:43
mo-radwan1 and others added 8 commits December 9, 2025 14:01
Replaces verbose `configServerUrl` pattern with clean API client for better
maintainability and developer experience.

## Changes

**New API Client** (src/lib/api/client.ts):
- Module-level singleton (works in components, hooks, utils)
- Clear namespace separation: `api.chat.*` vs `api.platform.*`
- Automatic JSON serialization for POST/PUT/PATCH
- Handles undefined/null body edge cases
- Base URL caching for performance (21 object allocations saved)
- Reconfiguration warnings for debugging

**Config Fields Renamed**:
- `configServerUrl` → `chatServerUrl`
- `configPlatformServerUrl` → `platformServerUrl`
- Deprecated fields removed after full migration

**Migration Impact** (26 files):
- Removed 78 instances of manual URL construction
- 70% less boilerplate code
- 207 insertions, 222 deletions (net -15 lines)

**Critical Fixes**:
- Fixed JSON.stringify(undefined) returning undefined instead of string
- Proper Content-Type header handling only when body exists
- Safe handling of null/undefined in POST/PUT/PATCH requests

**Performance**:
- Added getBaseUrls() result caching
- Invalidates cache on reconfiguration

## Examples

Before:
```typescript
const { configServerUrl } = useConfigContext();
await fetchJsonWithError(`${configServerUrl}/api/v1/prompts`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(data),
});
```

After:
```typescript
import { api } from "@/lib/api";
await api.chat.post('/api/v1/prompts', data);
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…rvice

Both services now use the shared OAuth middleware for consistent authentication
behavior across community and enterprise modes.

## Changes

**WebUI Gateway:**
- Added OAuth config fields to component (external_auth_service_url, external_auth_provider, use_authorization)
- Removed local _create_auth_middleware function (295 lines of duplicate code)
- Now uses shared create_oauth_middleware from shared/auth/middleware.py
- Updated config router to use component.use_authorization (unified field)

**Configuration:**
- Added OAuth config to webui_gateway_example.yaml
- Documented community vs enterprise mode
- Default: use_authorization=false (community/dev mode)

**Shared Middleware Features:**
- Community mode (use_authorization=false): Uses dev user (sam_dev_user)
- Enterprise mode (use_authorization=true): Validates OAuth tokens
- Same auth behavior for both WebUI Gateway and Platform Service
- Single source of truth for auth logic

## Benefits

✅ DRY - Removed 295 lines of duplicate auth middleware code
✅ Consistency - Both services authenticate identically
✅ Maintainability - Fix auth bugs once, applies to both services
✅ Community-safe - Defaults to dev mode (no breaking changes)
✅ Enterprise-ready - OAuth validation when enabled

## Net Impact

Files changed: 4
Lines added: 21
Lines deleted: 300
Net change: -279 lines (cleaner codebase!)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Updates templates/webui.yaml to include both WebUI Gateway and Platform Service
so Docker deployments have full functionality out of the box.

## Problem

After the service split, the Docker template only included WebUI Gateway.
This meant Docker deployments would have:
- ✅ Chat functionality (port 8000)
- ❌ NO agent management (Platform Service missing)
- ❌ Frontend calls to port 8001 would fail

## Solution

Added Platform Service as second app in templates/webui.yaml:
- App 1: WebUI Gateway (port 8000) - Chat endpoints
- App 2: Platform Service (port 8001) - Management endpoints

## Configuration

**Platform Service Added:**
- Module: solace_agent_mesh.services.platform.app
- Port: 8001 (configurable via PLATFORM_API_PORT)
- Database: sqlite:///platform-service.db (separate from chat DB)
- OAuth: Unified with WebUI Gateway (same USE_AUTHORIZATION)
- CORS: Allows localhost:3000, localhost:8000 (cross-service calls)

**WebUI Gateway Updated:**
- Added platform_service.url config
- Added OAuth config (use_authorization)
- Points to localhost:8001 for platform calls

## Docker Run Behavior

```bash
docker run solace-agent-mesh templates/webui.yaml
```

Now starts BOTH services:
- ✅ WebUI Gateway on port 8000
- ✅ Platform Service on port 8001
- ✅ Frontend can reach both
- ✅ All features work (chat + agent management)

## Testing

✅ Template has 2 apps defined
✅ OAuth unified across both services
✅ Platform service URL configured
✅ CORS allows cross-service communication
✅ Both services use same broker connection

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Separates Platform Service into its own template file for cleaner deployment
architecture and future community feature additions.

## Changes

**Created templates/platform.yaml:**
- Standalone Platform Service template
- Port 8001 (configurable)
- Separate database (PLATFORM_DATABASE_URL)
- OAuth configuration
- Background task configuration
- Ready for future community endpoints

**Updated templates/webui.yaml:**
- Removed Platform Service app (was 216 lines, now 182)
- Kept platform_service.url reference (frontend routing)
- WebUI Gateway only (chat endpoints)
- Clean community-focused template

## Deployment Patterns

**Community Users (Chat Only):**
```bash
sam run templates/webui.yaml
# WebUI Gateway on port 8000
# All chat features work
```

**Enterprise Users (Full Stack):**
```bash
# Terminal 1:
sam run templates/webui.yaml  # Port 8000

# Terminal 2:
sam run templates/platform.yaml  # Port 8001
```

**Future: Community + Platform:**
When community platform features are added, community users can optionally run:
```bash
sam run templates/webui.yaml &
sam run templates/platform.yaml &
```

## Benefits

✅ Clean separation - each service in its own template
✅ Community simple - single template for chat
✅ Enterprise flexible - run both as needed
✅ Future-ready - platform.yaml ready for community endpoints
✅ No forced dependencies - community doesn't need empty platform service

## Architecture

- templates/webui.yaml: WebUI Gateway (chat, prompts, speech, projects)
- templates/platform.yaml: Platform Service (agents, deployments, connectors - enterprise/future)
- Both can run independently or together

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Both services now default to use_authorization=false and provider=generic
for consistency and community-safety.

Changes Platform Service defaults from:
- use_authorization: True → False
- external_auth_provider: 'azure' → 'generic'

Now matches WebUI Gateway defaults exactly.
The SAM (simple) debug config was missing platform_service_example.yaml,
causing Platform Service not to start when debugging.

Now includes all necessary services:
- Orchestrator agent
- WebUI Gateway (port 8000)
- Test agent
- Platform Service (port 8001)

This matches the main SAM configuration and ensures full stack runs.
Platform Service now defaults to sqlite:///platform-service.db if
PLATFORM_DATABASE_URL not provided, matching WebUI Gateway pattern.

This prevents the warning: 'No database URL provided - platform service will not function'

Now works out of box for development without requiring env var.
…beats-and-shared-utilities

Resolved conflicts by keeping our API client refactoring while incorporating
main's new features (background tasks, file validation, effectiveSessionId).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants