Skip to content

Conversation

@slyt3
Copy link
Contributor

@slyt3 slyt3 commented Dec 8, 2025

Summary

Implements platform-agnostic status reporting infrastructure for Virtual MCP Servers, enabling runtime operational visibility in both Kubernetes and CLI environments.

Motivation

While exploring Issue #2853 (Health Monitoring), I identified that vMCP lacks a mechanism to report operational status. The Virtual MCP Server proposal explicitly requires "status reporting includes backend health summary" in its Integration Points section (line 6).

This PR provides the reporting foundation that Issue #2853's health monitoring will use, without implementing the health checking logic itself.

What This PR Implements

Core Components

Reporter interface: Platform-agnostic contract (Report/Start/Stop/SetStatusCallback)
K8sReporter: Updates VirtualMCPServer.status CRD in Kubernetes
LogReporter: Logs status to stdout for CLI environments
Factory pattern: Automatic environment detection via KUBERNETES_SERVICE_HOST
RuntimeStatus: Data structure for operational state (phase, backends, tool counts)

Integration

  • Server lifecycle integration (reporter starts with server, stops on shutdown)
  • Environment variables: VMCP_NAME and VMCP_NAMESPACE (set by operator)
  • RBAC permissions: Added virtualmcpservers/status read/write access
  • Graceful degradation: Server continues if reporter initialization fails

Status Data Model

type Phase string
const (
    PhaseReady     Phase = "Ready"      // All backends healthy
    PhaseDegraded  Phase = "Degraded"   // Some backends unhealthy  
    PhaseFailed    Phase = "Failed"     // Server not operational
    PhasePending   Phase = "Pending"    // Starting up
)

type RuntimeStatus struct {
    Phase             Phase
    Message           string
    Backends          []BackendHealthReport  // Ready for health data
    TotalToolCount    int
    HealthyBackends   int
    UnhealthyBackends int
    LastDiscoveryTime time.Time
}

What This PR Does NOT Implement

Health checking logic (that's Issue #2853)
Circuit breaker patterns
Backend health monitoring
Metrics emission (future work)

Why Merge This Separately From #2853

Standalone Value

  • Reports "Ready" status immediately (operational visibility today)
  • Enables monitoring of vMCP deployment lifecycle
  • Provides foundation without health checking complexity

Foundation for #2853

Testing

Unit Tests

added 15 Unit Test they all pass

Test coverage:

  • Factory pattern and environment detection (4 tests)
  • LogReporter functionality (7 tests)
  • Type definitions and validation (4 tests)

Integration Testing

Deployed to kind cluster and verified:
Factory detects Kubernetes environment correctly
K8sReporter creates with correct name/namespace from env vars
Reporter starts successfully and reports every 30s
Status updates call client.Status().Update() successfully
RBAC permissions allow status updates
No goroutine leaks on shutdown

Evidence:

{"msg":"Detected Kubernetes environment, using K8sReporter"}
{"msg":"Starting status reporter (interval: 30s)"}
{"msg":"[toolhive-system/test-vmcp] Updated K8s status: phase=Ready, backends=0/0"}

Relationship to Virtual MCP Proposal

From the Virtual MCP Server Proposal:

Integration Points:

  • Status reporting includes backend health summary

Backend Unavailability:

  • Notify monitoring systems of state change
  • Log backend unavailability events

VirtualMCPServer.status structure:

status:
  phase: Ready
  message: "..."
  backendCount: 2
  discoveredBackends: [...]

This PR implements the reporting mechanism. Issue #2853 will populate it with health data.

Current Behavior

CLI Environment

$ vmcp serve --config config.yaml
INFO: Detected CLI environment, using LogReporter
INFO: Starting status reporter (interval: 30s)
INFO: [my-vmcp] Status Report:
INFO:   Phase: Ready
INFO:   Message: No status callback configured
INFO:   Total Tools: 0
INFO:   Healthy Backends: 0/0

Kubernetes Environment

$ kubectl logs <vmcp-pod>
{"msg":"Detected Kubernetes environment, using K8sReporter"}
{"msg":"Starting status reporter (interval: 30s)"}
{"msg":"[toolhive-system/test-vmcp] Updated K8s status: phase=Ready, backends=0/0"}

$ kubectl get virtualmcpserver test-vmcp -o jsonpath='{.status.phase}'
Ready

Known Limitations & Future Work

StatusReporter is Foundation, Not Complete Feature

Currently reports dummy status because:

  1. Server lacks GetStatus() method to collect real-time data
  2. Backends lack health check implementation (Issue Add health monitoring and circuit breaker to vMCP server #2853)
  3. No circuit breaker state to report

This is intentional - provides infrastructure for future work:

Operator Status Ownership

The operator controller also updates VirtualMCPServer.status fields. Status ownership between operator and StatusReporter should be clarified:

  • Option A: StatusReporter owns runtime fields (phase, message, backendCount)
  • Option B: Operator owns all status, StatusReporter becomes metrics-only
  • Option C: Separate status sections (operator for deployment, reporter for runtime)

This design question is deferred to follow-up discussion.

Follow-up Work (Separate PRs)

  • Server status collection : Add Server.GetStatus() to collect real backend data
  • Health check integration (Issue Add health monitoring and circuit breaker to vMCP server #2853): Wire health checker to RuntimeStatus
  • Metrics emission: Export status to Prometheus
  • Status ownership clarification : Decide operator vs reporter responsibilities

Files Changed

New Package: pkg/vmcp/status/ (8 files)

  • factory.go (31 lines): Environment detection and reporter creation
  • factory_test.go (180 lines): Factory pattern tests
  • k8s_reporter.go (146 lines): Kubernetes implementation
  • log_reporter.go (94 lines): CLI implementation
  • log_reporter_test.go (196 lines): LogReporter tests
  • reporter.go (15 lines): Interface definition
  • types.go (45 lines): Data structures
  • types_test.go (102 lines): Type validation tests

Modified Files

  • pkg/vmcp/server/server.go: Added reporter field, start/stop integration
  • cmd/vmcp/app/commands.go: Reporter creation from environment variables
  • cmd/thv-operator/controllers/virtualmcpserver_deployment.go: RBAC permissions

Breaking Changes

None. This is purely additive functionality.

Migration Guide

No migration needed. Reporter is optional and automatically enabled.

Questions for Reviewers

  1. Scope: Is this the right separation from Add health monitoring and circuit breaker to vMCP server #2853? Should anything be added/removed?
  2. Status ownership: Should operator or reporter own status fields? Or separate sections?
  3. API design: Is the Reporter interface appropriate?
  4. RBAC: Are the permissions correct?
  5. Future work: What should be prioritized next i really want to work on this

Related Issues


I'm happy to adjust scope, address design questions, or split this further based on feedback. Thanks for reviewing! I really liked to work that i didnt noticed from sunday morning as i started that its already monday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant