diff --git a/docs/book/getting-started/zenml-pro/.gitbook/assets/pro-workload-managers.png b/docs/book/getting-started/zenml-pro/.gitbook/assets/pro-workload-managers.png
new file mode 100644
index 00000000000..257e4bc998b
Binary files /dev/null and b/docs/book/getting-started/zenml-pro/.gitbook/assets/pro-workload-managers.png differ
diff --git a/docs/book/getting-started/zenml-pro/README.md b/docs/book/getting-started/zenml-pro/README.md
index 0e16f4503cf..0efa96f445e 100644
--- a/docs/book/getting-started/zenml-pro/README.md
+++ b/docs/book/getting-started/zenml-pro/README.md
@@ -25,7 +25,7 @@ The [Pro version of ZenML](https://zenml.io/pro) extends the Open Source product

{% hint style="info" %}
-To try ZenML Pro or to learn more [book a call](https://www.zenml.io/book-your-demo).
+To get access to ZenML Pro, [book a call](https://www.zenml.io/book-your-demo).
{% endhint %}
## ZenML OSS vs Pro Feature Comparison
@@ -35,66 +35,20 @@ To try ZenML Pro or to learn more [book a call](https://www.zenml.io/book-your-d
| **User Management** | Single-user mode | Multi-user support with [SSO](self-hosted.md#identity-provider), [organizations](organization.md), and [teams](teams.md) |
| **Access Control** | No RBAC | Full [role-based access control](roles.md) with customizable permissions |
| **Multi-tenancy** | No workspaces/projects | [Workspaces](workspaces.md) and [projects](projects.md) for team and resource isolation |
-| **Dashboard** | Basic pipeline and run visualization | Pro dashboard with [Model Control Plane](https://docs.zenml.io/user-guides/starter-guide/track-ml-models), [Artifact Control Plane](https://docs.zenml.io/user-guides/starter-guide/manage-artifacts), and comparison views |
-| **Pipeline Execution** | Run pipelines via SDK/CLI | Run pipelines from the dashboard, manage schedules via UI, [triggers](https://docs.zenml.io/concepts/triggers) |
+| **ZenML Web UI** | Basic pipeline and run visualization | Pro UI with [Model Control Plane](https://docs.zenml.io/concepts/models), [Artifact Control Plane](https://docs.zenml.io/concepts/artifacts), and comparison views |
+| **Pipeline Execution** | Run pipelines via SDK/CLI | Run pipelines from the UI, manage schedules through the UI, [triggers](https://docs.zenml.io/concepts/triggers) |
| **Stack Configuration** | User-managed stacks | Advanced stack configurations with workspace/project-level restrictions for platform teams |
| **Security** | Community updates | Prioritized security patches, SOC 2 and ISO 27001 certification |
-| **Deployment** | Self-hosted only | [SaaS](#saas-deployment), [Hybrid SaaS](#hybrid-saas-deployment), or [Air-gapped](#air-gapped-deployment) options |
+| **Deployment** | Self-hosted only | [SaaS](#saas-deployment), [Hybrid SaaS](#hybrid-saas-deployment), or [Self-hosted](#self-hosted-deployment) options |
| **Support** | Community support | Professional support included (SaaS deployments) |
| **Reporting** | Basic run tracking | Advanced usage reports and analytics |
| **Core Features** | ✅ Run pipelines on stacks ✅ Full observability over runs ✅ Artifact tracking ✅ Model versioning | ✅ All OSS features ✅ [Run Snapshots](https://docs.zenml.io/concepts/snapshots) ✅ Enhanced filtering and search |
-## Deployment Scenarios Comparison
+## Deployment Scenarios
-| Deployment Aspect | SaaS | Hybrid SaaS | Air-gapped |
-|-------------------|------|-------------|------------|
-| **ZenML Server** | ZenML infrastructure | Customer infrastructure | Customer infrastructure |
-| **Control Plane** | ZenML infrastructure | ZenML infrastructure | Customer infrastructure |
-| **Metadata & RBAC** | ZenML infrastructure | RBAC: ZenML infrastructure Run metadata: Customer infrastructure | Customer infrastructure |
-| **Compute & Data** | Customer infrastructure | Customer infrastructure | Customer infrastructure |
-| **Setup Time** | ⚡ Fastest (minutes) | Moderate | Longer (requires full deployment) |
-| **Maintenance** | ✅ Fully managed | Partially managed (workspace maintenance required) | Customer managed |
-| **Production Ready** | ✅ Day 1 | ✅ Day 1 | ✅ Day 1 |
-| **Best For** | Teams wanting minimal infrastructure overhead and fastest time-to-value | Organizations with security/compliance requirements but wanting simplified user management | Organizations requiring complete data isolation and air-gapped environments |
+ZenML Pro offers three flexible deployment options to match your organization's needs: **SaaS**, **Hybrid**, and **Self-hosted**.
-### SaaS Deployment
-
-The ZenML-managed SaaS deployment provides the fastest path to production with zero infrastructure overhead. All ZenML server components run on ZenML infrastructure, while your compute resources and data remain in your environment.
-
-**What runs where:**
-- ZenML Server: ZenML infrastructure
-- Metadata and RBAC: ZenML infrastructure
-- Compute and Data: Customer infrastructure
-
-**Ideal for:** Teams that want to get started immediately without managing infrastructure, while keeping sensitive ML data in their own environment.
-
-[Learn more about SaaS architecture →](../system-architectures.md#zenml-pro-saas-architecture)
-
-### Hybrid SaaS Deployment
-
-The Hybrid deployment balances control with convenience. The ZenML control plane (handling user management, authentication, and RBAC) runs on ZenML infrastructure, while the ZenML server and all metadata run in your environment.
-
-**What runs where:**
-- ZenML Management Plane: ZenML infrastructure
-- ZenML Server: Customer infrastructure
-- RBAC: ZenML infrastructure
-- Run metadata: Customer infrastructure
-- Compute and Data: Customer infrastructure
-
-**Ideal for:** Organizations with security or compliance requirements that mandate keeping metadata and credentials within their infrastructure, while benefiting from centralized user management.
-
-[Learn more about Hybrid architecture →](../system-architectures.md#zenml-pro-hybrid-saas)
-
-### Air-gapped Deployment
-
-The fully self-hosted, air-gapped deployment gives you complete control and data sovereignty. All ZenML components run entirely within your infrastructure with no external dependencies.
-
-**What runs where:**
-- All components: Customer infrastructure (completely isolated)
-
-**Ideal for:** Organizations with the strictest security requirements, regulated industries, or environments that must operate without external network access.
-
-[Learn more about self-hosted architecture →](../system-architectures.md#zenml-pro-self-hosted-architecture) | [Self-hosting setup guide →](self-hosted.md)
+[Explore all deployment scenarios →](deployments-overview.md)
## Security & Compliance
@@ -105,8 +59,8 @@ All ZenML Pro deployments include:
- ✅ **Vulnerability Assessment Reports** available on request
- ✅ **Software Bill of Materials (SBOM)** available on request
-For software deployed on customer infrastructure (Hybrid and Air-gapped scenarios), ZenML provides comprehensive security documentation to support your compliance requirements.
+For software deployed on your infrastructure (Hybrid and Self-hosted scenarios), ZenML provides comprehensive security documentation to support your compliance requirements.
## Pro Feature Details
-
diff --git a/docs/book/getting-started/zenml-pro/deployments-overview.md b/docs/book/getting-started/zenml-pro/deployments-overview.md
new file mode 100644
index 00000000000..14c8c5017a4
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/deployments-overview.md
@@ -0,0 +1,187 @@
+---
+description: Compare ZenML Pro deployment scenarios to find the right fit for your organization.
+icon: code-merge
+layout:
+ title:
+ visible: true
+ description:
+ visible: true
+ tableOfContents:
+ visible: true
+ outline:
+ visible: true
+ pagination:
+ visible: true
+---
+
+# Deployment Scenarios
+
+ZenML Pro offers three flexible deployment options to match your organization's security, compliance, and operational needs. This page helps you understand the differences and choose the right scenario for your use case.
+
+## Quick Comparison
+
+| Deployment Aspect | Purpose | SaaS | Hybrid SaaS | Self-hosted |
+|-------------------|---------|------|-------------|-------------|
+| **ZenML Server** | Stores pipeline metadata and serves the API that your SDK and UI connect to | ZenML infrastructure | Your infrastructure | Your infrastructure |
+| **Pipeline/ Artifact Metadata** | Records of your pipeline runs, step executions, and artifact locations | ZenML infrastructure | Your infrastructure | Your infrastructure |
+| **ZenML Control Plane** | Manages authentication, RBAC, and organization-level settings across workspaces | ZenML infrastructure | ZenML infrastructure | Your infrastructure |
+| **ZenML Pro UI** | Web UI for visualizing pipelines, artifacts, and managing your ML workflows | ZenML infrastructure | ZenML infrastructure | Your infrastructure |
+| **Compute & Data** | Your ML training infrastructure, models, datasets, and artifacts | Your infrastructure | Your infrastructure | Your infrastructure |
+| **Setup Time** | Time to get your first pipeline running | ⚡ ~1 hour | ~4 hours | ~8 hours |
+| **Maintenance** | Ongoing operational responsibility | Fully managed | Partially managed (workspace maintenance required) | Customer managed |
+| **Best For** | Recommended use case | Teams wanting minimal infrastructure overhead and fastest time-to-value | Organizations with security/compliance requirements but wanting simplified user management | Organizations requiring complete data isolation and on-premises control |
+
+{% hint style="info" %}
+In all of these cases the client sdk that you pip install into your development environment is the same one found here: https://pypi.org/project/zenml/
+{% endhint %}
+
+## Which Scenario is Right for You?
+
+### SaaS Deployment
+
+Choose **SaaS** if you want to get started immediately with zero infrastructure overhead.
+
+**What runs where:**
+- ZenML Server: ZenML infrastructure
+- Metadata and RBAC: ZenML infrastructure
+- Compute and Data: Your infrastructure
+
+**Key Benefits:**
+- ⚡ Fastest setup (minutes)
+- ✅ Fully managed by ZenML
+- 🚀 Immediate production readiness
+- 💰 Minimal operational overhead
+
+**Ideal for:** Startups, teams prioritizing time-to-value and operational simplicity, organizations comfortable leveraging managed cloud services.
+
+[Learn more about SaaS deployment →](saas-deployment.md)
+
+### Hybrid SaaS Deployment
+
+Choose **Hybrid** if you need to keep sensitive metadata in your infrastructure while benefiting from centralized user management.
+
+**What runs where:**
+- ZenML Control Plane: ZenML infrastructure
+- ZenML Pro UI: ZenML infrastructure
+- ZenML Pro Server: Your infrastructure
+- Run metadata: Your infrastructure
+- Compute and Data: Your infrastructure
+
+**Key Benefits:**
+- 🔐 Metadata stays in your infrastructure
+- 👥 Centralized user management
+- ⚖️ Balance of control and convenience
+- 🏢 Control plane and UI fully maintained and patched by ZenML
+- ✅ Day 1 production ready
+
+**Ideal for:** Organizations with security policies requiring metadata sovereignty, teams wanting simplified identity management without full infrastructure control.
+
+[Learn more about Hybrid deployment →](hybrid-deployment.md)
+
+### Self-hosted Deployment
+
+Choose **Self-hosted** if you need complete control with no external dependencies.
+
+**What runs where:**
+- All components: Your infrastructure (completely isolated)
+
+**Key Benefits:**
+- 🔒 Complete data sovereignty
+- 🚫 No external network dependencies
+- 🛡️ Maximum security posture
+- 📋 Full audit trail control
+
+**Ideal for:** Regulated industries (healthcare, finance, defense), government organizations, enterprises with strict data residency requirements, environments requiring offline operation.
+
+[Learn more about Self-hosted deployment →](self-hosted-deployment.md)
+
+## Common Pipeline Execution Data Flow
+
+All three deployment scenarios follow a similar pipeline execution pattern, with differences in where authentication happens and where data resides:
+
+### Standard Data Flow Steps
+
+1. **Code Execution**: You write code and run pipelines with your client SDK using Python
+
+2. **Token Acquisition**: The ZenML client fetches short-lived tokens from your ZenML workspace for:
+ - Pushing Docker images to your container registry
+ - Communicating with your artifact store
+ - Submitting workloads to your orchestrator
+ - *Note: Your local Python environment needs the client libraries for your stack components*
+
+3. **Image & Workload Submission**: The client automatically builds and pushes Docker images (and optionally code if no code repository is configured) to your container registry, then submits the workload to your orchestrator
+
+4. **Orchestrator Execution**: In the orchestrator environment:
+ - The Docker image is pulled from your container registry
+ - The necessary code is pulled in
+ - A connection to your ZenML workspace is established
+ - The relevant pipeline/step code is executed
+
+5. **Runtime Data Flow**: During execution:
+ - Pipeline and step run metadata is logged to your ZenML workspace
+ - Logs are streamed to your log backend
+ - Artifacts are written to your artifact store
+ - Metadata pointing to these artifacts is persisted
+
+6. **Observability**: The ZenML UI connects to your workspace and uses all persisted metadata to provide you with a complete observability plane
+
+### Deployment-Specific Differences
+
+**SaaS**: Metadata is stored in ZenML infrastructure. Your ML data and compute remain in your infrastructure.
+
+**Hybrid**: Metadata and control plane are split — authentication/RBAC happens at ZenML control plane, but all run metadata, artifacts, and compute stay in your infrastructure.
+
+**Self-hosted**: All components (control plane, metadata, authentication, compute) run entirely within your infrastructure with zero external dependencies.
+
+## Making Your Choice
+
+Consider these factors when deciding:
+
+1. **Data Location Requirements**: Where must your ML metadata and run data reside?
+ - Cloud-hosted is acceptable → **SaaS**
+ - Must stay in your infrastructure → **Hybrid**
+ - Must be completely isolated on-premises → **Self-hosted**
+
+2. **Infrastructure Complexity**: How much infrastructure control do you want?
+ - Minimal → **SaaS**
+ - Moderate → **Hybrid**
+ - Full control → **Self-hosted**
+
+3. **Time to Value**: How quickly do you need to be productive?
+ - Within 1 hour → **SaaS**
+ - Within 4 hours → **Hybrid**
+ - Within 8 hours (or longer planning period) → **Self-hosted**
+
+4. **Compliance Requirements**: What regulations apply to your organization?
+ - General business → **SaaS**
+ - Data residency rules → **Hybrid**
+ - Strict isolation requirements → **Self-hosted**
+
+## Security & Compliance
+
+All ZenML Pro deployments include:
+
+- ✅ **SOC 2 Type II** certification
+- ✅ **ISO 27001** certification
+- ✅ **Vulnerability Assessment Reports** available on request
+- ✅ **Software Bill of Materials (SBOM)** available on request
+
+For software deployed on your infrastructure (Hybrid and Self-hosted scenarios), ZenML provides comprehensive security documentation to support your compliance requirements.
+
+## Running Pipelines from the web UI
+
+All deployment scenarios support running pipeline snapshots from the UI through [workload managers](workload-managers.md). Workload managers are built into the ZenML Pro workspace and can be configured to orchestrate pipeline execution on your Kubernetes cluster, AWS ECS, or GCP infrastructure.
+
+Learn more: [Understanding Workload Managers](workload-managers.md)
+
+## Next Steps
+
+- **Ready to start?** [Choose SaaS Deployment](saas-deployment.md)
+- **Need metadata control?** [Set up Hybrid Deployment](hybrid-deployment.md)
+- **Require complete isolation?** [Configure Self-hosted Deployment](self-hosted-deployment.md)
+- **Deploying on your own infrastructure?** [See Self-hosted Deployment Guide](self-hosted.md)
+- **Want to run pipelines from the UI?** [Configure Workload Managers](workload-managers.md)
+
+{% hint style="info" %}
+Not sure which option is right for you? [Book a call](https://www.zenml.io/book-your-demo) with our team to discuss your specific requirements.
+{% endhint %}
diff --git a/docs/book/getting-started/zenml-pro/hybrid-deployment-ecs.md b/docs/book/getting-started/zenml-pro/hybrid-deployment-ecs.md
new file mode 100644
index 00000000000..219bdceaf78
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/hybrid-deployment-ecs.md
@@ -0,0 +1,514 @@
+---
+description: Deploy ZenML Pro Hybrid on AWS ECS with a managed control plane.
+layout:
+ title:
+ visible: true
+ description:
+ visible: true
+ tableOfContents:
+ visible: true
+ outline:
+ visible: true
+ pagination:
+ visible: true
+---
+
+# Hybrid Deployment on AWS ECS
+
+This guide provides high-level instructions for deploying ZenML Pro in a Hybrid setup on AWS ECS (Elastic Container Service).
+
+## Architecture Overview
+
+In this setup:
+- **ZenML workspace** runs in ECS tasks within your VPC
+- **Load balancer** handles HTTPS traffic and routes to ECS tasks
+- **Database** stores workspace metadata in AWS RDS
+- **Secrets manager** stores Pro credentials securely
+- **NAT gateway** enables outbound access to ZenML Cloud control plane
+
+## Prerequisites
+
+Before starting, complete the setup described in [Hybrid Deployment Overview](hybrid-deployment.md):
+- Step 1: Set up ZenML Pro organization
+- Step 2: Configure your infrastructure (database, networking, TLS)
+- Step 3: Obtain Pro credentials from ZenML Support
+
+You'll also need:
+- AWS Account with appropriate IAM permissions
+- Basic familiarity with AWS ECS, VPC, and RDS
+
+## Step 1: Set Up AWS Infrastructure
+
+### VPC and Subnets
+
+Create a VPC with:
+- **Public subnets** (at least 2 across different availability zones) - for the Application Load Balancer
+- **Private subnets** (at least 2 across different availability zones) - for ECS tasks and RDS
+
+### Security Groups
+
+Create three security groups:
+
+1. **ALB Security Group**
+ - Inbound: HTTPS (443) and HTTP (80) from `0.0.0.0/0`
+ - Outbound: HTTP (8000) to the ECS security group
+
+2. **ECS Security Group**
+ - Inbound: HTTP (8000) from the ALB security group
+ - Outbound: HTTPS (443) to `0.0.0.0/0` (for ZenML Cloud access)
+ - Outbound: TCP (3306 for MySQL) to the RDS security group
+
+3. **RDS Security Group**
+ - Inbound: TCP (3306 for MySQL) from the ECS security group
+ - Outbound: Not restricted
+
+### NAT Gateway
+
+To enable ECS tasks to reach ZenML Cloud:
+
+1. Create an Elastic IP in your AWS region
+2. Create a NAT Gateway in one of your public subnets
+3. Wait for the NAT Gateway to be available
+
+### Route Tables
+
+For your private subnets (where ECS tasks run):
+1. Create a route table
+2. Add a default route (`0.0.0.0/0`) pointing to the NAT Gateway
+3. Associate this route table with your private subnets
+
+## Step 2: Set Up RDS Database
+
+Create an RDS database instance. **Important**: Workspace servers only support MySQL, not PostgreSQL.
+
+**Configuration:**
+- **DB Engine**: MySQL 8.0+ (PostgreSQL is not supported for workspace servers)
+- **Instance Class**: `db.t3.micro` or larger depending on expected load
+- **Storage**: 100 GB initial (with automatic scaling enabled)
+- **Multi-AZ**: Enable for production deployments
+- **VPC**: Your ZenML VPC
+- **Subnet Group**: Create a DB subnet group with your private subnets
+- **Security Group**: RDS security group created above
+- **Backups**: 30 days retention minimum
+- **Logs**: Enable error, general, and slowquery logs to CloudWatch
+
+**After creation:**
+1. Note the database endpoint (hostname)
+2. Create the initial database: `zenml_hybrid`
+3. Create a database user with full permissions on the database
+
+## Step 3: Store Secrets in AWS Secrets Manager
+
+Store your Pro credentials securely:
+
+1. **OAuth2 Client Secret**
+ - Secret name: `zenml/pro/oauth2-client-secret`
+ - Value: Your `ZENML_SERVER_PRO_OAUTH2_CLIENT_SECRET` from ZenML
+
+2. (Optional) **Database Password**
+ - Secret name: `zenml/rds/password`
+ - Value: Your RDS database password
+
+Note the ARN of your OAuth2 secret - you'll reference it in the task definition.
+
+## Step 4: Create ECS IAM Roles
+
+Create two IAM roles:
+
+### Task Execution Role
+
+This role allows ECS to pull images and manage logs:
+- Attach: `AmazonECSTaskExecutionRolePolicy`
+- Add inline policy for Secrets Manager access:
+ - Action: `secretsmanager:GetSecretValue`
+ - Resource: Your OAuth2 secret ARN
+ - Action: `logs:CreateLogGroup`, `logs:CreateLogStream`, `logs:PutLogEvents`
+ - Resource: Your CloudWatch log group
+
+### Task Role
+
+This role is for application-level permissions (optional for basic setup):
+- Leave empty for now, or add policies if your tasks need to access other AWS services
+
+## Step 5: Create ECS Task Definition
+
+In the AWS Console or using AWS CLI/Terraform, create a task definition with:
+
+**Task Configuration:**
+- **Compatibility**: FARGATE
+- **CPU**: 512 (0.5 vCPU)
+- **Memory**: 1024 MB
+- **Network Mode**: awsvpc
+- **Execution Role**: Task execution role created above
+- **Task Role**: Task role created above
+
+**Container Configuration:**
+- **Image**: `715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:`
+- **Port Mapping**: Container port 8000 to port 8000
+- **Essential**: Yes
+
+**Environment Variables:**
+
+Set these in the task definition:
+
+| Variable | Value |
+|----------|-------|
+| `ZENML_SERVER_DEPLOYMENT_TYPE` | `cloud` |
+| `ZENML_SERVER_PRO_API_URL` | `https://cloudapi.zenml.io` |
+| `ZENML_SERVER_PRO_DASHBOARD_URL` | `https://cloud.zenml.io` |
+| `ZENML_SERVER_PRO_ORGANIZATION_ID` | Your organization ID from Step 1 |
+| `ZENML_SERVER_PRO_ORGANIZATION_NAME` | Your organization name from Step 1 |
+| `ZENML_SERVER_PRO_WORKSPACE_ID` | From ZenML Support |
+| `ZENML_SERVER_PRO_WORKSPACE_NAME` | Your workspace name |
+| `ZENML_SERVER_PRO_OAUTH2_AUDIENCE` | `https://cloudapi.zenml.io` |
+| `ZENML_SERVER_SERVER_URL` | `https://zenml.mycompany.com` |
+| `ZENML_DATABASE_URL` | `mysql://user:password@hostname:3306/zenml_hybrid` (MySQL only - PostgreSQL not supported) |
+| `ZENML_SERVER_HOSTNAME` | `0.0.0.0` |
+| `ZENML_SERVER_PORT` | `8000` |
+| `ZENML_LOGGING_LEVEL` | `INFO` |
+
+**Secrets:**
+
+Reference your secret from Secrets Manager:
+
+| Variable | Secret |
+|----------|--------|
+| `ZENML_SERVER_PRO_OAUTH2_CLIENT_SECRET` | `arn:aws:secretsmanager:region:account:secret:zenml/pro/oauth2-client-secret` |
+
+**Logging:**
+
+Configure CloudWatch logs:
+- **Log Group**: `/ecs/zenml-hybrid`
+- **Log Stream Prefix**: `ecs`
+- **Region**: Your AWS region
+
+## Step 6: Create ECS Cluster and Service
+
+Create an ECS cluster named `zenml-hybrid`.
+
+Then create an ECS service within this cluster:
+
+**Service Configuration:**
+- **Cluster**: zenml-hybrid
+- **Task Definition**: zenml-hybrid (latest version)
+- **Launch Type**: FARGATE
+- **Desired Count**: 1 (or more for high availability)
+- **Platform Version**: LATEST
+
+**Network Configuration:**
+- **VPC**: Your ZenML VPC
+- **Subnets**: Your private subnets
+- **Security Group**: ECS security group
+- **Public IP**: Disabled (tasks don't need public IPs)
+
+**Load Balancing:**
+- **Load Balancer Type**: Application Load Balancer
+- **Container**: zenml-server
+- **Container Port**: 8000
+- (Leave the target group selection for the next step)
+
+## Step 7: Set Up Application Load Balancer
+
+Create an Application Load Balancer (ALB):
+
+**Configuration:**
+- **Subnets**: Your public subnets
+- **Security Group**: ALB security group
+
+### Target Group
+
+Create a target group for your ECS service:
+
+**Health Check Configuration:**
+- **Protocol**: HTTP
+- **Path**: `/health`
+- **Port**: 8000
+- **Interval**: 30 seconds
+- **Timeout**: 5 seconds
+- **Healthy Threshold**: 2
+- **Unhealthy Threshold**: 3
+
+### Listeners
+
+Create two listeners on your ALB:
+
+1. **HTTPS Listener (Port 443)**
+ - **Certificate**: Your TLS certificate from ACM or imported
+ - **Default Action**: Forward to your target group
+
+2. **HTTP Listener (Port 80)**
+ - **Default Action**: Redirect to HTTPS (port 443)
+
+## Step 8: Configure DNS
+
+In your DNS provider (Route 53 or external):
+
+1. Create an A record (or CNAME) pointing to your ALB's DNS name
+ - **Name**: `zenml.mycompany.com`
+ - **Target**: Your ALB's DNS name or IP
+ - **Type**: A record (use Alias if in Route 53)
+
+2. Allow time for DNS propagation (typically 5-15 minutes)
+
+## Step 9: Verify the Deployment
+
+1. **Check ECS Service Status**
+ - Go to ECS console → Clusters → zenml-hybrid → Services
+ - Verify the service shows "Active"
+ - Check that desired and running task counts match
+
+2. **Check Task Logs**
+ - Go to CloudWatch → Log Groups → `/ecs/zenml-hybrid`
+ - View log stream to look for startup messages
+ - Verify no critical errors appear
+
+3. **Test HTTPS Access**
+ - Visit `https://zenml.mycompany.com` in your browser
+ - You should see ZenML Pro login redirecting to cloud.zenml.io
+
+4. **Verify Control Plane Connection**
+ - In CloudWatch logs, look for messages indicating successful connection to ZenML Cloud
+ - Check for any authentication or SSL errors
+
+## Network & Firewall Requirements
+
+### Outbound Access to ZenML Cloud
+
+Your ECS tasks need HTTPS (port 443) outbound access to:
+- `cloudapi.zenml.io` - For control plane authentication
+
+This is enabled by the NAT Gateway and ECS security group configuration.
+
+### Inbound Access from Clients
+
+Clients need HTTPS (port 443) inbound access to:
+- `zenml.mycompany.com` - Your ALB endpoint
+
+This is enabled by the ALB and ALB security group configuration.
+
+### Database Access
+
+ECS tasks need TCP access to:
+- Your RDS instance on port 3306 (MySQL)
+
+This is enabled by the ECS security group egress rule and RDS security group ingress rule.
+
+## Scaling & High Availability
+
+### Multiple Tasks
+
+For high availability:
+1. Update the ECS service's desired count to 2 or more
+2. ECS will distribute tasks across availability zones
+3. The ALB automatically distributes traffic to all healthy tasks
+
+### Auto Scaling (Optional)
+
+To automatically scale based on CPU or memory usage:
+1. Register a scalable target (your ECS service)
+2. Create a target tracking scaling policy
+3. Set target CPU utilization (e.g., 70%)
+
+## Monitoring & Logging
+
+### CloudWatch Logs
+
+Monitor your deployment:
+1. Go to CloudWatch → Log Groups → `/ecs/zenml-hybrid`
+2. Set up log filters to find errors: filter for `ERROR` or `CRITICAL`
+3. Create metric filters if needed
+
+### CloudWatch Alarms
+
+Create alarms for:
+- **High CPU Utilization**: Alert when average CPU > 80%
+- **Failed Tasks**: Alert when tasks exit unexpectedly
+- **Unhealthy Targets**: Alert when ALB marks tasks as unhealthy
+
+### Application Logs
+
+For production deployments:
+1. Forward CloudWatch logs to your centralized logging system (ELK, Datadog, etc.)
+2. Set up alerts for authentication failures to ZenML Cloud
+3. Monitor database connection errors
+
+## Database Maintenance
+
+### Backups
+
+Automated backups are configured, but:
+1. Verify backup retention is set to at least 30 days
+2. Test backup restoration periodically
+3. Store backups in a different region for disaster recovery
+
+### Monitoring
+
+Monitor database health:
+1. Check RDS Performance Insights for slow queries
+2. Review CloudWatch metrics for connection count and CPU
+3. Monitor free storage space and create alerts
+
+## (Optional) Enable Snapshot Support / Workload Manager
+
+Pipeline snapshots (running pipelines from the UI) require a workload manager. For ECS deployments, you'll typically use the AWS Kubernetes implementation if you also have a Kubernetes cluster available, or configure settings as appropriate for your infrastructure.
+
+### Prerequisites for Workload Manager
+
+To enable snapshots on ECS-deployed ZenML workspaces:
+
+1. **Kubernetes Cluster Access** - You'll need a Kubernetes cluster where the workload manager can run jobs. This could be:
+ - The same EKS cluster as your other infrastructure
+ - A separate EKS cluster dedicated to workloads
+ - Another Kubernetes distribution in your environment
+
+2. **Container Registry Access** - The workload manager needs access to your container registry to:
+ - Pull base ZenML images
+ - Push/pull runner images (if building them)
+
+3. **Storage Access** - For AWS implementation:
+ - S3 bucket for logs storage
+ - IAM permissions to read/write to the bucket
+
+### Configuration Options
+
+**Option A: AWS Kubernetes Workload Manager (Recommended for ECS)**
+
+If you have an EKS cluster or other Kubernetes cluster available:
+
+1. Create a dedicated namespace:
+ ```
+ kubectl create namespace zenml-workload-manager
+ kubectl -n zenml-workload-manager create serviceaccount zenml-runner
+ ```
+
+2. Add these environment variables to your ECS task definition:
+
+ | Variable | Value |
+ |----------|-------|
+ | `ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE` | `zenml_cloud_plugins.aws_kubernetes_workload_manager.AWSKubernetesWorkloadManager` |
+ | `ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE` | `zenml-workload-manager` |
+ | `ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT` | `zenml-runner` |
+ | `ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE` | `true` |
+ | `ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY` | Your ECR registry URI |
+ | `ZENML_KUBERNETES_WORKLOAD_MANAGER_ENABLE_EXTERNAL_LOGS` | `true` |
+ | `ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_BUCKET` | Your S3 bucket for logs |
+ | `ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_REGION` | Your AWS region |
+ | `ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS` | `2` (or higher) |
+ | `ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES` | `{"requests": {"cpu": "500m", "memory": "512Mi"}, "limits": {"cpu": "2000m", "memory": "2Gi"}}` |
+
+3. Ensure the ECS task has permissions to access:
+ - The Kubernetes cluster (kubeconfig/IAM role)
+ - Your ECR registry
+ - Your S3 bucket for logs
+
+**Option B: Kubernetes-based (Simpler Alternative)**
+
+If you prefer a basic setup without AWS-specific features:
+
+Add these environment variables to your ECS task definition:
+
+| Variable | Value |
+|----------|-------|
+| `ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE` | `zenml_cloud_plugins.kubernetes_workload_manager.KubernetesWorkloadManager` |
+| `ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE` | `zenml-workload-manager` |
+| `ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT` | `zenml-runner` |
+| `ZENML_KUBERNETES_WORKLOAD_MANAGER_RUNNER_IMAGE` | Your prebuilt ZenML image URI |
+
+### Updating Task Definition
+
+After configuring the workload manager environment variables:
+
+1. Create a new task definition revision with the updated environment variables
+2. Update your ECS service to use the new task definition
+3. ECS will gradually replace running tasks with the new version
+4. Monitor CloudWatch logs to verify the workload manager is operational
+
+## Troubleshooting
+
+### Task Won't Start
+
+Check ECS task logs in CloudWatch:
+1. Go to `/ecs/zenml-hybrid` log group
+2. Look for error messages about image pull failures or environment variable issues
+3. Verify IAM execution role has correct permissions
+
+### Database Connection Failed
+
+1. Verify database is running and accessible
+2. Check ECS security group allows outbound to RDS security group
+3. Verify `ZENML_DATABASE_URL` has correct hostname, port, and credentials
+4. Test connectivity from an ECS task using a MySQL client
+
+### Can't Reach Server via HTTPS
+
+1. Verify ALB is in "Active" state
+2. Check ALB target group - tasks should show "Healthy"
+3. Verify TLS certificate is valid for your domain
+4. Check DNS resolution: `nslookup zenml.mycompany.com`
+
+### Control Plane Connection Issues
+
+Check CloudWatch logs for:
+1. OAuth2 authentication errors - verify `ZENML_SERVER_PRO_OAUTH2_CLIENT_SECRET` is correct
+2. Network connectivity errors - verify NAT Gateway is operational
+3. Certificate validation errors - verify outbound HTTPS to cloudapi.zenml.io works
+
+## Updating the Deployment
+
+### Update Configuration
+
+1. Modify environment variables in the task definition
+2. Create a new task definition revision
+3. Update the ECS service to use the new task definition
+4. ECS will gradually replace old tasks with new ones
+
+### Upgrade ZenML Version
+
+1. Update the container image in the task definition
+2. Create a new task definition revision
+3. Update the ECS service
+4. Monitor CloudWatch logs during the update
+
+## Cleanup
+
+To remove the deployment:
+
+1. **Delete ECS Service**
+ - Go to ECS → Clusters → zenml-hybrid → Services
+ - Delete the zenml-server service
+ - Set desired count to 0 first
+
+2. **Delete ECS Cluster**
+ - Delete the cluster once service is removed
+
+3. **Delete ALB**
+ - Go to EC2 → Load Balancers
+ - Delete the ALB and associated target groups
+
+4. **Delete RDS Instance**
+ - Go to RDS → Databases
+ - Delete the zenml-hybrid-db instance
+ - Skip final snapshot if you don't need a backup
+
+5. **Delete VPC and Related Resources**
+ - Delete NAT Gateway (releases Elastic IP)
+ - Delete subnets, route tables, security groups
+ - Delete VPC
+
+6. **Clean Up Secrets**
+ - Go to Secrets Manager
+ - Delete zenml/pro/oauth2-client-secret
+
+## Next Steps
+
+- [Configure your organization in ZenML Cloud](https://cloud.zenml.io)
+- [Set up users and teams](../organization.md)
+- [Configure stacks and service connectors](https://docs.zenml.io/stacks)
+- [Run your first pipeline](https://docs.zenml.io/getting-started/quickstart)
+
+## Related Documentation
+
+- [Hybrid Deployment Overview](hybrid-deployment.md)
+- [Self-hosted Deployment Guide](self-hosted.md)
+- [AWS ECS Documentation](https://docs.aws.amazon.com/ecs/)
+- [AWS RDS Documentation](https://docs.aws.amazon.com/rds/)
diff --git a/docs/book/getting-started/zenml-pro/hybrid-deployment-helm.md b/docs/book/getting-started/zenml-pro/hybrid-deployment-helm.md
new file mode 100644
index 00000000000..66be88dfc42
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/hybrid-deployment-helm.md
@@ -0,0 +1,670 @@
+---
+description: Deploy ZenML Pro Hybrid using Kubernetes and Helm charts.
+layout:
+ title:
+ visible: true
+ description:
+ visible: true
+ tableOfContents:
+ visible: true
+ outline:
+ visible: true
+ pagination:
+ visible: true
+---
+
+# Hybrid Deployment on Kubernetes with Helm
+
+This guide provides step-by-step instructions for deploying ZenML Pro in a Hybrid setup using Kubernetes and Helm charts.
+
+## Prerequisites
+
+- Kubernetes cluster (1.24+) - EKS, GKE, AKS, or self-managed
+- `kubectl` configured to access your cluster
+- `helm` CLI (3.0+) installed
+- A domain name and TLS certificate for your ZenML server
+- MySQL database (managed or self-hosted)
+- Outbound HTTPS access to `cloudapi.zenml.io`
+
+Before starting, complete the setup described in [Hybrid Deployment Overview](hybrid-deployment.md):
+- Step 1: Set up ZenML Pro organization
+- Step 2: Configure your infrastructure (database, networking, TLS)
+- Step 3: Obtain Pro credentials from ZenML Support
+
+## Step 1: Prepare Helm Chart
+
+For OCI-based Helm charts, you can either pull the chart or install directly. To pull the chart first:
+
+```bash
+helm pull oci://public.ecr.aws/zenml/zenml --version
+```
+
+Alternatively, you can install directly from OCI (see Step 5 below).
+
+## Step 2: Create Kubernetes Namespace
+
+```bash
+kubectl create namespace zenml-hybrid
+```
+
+## Step 3: Create Secrets for Credentials
+
+Create a secret for your Pro OAuth2 credentials. Ask you ZenML Solutions Architect to send you this secret.:
+
+```bash
+kubectl -n zenml-hybrid create secret generic zenml-pro-credentials \
+ --from-literal=ZENML_SERVER_PRO_OAUTH2_CLIENT_SECRET=
+```
+
+
+If using a custom TLS certificate (self-signed or from a CA), create a secret:
+
+```bash
+kubectl -n zenml-hybrid create secret tls zenml-tls \
+ --cert=/path/to/tls.crt \
+ --key=/path/to/tls.key
+```
+
+## Step 4: Create Helm Values File
+
+Create a file `zenml-hybrid-values.yaml` with your configuration:
+
+```yaml
+# ZenML Server Configuration
+zenml:
+ # Server metadata
+ serverURL: https://zenml.mycompany.com
+
+ # Pro Hybrid Configuration
+ pro:
+ enabled: true
+ deploymentType: cloud
+
+ # ZenML Control Plane endpoints
+ apiURL: https://cloudapi.zenml.io
+ dashboardURL: https://cloud.zenml.io
+
+ # Your organization details
+ organizationID:
+ organizationName:
+
+ # Workspace details (provided by ZenML)
+ workspaceID:
+ workspaceName:
+
+ # OAuth2 authentication (stored in secret)
+ oauth2:
+ audience: https://cloudapi.zenml.io
+ clientSecretRef:
+ name: zenml-pro-credentials
+ key: ZENML_SERVER_PRO_OAUTH2_CLIENT_SECRET
+
+ # Database Configuration
+ # Note: Workspace servers only support MySQL, not PostgreSQL
+ database:
+ external:
+ type: mysql
+ host: mysql.mycompany.com
+ port: 3306
+ username: zenml_user
+ password:
+ database: zenml_hybrid
+
+ # Image Configuration
+ image:
+ repository: 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server
+ tag: "" # e.g., "0.73.0" - Match your ZenML OSS version
+ pullPolicy: IfNotPresent
+
+ # Ingress Configuration
+ ingress:
+ enabled: true
+ className: nginx # or your ingress class
+ host: zenml.mycompany.com
+
+ # TLS Configuration
+ tls:
+ enabled: true
+ secretName: zenml-tls
+
+ # Annotations for your ingress controller
+ annotations:
+ cert-manager.io/cluster-issuer: "letsencrypt-prod" # if using cert-manager
+
+ # Service Configuration
+ service:
+ type: ClusterIP
+ port: 80
+ targetPort: 8000
+
+ # Resource Limits
+ resources:
+ requests:
+ cpu: 250m
+ memory: 512Mi
+ limits:
+ cpu: 1000m
+ memory: 2Gi
+
+ # Analytics (optional)
+ analyticsOptIn: false
+
+ # Replica count
+ replicaCount: 1
+
+# Image pull secrets (if using private registry)
+imagePullSecrets: []
+
+# Pod Security Context
+podSecurityContext:
+ fsGroup: 1000
+ runAsNonRoot: true
+ runAsUser: 1000
+
+# Container Security Context
+securityContext:
+ allowPrivilegeEscalation: false
+ readOnlyRootFilesystem: true
+ runAsNonRoot: true
+ runAsUser: 1000
+ capabilities:
+ drop:
+ - ALL
+```
+
+## Step 5: Deploy with Helm
+
+Install the ZenML chart directly from OCI:
+
+```bash
+helm install zenml oci://public.ecr.aws/zenml/zenml \
+ --namespace zenml-hybrid \
+ --values zenml-hybrid-values.yaml \
+ --version
+```
+
+Or if you pulled the chart in Step 1, install from the local file:
+
+```bash
+helm install zenml ./zenml-.tgz \
+ --namespace zenml-hybrid \
+ --values zenml-hybrid-values.yaml
+```
+
+Monitor the deployment:
+
+```bash
+kubectl -n zenml-hybrid get pods -w
+```
+
+Wait for the pod to be running:
+
+```bash
+kubectl -n zenml-hybrid get pods
+# Output should show:
+# NAME READY STATUS RESTARTS AGE
+# zenml-5c4b6d9dcd-7bhfp 1/1 Running 0 2m
+```
+
+## Step 6: Verify the Deployment
+
+### Check Service is Running
+
+```bash
+kubectl -n zenml-hybrid get svc
+kubectl -n zenml-hybrid get ingress
+```
+
+### Verify Control Plane Connection
+
+```bash
+kubectl -n zenml-hybrid logs deployment/zenml | tail -20
+```
+
+Look for messages indicating successful connection to the control plane.
+
+### Test HTTPS Connectivity
+
+```bash
+curl -k https://zenml.mycompany.com/health
+# Should return 200 OK with a JSON response
+```
+
+### Access the Dashboard
+
+1. Navigate to `https://zenml.mycompany.com` in your browser
+2. You should be redirected to ZenML Cloud login
+3. Sign in with your organization credentials
+4. You should see your workspace listed
+
+## Step 7: (Optional) Enable Snapshot Support / Workload Manager
+
+Pipeline snapshots (running pipelines from the dashboard) require a workload manager. For hybrid deployments, you can configure one of the following:
+
+### 1. Create Kubernetes Resources for Workload Manager
+
+Create a dedicated namespace and service account:
+
+```bash
+kubectl create namespace zenml-workload-manager
+kubectl -n zenml-workload-manager create serviceaccount zenml-runner
+```
+
+### 2. Configure Workload Manager in Helm Values
+
+Add environment variables to your `zenml-hybrid-values.yaml`:
+
+**Option A: Kubernetes-based (Simplest)**
+
+```yaml
+zenml:
+ environment:
+ ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.kubernetes_workload_manager.KubernetesWorkloadManager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_RUNNER_IMAGE: 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:
+```
+
+**Option B: AWS-based (if running on EKS)**
+
+```yaml
+zenml:
+ environment:
+ ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.aws_kubernetes_workload_manager.AWSKubernetesWorkloadManager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY:
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_ENABLE_EXTERNAL_LOGS: "true"
+ ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_BUCKET: s3://your-bucket/zenml-logs
+ ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_REGION: us-east-1
+```
+
+**Option C: GCP-based (if running on GKE)**
+
+```yaml
+zenml:
+ environment:
+ ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.gcp_kubernetes_workload_manager.GCPKubernetesWorkloadManager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY:
+```
+
+### 3. Configure Pod Resources (Optional but Recommended)
+
+```yaml
+zenml:
+ environment:
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES: '{"requests": {"cpu": "500m", "memory": "512Mi"}, "limits": {"cpu": "2000m", "memory": "2Gi"}}'
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED: 86400
+ ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS: 5
+```
+
+### 4. Redeploy with Updated Values
+
+```bash
+helm upgrade zenml zenml/zenml \
+ --namespace zenml-hybrid \
+ --values zenml-hybrid-values.yaml
+```
+
+## Step 8: Configure Environment Variables (Advanced)
+
+For advanced configurations, you can set additional environment variables in your Helm values:
+
+```yaml
+zenml:
+ environment:
+ ZENML_LOGGING_LEVEL: INFO
+ ZENML_ANALYTICS_OPT_IN: "false"
+ # Add other environment variables as needed
+```
+
+## Database Configuration Examples
+
+### AWS RDS MySQL
+
+```yaml
+zenml:
+ database:
+ external:
+ type: mysql
+ host: zenml-db.123456789.us-east-1.rds.amazonaws.com
+ port: 3306
+ username: admin
+ password:
+ database: zenml_hybrid
+```
+
+### Google Cloud SQL MySQL
+
+```yaml
+zenml:
+ database:
+ external:
+ type: mysql
+ host: 34.123.45.67
+ port: 3306
+ username: root
+ password:
+ database: zenml_hybrid
+```
+
+### Self-Managed MySQL
+
+```yaml
+zenml:
+ database:
+ external:
+ type: mysql
+ host: mysql.internal.mycompany.com
+ port: 3306
+ username: zenml_user
+ password:
+ database: zenml_hybrid
+```
+
+## Networking & Firewall Configuration
+
+### Kubernetes Network Policy
+
+If your cluster uses network policies, allow traffic:
+
+```yaml
+apiVersion: networking.k8s.io/v1
+kind: NetworkPolicy
+metadata:
+ name: zenml-egress
+ namespace: zenml-hybrid
+spec:
+ podSelector:
+ matchLabels:
+ app: zenml
+ policyTypes:
+ - Egress
+ egress:
+ # Allow DNS
+ - to:
+ - namespaceSelector: {}
+ ports:
+ - protocol: UDP
+ port: 53
+ # Allow outbound to ZenML Cloud
+ - to:
+ - ipBlock:
+ cidr: 0.0.0.0/0
+ ports:
+ - protocol: TCP
+ port: 443
+ # Allow database access
+ - to:
+ - podSelector:
+ matchLabels:
+ app: mysql
+ ports:
+ - protocol: TCP
+ port: 3306
+```
+
+### Firewall Rules
+
+Ensure your infrastructure firewall allows:
+
+**Egress:**
+- Destination: `cloudapi.zenml.io` (HTTPS port 443)
+- Destination: Your database server (e.g., port 3306 for MySQL)
+
+**Ingress:**
+- Source: Your organization's networks or public internet
+- Destination: Your ZenML server domain (HTTPS port 443)
+
+## Ingress Controller Setup
+
+### Using NGINX Ingress Controller
+
+If not already installed:
+
+```bash
+helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
+helm repo update
+helm install nginx-ingress ingress-nginx/ingress-nginx \
+ --namespace ingress-nginx \
+ --create-namespace
+```
+
+Configure your Helm values:
+
+```yaml
+zenml:
+ ingress:
+ enabled: true
+ className: nginx
+ host: zenml.mycompany.com
+ annotations:
+ nginx.ingress.kubernetes.io/ssl-redirect: "true"
+ nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
+ tls:
+ enabled: true
+ secretName: zenml-tls
+```
+
+### Using Traefik
+
+```yaml
+zenml:
+ ingress:
+ enabled: true
+ className: traefik
+ host: zenml.mycompany.com
+ annotations:
+ traefik.ingress.kubernetes.io/router.entrypoints: websecure
+ traefik.ingress.kubernetes.io/router.tls: "true"
+ tls:
+ enabled: true
+ secretName: zenml-tls
+```
+
+## TLS Certificate Management
+
+### Self-Signed Certificates (Development Only)
+
+```bash
+# Generate certificate
+openssl req -x509 -newkey rsa:4096 -keyout tls.key -out tls.crt -days 365 -nodes \
+ -subj "/CN=zenml.mycompany.com"
+
+# Create secret
+kubectl -n zenml-hybrid create secret tls zenml-tls \
+ --cert=tls.crt --key=tls.key
+```
+
+### Using cert-manager with Let's Encrypt
+
+1. Install cert-manager:
+
+```bash
+helm repo add jetstack https://charts.jetstack.io
+helm repo update
+helm install cert-manager jetstack/cert-manager \
+ --namespace cert-manager \
+ --create-namespace \
+ --set installCRDs=true
+```
+
+2. Create ClusterIssuer:
+
+```yaml
+apiVersion: cert-manager.io/v1
+kind: ClusterIssuer
+metadata:
+ name: letsencrypt-prod
+spec:
+ acme:
+ server: https://acme-v02.api.letsencrypt.org/directory
+ email: admin@mycompany.com
+ privateKeySecretRef:
+ name: letsencrypt-prod
+ solvers:
+ - http01:
+ ingress:
+ class: nginx
+```
+
+3. Update Helm values:
+
+```yaml
+zenml:
+ ingress:
+ annotations:
+ cert-manager.io/cluster-issuer: letsencrypt-prod
+ tls:
+ enabled: true
+```
+
+## Persistent Storage (Optional)
+
+If you need persistent storage for the ZenML server:
+
+```yaml
+persistence:
+ enabled: true
+ storageClassName: standard
+ accessMode: ReadWriteOnce
+ size: 10Gi
+```
+
+## Scaling & High Availability
+
+### Multiple Replicas
+
+```yaml
+zenml:
+ replicaCount: 3
+```
+
+### Pod Disruption Budget
+
+```yaml
+podDisruptionBudget:
+ enabled: true
+ minAvailable: 1
+```
+
+### Horizontal Pod Autoscaler
+
+```yaml
+autoscaling:
+ enabled: true
+ minReplicas: 2
+ maxReplicas: 5
+ targetCPUUtilizationPercentage: 80
+```
+
+## Monitoring & Logging
+
+### Prometheus Metrics
+
+```yaml
+zenml:
+ metrics:
+ enabled: true
+ port: 8001
+```
+
+### Logging Configuration
+
+```yaml
+zenml:
+ logging:
+ level: INFO
+ format: json
+```
+
+Collect logs with:
+
+```bash
+kubectl -n zenml-hybrid logs deployment/zenml -f
+```
+
+## Updating the Deployment
+
+### Update Configuration
+
+1. Modify `zenml-hybrid-values.yaml`
+2. Upgrade with Helm:
+
+```bash
+helm upgrade zenml oci://public.ecr.aws/zenml/zenml \
+ --namespace zenml-hybrid \
+ --values zenml-hybrid-values.yaml \
+ --version
+```
+
+### Upgrade ZenML Version
+
+1. Check available versions:
+
+For OCI charts, you can check available versions by attempting to pull different versions, or contact ZenML Support for the latest version information.
+
+2. Update values file with new version
+3. Upgrade:
+
+```bash
+helm upgrade zenml zenml/zenml \
+ --namespace zenml-hybrid \
+ --values zenml-hybrid-values.yaml \
+ --version
+```
+
+## Troubleshooting
+
+### Pod won't start
+
+```bash
+kubectl -n zenml-hybrid describe pod zenml-xxxxx
+kubectl -n zenml-hybrid logs zenml-xxxxx
+```
+
+### Database connection errors
+
+```bash
+# Test database connectivity from pod
+kubectl -n zenml-hybrid exec -it zenml-xxxxx -- \
+ mysql -h -u -p -e "SELECT 1"
+```
+
+### Control plane connection issues
+
+```bash
+# Check logs for auth errors
+kubectl -n zenml-hybrid logs zenml-xxxxx | grep -i "oauth\|auth\|control"
+```
+
+### Ingress not working
+
+```bash
+kubectl -n zenml-hybrid get ingress
+kubectl -n zenml-hybrid describe ingress zenml
+```
+
+## Uninstalling
+
+```bash
+helm uninstall zenml --namespace zenml-hybrid
+kubectl delete namespace zenml-hybrid
+```
+
+## Next Steps
+
+- [Configure your organization in ZenML Cloud](https://cloud.zenml.io)
+- [Set up users and teams](../organization.md)
+- [Configure stacks and service connectors](https://docs.zenml.io/stacks)
+- [Run your first pipeline](https://docs.zenml.io/getting-started/quickstart)
+
+## Related Documentation
+
+- [Hybrid Deployment Overview](hybrid-deployment.md)
+- [Self-hosted Deployment Guide](self-hosted.md)
+- [ZenML Helm Chart Documentation](https://artifacthub.io/packages/helm/zenml/zenml)
diff --git a/docs/book/getting-started/zenml-pro/hybrid-deployment.md b/docs/book/getting-started/zenml-pro/hybrid-deployment.md
new file mode 100644
index 00000000000..e3789d75c82
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/hybrid-deployment.md
@@ -0,0 +1,411 @@
+---
+description: Learn about ZenML Pro Hybrid SaaS deployment - balancing control with convenience for enterprise MLOps.
+icon: building-shield
+---
+
+# Hybrid SaaS Deployment
+
+ZenML Pro Hybrid SaaS offers the perfect balance between control and convenience. While ZenML manages user authentication and RBAC through a cloud-hosted control plane, all your data, metadata, and workspaces run securely within your own infrastructure.
+
+{% hint style="info" %}
+To learn more about Hybrid SaaS deployment, [book a call](https://www.zenml.io/book-your-demo).
+{% endhint %}
+
+## Overview
+
+The Hybrid deployment model is designed for organizations that need to keep sensitive data and metadata within their infrastructure boundaries while still benefiting from centralized user management and simplified operations.
+
+
+
+## Architecture
+
+### What Runs Where
+
+| Component | Location | Purpose |
+|-----------|----------|---------|
+| **Pro Control Plane** | ZenML Infrastructure | Manages authentication, RBAC, and global workspace coordination |
+| **ZenML Pro Server(s)** | Your Infrastructure | Handles pipeline orchestration and execution |
+| **Metadata Store** | Your Infrastructure | Stores all pipeline runs, model metadata, and tracking information |
+| **Secrets Store** | Your Infrastructure | Stores all credentials and sensitive configuration |
+| **Compute Resources** | Your infrastructure through [stacks](https://docs.zenml.io/stacks) | Executes pipeline steps and training jobs |
+| **Data & Artifacts** | Your infrastructure through [stacks](https://docs.zenml.io/stacks) | Stores datasets, models, and pipeline artifacts |
+
+### Data Flow
+
+For a detailed explanation of the common pipeline execution data flow across all deployment scenarios, see [Common Pipeline Execution Data Flow](deployments-overview.md#common-pipeline-execution-data-flow) in the Deployment Scenarios Overview.
+
+In Hybrid deployment, users authenticate via ZenML-hosted control plane (SSO), and RBAC policies are enforced there before token issuance.
+
+{% hint style="success" %}
+**Complete data sovereignty**: All metadata, secrets, and ML artifacts remain within your infrastructure. Only authentication and authorization data flows to ZenML control plane.
+{% endhint %}
+
+## Key Benefits
+
+### 🔒 Enhanced Security & Compliance
+
+- **Data sovereignty**: All metadata and artifacts stay within your infrastructure
+- **Secret isolation**: Credentials never leave your environment
+- **VPN/Firewall compatible**: Workspaces operate behind your security perimeter
+- **Audit trails**: Complete logging within your infrastructure
+- **SOC 2 & ISO 27001 certified software**: Meets enterprise security and compliance benchmarks for your peace of mind
+
+### 🎯 Centralized Governance
+
+- **Unified user management**: Single control plane for all workspaces
+- **Consistent RBAC**: Centrally managed permissions across teams
+- **SSO integration**: Connect with your identity provider once
+- **Global visibility**: Platform teams see across all workspaces
+- **Standardized policies**: Enforce organizational standards
+
+### ⚖️ Balanced Control
+
+- **Infrastructure control**: Full control over workspace configuration and resources
+- **Reduced operational overhead**: ZenML manages the control plane
+- **Customization freedom**: Configure workspaces to specific team needs
+- **Network isolation**: Workspaces can be fully isolated per team/department
+- **Cost optimization**: Pay only for what you use in your infrastructure
+
+### 🚀 Production Ready
+
+- **Automatic updates**: Control plane and UI maintained by ZenML
+- **Professional support**: Direct access to ZenML experts
+
+## Ideal Use Cases
+
+Hybrid SaaS is perfect for:
+
+- **Regulated industries** (finance, healthcare, government) with strict data residency requirements
+- **Organizations with centralized MLOps teams** managing multiple business units
+- **Companies with existing VPN/firewall policies** that restrict inbound connections
+- **Enterprises requiring audit trails** of all data access within their infrastructure
+- **Teams needing customization** while maintaining centralized user management
+- **Organizations with compliance requirements** mandating on-premises metadata storage
+
+## Architecture Details
+
+### Network Security
+
+#### Outbound-Only Connections
+Workspaces initiate outbound-only connections to the control plane:
+- No inbound connections required to your infrastructure
+- Compatible with strict firewall policies
+
+#### Multi-Workspace Isolation
+Each workspace can be:
+- Deployed in separate VPCs/networks
+- Isolated per team or department or customer
+- Configured with different security policies
+- Managed independently by different teams
+
+### Authentication & Authorization Flow
+
+```mermaid
+graph LR
+ A[User] -->|1. Login| B[Control Plane ZenML Infrastructure]
+ B -->|2. Auth Token| A
+ A -->|3. Access Workspace| C[Workspace Your Infrastructure]
+ C -->|4. Validate Token| B
+ B -->|5. Authorization| C
+ C -->|6. Execute| D[Your Resources]
+```
+
+1. User authenticates with ZenML control plane (SSO)
+2. Control plane issues authentication credentials
+3. User accesses workspace with credentials
+4. Workspace validates credentials with control plane
+5. Control plane confirms authenticaiton and authorization (RBAC)
+6. Workspace executes operations on your infrastructure
+
+### Data Residency
+
+| Data Type | Storage Location | Purpose |
+|-----------|-----------------|---------|
+| User metadata | Control Plane | Authentication only |
+| RBAC policies | Control Plane | Authorization decisions |
+| Pipeline metadata | Your Infrastructure | Run history, metrics, parameters |
+| Model metadata | Your Infrastructure | Model versions, stages, annotations |
+| Artifacts | Your Infrastructure | Datasets, models, visualizations |
+| Secrets | Your Infrastructure | Cloud credentials, API keys |
+| Logs | Your Infrastructure | Step outputs, debug information |
+
+## Deployment Architecture
+
+### Single Organization, Multiple Workspaces
+
+```mermaid
+graph TB
+ subgraph clients["Client Machines (Developer Laptops/CI)"]
+ C1[Data Scientist]
+ C2[ML Engineer]
+ C3[CI/CD Pipeline]
+ end
+
+ subgraph zenml["ZenML Infrastructure"]
+ CP[Control Plane - Authentication SSO - RBAC Management - Workspace Registry]
+ end
+
+ subgraph customer["Your Infrastructure"]
+ subgraph ws1["Workspace 1 - Team A"]
+ W1[ZenML Server Metadata DB Secrets Store]
+ R1[Your Resources Orchestrator Artifact Store]
+ end
+
+ subgraph ws2["Workspace 2 - Team B"]
+ W2[ZenML Server Metadata DB Secrets Store]
+ R2[Your Resources Orchestrator Artifact Store]
+ end
+
+ subgraph wsn["Workspace N - Platform"]
+ WN[ZenML Server Metadata DB Secrets Store]
+ RN[Your Resources Orchestrator Artifact Store]
+ end
+ end
+
+ C1 -->|1. Authenticate| CP
+ C2 -->|1. Authenticate| CP
+ C3 -->|1. Authenticate| CP
+
+ CP -->|2. RBAC Token| C1
+ CP -->|2. RBAC Token| C2
+ CP -->|2. RBAC Token| C3
+
+ C1 -->|3. Run Pipeline| W1
+ C2 -->|3. Run Pipeline| W2
+ C3 -->|3. Run Pipeline| WN
+
+ W1 -.->|Validate Token| CP
+ W2 -.->|Validate Token| CP
+ WN -.->|Validate Token| CP
+
+ W1 -->|Execute| R1
+ W2 -->|Execute| R2
+ WN -->|Execute| RN
+
+ style zenml fill:#e1f5ff
+ style customer fill:#f0f0f0
+ style clients fill:#fff4e6
+```
+
+**Connection Flow:**
+1. **Clients authenticate** with ZenML Control Plane (SSO) - hosted by ZenML
+2. **Control Plane issues** RBAC-validated tokens to clients
+3. **Clients connect** to their assigned workspace(s) in your infrastructure
+4. **Workspaces validate** tokens with Control Plane (outbound-only connection)
+5. **Pipelines execute** on your infrastructure resources
+
+### Multi-Region Support
+
+Deploy workspaces across different regions while maintaining centralized control:
+- Workspaces in US, EU, APAC regions
+- Data residency compliance per region
+- Centralized user management
+- Consistent RBAC across regions
+
+## Setup Process
+
+### 1. Initial Configuration
+
+[Book a demo](https://www.zenml.io/book-your-demo) to get started. The ZenML team will:
+- Help set up your organization in the control plane
+- Establish secure communication channels
+- (optional) Configure SSO integration
+
+### 2. Workspace Deployment
+
+Deploy ZenML workspaces in your infrastructure. Workspaces can be deployed on:
+
+**Supported Deployment Backends:**
+- **Kubernetes** (Recommended) - EKS, GKE, AKS, or self-managed clusters
+- **AWS ECS** - Elastic Container Service
+- **Container orchestration alternatives** - Other Kubernetes distributions
+
+**Requirements:**
+- **Database**: MySQL or PostgreSQL database in your infrastructure
+- **Network**: Egress access to `cloud.zenml.io` (for Control Plane communication)
+- **Resources**: Compute resources for the ZenML server container
+
+**Deployment Tools:**
+- **Kubernetes**: We provide officially supported Helm charts
+- **Non-Kubernetes environments**: We recommend using infrastructure-as-code tools like Terraform, Pulumi, or CloudFormation to manage server lifecycle
+
+
+### 3. Configure Infrastructure Access
+
+Once your workspace is deployed, configure access to your cloud resources using ZenML's infrastructure abstractions:
+
+**Stack Components**: Individual infrastructure elements that your pipelines need to run - orchestrators (Kubernetes, Airflow, etc.), artifact stores (S3, GCS, Azure Blob), container registries, experiment trackers, model deployers, and more. Each component type has multiple "flavors" supporting different technologies.
+
+**Stacks**: A stack is a named collection of components that define where and how your pipelines run. By combining different components into stacks, you can easily switch between environments (development, staging, production) or infrastructure providers without changing your pipeline code.
+
+**Service Connectors**: Service connectors provide secure, reusable authentication to cloud providers and services. Instead of managing credentials manually in each component, connectors handle authentication centrally and can be shared across your team with appropriate access controls.
+
+Learn more:
+- [Stack Components Documentation](https://docs.zenml.io/stacks) - Available components and how to configure them
+- [Stacks Documentation](https://docs.zenml.io/user-guide/production-guide/understand-stacks) - Complete guide to configuring and managing stacks
+- [Service Connectors Documentation](https://docs.zenml.io/how-to/auth-management/service-connectors-guide) - How to set up authentication to cloud providers
+
+### 4. Set Up Users & Teams
+
+Manage users through the control plane:
+- Invite team members via email
+- Assign roles and permissions
+- Create teams for different departments
+- Configure workspace access
+
+
+## Organizational Structure
+
+### Recommended Hierarchy
+
+```mermaid
+graph TB
+ subgraph cp["Control Plane (ZenML Infrastructure)"]
+ ORG[Organization]
+ PT[Platform Team Org Admins]
+ end
+
+ subgraph infra["Your Infrastructure"]
+ subgraph ws1["DS Team 1 Workspace"]
+ W1[ZenML Server Metadata DB]
+ T1[Team Members]
+ S1[Stacks managed by Platform Team]
+ end
+
+ subgraph ws2["DS Team 2 Workspace"]
+ W2[ZenML Server Metadata DB]
+ T2[Team Members]
+ S2[Stacks managed by Platform Team]
+ end
+ end
+
+ ORG --> PT
+ PT -.->|Configure & Manage| S1
+ PT -.->|Configure & Manage| S2
+ PT -.->|Cross-workspace Admin Access| W1
+ PT -.->|Cross-workspace Admin Access| W2
+
+ T1 -->|Use Stacks Run Pipelines Create Projects| W1
+ T2 -->|Use Stacks Run Pipelines Create Projects| W2
+
+ style cp fill:#e1f5ff
+ style infra fill:#f0f0f0
+ style PT fill:#ffd700
+ style T1 fill:#98fb98
+ style T2 fill:#98fb98
+```
+
+**Access Model:**
+- **Platform Team**: Organization admins with cross-workspace access. They configure and manage stacks, service connectors, and infrastructure across all workspaces
+- **DS/ML Teams**: Limited workspace-level access. Can use pre-configured stacks to run pipelines, create projects, and manage workspace-level secrets, but cannot modify stack configurations or global settings
+- **Workspace Isolation**: Each workspace runs independently in your infrastructure with its own ZenML server and metadata store
+
+## Cost Considerations
+
+### Infrastructure Costs
+You control costs by managing:
+- Compute resources (scale up/down as needed)
+- Storage (artifact stores, databases)
+- Networking (data transfer, load balancers)
+- Backups and disaster recovery
+
+### ZenML Costs
+ZenML provides:
+- Control plane management (included)
+- Professional support (included)
+- Regular updates and security patches
+- Usage-based pricing per workspace
+
+## Security Documentation
+
+For software deployed on your infrastructure, ZenML provides:
+
+- **Vulnerability Assessment Reports**: Comprehensive security analysis available on request
+- **Software Bill of Materials (SBOM)**: Complete dependency inventory for compliance
+- **Compliance documentation**: Support for your security audits and certifications
+- **Architecture review**: Security team consultation for deployment planning
+
+Contact [cloud@zenml.io](mailto:cloud@zenml.io) to request security documentation.
+
+## Monitoring & Maintenance
+
+### Control Plane (ZenML Managed)
+- ✅ Automatic updates
+- ✅ Security patches
+- ✅ Uptime monitoring
+- ✅ Backup and recovery
+
+### Workspaces (Your Responsibility)
+- Database maintenance and backups
+- Workspace version updates (with ZenML guidance)
+- Infrastructure scaling
+- Resource monitoring
+
+### Support Included
+- Professional support with SLA
+- Architecture consultation
+- Migration assistance
+- Security advisory updates
+
+## Comparison with Other Deployments
+
+| Feature | SaaS | Hybrid SaaS | Self-hosted |
+|---------|------|-------------|------------|
+| Setup Time | Minutes | Hours to Days | Days to Weeks |
+| Metadata Location | ZenML Infra | Your Infra | Your Infra |
+| Secret Management | ZenML or Yours | Your Infra | Your Infra |
+| User Management | ZenML Managed | ZenML Managed | Self-Managed |
+| Maintenance | Zero | Workspace Only | Full Stack |
+| Control | Minimal | Moderate | Complete |
+| Best For | Fast start | Security + Convenience | Strictest compliance |
+
+[Compare all deployment options →](README.md#deployment-scenarios)
+
+## Migration Paths
+
+### From ZenML OSS
+1. Deploy a ZenML Pro-compatible workspace in your own infrastructure (you can start from your existing ZenML OSS workspace deployment).
+ - **Update your Docker image**: Replace the OSS ZenML server image with the latest Pro Hybrid image provided by ZenML.
+ - **Set required environment variables**: Add or update environment variables according to the ZenML Pro documentation (for example: `ZENML_PRO_CONTROL_PLANE_URL`, `ZENML_PRO_CONTROL_PLANE_CLIENT_ID`, secrets, and SSO configuration as instructed by ZenML).
+ - **Restart your deployment** to apply these changes.
+2. Migrate users and teams
+5. Run `zenml login` to authenticate via [cloud.zenml.io](https://cloud.zenml.io) and connect your SDK clients to the new workspace
+
+### From SaaS to Hybrid
+
+If you're interested in migrating from the ZenML Pro SaaS deployment to a Hybrid SaaS setup, we're here to help guide you through every step of the process. Because migration paths can vary depending on your organization’s size, data residency requirements, and current ZenML setup, we recommend discussing your plans with a ZenML solutions architect.
+
+**Next steps:**
+
+- [Book a migration consultation →](https://www.zenml.io/book-your-demo)
+- Or email us at [cloud@zenml.io](mailto:cloud@zenml.io)
+
+Your ZenML representative will provide you with a tailored migration checklist, technical documentation, and direct support to ensure a smooth transition with minimal downtime.
+
+
+### Between Workspaces
+
+A workspace deep copy feature for migrating pipelines and artifacts between workspaces is coming soon.
+
+## Detailed Architecture Diagram
+
+
+
+## Related Resources
+
+- [System Architecture Overview](../system-architectures.md#zenml-pro-hybrid-saas)
+- [Deployment Scenarios Overview](deployments-overview.md)
+- [SaaS Deployment](saas-deployment.md)
+- [Self-hosted Deployment](self-hosted-deployment.md)
+- [Workload Managers](workload-managers.md)
+- [Self-hosted Deployment Guide](self-hosted.md)
+- [Workspaces](workspaces.md)
+- [Organizations](organization.md)
+
+## Get Started
+
+Ready to deploy ZenML Pro in Hybrid mode?
+
+[Book a Demo](https://www.zenml.io/book-your-demo){ .md-button .md-button--primary }
+
+Have questions? [Contact us](mailto:cloud@zenml.io) or check out our [documentation](https://docs.zenml.io).
diff --git a/docs/book/getting-started/zenml-pro/on-prem-deployment-helm.md b/docs/book/getting-started/zenml-pro/on-prem-deployment-helm.md
new file mode 100644
index 00000000000..b47c88c839d
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/on-prem-deployment-helm.md
@@ -0,0 +1,863 @@
+---
+description: Deploy ZenML Pro Air-gapped on Kubernetes with Helm - complete self-hosted setup with no external dependencies.
+layout:
+ title:
+ visible: true
+ description:
+ visible: true
+ tableOfContents:
+ visible: true
+ outline:
+ visible: true
+ pagination:
+ visible: true
+---
+
+# Self-hosted Deployment on Kubernetes with Helm
+
+This guide provides step-by-step instructions for deploying ZenML Pro in a fully air-gapped setup on Kubernetes using Helm charts. In an air-gapped deployment, all components run within your infrastructure with zero external dependencies.
+
+## Architecture Overview
+
+All components run entirely within your Kubernetes cluster and infrastructure:
+
+```
+┌──────────────────────────────────────────────────┐
+│ Your Air-gapped Infrastructure │
+│ │
+│ ┌────────────────────────────────────────────┐ │
+│ │ Kubernetes Cluster │ │
+│ │ │ │
+│ │ ┌─────────────────────────────────────┐ │ │
+│ │ │ ZenML Pro Control Plane │ │ │
+│ │ │ - Authentication & Authorization │ │ │
+│ │ │ - RBAC Management │ │ │
+│ │ │ - Dashboard │ │ │
+│ │ └─────────────────────────────────────┘ │ │
+│ │ │ │
+│ │ ┌─────────────────────────────────────┐ │ │
+│ │ │ ZenML Workspace Servers │ │ │
+│ │ │ (one or more) │ │ │
+│ │ └─────────────────────────────────────┘ │ │
+│ │ │ │
+│ │ ┌─────────────────────────────────────┐ │ │
+│ │ │ Load Balancer / Ingress │ │ │
+│ │ │ (HTTPS with internal CA) │ │ │
+│ │ └─────────────────────────────────────┘ │ │
+│ └────────────────────────────────────────────┘ │
+│ │
+│ ┌────────────────────────────────────────────┐ │
+│ │ PostgreSQL Database │ │
+│ │ (for metadata storage) │ │
+│ └────────────────────────────────────────────┘ │
+│ │
+│ ┌────────────────────────────────────────────┐ │
+│ │ Internal Docker Registry │ │
+│ │ (for container images) │ │
+│ └────────────────────────────────────────────┘ │
+│ │
+│ ┌────────────────────────────────────────────┐ │
+│ │ Object Storage / NFS │ │
+│ │ (for artifacts & backups) │ │
+│ └────────────────────────────────────────────┘ │
+│ │
+└──────────────────────────────────────────────────┘
+ 🔒 Completely Isolated - No External Access
+```
+
+## Prerequisites
+
+Before starting, you need:
+
+**Infrastructure:**
+- Kubernetes cluster (1.24+) within your air-gapped network
+- PostgreSQL database (12+) for metadata storage
+- Internal Docker registry (Harbor, Quay, Artifactory, etc.)
+- Load balancer or Ingress controller for HTTPS
+- NFS or object storage for artifacts (optional)
+
+**Network:**
+- Internal DNS resolution
+- TLS certificates signed by your internal CA
+- Network connectivity between cluster components
+
+**Tools (on a machine with internet access for initial setup):**
+- Docker
+- Helm (3.0+)
+- Access to pull ZenML Pro images from private registries (credentials from ZenML)
+
+## Step 1: Prepare Offline Artifacts
+
+This step is performed on a machine with internet access, then transferred to your air-gapped environment.
+
+### 1.1 Pull Container Images
+
+On a machine with internet access and access to the ZenML Pro container registries:
+
+1. Authenticate to the ZenML Pro container registries (AWS ECR or GCP Artifact Registry)
+ - Use credentials provided by ZenML Support
+ - Follow registry-specific authentication procedures
+
+2. Pull all required images:
+ - **Pro Control Plane images:**
+ - `715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api:`
+ - `715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard:`
+ - **Workspace Server image:**
+ - `715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:`
+ - **Client image (for pipelines):**
+ - `zenmldocker/zenml:`
+
+ Example pull commands:
+ ```bash
+ docker pull 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api:
+ docker pull 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard:
+ docker pull 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:
+ docker pull zenmldocker/zenml:
+ ```
+
+3. Tag images with your internal registry:
+ ```
+ internal-registry.mycompany.com/zenml/zenml-pro-api:version
+ internal-registry.mycompany.com/zenml/zenml-pro-dashboard:version
+ internal-registry.mycompany.com/zenml/zenml-pro-server:version
+ internal-registry.mycompany.com/zenml/zenml:version
+ ```
+
+4. Save images to tar files for transfer:
+ ```bash
+ docker save 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api: > zenml-pro-api.tar
+ docker save 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard: > zenml-pro-dashboard.tar
+ docker save 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server: > zenml-pro-server.tar
+ docker save zenmldocker/zenml: > zenml-client.tar
+ ```
+
+### 1.2 Download Helm Charts
+
+On the same machine with internet access:
+
+1. Pull the Helm charts:
+ - ZenML Pro Control Plane: `oci://public.ecr.aws/zenml/zenml-pro`
+ - ZenML Workspace Server: `oci://public.ecr.aws/zenml/zenml`
+
+2. Save charts as `.tgz` files for transfer
+
+### 1.3 Create Offline Bundle
+
+Create a bundle containing all artifacts:
+
+```
+zenml-air-gapped-bundle/
+├── images/
+│ ├── zenml-pro-api.tar
+│ ├── zenml-pro-dashboard.tar
+│ ├── zenml-pro-server.tar
+│ └── zenml-client.tar
+├── charts/
+│ ├── zenml-pro-.tgz
+│ └── zenml-.tgz
+└── manifest.txt
+```
+
+The manifest should document:
+- All image names and versions
+- Helm chart versions
+- Date of bundle creation
+- Required internal registry URLs
+
+## Step 2: Transfer to Air-gapped Environment
+
+Transfer the bundle to your air-gapped environment using approved methods:
+- Physical media (USB drive, external drive)
+- Approved secure file transfer system
+- Air-gap transfer appliances
+- Any method compliant with your security policies
+
+## Step 3: Load Images into Internal Registry
+
+In your air-gapped environment, load the images:
+
+1. Extract all tar files:
+ ```
+ cd images/
+ for file in *.tar; do docker load < "$file"; done
+ ```
+
+2. Tag images for your internal registry:
+ ```
+ docker tag 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api:version internal-registry.mycompany.com/zenml/zenml-pro-api:version
+ docker tag 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard:version internal-registry.mycompany.com/zenml/zenml-pro-dashboard:version
+ docker tag 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:version internal-registry.mycompany.com/zenml/zenml-pro-server:version
+ docker tag zenmldocker/zenml:version internal-registry.mycompany.com/zenml/zenml:version
+ ```
+
+3. Push images to your internal registry:
+ ```
+ docker push internal-registry.mycompany.com/zenml/zenml-pro-api:version
+ docker push internal-registry.mycompany.com/zenml/zenml-pro-dashboard:version
+ docker push internal-registry.mycompany.com/zenml/zenml-pro-server:version
+ docker push internal-registry.mycompany.com/zenml/zenml:version
+ ```
+
+## Step 4: Create Kubernetes Namespace and Secrets
+
+```bash
+# Create namespace for ZenML Pro
+kubectl create namespace zenml-pro
+
+# Create secret for internal registry credentials (if needed)
+kubectl -n zenml-pro create secret docker-registry internal-registry-secret \
+ --docker-server=internal-registry.mycompany.com \
+ --docker-username= \
+ --docker-password=
+
+# Create secret for TLS certificate
+kubectl -n zenml-pro create secret tls zenml-tls \
+ --cert=/path/to/tls.crt \
+ --key=/path/to/tls.key
+```
+
+## Step 5: Set Up Databases
+
+Create database instances (within your air-gapped network):
+
+**Important Database Support:**
+- **Control Plane**: Supports both PostgreSQL and MySQL
+- **Workspace Servers**: Only support MySQL (PostgreSQL is not supported)
+
+**Configuration:**
+- **Accessibility**: Reachable from your Kubernetes cluster
+- **Databases**: At least 2 (one for control plane, one for workspace)
+- **Users**: Create dedicated database users with permissions
+- **Backups**: Configure automated backups to local storage
+- **Monitoring**: Enable local log aggregation
+
+**Connection strings needed for later:**
+- Control Plane DB (PostgreSQL or MySQL): `postgresql://user:password@db-host:5432/zenml_pro` or `mysql://user:password@db-host:3306/zenml_pro`
+- Workspace DB (MySQL only): `mysql://user:password@db-host:3306/zenml_workspace`
+
+## Step 6: Configure Helm Values for Control Plane
+
+Create a file `zenml-pro-values.yaml`:
+
+```yaml
+# ZenML Pro Control Plane Values
+
+zenml:
+ # Image configuration - use your internal registry
+ image:
+ api:
+ repository: internal-registry.mycompany.com/zenml/zenml-pro-api
+ tag: "" # e.g., "0.10.24"
+ dashboard:
+ repository: internal-registry.mycompany.com/zenml/zenml-pro-dashboard
+ tag: "" # e.g., "0.10.24"
+
+ # Server URL - use your internal domain
+ serverURL: https://zenml-pro.internal.mycompany.com
+
+ # Database for Control Plane
+ database:
+ external:
+ type: postgresql
+ host: postgres.internal.mycompany.com
+ port: 5432
+ username: zenml_pro_user
+ password:
+ database: zenml_pro
+
+ # Ingress configuration
+ ingress:
+ enabled: true
+ className: nginx # or your ingress controller
+ host: zenml-pro.internal.mycompany.com
+ tls:
+ enabled: true
+ secretName: zenml-tls
+
+ # Authentication (no external IdP needed for air-gap)
+ auth:
+ password:
+
+ # Resource constraints
+ resources:
+ requests:
+ cpu: 500m
+ memory: 1Gi
+ limits:
+ cpu: 2000m
+ memory: 4Gi
+
+# Image pull secrets for internal registry
+imagePullSecrets:
+ - name: internal-registry-secret
+
+# Pod security context
+podSecurityContext:
+ fsGroup: 1000
+ runAsNonRoot: true
+ runAsUser: 1000
+```
+
+## Step 7: Deploy ZenML Pro Control Plane
+
+Using the local Helm chart:
+
+```bash
+helm install zenml-pro ./zenml-pro-.tgz \
+ --namespace zenml-pro \
+ --values zenml-pro-values.yaml
+```
+
+Verify deployment:
+
+```bash
+kubectl -n zenml-pro get pods
+kubectl -n zenml-pro get svc
+kubectl -n zenml-pro get ingress
+```
+
+Wait for all pods to be running and healthy.
+
+## Step 8: Enroll Workspace in Control Plane
+
+Before deploying the workspace server, you must enroll it in the control plane to obtain the necessary enrollment credentials.
+
+1. **Access the Control Plane Dashboard**
+ - Navigate to `https://zenml-pro.internal.mycompany.com`
+ - Log in with your admin credentials
+
+2. **Create an Organization** (if not already created)
+ - Go to Organization settings
+ - Create a new organization or use an existing one
+ - Note the Organization ID and Name
+
+3. **Enroll the Workspace**
+ - Use the enrollment script from the [Self-hosted Deployment Guide](self-hosted.md#enrolling-a-workspace) or
+ - Create a workspace through the dashboard and obtain:
+ - Enrollment Key
+ - Organization ID
+ - Organization Name
+ - Workspace ID
+ - Workspace Name
+
+4. **Save these values** - you'll need them in the next step
+
+## Step 9: Configure Helm Values for Workspace Server
+
+Create a file `zenml-workspace-values.yaml`:
+
+```yaml
+zenml:
+ # Image configuration - use your internal registry
+ image:
+ repository: internal-registry.mycompany.com/zenml/zenml-pro-server
+ tag: "" # e.g., "0.73.0"
+
+ # Server URL
+ serverURL: https://zenml-workspace.internal.mycompany.com
+
+ # Database for Workspace
+ # Note: Workspace servers only support MySQL, not PostgreSQL
+ database:
+ external:
+ type: mysql
+ host: mysql.internal.mycompany.com
+ port: 3306
+ username: zenml_workspace_user
+ password:
+ database: zenml_workspace
+
+ # Pro configuration - connect to local control plane
+ pro:
+ enabled: true
+ apiURL: https://zenml-pro.internal.mycompany.com/api/v1
+ dashboardURL: https://zenml-pro.internal.mycompany.com
+ enrollmentKey:
+ organizationID:
+ organizationName:
+ workspaceID:
+ workspaceName:
+
+ # Ingress configuration
+ ingress:
+ enabled: true
+ className: nginx
+ host: zenml-workspace.internal.mycompany.com
+ tls:
+ enabled: true
+ secretName: zenml-tls
+
+ # Resource constraints
+ resources:
+ requests:
+ cpu: 250m
+ memory: 512Mi
+ limits:
+ cpu: 1000m
+ memory: 2Gi
+
+# Image pull secrets
+imagePullSecrets:
+ - name: internal-registry-secret
+
+# Pod security context
+podSecurityContext:
+ fsGroup: 1000
+ runAsNonRoot: true
+ runAsUser: 1000
+```
+
+## Step 10: Deploy ZenML Workspace Server
+
+```bash
+# Create namespace
+kubectl create namespace zenml-workspace
+
+# Deploy workspace
+helm install zenml ./zenml-.tgz \
+ --namespace zenml-workspace \
+ --values zenml-workspace-values.yaml
+```
+
+Verify deployment:
+
+```bash
+kubectl -n zenml-workspace get pods
+kubectl -n zenml-workspace get svc
+kubectl -n zenml-workspace get ingress
+```
+
+## Step 11: Configure Internal DNS
+
+Update your internal DNS to resolve:
+- `zenml-pro.internal.mycompany.com` → Your ALB/Ingress IP
+- `zenml-workspace.internal.mycompany.com` → Your ALB/Ingress IP
+
+## Step 12: Install Internal CA Certificate
+
+On all client machines that will access ZenML:
+
+1. Obtain your internal CA certificate
+2. Install it in the system certificate store:
+ - **Linux**: Copy to `/usr/local/share/ca-certificates/` and run `update-ca-certificates`
+ - **macOS**: Use `sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain `
+ - **Windows**: Use `certutil -addstore "Root" cert.pem`
+
+3. For Python/ZenML client:
+ ```bash
+ export REQUESTS_CA_BUNDLE=/path/to/ca-bundle.crt
+ ```
+
+4. For containerized pipelines, include the CA certificate in your custom ZenML image
+
+## Step 13: Verify the Deployment
+
+1. **Check Control Plane Health**
+ ```bash
+ curl -k https://zenml-pro.internal.mycompany.com/health
+ ```
+
+2. **Check Workspace Health**
+ ```bash
+ curl -k https://zenml-workspace.internal.mycompany.com/health
+ ```
+
+3. **Access the Dashboard**
+ - Navigate to `https://zenml-pro.internal.mycompany.com` in your browser
+ - Log in with admin credentials
+
+4. **Check Logs**
+ ```bash
+ kubectl -n zenml-pro logs deployment/zenml-pro
+ kubectl -n zenml-workspace logs deployment/zenml
+ ```
+
+## Step 14: (Optional) Enable Snapshot Support / Workload Manager
+
+Pipeline snapshots (running pipelines from the dashboard) require additional configuration:
+
+### 1. Create Kubernetes Resources for Workload Manager
+
+Create a dedicated namespace and service account for runner jobs:
+
+```bash
+# Create namespace
+kubectl create namespace zenml-workload-manager
+
+# Create service account
+kubectl -n zenml-workload-manager create serviceaccount zenml-runner
+
+# Create role with permissions to create jobs and access registry
+# (Specific permissions depend on your implementation choice below)
+```
+
+### 2. Choose Implementation
+
+**Option A: Kubernetes Implementation (Simplest)**
+
+Use the built-in Kubernetes implementation for running snapshots:
+
+```yaml
+zenml:
+ environment:
+ ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.kubernetes_workload_manager.KubernetesWorkloadManager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+```
+
+**Option B: GCP Implementation (if using GCP)**
+
+For GCP-specific features:
+
+```yaml
+zenml:
+ environment:
+ ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.gcp_kubernetes_workload_manager.GCPKubernetesWorkloadManager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY: /zenml
+```
+
+### 3. Configure Runner Image
+
+Choose how runner images are managed:
+
+**Option A: Use Pre-built Runner Image (Simpler for Air-gap)**
+
+```yaml
+zenml:
+ environment:
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "false"
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_RUNNER_IMAGE: internal-registry.mycompany.com/zenml/zenml:
+```
+
+Pre-build your runner image and push to your internal registry.
+
+**Option B: Have ZenML Build Runner Images**
+
+Requires access to internal Docker registry with push permissions:
+
+```yaml
+zenml:
+ environment:
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY: internal-registry.mycompany.com/zenml
+```
+
+### 4. Configure Pod Resources and Policies
+
+```yaml
+zenml:
+ environment:
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES: '{"requests": {"cpu": "100m", "memory": "400Mi"}, "limits": {"memory": "700Mi"}}'
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED: 86400 # 1 day
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NODE_SELECTOR: '{"node-pool": "compute"}'
+ ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS: 2
+```
+
+### 5. Update Workspace Deployment
+
+Update your workspace server Helm values with workload manager configuration and redeploy:
+
+```bash
+helm upgrade zenml ./zenml-.tgz \
+ --namespace zenml-workspace \
+ --values zenml-workspace-values.yaml
+```
+
+## Step 15: Create Users and Organizations
+
+In the ZenML Pro dashboard:
+
+1. Create an organization
+2. Create users for your team
+3. Assign roles and permissions
+4. Configure teams
+
+## Network Requirements Summary
+
+| Traffic | Source | Destination | Port | Direction |
+|---------|--------|-------------|------|-----------|
+| Web Access | Client Machines | Ingress Controller | 443 | Inbound |
+| API Access | ZenML Client | Workspace Server | 443 | Inbound |
+| Database | Kubernetes Pods | PostgreSQL | 5432 | Outbound |
+| Registry | Kubernetes | Internal Registry | 443 | Outbound |
+| Inter-service | Kubernetes Internal | Kubernetes Services | 443 | Internal |
+
+## Scaling & High Availability
+
+### Multiple Control Plane Replicas
+
+```yaml
+zenml:
+ replicaCount: 3
+```
+
+### Multiple Workspace Replicas
+
+```yaml
+zenml:
+ replicaCount: 2
+```
+
+### Pod Disruption Budgets
+
+Protect against accidental disruptions:
+
+```yaml
+podDisruptionBudget:
+ enabled: true
+ minAvailable: 1
+```
+
+### Database Replication
+
+For HA, configure PostgreSQL streaming replication:
+1. Set up a standby database
+2. Configure continuous archiving
+3. Test failover procedures
+
+## Backup & Recovery
+
+### Automated Backups
+
+Configure automated PostgreSQL backups:
+- **Frequency**: Daily or more frequent
+- **Retention**: 30+ days
+- **Location**: Internal storage (not external)
+- **Testing**: Test restore procedures regularly
+
+### Backup Checklist
+
+1. Database backups (automated)
+2. Configuration backups (values.yaml files, versioned)
+3. TLS certificates (secure storage)
+4. Custom CA certificate (backup copy)
+5. Helm chart versions (archived)
+
+### Recovery Procedure
+
+Documented recovery procedure should cover:
+1. Database restoration steps
+2. Helm redeployment steps
+3. Data validation after restore
+4. User communication plan
+
+## Monitoring & Logging
+
+### Internal Monitoring
+
+Set up internal monitoring for:
+- CPU and memory usage
+- Pod restart count
+- Database connection count
+- Ingress error rates
+- Certificate expiration dates
+
+### Log Aggregation
+
+Forward logs to your internal log aggregation system:
+- Application logs from ZenML pods
+- Ingress logs
+- Database logs
+- Kubernetes events
+
+### Alerting
+
+Create alerts for:
+- Pod failures
+- High resource usage
+- Database connection errors
+- Certificate near expiration
+- Disk space warnings
+
+## Maintenance
+
+### Regular Tasks
+
+- Monitor disk space (databases, artifact storage)
+- Review and manage user access
+- Update internal CA certificate before expiration
+- Test backup and recovery procedures
+- Monitor pod logs for warnings
+
+### Periodic Updates
+
+When updating to a new ZenML version:
+
+1. Pull new images on internet-connected machine
+2. Push to internal registry
+3. Create new offline bundle with updated Helm charts
+4. Transfer bundle to air-gapped environment
+5. Update Helm charts in air-gapped environment
+6. Update image tags in values.yaml
+7. Perform helm upgrade on control plane
+8. Perform helm upgrade on workspace servers
+9. Verify health after upgrade
+10. Update client images in your custom ZenML container
+
+## Troubleshooting
+
+### Pods Won't Start
+
+Check pod logs and events:
+```bash
+kubectl -n zenml-pro describe pod zenml-pro-xxxxx
+kubectl -n zenml-pro logs zenml-pro-xxxxx
+```
+
+Common issues:
+- Image pull failures (check registry access)
+- Database connectivity (verify connection string)
+- Certificate issues (verify CA is trusted)
+
+### Database Connection Failed
+
+```bash
+# Test from pod
+kubectl -n zenml-pro exec -it zenml-pro-xxxxx -- \
+ psql -h postgres.internal.mycompany.com -U zenml_pro_user -d zenml_pro
+```
+
+### Can't Access via HTTPS
+
+1. Verify certificate validity
+2. Verify DNS resolution
+3. Check Ingress status
+4. Verify CA certificate is installed on client
+
+### Image Pull Errors
+
+1. Verify images are in internal registry
+2. Check registry credentials in secret
+3. Verify imagePullSecrets configured correctly
+
+## Day 2 Operations: Updates and Upgrades
+
+### Receiving New Versions
+
+When new ZenML versions are released:
+
+1. **Request offline bundle** from ZenML Support containing:
+ - Updated container images
+ - Updated Helm charts
+ - Release notes and migration guide
+ - Vulnerability assessment (if applicable)
+
+2. **Review release notes** for:
+ - Breaking changes
+ - Database migration requirements
+ - New features and configuration options
+ - Security updates
+
+3. **Transfer bundle** to your air-gapped environment using approved methods
+
+### Upgrade Process
+
+1. **Backup current state:**
+ - Database backup
+ - Values.yaml files
+ - TLS certificates
+
+2. **Update container images in internal registry:**
+ - Extract and load new images
+ - Tag and push to your internal registry
+
+3. **Update Helm charts:**
+ - Extract new chart versions
+ - Review any changes to values schema
+
+4. **Upgrade control plane first:**
+ ```bash
+ helm upgrade zenml-pro ./zenml-pro-.tgz \
+ --namespace zenml-pro \
+ --values zenml-pro-values.yaml
+ ```
+
+5. **Verify control plane:**
+ - Check pod status
+ - Review logs
+ - Test connectivity
+
+6. **Upgrade workspace servers:**
+ ```bash
+ helm upgrade zenml ./zenml-.tgz \
+ --namespace zenml-workspace \
+ --values zenml-workspace-values.yaml
+ ```
+
+7. **Verify workspaces:**
+ - Check all pods are running
+ - Review logs
+ - Run health checks
+ - Test dashboard access
+
+### Database Migrations
+
+Some updates may require database migrations:
+
+1. **Review migration guide** in release notes
+2. **Back up database** before upgrading
+3. **Monitor logs** for any migration-related errors
+4. **Verify data integrity** after upgrade
+5. **Test key features** (workspace access, pipeline runs, etc.)
+
+## Disaster Recovery & Backup Strategy
+
+### Backup Components
+
+Regular backups should include:
+
+1. **PostgreSQL Databases:**
+ - Schedule automated backups (daily minimum)
+ - Test restore procedures regularly
+ - Store backups in a different location (second disk, external storage)
+ - Retain for 30+ days
+
+2. **Configuration:**
+ - Version control Helm values files
+ - Store TLS certificates securely
+ - Document any manual customizations
+
+3. **Container Images:**
+ - Keep copies of all images used
+ - Maintain manifest of images and versions
+
+### Recovery Procedures
+
+Document and test:
+
+1. **Database Recovery:**
+ - Steps to restore from backup
+ - Verification procedures
+ - Estimated recovery time
+
+2. **Full Cluster Recovery:**
+ - How to redeploy from scratch
+ - Image and chart preparation
+ - Restore order (databases first, then control plane, then workspaces)
+
+3. **Partial Recovery:**
+ - Recovering single workspace
+ - Recovering specific components
+
+## Related Resources
+
+- [Self-hosted Deployment Overview](self-hosted-deployment.md)
+- [Self-hosted Deployment Guide](self-hosted.md) - Comprehensive deployment reference
+- [Kubernetes Documentation](https://kubernetes.io/docs/)
+- [PostgreSQL Documentation](https://www.postgresql.org/docs/)
+- [Helm Documentation](https://helm.sh/docs/)
+
+## Support
+
+For air-gapped deployments, contact ZenML Support:
+- Email: [cloud@zenml.io](mailto:cloud@zenml.io)
+- Provide: Your offline bundle, deployment status, and any error logs
+
+Request from ZenML Support:
+- Pre-deployment architecture consultation
+- Offline support packages
+- Update bundles and release notes
+- Security documentation (SBOM, vulnerability reports)
diff --git a/docs/book/getting-started/zenml-pro/saas-deployment.md b/docs/book/getting-started/zenml-pro/saas-deployment.md
new file mode 100644
index 00000000000..b2114366fab
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/saas-deployment.md
@@ -0,0 +1,187 @@
+---
+description: Learn about ZenML Pro SaaS deployment - the fastest way to get started with production-ready MLOps.
+icon: cloud
+---
+
+# SaaS Deployment
+
+ZenML Pro SaaS is the fastest and easiest way to get started with enterprise-grade MLOps. With zero infrastructure setup required, you can be running production pipelines within minutes while maintaining full control over your data and compute resources.
+
+{% hint style="info" %}
+To get access to ZenML Pro, [book a call](https://www.zenml.io/book-your-demo).
+{% endhint %}
+
+## Overview
+
+In a SaaS deployment, ZenML manages all server infrastructure while your sensitive data and compute resources remain in your own cloud environment. This architecture provides the fastest time-to-value while maintaining data sovereignty for your ML workloads.
+
+
+
+## Architecture
+
+### What Runs Where
+
+| Component | Location | Purpose |
+|-----------|----------|---------|
+| **ZenML Pro Server** | ZenML Infrastructure | Manages pipeline orchestration and metadata |
+| **Pro Control Plane** | ZenML Infrastructure | Handles authentication, RBAC, and workspace management |
+| **Metadata Store** | ZenML Infrastructure | Stores pipeline runs, model metadata, and tracking information |
+| **Secrets Store** | ZenML Infrastructure (default) | Stores credentials for accessing your infrastructure |
+| **Compute Resources** | Your infrastructure through [stacks](https://docs.zenml.io/stacks) | Executes pipeline steps and training jobs |
+| **Data & Artifacts** | Your infrastructure through [stacks](https://docs.zenml.io/stacks) | Stores datasets, models, and pipeline artifacts |
+
+### Data Flow
+
+For a detailed explanation of the common pipeline execution data flow across all deployment scenarios, see [Common Pipeline Execution Data Flow](deployments-overview.md#common-pipeline-execution-data-flow) in the Deployment Scenarios Overview.
+
+{% hint style="success" %}
+**Your ML data never leaves your infrastructure.** Only metadata about runs and pipelines is stored on ZenML infrastructure.
+{% endhint %}
+
+## Key Benefits
+
+### ⚡ Fastest Setup
+- **Minutes to production**: No infrastructure provisioning required for ZenML services
+- **Low maintenance**: Updates and patches handled automatically
+- **Instant scaling**: Infrastructure scales with your needs
+
+### 🛡️ Security & Compliance
+- **SOC 2 Type II certified**: Enterprise-grade security controls
+- **ISO 27001 certified**: International security management standards
+- **Data sovereignty**: Your ML data stays in your infrastructure
+- **Encrypted communications**: All data in transit is encrypted
+- **Custom secret stores**: Optionally use your own secret management solution
+
+### 🚀 Production Ready from Day 1
+- **High availability**: Built-in redundancy and failover
+- **Automatic backups**: Metadata backed up continuously
+- **Monitoring included**: Health checks and alerting configured
+- **Professional support**: Direct access to ZenML experts
+
+### 👥 Collaboration Features
+- **Multi-user support**: Full team collaboration capabilities
+- **SSO integration**: Connect with your identity provider
+- **Role-based access control**: Granular permissions management
+- **Workspaces & projects**: Organize teams and resources
+
+## Ideal Use Cases
+
+ZenML Pro SaaS is perfect for:
+
+- **Startups and scale-ups** that need production MLOps quickly without infrastructure overhead
+- **Teams without dedicated DevOps** that want managed infrastructure and support
+- **Organizations with existing cloud infrastructure** comfortable with SaaS tools
+- **Teams prioritizing velocity** over complete infrastructure control
+- **POC and pilot projects** that need to demonstrate value quickly
+
+## Secret Management Options
+
+### Default: ZenML-Managed Secrets Store
+
+By default, ZenML Pro SaaS stores your cloud credentials securely in our managed secrets store. This provides:
+- Zero configuration required
+- Automatic encryption at rest and in transit
+- Access controls via RBAC
+
+### Alternative: Customer-Managed Secrets Store
+
+For organizations with strict security requirements, you can configure ZenML to use your own (secrets management)[..deploying-zenml/secret-management] solution:
+- AWS Secrets Manager
+- Google Cloud Secret Manager
+- Azure Key Vault
+- HashiCorp Vault
+
+
+
+This keeps all credentials within your infrastructure while still benefiting from managed ZenML services - [Book a call](https://www.zenml.io/book-your-demo) with us if you want this set up.
+
+## Network Architecture
+
+### Outbound-Only Communication
+
+ZenML Pro SaaS uses outbound-only connections from your infrastructure to ZenML services:
+- No inbound connections required to your infrastructure
+- Limited compatibility with firewall and VPN restrictions
+
+### Artifact Store Access
+
+The ZenML UI requires read access to your artifact store to display:
+- Pipeline visualizations
+- Model comparison views
+- Artifact lineage graphs
+- Step logs and outputs
+
+You control this access by configuring appropriate cloud IAM permissions.
+
+## Getting Started
+
+### 1. Sign Up
+
+[Book a demo](https://www.zenml.io/book-your-demo) to get started with ZenML Pro SaaS.
+
+### 2. Connect Your Cloud
+
+Configure access to your cloud infrastructure:
+- Set up an artifact store (S3, GCS, Azure Blob, etc.)
+- Configure compute resources (AWS, GCP, Azure, or Kubernetes)
+- Provide necessary credentials via secrets
+
+### 3. You're ready to run your pipelines and monitor them through the Frontend
+
+## Security Documentation
+
+For software deployed on your infrastructure, ZenML provides:
+
+- **Vulnerability Assessment Reports**: Comprehensive security analysis available on request
+- **Software Bill of Materials (SBOM)**: Complete dependency inventory for compliance
+- **Compliance documentation**: Support for your security audits and certifications
+
+Contact [cloud@zenml.io](mailto:cloud@zenml.io) to request security documentation.
+
+## Pricing & Support
+
+ZenML Pro SaaS includes:
+- Managed infrastructure and updates
+- Professional support with SLA
+- Regular security patches and updates
+- Access to pro-exclusive features
+- Usage-based pricing model
+
+[Contact us](https://www.zenml.io/book-your-demo) for pricing details and custom plans.
+
+## Comparison with Other Deployments
+
+| Feature | SaaS | Hybrid SaaS | Self-hosted |
+|---------|------|-------------|------------|
+| Setup Time | ⚡ Minutes | Hours | Days |
+| Maintenance | Zero | Workspace only | Full stack |
+| Infrastructure Control | Minimal | Moderate | Complete |
+| Data Sovereignty | Metadata on ZenML | Full | Full |
+| Best For | Fast time-to-value | Security requirements | Strictest compliance |
+
+[Compare all deployment options →](README.md#deployment-scenarios)
+
+## Migration Path
+
+Already running ZenML OSS? Migrating to SaaS is possible with the assistance of the ZenML support team. Reach out to us at hello@zenml.io or on (slack)[https://zenml.io/slack] to learn more.
+
+## Detailed Architecture Diagram
+
+
+
+## Related Resources
+
+- [System Architecture Overview](../system-architectures.md#zenml-pro-saas-architecture)
+- [Deployment Scenarios Overview](deployments-overview.md)
+- [Hybrid SaaS Deployment](hybrid-deployment.md)
+- [Self-hosted Deployment](self-hosted-deployment.md)
+- [Workload Managers](workload-managers.md)
+- [Security & Compliance](README.md#security--compliance)
+
+## Get Started
+
+Ready to get started with ZenML Pro SaaS?
+
+[Book a Demo](https://www.zenml.io/book-your-demo)
+
+Have questions? [Contact us](mailto:cloud@zenml.io) or check out our [documentation](https://docs.zenml.io).
diff --git a/docs/book/getting-started/zenml-pro/self-hosted-deployment-helm.md b/docs/book/getting-started/zenml-pro/self-hosted-deployment-helm.md
new file mode 100644
index 00000000000..7d1eaf956a8
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/self-hosted-deployment-helm.md
@@ -0,0 +1,863 @@
+---
+description: Deploy ZenML Pro Self-hosted on Kubernetes with Helm - complete self-hosted setup with no external dependencies.
+layout:
+ title:
+ visible: true
+ description:
+ visible: true
+ tableOfContents:
+ visible: true
+ outline:
+ visible: true
+ pagination:
+ visible: true
+---
+
+# Self-hosted Deployment on Kubernetes with Helm
+
+This guide provides step-by-step instructions for deploying ZenML Pro in a fully air-gapped setup on Kubernetes using Helm charts. In an air-gapped deployment, all components run within your infrastructure with zero external dependencies.
+
+## Architecture Overview
+
+All components run entirely within your Kubernetes cluster and infrastructure:
+
+```
+┌──────────────────────────────────────────────────┐
+│ Your Air-gapped Infrastructure │
+│ │
+│ ┌────────────────────────────────────────────┐ │
+│ │ Kubernetes Cluster │ │
+│ │ │ │
+│ │ ┌─────────────────────────────────────┐ │ │
+│ │ │ ZenML Pro Control Plane │ │ │
+│ │ │ - Authentication & Authorization │ │ │
+│ │ │ - RBAC Management │ │ │
+│ │ │ - Dashboard │ │ │
+│ │ └─────────────────────────────────────┘ │ │
+│ │ │ │
+│ │ ┌─────────────────────────────────────┐ │ │
+│ │ │ ZenML Workspace Servers │ │ │
+│ │ │ (one or more) │ │ │
+│ │ └─────────────────────────────────────┘ │ │
+│ │ │ │
+│ │ ┌─────────────────────────────────────┐ │ │
+│ │ │ Load Balancer / Ingress │ │ │
+│ │ │ (HTTPS with internal CA) │ │ │
+│ │ └─────────────────────────────────────┘ │ │
+│ └────────────────────────────────────────────┘ │
+│ │
+│ ┌────────────────────────────────────────────┐ │
+│ │ PostgreSQL Database │ │
+│ │ (for metadata storage) │ │
+│ └────────────────────────────────────────────┘ │
+│ │
+│ ┌────────────────────────────────────────────┐ │
+│ │ Internal Docker Registry │ │
+│ │ (for container images) │ │
+│ └────────────────────────────────────────────┘ │
+│ │
+│ ┌────────────────────────────────────────────┐ │
+│ │ Object Storage / NFS │ │
+│ │ (for artifacts & backups) │ │
+│ └────────────────────────────────────────────┘ │
+│ │
+└──────────────────────────────────────────────────┘
+ 🔒 Completely Isolated - No External Access
+```
+
+## Prerequisites
+
+Before starting, you need:
+
+**Infrastructure:**
+- Kubernetes cluster (1.24+) within your air-gapped network
+- PostgreSQL database (12+) for metadata storage
+- Internal Docker registry (Harbor, Quay, Artifactory, etc.)
+- Load balancer or Ingress controller for HTTPS
+- NFS or object storage for artifacts (optional)
+
+**Network:**
+- Internal DNS resolution
+- TLS certificates signed by your internal CA
+- Network connectivity between cluster components
+
+**Tools (on a machine with internet access for initial setup):**
+- Docker
+- Helm (3.0+)
+- Access to pull ZenML Pro images from private registries (credentials from ZenML)
+
+## Step 1: Prepare Offline Artifacts
+
+This step is performed on a machine with internet access, then transferred to your air-gapped environment.
+
+### 1.1 Pull Container Images
+
+On a machine with internet access and access to the ZenML Pro container registries:
+
+1. Authenticate to the ZenML Pro container registries (AWS ECR or GCP Artifact Registry)
+ - Use credentials provided by ZenML Support
+ - Follow registry-specific authentication procedures
+
+2. Pull all required images:
+ - **Pro Control Plane images:**
+ - `715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api:`
+ - `715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard:`
+ - **Workspace Server image:**
+ - `715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:`
+ - **Client image (for pipelines):**
+ - `zenmldocker/zenml:`
+
+ Example pull commands:
+ ```bash
+ docker pull 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api:
+ docker pull 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard:
+ docker pull 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:
+ docker pull zenmldocker/zenml:
+ ```
+
+3. Tag images with your internal registry:
+ ```
+ internal-registry.mycompany.com/zenml/zenml-pro-api:version
+ internal-registry.mycompany.com/zenml/zenml-pro-dashboard:version
+ internal-registry.mycompany.com/zenml/zenml-pro-server:version
+ internal-registry.mycompany.com/zenml/zenml:version
+ ```
+
+4. Save images to tar files for transfer:
+ ```bash
+ docker save 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api: > zenml-pro-api.tar
+ docker save 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard: > zenml-pro-dashboard.tar
+ docker save 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server: > zenml-pro-server.tar
+ docker save zenmldocker/zenml: > zenml-client.tar
+ ```
+
+### 1.2 Download Helm Charts
+
+On the same machine with internet access:
+
+1. Pull the Helm charts:
+ - ZenML Pro Control Plane: `oci://public.ecr.aws/zenml/zenml-pro`
+ - ZenML Workspace Server: `oci://public.ecr.aws/zenml/zenml`
+
+2. Save charts as `.tgz` files for transfer
+
+### 1.3 Create Offline Bundle
+
+Create a bundle containing all artifacts:
+
+```
+zenml-air-gapped-bundle/
+├── images/
+│ ├── zenml-pro-api.tar
+│ ├── zenml-pro-dashboard.tar
+│ ├── zenml-pro-server.tar
+│ └── zenml-client.tar
+├── charts/
+│ ├── zenml-pro-.tgz
+│ └── zenml-.tgz
+└── manifest.txt
+```
+
+The manifest should document:
+- All image names and versions
+- Helm chart versions
+- Date of bundle creation
+- Required internal registry URLs
+
+## Step 2: Transfer to Air-gapped Environment
+
+Transfer the bundle to your air-gapped environment using approved methods:
+- Physical media (USB drive, external drive)
+- Approved secure file transfer system
+- Air-gap transfer appliances
+- Any method compliant with your security policies
+
+## Step 3: Load Images into Internal Registry
+
+In your air-gapped environment, load the images:
+
+1. Extract all tar files:
+ ```
+ cd images/
+ for file in *.tar; do docker load < "$file"; done
+ ```
+
+2. Tag images for your internal registry:
+ ```
+ docker tag 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api:version internal-registry.mycompany.com/zenml/zenml-pro-api:version
+ docker tag 715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard:version internal-registry.mycompany.com/zenml/zenml-pro-dashboard:version
+ docker tag 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:version internal-registry.mycompany.com/zenml/zenml-pro-server:version
+ docker tag zenmldocker/zenml:version internal-registry.mycompany.com/zenml/zenml:version
+ ```
+
+3. Push images to your internal registry:
+ ```
+ docker push internal-registry.mycompany.com/zenml/zenml-pro-api:version
+ docker push internal-registry.mycompany.com/zenml/zenml-pro-dashboard:version
+ docker push internal-registry.mycompany.com/zenml/zenml-pro-server:version
+ docker push internal-registry.mycompany.com/zenml/zenml:version
+ ```
+
+## Step 4: Create Kubernetes Namespace and Secrets
+
+```bash
+# Create namespace for ZenML Pro
+kubectl create namespace zenml-pro
+
+# Create secret for internal registry credentials (if needed)
+kubectl -n zenml-pro create secret docker-registry internal-registry-secret \
+ --docker-server=internal-registry.mycompany.com \
+ --docker-username= \
+ --docker-password=
+
+# Create secret for TLS certificate
+kubectl -n zenml-pro create secret tls zenml-tls \
+ --cert=/path/to/tls.crt \
+ --key=/path/to/tls.key
+```
+
+## Step 5: Set Up Databases
+
+Create database instances (within your air-gapped network):
+
+**Important Database Support:**
+- **Control Plane**: Supports both PostgreSQL and MySQL
+- **Workspace Servers**: Only support MySQL (PostgreSQL is not supported)
+
+**Configuration:**
+- **Accessibility**: Reachable from your Kubernetes cluster
+- **Databases**: At least 2 (one for control plane, one for workspace)
+- **Users**: Create dedicated database users with permissions
+- **Backups**: Configure automated backups to local storage
+- **Monitoring**: Enable local log aggregation
+
+**Connection strings needed for later:**
+- Control Plane DB (PostgreSQL or MySQL): `postgresql://user:password@db-host:5432/zenml_pro` or `mysql://user:password@db-host:3306/zenml_pro`
+- Workspace DB (MySQL only): `mysql://user:password@db-host:3306/zenml_workspace`
+
+## Step 6: Configure Helm Values for Control Plane
+
+Create a file `zenml-pro-values.yaml`:
+
+```yaml
+# ZenML Pro Control Plane Values
+
+zenml:
+ # Image configuration - use your internal registry
+ image:
+ api:
+ repository: internal-registry.mycompany.com/zenml/zenml-pro-api
+ tag: "" # e.g., "0.10.24"
+ dashboard:
+ repository: internal-registry.mycompany.com/zenml/zenml-pro-dashboard
+ tag: "" # e.g., "0.10.24"
+
+ # Server URL - use your internal domain
+ serverURL: https://zenml-pro.internal.mycompany.com
+
+ # Database for Control Plane
+ database:
+ external:
+ type: postgresql
+ host: postgres.internal.mycompany.com
+ port: 5432
+ username: zenml_pro_user
+ password:
+ database: zenml_pro
+
+ # Ingress configuration
+ ingress:
+ enabled: true
+ className: nginx # or your ingress controller
+ host: zenml-pro.internal.mycompany.com
+ tls:
+ enabled: true
+ secretName: zenml-tls
+
+ # Authentication (no external IdP needed for air-gap)
+ auth:
+ password:
+
+ # Resource constraints
+ resources:
+ requests:
+ cpu: 500m
+ memory: 1Gi
+ limits:
+ cpu: 2000m
+ memory: 4Gi
+
+# Image pull secrets for internal registry
+imagePullSecrets:
+ - name: internal-registry-secret
+
+# Pod security context
+podSecurityContext:
+ fsGroup: 1000
+ runAsNonRoot: true
+ runAsUser: 1000
+```
+
+## Step 7: Deploy ZenML Pro Control Plane
+
+Using the local Helm chart:
+
+```bash
+helm install zenml-pro ./zenml-pro-.tgz \
+ --namespace zenml-pro \
+ --values zenml-pro-values.yaml
+```
+
+Verify deployment:
+
+```bash
+kubectl -n zenml-pro get pods
+kubectl -n zenml-pro get svc
+kubectl -n zenml-pro get ingress
+```
+
+Wait for all pods to be running and healthy.
+
+## Step 8: Enroll Workspace in Control Plane
+
+Before deploying the workspace server, you must enroll it in the control plane to obtain the necessary enrollment credentials.
+
+1. **Access the Control Plane Dashboard**
+ - Navigate to `https://zenml-pro.internal.mycompany.com`
+ - Log in with your admin credentials
+
+2. **Create an Organization** (if not already created)
+ - Go to Organization settings
+ - Create a new organization or use an existing one
+ - Note the Organization ID and Name
+
+3. **Enroll the Workspace**
+ - Use the enrollment script from the [Self-hosted Deployment Guide](self-hosted.md#enrolling-a-workspace) or
+ - Create a workspace through the dashboard and obtain:
+ - Enrollment Key
+ - Organization ID
+ - Organization Name
+ - Workspace ID
+ - Workspace Name
+
+4. **Save these values** - you'll need them in the next step
+
+## Step 9: Configure Helm Values for Workspace Server
+
+Create a file `zenml-workspace-values.yaml`:
+
+```yaml
+zenml:
+ # Image configuration - use your internal registry
+ image:
+ repository: internal-registry.mycompany.com/zenml/zenml-pro-server
+ tag: "" # e.g., "0.73.0"
+
+ # Server URL
+ serverURL: https://zenml-workspace.internal.mycompany.com
+
+ # Database for Workspace
+ # Note: Workspace servers only support MySQL, not PostgreSQL
+ database:
+ external:
+ type: mysql
+ host: mysql.internal.mycompany.com
+ port: 3306
+ username: zenml_workspace_user
+ password:
+ database: zenml_workspace
+
+ # Pro configuration - connect to local control plane
+ pro:
+ enabled: true
+ apiURL: https://zenml-pro.internal.mycompany.com/api/v1
+ dashboardURL: https://zenml-pro.internal.mycompany.com
+ enrollmentKey:
+ organizationID:
+ organizationName:
+ workspaceID:
+ workspaceName:
+
+ # Ingress configuration
+ ingress:
+ enabled: true
+ className: nginx
+ host: zenml-workspace.internal.mycompany.com
+ tls:
+ enabled: true
+ secretName: zenml-tls
+
+ # Resource constraints
+ resources:
+ requests:
+ cpu: 250m
+ memory: 512Mi
+ limits:
+ cpu: 1000m
+ memory: 2Gi
+
+# Image pull secrets
+imagePullSecrets:
+ - name: internal-registry-secret
+
+# Pod security context
+podSecurityContext:
+ fsGroup: 1000
+ runAsNonRoot: true
+ runAsUser: 1000
+```
+
+## Step 10: Deploy ZenML Workspace Server
+
+```bash
+# Create namespace
+kubectl create namespace zenml-workspace
+
+# Deploy workspace
+helm install zenml ./zenml-.tgz \
+ --namespace zenml-workspace \
+ --values zenml-workspace-values.yaml
+```
+
+Verify deployment:
+
+```bash
+kubectl -n zenml-workspace get pods
+kubectl -n zenml-workspace get svc
+kubectl -n zenml-workspace get ingress
+```
+
+## Step 11: Configure Internal DNS
+
+Update your internal DNS to resolve:
+- `zenml-pro.internal.mycompany.com` → Your ALB/Ingress IP
+- `zenml-workspace.internal.mycompany.com` → Your ALB/Ingress IP
+
+## Step 12: Install Internal CA Certificate
+
+On all client machines that will access ZenML:
+
+1. Obtain your internal CA certificate
+2. Install it in the system certificate store:
+ - **Linux**: Copy to `/usr/local/share/ca-certificates/` and run `update-ca-certificates`
+ - **macOS**: Use `sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain `
+ - **Windows**: Use `certutil -addstore "Root" cert.pem`
+
+3. For Python/ZenML client:
+ ```bash
+ export REQUESTS_CA_BUNDLE=/path/to/ca-bundle.crt
+ ```
+
+4. For containerized pipelines, include the CA certificate in your custom ZenML image
+
+## Step 13: Verify the Deployment
+
+1. **Check Control Plane Health**
+ ```bash
+ curl -k https://zenml-pro.internal.mycompany.com/health
+ ```
+
+2. **Check Workspace Health**
+ ```bash
+ curl -k https://zenml-workspace.internal.mycompany.com/health
+ ```
+
+3. **Access the Dashboard**
+ - Navigate to `https://zenml-pro.internal.mycompany.com` in your browser
+ - Log in with admin credentials
+
+4. **Check Logs**
+ ```bash
+ kubectl -n zenml-pro logs deployment/zenml-pro
+ kubectl -n zenml-workspace logs deployment/zenml
+ ```
+
+## Step 14: (Optional) Enable Snapshot Support / Workload Manager
+
+Pipeline snapshots (running pipelines from the dashboard) require additional configuration:
+
+### 1. Create Kubernetes Resources for Workload Manager
+
+Create a dedicated namespace and service account for runner jobs:
+
+```bash
+# Create namespace
+kubectl create namespace zenml-workload-manager
+
+# Create service account
+kubectl -n zenml-workload-manager create serviceaccount zenml-runner
+
+# Create role with permissions to create jobs and access registry
+# (Specific permissions depend on your implementation choice below)
+```
+
+### 2. Choose Implementation
+
+**Option A: Kubernetes Implementation (Simplest)**
+
+Use the built-in Kubernetes implementation for running snapshots:
+
+```yaml
+zenml:
+ environment:
+ ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.kubernetes_workload_manager.KubernetesWorkloadManager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+```
+
+**Option B: GCP Implementation (if using GCP)**
+
+For GCP-specific features:
+
+```yaml
+zenml:
+ environment:
+ ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.gcp_kubernetes_workload_manager.GCPKubernetesWorkloadManager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY: /zenml
+```
+
+### 3. Configure Runner Image
+
+Choose how runner images are managed:
+
+**Option A: Use Pre-built Runner Image (Simpler for Air-gap)**
+
+```yaml
+zenml:
+ environment:
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "false"
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_RUNNER_IMAGE: internal-registry.mycompany.com/zenml/zenml:
+```
+
+Pre-build your runner image and push to your internal registry.
+
+**Option B: Have ZenML Build Runner Images**
+
+Requires access to internal Docker registry with push permissions:
+
+```yaml
+zenml:
+ environment:
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY: internal-registry.mycompany.com/zenml
+```
+
+### 4. Configure Pod Resources and Policies
+
+```yaml
+zenml:
+ environment:
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES: '{"requests": {"cpu": "100m", "memory": "400Mi"}, "limits": {"memory": "700Mi"}}'
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED: 86400 # 1 day
+ ZENML_KUBERNETES_WORKLOAD_MANAGER_NODE_SELECTOR: '{"node-pool": "compute"}'
+ ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS: 2
+```
+
+### 5. Update Workspace Deployment
+
+Update your workspace server Helm values with workload manager configuration and redeploy:
+
+```bash
+helm upgrade zenml ./zenml-.tgz \
+ --namespace zenml-workspace \
+ --values zenml-workspace-values.yaml
+```
+
+## Step 15: Create Users and Organizations
+
+In the ZenML Pro dashboard:
+
+1. Create an organization
+2. Create users for your team
+3. Assign roles and permissions
+4. Configure teams
+
+## Network Requirements Summary
+
+| Traffic | Source | Destination | Port | Direction |
+|---------|--------|-------------|------|-----------|
+| Web Access | Client Machines | Ingress Controller | 443 | Inbound |
+| API Access | ZenML Client | Workspace Server | 443 | Inbound |
+| Database | Kubernetes Pods | PostgreSQL | 5432 | Outbound |
+| Registry | Kubernetes | Internal Registry | 443 | Outbound |
+| Inter-service | Kubernetes Internal | Kubernetes Services | 443 | Internal |
+
+## Scaling & High Availability
+
+### Multiple Control Plane Replicas
+
+```yaml
+zenml:
+ replicaCount: 3
+```
+
+### Multiple Workspace Replicas
+
+```yaml
+zenml:
+ replicaCount: 2
+```
+
+### Pod Disruption Budgets
+
+Protect against accidental disruptions:
+
+```yaml
+podDisruptionBudget:
+ enabled: true
+ minAvailable: 1
+```
+
+### Database Replication
+
+For HA, configure PostgreSQL streaming replication:
+1. Set up a standby database
+2. Configure continuous archiving
+3. Test failover procedures
+
+## Backup & Recovery
+
+### Automated Backups
+
+Configure automated PostgreSQL backups:
+- **Frequency**: Daily or more frequent
+- **Retention**: 30+ days
+- **Location**: Internal storage (not external)
+- **Testing**: Test restore procedures regularly
+
+### Backup Checklist
+
+1. Database backups (automated)
+2. Configuration backups (values.yaml files, versioned)
+3. TLS certificates (secure storage)
+4. Custom CA certificate (backup copy)
+5. Helm chart versions (archived)
+
+### Recovery Procedure
+
+Documented recovery procedure should cover:
+1. Database restoration steps
+2. Helm redeployment steps
+3. Data validation after restore
+4. User communication plan
+
+## Monitoring & Logging
+
+### Internal Monitoring
+
+Set up internal monitoring for:
+- CPU and memory usage
+- Pod restart count
+- Database connection count
+- Ingress error rates
+- Certificate expiration dates
+
+### Log Aggregation
+
+Forward logs to your internal log aggregation system:
+- Application logs from ZenML pods
+- Ingress logs
+- Database logs
+- Kubernetes events
+
+### Alerting
+
+Create alerts for:
+- Pod failures
+- High resource usage
+- Database connection errors
+- Certificate near expiration
+- Disk space warnings
+
+## Maintenance
+
+### Regular Tasks
+
+- Monitor disk space (databases, artifact storage)
+- Review and manage user access
+- Update internal CA certificate before expiration
+- Test backup and recovery procedures
+- Monitor pod logs for warnings
+
+### Periodic Updates
+
+When updating to a new ZenML version:
+
+1. Pull new images on internet-connected machine
+2. Push to internal registry
+3. Create new offline bundle with updated Helm charts
+4. Transfer bundle to air-gapped environment
+5. Update Helm charts in air-gapped environment
+6. Update image tags in values.yaml
+7. Perform helm upgrade on control plane
+8. Perform helm upgrade on workspace servers
+9. Verify health after upgrade
+10. Update client images in your custom ZenML container
+
+## Troubleshooting
+
+### Pods Won't Start
+
+Check pod logs and events:
+```bash
+kubectl -n zenml-pro describe pod zenml-pro-xxxxx
+kubectl -n zenml-pro logs zenml-pro-xxxxx
+```
+
+Common issues:
+- Image pull failures (check registry access)
+- Database connectivity (verify connection string)
+- Certificate issues (verify CA is trusted)
+
+### Database Connection Failed
+
+```bash
+# Test from pod
+kubectl -n zenml-pro exec -it zenml-pro-xxxxx -- \
+ psql -h postgres.internal.mycompany.com -U zenml_pro_user -d zenml_pro
+```
+
+### Can't Access via HTTPS
+
+1. Verify certificate validity
+2. Verify DNS resolution
+3. Check Ingress status
+4. Verify CA certificate is installed on client
+
+### Image Pull Errors
+
+1. Verify images are in internal registry
+2. Check registry credentials in secret
+3. Verify imagePullSecrets configured correctly
+
+## Day 2 Operations: Updates and Upgrades
+
+### Receiving New Versions
+
+When new ZenML versions are released:
+
+1. **Request offline bundle** from ZenML Support containing:
+ - Updated container images
+ - Updated Helm charts
+ - Release notes and migration guide
+ - Vulnerability assessment (if applicable)
+
+2. **Review release notes** for:
+ - Breaking changes
+ - Database migration requirements
+ - New features and configuration options
+ - Security updates
+
+3. **Transfer bundle** to your air-gapped environment using approved methods
+
+### Upgrade Process
+
+1. **Backup current state:**
+ - Database backup
+ - Values.yaml files
+ - TLS certificates
+
+2. **Update container images in internal registry:**
+ - Extract and load new images
+ - Tag and push to your internal registry
+
+3. **Update Helm charts:**
+ - Extract new chart versions
+ - Review any changes to values schema
+
+4. **Upgrade control plane first:**
+ ```bash
+ helm upgrade zenml-pro ./zenml-pro-.tgz \
+ --namespace zenml-pro \
+ --values zenml-pro-values.yaml
+ ```
+
+5. **Verify control plane:**
+ - Check pod status
+ - Review logs
+ - Test connectivity
+
+6. **Upgrade workspace servers:**
+ ```bash
+ helm upgrade zenml ./zenml-.tgz \
+ --namespace zenml-workspace \
+ --values zenml-workspace-values.yaml
+ ```
+
+7. **Verify workspaces:**
+ - Check all pods are running
+ - Review logs
+ - Run health checks
+ - Test dashboard access
+
+### Database Migrations
+
+Some updates may require database migrations:
+
+1. **Review migration guide** in release notes
+2. **Back up database** before upgrading
+3. **Monitor logs** for any migration-related errors
+4. **Verify data integrity** after upgrade
+5. **Test key features** (workspace access, pipeline runs, etc.)
+
+## Disaster Recovery & Backup Strategy
+
+### Backup Components
+
+Regular backups should include:
+
+1. **PostgreSQL Databases:**
+ - Schedule automated backups (daily minimum)
+ - Test restore procedures regularly
+ - Store backups in a different location (second disk, external storage)
+ - Retain for 30+ days
+
+2. **Configuration:**
+ - Version control Helm values files
+ - Store TLS certificates securely
+ - Document any manual customizations
+
+3. **Container Images:**
+ - Keep copies of all images used
+ - Maintain manifest of images and versions
+
+### Recovery Procedures
+
+Document and test:
+
+1. **Database Recovery:**
+ - Steps to restore from backup
+ - Verification procedures
+ - Estimated recovery time
+
+2. **Full Cluster Recovery:**
+ - How to redeploy from scratch
+ - Image and chart preparation
+ - Restore order (databases first, then control plane, then workspaces)
+
+3. **Partial Recovery:**
+ - Recovering single workspace
+ - Recovering specific components
+
+## Related Resources
+
+- [Self-hosted Deployment Overview](self-hosted-deployment.md)
+- [Self-hosted Deployment Guide](self-hosted.md) - Comprehensive deployment reference
+- [Kubernetes Documentation](https://kubernetes.io/docs/)
+- [PostgreSQL Documentation](https://www.postgresql.org/docs/)
+- [Helm Documentation](https://helm.sh/docs/)
+
+## Support
+
+For air-gapped deployments, contact ZenML Support:
+- Email: [cloud@zenml.io](mailto:cloud@zenml.io)
+- Provide: Your offline bundle, deployment status, and any error logs
+
+Request from ZenML Support:
+- Pre-deployment architecture consultation
+- Offline support packages
+- Update bundles and release notes
+- Security documentation (SBOM, vulnerability reports)
diff --git a/docs/book/getting-started/zenml-pro/self-hosted-deployment.md b/docs/book/getting-started/zenml-pro/self-hosted-deployment.md
new file mode 100644
index 00000000000..b343705be70
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/self-hosted-deployment.md
@@ -0,0 +1,412 @@
+---
+description: Learn about ZenML Pro Self-hosted deployment - complete control and data sovereignty for the strictest security requirements.
+icon: shield-halved
+---
+
+# Self-hosted Deployment
+
+ZenML Pro Self-hosted deployment provides complete control and data sovereignty for organizations with the strictest security, compliance, or regulatory requirements. All ZenML components run entirely within your infrastructure with no external dependencies or internet connectivity required.
+
+{% hint style="info" %}
+To learn more about Self-hosted deployment, [book a call](https://www.zenml.io/book-your-demo).
+{% endhint %}
+
+## Overview
+
+In a Self-hosted deployment, every component of ZenML Pro runs within your isolated network environment. This architecture is designed for organizations that must operate in completely disconnected environments or have regulatory requirements preventing any external communication.
+
+
+
+## Architecture
+
+### What Runs Where
+
+| Component | Location | Purpose |
+|-----------|----------|---------|
+| **Pro Control Plane** | Your Infrastructure | Manages authentication, RBAC, and workspace coordination |
+| **ZenML Pro Server(s)** | Your Infrastructure | Handles pipeline orchestration and execution |
+| **Pro Metadata Store** | Your Infrastructure | Stores user management, RBAC, and organizational data |
+| **Workspace Metadata Store** | Your Infrastructure | Stores pipeline runs, model metadata, and tracking information |
+| **Secrets Store** | Your Infrastructure | Stores all credentials and sensitive configuration |
+| **Identity Provider** | Your Infrastructure | Handles authentication (OIDC/LDAP/SAML) |
+| **Pro Dashboard** | Your Infrastructure | Web interface for all ZenML Pro features |
+| **Compute Resources** | Your infrastructure through [stacks](https://docs.zenml.io/stacks) | Executes pipeline steps and training jobs |
+| **Data & Artifacts** | Your infrastructure through [stacks](https://docs.zenml.io/stacks) | Stores datasets, models, and pipeline artifacts |
+
+### Complete Isolation
+
+```mermaid
+flowchart TB
+ subgraph infra["Your Infrastructure (Self-hosted)"]
+ direction TB
+
+ control_plane["ZenML Pro Control Plane - Authentication & Authorization - RBAC Management - Workspace Coordination - Pro Metadata Store"]
+
+ subgraph workspaces[" "]
+ direction LR
+ ws1["Workspace 1 - Server - Metadata - Secrets"]
+ ws2["Workspace 2 - Server - Metadata - Secrets"]
+ wsn["... - Server - Metadata - Secrets"]
+ end
+
+ compute["Your Compute & Storage Resources - Kubernetes / VMs / Cloud - Artifact Stores - ML Data & Models"]
+ end
+
+ control_plane --> ws1
+ control_plane --> ws2
+ control_plane --> wsn
+ ws1 --> compute
+ ws2 --> compute
+ wsn --> compute
+
+ note[/"⚠️ No External Communication Required"/]
+ infra --- note
+```
+
+{% hint style="success" %}
+**Complete data sovereignty**: Zero data leaves your environment. All components, metadata, and ML artifacts remain within your infrastructure boundaries.
+{% endhint %}
+
+### Data Flow
+
+For a detailed explanation of the common pipeline execution data flow across all deployment scenarios, see [Common Pipeline Execution Data Flow](deployments-overview.md#common-pipeline-execution-data-flow) in the Deployment Scenarios Overview.
+
+In Self-hosted deployment, users authenticate via your internal identity provider (LDAP/AD/OIDC), and the control plane (running in your infrastructure) handles both authentication and RBAC. All communication happens entirely within your infrastructure boundary with zero external dependencies or internet connectivity required.
+
+## Key Benefits
+
+### 🔒 Maximum Security & Control
+
+- **Complete air-gap**: No internet connectivity required for operation
+- **Zero external dependencies**: All components self-contained
+- **Custom security policies**: Full control over all security configurations
+- **Network isolation**: Operates within your security perimeter
+- **Audit compliance**: Complete logging and monitoring within your infrastructure
+
+### 🏛️ Regulatory Compliance
+
+- **Data residency**: All data stays within your jurisdiction
+- **ITAR/EAR compliance**: Suitable for controlled data environments
+- **HIPAA/GDPR ready**: Meet healthcare and privacy regulations
+- **Government/Defense**: Suitable for classified environments
+- **Financial services**: Meet banking and financial regulations
+
+### 🎯 Enterprise Control
+
+- **Custom identity provider**: Integrate with your LDAP/AD/OIDC
+- **Infrastructure flexibility**: Deploy on any infrastructure (cloud, on-prem, edge)
+- **Version control**: Control update schedules and versions
+- **Backup strategy**: Implement your own backup and DR policies
+- **Resource optimization**: Full control over resource allocation and costs
+
+### 🛡️ Certified & Documented
+
+- **SOC 2 & ISO 27001 certified software**: Meets enterprise security and compliance benchmarks for your peace of mind
+- **Vulnerability Assessment Reports**: Available on request
+- **Software Bill of Materials (SBOM)**: Complete dependency inventory
+- **Architecture documentation**: Comprehensive deployment guides
+
+## Ideal Use Cases
+
+Self-hosted deployment is essential for:
+
+- **Government and defense** organizations with classified data requirements
+- **Regulated industries** (healthcare, finance) with strict data residency requirements
+- **Organizations in restricted regions** with limited or no internet connectivity
+- **Research institutions** handling sensitive or proprietary research data
+- **Critical infrastructure** operators requiring isolated systems
+- **Companies with ITAR/EAR compliance** requirements
+- **Enterprises with zero-trust policies** prohibiting external communication
+- **Organizations requiring full control** over all aspects of their MLOps platform
+
+## Deployment Options
+
+### On-Premises Data Center
+
+Deploy on your own hardware:
+- Physical servers or private cloud
+- Complete infrastructure control
+- Integration with existing systems
+- Custom hardware configurations
+
+### Private Cloud (AWS, Azure, GCP)
+
+Deploy in isolated cloud VPC:
+- No internet gateway
+- Private networking only
+- Use cloud-native services
+- Leverage cloud scalability within your boundary
+
+### Hybrid Multi-Cloud
+
+Deploy across multiple environments:
+- On-premises + private cloud
+- Multi-region for DR
+- Edge + datacenter hybrid
+- Maintain complete isolation across all environments
+
+### Edge Deployments
+
+Deploy at edge locations:
+- Manufacturing facilities
+- Remote research stations
+- Mobile/tactical deployments
+- Disconnected field operations
+
+## Deployment Architecture
+
+### Architecture Diagram
+
+
+
+The diagram above illustrates a complete Self-hosted ZenML Pro deployment with all components running within your organization's VPC. This architecture ensures zero external communication while providing full enterprise MLOps capabilities.
+
+### Architecture Components
+
+**Client SDK** (top center):
+The ZenML Python SDK runs on developer laptops, CI/CD systems, or notebooks. It communicates with all layers to:
+- Authenticate users via your Identity Provider
+- Submit pipeline runs to workspaces
+- Push Docker images to your Container Registry
+- Access the Organization Platform Layer components
+
+**Organization Platform Layer** (left, pink):
+Your existing ML infrastructure components that ZenML integrates with:
+- **Container Registry**: Store pipeline Docker images (AWS ECR, Dockerhub, Google Artifact Registry, Azure Container Registry)
+- **Artifact Store**: Store ML artifacts, models, and datasets (S3, GCS, Azure Blob Storage, ADLS)
+- **Code Repository**: Version control for pipeline code (GitHub Enterprise, GitLab, Bitbucket)
+- **Orchestrator**: Execute pipeline workloads (Vertex AI, Sagemaker, AzureML, Kubernetes)
+
+**Infrastructure Layer** (top, cyan):
+Core infrastructure services:
+- **Identity Provider**: LDAP, Active Directory, or OIDC provider for user authentication
+- **Load Balancer**: Distributes traffic to ZenML services for high availability
+
+**ZenML Control Plane** (center, blue):
+The management layer running in Kubernetes:
+- **ZenML FE**: React-based Pro dashboard for pipeline visualization and model management
+- **ZenML Control Plane**: Coordinates workspaces, handles authentication/RBAC, manages organization settings
+
+**ZenML Application Plane** (center, purple):
+Individual workspace servers running in Kubernetes:
+- **Multiple Workspaces**: Isolated environments for different teams (DS Team 1, DS Team 2, etc.)
+- Each workspace has its own server instance, metadata database, and secrets store
+- Workspaces are orchestrated by the Control Plane but run independently
+
+**ZenML Storage Plane** (bottom, pink):
+Persistent storage for ZenML services:
+- **Secret Store**: Vault or cloud secrets manager for storing credentials securely
+- **Database**: PostgreSQL or MySQL for storing workspace metadata, pipeline runs, and control plane data
+
+### Data Flow
+
+All arrows in the diagram represent communication flows that occur entirely within your VPC:
+1. Client SDK authenticates with Identity Provider
+2. Client SDK connects to ZenML Control Plane for workspace access
+3. Control Plane manages and coordinates workspaces
+4. Workspaces orchestrate pipeline execution on your Orchestrator
+5. Pipelines write artifacts to your Artifact Store
+6. Workspaces store metadata in the Database
+7. All components access secrets from the Secret Store
+
+**Key Security Feature**: The entire system operates without any external internet connectivity. All Docker images, dependencies, and updates are transferred to your environment through secure offline channels.
+
+### High Availability Configuration
+
+For mission-critical deployments:
+- **Active-active** control plane for zero downtime
+- **Database replication** for metadata stores
+- **Load balancers** for workspace servers
+- **Backup sites** for disaster recovery
+- **Monitoring and alerting** for all components
+
+## Pre-requisites
+
+Before deployment, ensure you have:
+
+#### Infrastructure Requirements
+- Kubernetes cluster (recommended) or VM infrastructure
+- PostgreSQL database(s) for metadata storage
+- Object storage or NFS for artifacts
+- Load balancer for HA configurations
+- Identity provider (LDAP/AD/OIDC)
+
+#### Network Requirements
+- Internal DNS resolution
+- SSL/TLS certificates (internal CA)
+- Network connectivity between components
+- Firewall rules for inter-component communication
+
+#### Resource Requirements
+```yaml
+# Minimum requirements
+Control Plane:
+ CPU: 4 cores
+ Memory: 16GB RAM
+ Storage: 100GB
+
+Per Workspace:
+ CPU: 2 cores
+ Memory: 8GB RAM
+ Storage: 50GB + metadata
+
+Database:
+ CPU: 4 cores
+ Memory: 16GB RAM
+ Storage: 500GB (scalable)
+```
+
+## Operations & Maintenance
+
+### Updates & Upgrades
+
+ZenML provides new versions as offline bundles:
+
+1. **Receive new bundle**: Typically by pulling our Docker images via your approved transfer method
+2. **Review release notes and compatibility notes**: Carefully review the release notes and any migration instructions included in the offline bundle to understand all changes, requirements, and potential impacts. Assess required infrastructure or configuration updates and note any changes in CI/CD actions or deployment processes before proceeding.
+3. **Test in staging**: Deploy to test environment first
+4. **Backup current state**: Database and configuration backups
+5. **Apply updates**: Using Helm upgrade commands, or update your deployment using Terraform or other Infrastructure-as-Code (IaC) tools.
+6. **Verify functionality**: Run health checks and tests
+7. **Monitor**: Watch for any issues post-upgrade
+
+
+### Disaster Recovery
+
+Plan for disaster scenarios:
+
+1. **Database replication**: PostgreSQL streaming replication to backup site
+2. **Artifact replication**: Sync artifact stores to DR location
+3. **Configuration backup**: Version-controlled infrastructure as code
+4. **Runbook**: Document DR procedures
+5. **Regular testing**: Test DR procedures quarterly
+
+## Security Hardening
+
+### Network Security
+
+- **Network segmentation**: Isolate ZenML components in dedicated network segments
+- **Firewall rules**: Restrict traffic to only required ports
+- **TLS everywhere**: Encrypt all communication
+- **Certificate management**: Use internal CA for certificate issuance
+
+
+### Access Control
+
+- **Principle of least privilege**: Grant minimal required permissions
+- **Service accounts**: Use dedicated service accounts for automation
+- **Audit logging**: Log all authentication and authorization events
+
+### Container Security
+
+- **Image scanning**: Scan all container images before deployment
+- **Runtime security**: Monitor container behavior
+- **Pod security policies**: Enforce security standards
+- **Resource limits**: Prevent resource exhaustion attacks
+
+## Support & Documentation
+
+### What ZenML Provides
+
+- **Deployment packages**: Complete offline installation bundles
+- **Documentation**: Comprehensive setup and operation guides
+- **SBOM**: Full software bill of materials for compliance
+- **Vulnerability reports**: Security assessment documentation
+- **Architecture consultation**: Pre-deployment planning support
+- **Deployment assistance**: Guidance during initial setup
+- **Update packages**: New versions as offline bundles
+
+### What You Manage
+
+- **Infrastructure**: Hardware, networking, storage
+- **Day-to-day operations**: Monitoring, backups, user management
+- **Security policies**: Firewall rules, access controls
+- **Compliance**: Audit logs, security assessments
+- **Updates**: Applying new versions using provided bundles
+
+### Support Model
+
+Contact [cloud@zenml.io](mailto:cloud@zenml.io) for:
+- Pre-sales architecture consultation
+- Deployment planning and sizing
+- Security documentation requests
+- Offline support packages
+- Update and upgrade assistance
+
+## Licensing
+
+Air-gapped deployments are provided under commercial software license agreements, with license fees and terms defined on a per-customer basis. Each contract includes detailed license terms and conditions appropriate to the deployment.
+
+## Security Documentation
+
+Available on request for compliance and security reviews:
+
+- ✅ **Vulnerability Assessment Reports**: Full security analysis
+- ✅ **Software Bill of Materials (SBOM)**: Complete dependency list
+- ✅ **Architecture security review**: Threat model and mitigations
+- ✅ **Compliance mappings**: NIST, CIS, GDPR, HIPAA guidance
+- ✅ **Security hardening guide**: Best practices for your deployment
+
+## Comparison with Other Deployments
+
+| Feature | SaaS | Hybrid SaaS | Self-hosted |
+|---------|------|-------------|------------|
+| Internet Required | Yes (metadata) | Yes (control plane) | **No** |
+| Setup Time | Minutes | Hours/Days | Days/Weeks |
+| Maintenance | Zero | Partial | **Full control** |
+| Data Location | Mixed | Your infra | **100% yours** |
+| User Management | ZenML | ZenML | **Your IDP** |
+| Update Control | Automatic | Automatic CP | **You decide** |
+| Customization | Limited | Moderate | **Complete** |
+| Best For | Fast start | Balance | **Max security** |
+
+[Compare all deployment options →](README.md#deployment-scenarios)
+
+## Migration Path
+
+### From ZenML OSS to Self-hosted Pro
+
+If you're interested in migrating from ZenML OSS to a Self-hosted Pro deployment, we're here to help guide you through every step of the process. Migration paths are highly dependent on your specific environment, infrastructure setup, and current ZenML OSS deployment configuration.
+
+It's possible to migrate existing stacks or even existing metadata from existing OSS deployments. We can figure out how and what to migrate together in a call.
+
+**Next steps:**
+
+- [Book a migration consultation →](https://www.zenml.io/book-your-demo)
+- Or email us at [cloud@zenml.io](mailto:cloud@zenml.io)
+
+Your ZenML representative will work with you to assess your current setup, understand your Self-hosted requirements, and provide a tailored migration plan that fits your environment.
+
+### From Other Pro Deployments
+
+If you're moving from SaaS or Hybrid to Self-hosted, migration paths can vary significantly depending on your organization's size, data residency requirements, and current ZenML setup. We recommend discussing your plans with a ZenML solutions architect.
+
+**Next steps:**
+
+- [Book a migration consultation →](https://www.zenml.io/book-your-demo)
+- Or email us at [cloud@zenml.io](mailto:cloud@zenml.io)
+
+Your ZenML representative will provide you with a tailored migration checklist, technical documentation, and direct support to ensure a smooth transition with minimal downtime.
+
+## Detailed Architecture Diagram
+
+
+
+
+## Related Resources
+
+- [System Architecture Overview](../system-architectures.md#zenml-pro-self-hosted-architecture)
+- [Deployment Scenarios Overview](deployments-overview.md)
+- [SaaS Deployment](saas-deployment.md)
+- [Hybrid SaaS Deployment](hybrid-deployment.md)
+- [Workload Managers](workload-managers.md)
+- [Self-hosted Deployment Guide](self-hosted.md)
+- [Security & Compliance](README.md#security--compliance)
+
+## Get Started
+
+Ready to deploy ZenML Pro in a Self-hosted environment?
+
+[Book a Demo](https://www.zenml.io/book-your-demo){ .md-button .md-button--primary }
+
+Have questions? [Contact us](mailto:cloud@zenml.io) for detailed deployment planning.
diff --git a/docs/book/getting-started/zenml-pro/toc.md b/docs/book/getting-started/zenml-pro/toc.md
index 79f4ff07e21..f0fc262337f 100644
--- a/docs/book/getting-started/zenml-pro/toc.md
+++ b/docs/book/getting-started/zenml-pro/toc.md
@@ -4,7 +4,13 @@
## Deployments
-* [Self-hosted deployment](self-hosted.md)
+* [Deployment Scenarios Overview](deployments-overview.md)
+* [SaaS Deployment](saas-deployment.md)
+* [Hybrid SaaS Deployment](hybrid-deployment.md)
+ * [Kubernetes with Helm](hybrid-deployment-helm.md)
+ * [AWS ECS](hybrid-deployment-ecs.md)
+* [Self-hosted Deployment](self-hosted-deployment.md)
+ * [Kubernetes with Helm](self-hosted-deployment-helm.md)
## Core Concepts
@@ -13,6 +19,7 @@
* [Workspaces](workspaces.md)
* [Projects](projects.md)
* [Teams](teams.md)
+* [Workload Managers](workload-managers.md)
## Access Management
diff --git a/docs/book/getting-started/zenml-pro/workload-managers.md b/docs/book/getting-started/zenml-pro/workload-managers.md
new file mode 100644
index 00000000000..2c1857b75b3
--- /dev/null
+++ b/docs/book/getting-started/zenml-pro/workload-managers.md
@@ -0,0 +1,391 @@
+---
+description: Understand workload managers and how they enable running pipelines from the ZenML Pro UI.
+icon: microchip
+---
+
+# Workload Managers
+
+Workload managers are built into the ZenML Pro workspace container. They enable you to run pipeline snapshots directly from the ZenML Pro UI by allowing the workspace to orchestrate pipeline execution on your infrastructure. Without a workload manager configured, your workspace can only be used for monitoring and analyzing completed pipeline runs. With one configured, you gain the ability to trigger and execute pipelines interactively.
+
+{% hint style="info" %}
+This feature is available in [all ZenML Pro deployment scenarios](deployments-overview.md) (SaaS, Hybrid, and Self-hosted).
+{% endhint %}
+
+## Architecture
+
+The ZenML Pro workspace container includes workload manager implementations. You configure which implementation to use through environment variables passed to the workspace. The workspace then uses that implementation to coordinate pipeline execution with your infrastructure.
+
+### Execution Flow
+
+1. **User triggers a snapshot from the ZenML Pro UI**: You select a pipeline snapshot and click "Run"
+2. **ZenML workspace receives the request**: Your ZenML Pro workspace (running in your workspace, whether SaaS, Hybrid, or Self-hosted) receives the execution request.
+3. **Workload manager implementation handles orchestration**: The configured workload manager implementation (Kubernetes, AWS, or GCP) translates the request into infrastructure-specific commands.
+4. **Runner pod/task is created**: The workload manager creates a Kubernetes pod, ECS task, or equivalent compute unit on your infrastructure.
+5. **Pipeline executes**: The runner pulls the pipeline code, executes the steps, and streams logs back to the workspace.
+6. **Results are captured**: Artifacts, metrics, and run metadata are stored in your configured artifact store and metadata backend.
+
+## How Workload Managers Are Configured
+
+Workload managers are enabled by setting environment variables on the ZenML Pro workspace container. Each implementation requires a specific set of environment variables that tell the workspace:
+
+- Which workload manager implementation to use
+- Where to create runner pods/tasks (namespace, cluster, region)
+- How to access container registries and storage
+- What permissions and resources runners should have
+
+All configuration happens within a single workspace deployment—no separate services are needed.
+
+## Supported Implementations
+
+ZenML Pro supports three workload manager implementations:
+
+### 1. Kubernetes Workload Manager
+
+The simplest implementation, suitable for any Kubernetes cluster (EKS, GKE, AKS, self-managed).
+
+**Environment Variables:**
+
+```yaml
+ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.kubernetes_workload_manager.KubernetesWorkloadManager
+ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+ZENML_KUBERNETES_WORKLOAD_MANAGER_RUNNER_IMAGE: 715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server:0.73.0
+ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES: '{"requests": {"cpu": "500m", "memory": "512Mi"}, "limits": {"cpu": "2000m", "memory": "2Gi"}}'
+ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED: 86400
+ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS: 5
+```
+
+**Requirements:**
+- Kubernetes cluster (1.24+)
+- Service account with permissions to create/manage pods in a dedicated namespace
+- Network connectivity from cluster to your ZenML workspace
+- Access to a container registry with ZenML runner images
+
+**How it works:**
+- The workspace uses the Kubernetes API to create pods in the specified namespace
+- Pods run under the specified service account, inheriting cluster network access
+- Completed pods are automatically cleaned up after the TTL expires
+
+**Use cases:**
+- Self-managed ZenML workspaces on Kubernetes (Hybrid or Self-hosted)
+- Teams already running Kubernetes infrastructure
+- Minimal setup complexity
+
+### 2. AWS Kubernetes Workload Manager
+
+A specialized implementation for EKS that integrates with AWS services (ECR for images, S3 for logs, IAM for permissions).
+
+**Environment Variables:**
+
+```yaml
+ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.aws_kubernetes_workload_manager.AWSKubernetesWorkloadManager
+ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"
+ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY:
+ZENML_KUBERNETES_WORKLOAD_MANAGER_ENABLE_EXTERNAL_LOGS: "true"
+ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_BUCKET: s3://your-bucket/zenml-logs
+ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_REGION: us-east-1
+ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES: '{"requests": {"cpu": "500m", "memory": "512Mi"}, "limits": {"cpu": "2000m", "memory": "2Gi"}}'
+ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED: 86400
+ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS: 5
+```
+
+**Requirements:**
+- EKS cluster
+- IAM role for the workspace with permissions to access EKS, ECR, and S3
+- Docker registry (ECR) for storing runner images
+- S3 bucket for exporting logs
+
+**How it works:**
+- The workspace assumes an IAM role to access AWS services
+- Runner images are stored and pulled from ECR
+- Pod permissions are managed through IAM roles for service accounts (IRSA)
+- Logs are streamed to S3 for long-term retention and analysis
+
+**Use cases:**
+- AWS-centric environments with EKS
+- Need for image building and custom runner management
+- Centralized log aggregation in S3
+- Fine-grained IAM-based access control
+
+### 3. GCP Kubernetes Workload Manager
+
+Similar to AWS implementation but integrated with GCP services (GCR for images, Cloud Logging for logs).
+
+**Environment Variables:**
+
+```yaml
+ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE: zenml_cloud_plugins.gcp_kubernetes_workload_manager.GCPKubernetesWorkloadManager
+ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE: zenml-workload-manager
+ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT: zenml-runner
+ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"
+ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY:
+ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES: '{"requests": {"cpu": "500m", "memory": "512Mi"}, "limits": {"cpu": "2000m", "memory": "2Gi"}}'
+ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED: 86400
+ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS: 5
+```
+
+**Requirements:**
+- GKE cluster
+- Service account with permissions to access GKE, GCR, and Cloud Logging
+- Docker registry (GCR) for storing runner images
+
+**How it works:**
+- The workspace authenticates to GCP using a service account
+- Runner images are stored and pulled from GCR
+- Pod permissions are managed through Workload Identity
+- Logs are automatically sent to Cloud Logging
+
+**Use cases:**
+- GCP-centric environments with GKE
+- Leveraging GCP's managed services and Cloud Logging
+- Integration with Google Cloud monitoring and observability tools
+
+## IAM Permissions and Service Accounts
+
+Proper permission configuration is critical for workload managers to function correctly. The ZenML Pro workspace needs sufficient permissions to create and manage runner pods without being overly permissive.
+
+### Kubernetes Service Account
+
+For Kubernetes-based implementations, the workspace uses a Kubernetes service account to interact with your cluster.
+
+**Required RBAC permissions:**
+- Create pods in the designated namespace
+- Get/list pods (for monitoring runner status)
+- Delete pods (for cleanup after runs complete)
+- Get pod logs
+- Create, get, patch, delete persistent volume claims (if using persistent storage)
+- Get secrets in the namespace (for accessing runner credentials)
+
+**Example RBAC role:**
+
+```yaml
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+ name: zenml-workload-manager
+ namespace: zenml-workload-manager
+rules:
+- apiGroups: [""]
+ resources: ["pods"]
+ verbs: ["create", "get", "list", "delete", "patch"]
+- apiGroups: [""]
+ resources: ["pods/logs"]
+ verbs: ["get"]
+- apiGroups: [""]
+ resources: ["secrets"]
+ verbs: ["get"]
+- apiGroups: [""]
+ resources: ["persistentvolumeclaims"]
+ verbs: ["create", "get", "delete"]
+```
+
+### AWS IAM Role
+
+For AWS-based implementations, the ZenML Pro workspace container needs an IAM role (typically via IRSA—IAM roles for service accounts) to access EKS and related AWS services.
+
+**Required permissions:**
+
+**EKS cluster access:**
+- `eks:DescribeCluster` - Retrieve cluster details
+- `eks:ListClusters` - List available clusters
+
+**Pod creation and management (via Kubernetes API using IRSA):**
+- The IAM role must be associated with a Kubernetes service account
+- The role is assumed by pods running under that service account
+- This allows the ZenML workspace to access the Kubernetes API
+
+**ECR (if building images):**
+- `ecr:DescribeRepositories` - List image repositories
+- `ecr:BatchCheckLayerAvailability` - Check for existing layers
+- `ecr:GetDownloadUrlForLayer` - Download layer data
+- `ecr:BatchGetImage` - Retrieve images
+- `ecr:PutImage` - Push images to registry
+
+**S3 (for log export):**
+- `s3:PutObject` - Write logs to bucket
+- `s3:GetObject` - Read logs from bucket
+- `s3:ListBucket` - List log files
+
+**Example IAM policy:**
+
+```json
+{
+ "Version": "2012-10-17",
+ "Statement": [
+ {
+ "Effect": "Allow",
+ "Action": [
+ "eks:DescribeCluster",
+ "eks:ListClusters"
+ ],
+ "Resource": "*"
+ },
+ {
+ "Effect": "Allow",
+ "Action": [
+ "ecr:GetDownloadUrlForLayer",
+ "ecr:BatchGetImage",
+ "ecr:PutImage",
+ "ecr:DescribeRepositories"
+ ],
+ "Resource": "arn:aws:ecr:region:account:repository/zenml-*"
+ },
+ {
+ "Effect": "Allow",
+ "Action": [
+ "s3:PutObject",
+ "s3:GetObject",
+ "s3:ListBucket"
+ ],
+ "Resource": [
+ "arn:aws:s3:::your-log-bucket",
+ "arn:aws:s3:::your-log-bucket/*"
+ ]
+ }
+ ]
+}
+```
+
+### GCP Service Account
+
+For GCP-based implementations, the ZenML Pro workspace uses a GCP service account with appropriate roles.
+
+**Required roles:**
+- `roles/container.developer` - Access to create and manage pods in GKE
+- `roles/storage.admin` (or more restrictive) - Access to GCR for image operations
+- `roles/logging.logWriter` - Write logs to Cloud Logging
+
+**Permissions by service:**
+
+**GKE pod management:**
+- `container.operations.create`
+- `container.operations.get`
+- `container.pods.create`
+- `container.pods.get`
+- `container.pods.list`
+- `container.pods.delete`
+
+**GCR image access:**
+- `storage.buckets.get`
+- `storage.objects.create`
+- `storage.objects.get`
+- `storage.objects.list`
+- `storage.objects.delete`
+
+**Cloud Logging:**
+- `logging.logEntries.create`
+
+## General Considerations
+
+When configuring workload managers, keep these factors in mind:
+
+### Network Connectivity
+
+- **Egress from workspace to Kubernetes API**: The ZenML Pro workspace must have network access to your Kubernetes cluster's API server (port 6443 by default)
+- **Egress from runners to workspace**: Runner pods must have network access to your ZenML workspace (cloud.zenml.io for SaaS, your custom domain for Hybrid/Self-hosted, port 443)
+- **Artifact storage access**: Runners need network access to your artifact store (S3, GCS, Azure Blob, local NFS, etc.)
+- **Metadata backend access**: Runners need to reach your database for metadata operations
+- **Container registry access**: Runners need to pull images from your container registry
+
+For Self-hosted deployments, ensure all dependencies are available internally:
+- Private container registry with runner images
+- Internal artifact storage accessible from runners
+- Internal database (no external connectivity required)
+- Kubernetes API accessible from the workspace container
+
+### Resource Configuration
+
+Configure appropriate resources for runner pods:
+
+- **CPU requests/limits**: Depends on pipeline complexity; start with 500m requests and 2000m limits, adjust based on workload profiling
+- **Memory requests/limits**: Typical range is 512Mi to 2Gi; larger for data-intensive workloads
+- **Ephemeral storage**: Consider temporary storage for intermediate pipeline data
+- **Pod disruption budget**: For production deployments, define minimum available pods to prevent service disruption
+
+The `ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES` environment variable controls these settings for all runner pods.
+
+### Image Management
+
+Runner pods need access to container images:
+
+- **Pre-built images**: ZenML provides official runner images in its public ECR registry (715803424590.dkr.ecr.eu-central-1.amazonaws.com)
+- **Custom images**: For Self-hosted setups, pull images into your private registry before deployment
+- **Image pull secrets**: Configure if your registry requires authentication
+- **Regular updates**: Keep runner images up-to-date for security and compatibility
+- **Image building**: For AWS and GCP implementations, set `ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE: "true"` to allow the workspace to build custom images
+
+### Logging and Observability
+
+- **Log collection**: Logs can be streamed to S3 (AWS), Cloud Logging (GCP), or local storage
+- **Monitoring**: Use your infrastructure's native monitoring (CloudWatch, Cloud Monitoring, Prometheus)
+- **Pod events**: Kubernetes events track pod creation, scheduling, and termination
+- **Execution tracing**: ZenML captures step-level execution metadata for debugging
+- **Enable external logs**: Use `ZENML_KUBERNETES_WORKLOAD_MANAGER_ENABLE_EXTERNAL_LOGS: "true"` for AWS implementation
+
+### Isolation and Security
+
+- **Namespace isolation**: Use dedicated namespaces (e.g., `zenml-workload-manager`) to separate runner pods from other workloads
+- **Pod security policies**: Apply network policies to restrict pod communication
+- **Secret management**: Use Kubernetes secrets or cloud-native secret managers for runner credentials
+- **Service account scoping**: Limit workspace permissions to only what's needed for runner management
+- **Image scanning**: Scan runner images for vulnerabilities before deployment
+- **RBAC enforcement**: Ensure Kubernetes RBAC policies prevent unauthorized pod creation
+
+### Scaling and Concurrency
+
+Configure limits to prevent resource exhaustion:
+
+- **Concurrent runs**: Set `ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS` to limit simultaneous executions (typical: 2-10 depending on runner resources and cluster capacity)
+- **TTL for completed pods**: Clean up finished pods automatically using `ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED` (e.g., 86400 seconds = 24 hours)
+- **Pod disruption budgets**: For HA setups, define minimum available pods to ensure service continuity
+- **Horizontal Pod Autoscaler (HPA)**: For the ZenML workspace itself (not runners), consider HPA if handling many concurrent run submissions
+
+### Troubleshooting Common Issues
+
+**Pods fail to start:**
+- Check RBAC permissions for the service account: `kubectl auth can-i create pods --as=system:serviceaccount:zenml-workload-manager:zenml-runner -n zenml-workload-manager`
+- Verify image pull secrets if using private registries
+- Check resource availability (CPU, memory) in cluster
+- Review pod events: `kubectl describe pod -n zenml-workload-manager`
+- Check workspace logs for workload manager errors: `kubectl logs -n zenml-workspace deployment/zenml`
+
+**Logs not appearing:**
+- Verify workspace can reach artifact store and database
+- Check network connectivity between cluster and workspace
+- Ensure S3/Cloud Logging permissions are correct
+- Review pod logs for pipeline execution errors: `kubectl logs -n zenml-workload-manager`
+
+**Server can't reach cluster:**
+- Verify network connectivity to Kubernetes API server
+- Check credentials/RBAC permissions (especially for Hybrid deployments with OAuth2)
+- Confirm service account role bindings are in place
+- Test cluster connectivity: `kubectl cluster-info`
+
+**Runners can't reach workspace:**
+- Verify egress network policies allow outbound HTTPS (port 443)
+- Check firewall rules for ingress/egress to ZenML workspace
+- Confirm workspace URL is resolvable and reachable from pods
+- Test from pod: `kubectl run -it --rm debug --image=curlimages/curl --restart=Never -- curl https:///health`
+
+## Next Steps
+
+- [Set up workload managers in Hybrid deployments](hybrid-deployment-helm.md#step-7-optional-enable-snapshot-support--workload-manager)
+- [Configure workload managers in Self-hosted environments](self-hosted-deployment-helm.md#step-14-optional-enable-snapshot-support--workload-manager)
+- [Learn about pipeline snapshots](https://docs.zenml.io/concepts/snapshots)
+
+## Related Resources
+
+**Deployment & Infrastructure:**
+- [Deployment Scenarios Overview](deployments-overview.md) - Compare SaaS, Hybrid, and Self-hosted options
+- [Hybrid SaaS Deployment](hybrid-deployment.md) - Balance control with convenience
+- [Self-hosted Deployment](self-hosted-deployment.md) - Complete control and data sovereignty
+- [Self-hosted Deployment Guide](self-hosted.md) - Comprehensive deployment reference
+
+**Core Concepts:**
+- [Workspaces](workspaces.md) - Isolated environments for teams and projects
+- [Organizations](organization.md) - Top-level entity for managing users and teams
+- [Roles & Permissions](roles.md) - Control access to workload manager configuration
+
+