Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions charts/deepgram-self-hosted/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,16 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),

- Exposed the ability to add custom TOML sections in api.toml and engine.toml via `customToml`
- Added `nodeSelector` support for all components (API, Engine, License Proxy) to allow scheduling pods on specific nodes.
- Added configurable service types for API, Engine, and License Proxy services with ClusterIP as the default
- Added support for service annotations when using LoadBalancer service type
- Added `loadBalancerSourceRanges` configuration for LoadBalancer services to restrict access to specific IP CIDR ranges
- Added `externalTrafficPolicy` configuration for LoadBalancer services to control traffic routing behavior
- Updated sample configurations to demonstrate service configuration options including LoadBalancer security settings

### Changed

- Changed default service type from NodePort to ClusterIP for all services (API external, Engine metrics, License Proxy status)
- Updated service templates to support configurable service types and annotations

## [0.19.0] - 2025-09-12

Expand Down
69 changes: 69 additions & 0 deletions charts/deepgram-self-hosted/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,60 @@ To configure a specific storage option, see the `engine.modelManager.volumes` [c

For detailed instructions on setting up and configuring each storage option, refer to the [Deepgram self-hosted guides](https://developers.deepgram.com/docs/kubernetes) and the respective cloud provider's documentation.

### Service Configuration

The Deepgram Helm chart provides flexible service configuration options for exposing the API, Engine, and License Proxy services. By default, all services use `ClusterIP` type, which provides internal cluster access only.

#### Service Types

- **ClusterIP** (default): Exposes the service on a cluster-internal IP. This is the default and recommended option for most deployments.
- **NodePort**: Exposes the service on each Node's IP at a static port. Useful for development or when you need direct node access.
- **LoadBalancer**: Exposes the service externally using a cloud provider's load balancer. Recommended for production deployments requiring external access.

#### Configuration Examples

**API Service with LoadBalancer (with security restrictions):**
```yaml
api:
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
loadBalancerSourceRanges:
- "10.0.0.0/8" # Allow access from private networks
- "192.168.1.0/24" # Allow access from specific subnet
externalTrafficPolicy: "Local" # Preserve source IP and reduce hops
```

**Engine Metrics Service with NodePort:**
```yaml
engine:
service:
type: NodePort
```

**License Proxy Service with LoadBalancer (restricted access):**
```yaml
licenseProxy:
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
loadBalancerSourceRanges:
- "10.0.0.0/8" # Only allow internal network access
externalTrafficPolicy: "Cluster" # Allow traffic from any node
```

#### LoadBalancer Security Options

When using `LoadBalancer` service type, you can configure additional security and performance options:

- **`loadBalancerSourceRanges`**: Restrict access to specific IP CIDR ranges. This provides network-level security by only allowing traffic from specified IP ranges.
- **`externalTrafficPolicy`**: Controls how external traffic is routed:
- `Cluster` (default): Traffic can be routed to any node in the cluster, then forwarded to the target pod
- `Local`: Traffic is only routed to nodes that have the target pod running, preserving source IP addresses

### Autoscaling

Autoscaling your cluster's capacity to meet incoming traffic demands involves both node autoscaling and pod autoscaling. Node autoscaling for supported cloud providers is setup by default when using this Helm chart and creating your cluster with the [Deepgram self-hosted guides](https://developers.deepgram.com/docs/kubernetes). Pod autoscaling can be enabled via the `scaling.auto.enabled` configuration option in this chart.
Expand Down Expand Up @@ -230,6 +284,11 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| api.server.fetchTimeout | string | `"60s"` | fetchTimeout configures how long to wait for a response from a fetch URL. The value should be a humantime duration. A fetch URL is a URL passed in an inference request from which a payload should be downloaded. |
| api.server.host | string | `"0.0.0.0"` | host is the IP address to listen on. You will want to listen on all interfaces to interact with other pods in the cluster. |
| api.server.port | int | `8080` | port to listen on. |
| api.service | object | `` | Service configuration for the API external service |
| api.service.annotations | object | `` | Additional annotations to add to the service when type is LoadBalancer |
| api.service.externalTrafficPolicy | string | `` | External traffic policy for LoadBalancer service. Options: Cluster, Local Only applies when service type is LoadBalancer |
| api.service.loadBalancerSourceRanges | list | `` | List of IP CIDR ranges allowed to access the LoadBalancer service Only applies when service type is LoadBalancer |
| api.service.type | string | `ClusterIP` | Service type for the API external service. Options: ClusterIP, NodePort, LoadBalancer |
| api.serviceAccount.create | bool | `true` | Specifies whether to create a default service account for the Deepgram API Deployment. |
| api.serviceAccount.name | string | `nil` | Allows providing a custom service account name for the API component. If left empty, the default service account name will be used. If specified, and `api.serviceAccount.create = true`, this defines the name of the default service account. If specified, and `api.serviceAccount.create = false`, this provides the name of a preconfigured service account you wish to attach to the API deployment. |
| api.tolerations | list | `[]` | [Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) to apply to API pods. |
Expand Down Expand Up @@ -290,6 +349,11 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| engine.server | object | `` | Configure Engine containers to listen for requests from API containers. |
| engine.server.host | string | `"0.0.0.0"` | host is the IP address to listen on for inference requests. You will want to listen on all interfaces to interact with other pods in the cluster. |
| engine.server.port | int | `8080` | port to listen on for inference requests |
| engine.service | object | `` | Service configuration for the Engine metrics service |
| engine.service.annotations | object | `` | Additional annotations to add to the service when type is LoadBalancer |
| engine.service.externalTrafficPolicy | string | `` | External traffic policy for LoadBalancer service. Options: Cluster, Local Only applies when service type is LoadBalancer |
| engine.service.loadBalancerSourceRanges | list | `` | List of IP CIDR ranges allowed to access the LoadBalancer service Only applies when service type is LoadBalancer |
| engine.service.type | string | `ClusterIP` | Service type for the Engine metrics service. Options: ClusterIP, NodePort, LoadBalancer |
| engine.serviceAccount.create | bool | `true` | Specifies whether to create a default service account for the Deepgram Engine Deployment. |
| engine.serviceAccount.name | string | `nil` | Allows providing a custom service account name for the Engine component. If left empty, the default service account name will be used. If specified, and `engine.serviceAccount.create = true`, this defines the name of the default service account. If specified, and `engine.serviceAccount.create = false`, this provides the name of a preconfigured service account you wish to attach to the Engine deployment. |
| engine.startupProbe | object | `` | The startupProbe combination of `periodSeconds` and `failureThreshold` allows time for the container to load all models and start listening for incoming requests. Model load time can be affected by hardware I/O speeds, as well as network speeds if you are using a network volume mount for the models. If you are hitting the failure threshold before models are finished loading, you may want to extend the startup probe. However, this will also extend the time it takes to detect a pod that can't establish a network connection to validate its license. |
Expand Down Expand Up @@ -331,6 +395,11 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| licenseProxy.server.host | string | `"0.0.0.0"` | host is the IP address to listen on. You will want to listen on all interfaces to interact with other pods in the cluster. |
| licenseProxy.server.port | int | `8443` | port to listen on. |
| licenseProxy.server.statusPort | int | `8080` | statusPort is the port to listen on for the status/health endpoint. |
| licenseProxy.service | object | `` | Service configuration for the License Proxy status service |
| licenseProxy.service.annotations | object | `` | Additional annotations to add to the service when type is LoadBalancer |
| licenseProxy.service.externalTrafficPolicy | string | `` | External traffic policy for LoadBalancer service. Options: Cluster, Local Only applies when service type is LoadBalancer |
| licenseProxy.service.loadBalancerSourceRanges | list | `` | List of IP CIDR ranges allowed to access the LoadBalancer service Only applies when service type is LoadBalancer |
| licenseProxy.service.type | string | `ClusterIP` | Service type for the License Proxy status service. Options: ClusterIP, NodePort, LoadBalancer |
| licenseProxy.serviceAccount.create | bool | `true` | Specifies whether to create a default service account for the Deepgram License Proxy Deployment. |
| licenseProxy.serviceAccount.name | string | `nil` | Allows providing a custom service account name for the LicenseProxy component. If left empty, the default service account name will be used. If specified, and `licenseProxy.serviceAccount.create = true`, this defines the name of the default service account. If specified, and `licenseProxy.serviceAccount.create = false`, this provides the name of a preconfigured service account you wish to attach to the License Proxy deployment. |
| licenseProxy.tolerations | list | `[]` | [Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) to apply to License Proxy pods. |
Expand Down
54 changes: 54 additions & 0 deletions charts/deepgram-self-hosted/README.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,60 @@ To configure a specific storage option, see the `engine.modelManager.volumes` [c

For detailed instructions on setting up and configuring each storage option, refer to the [Deepgram self-hosted guides](https://developers.deepgram.com/docs/kubernetes) and the respective cloud provider's documentation.

### Service Configuration

The Deepgram Helm chart provides flexible service configuration options for exposing the API, Engine, and License Proxy services. By default, all services use `ClusterIP` type, which provides internal cluster access only.

#### Service Types

- **ClusterIP** (default): Exposes the service on a cluster-internal IP. This is the default and recommended option for most deployments.
- **NodePort**: Exposes the service on each Node's IP at a static port. Useful for development or when you need direct node access.
- **LoadBalancer**: Exposes the service externally using a cloud provider's load balancer. Recommended for production deployments requiring external access.

#### Configuration Examples

**API Service with LoadBalancer (with security restrictions):**
```yaml
api:
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
loadBalancerSourceRanges:
- "10.0.0.0/8" # Allow access from private networks
- "192.168.1.0/24" # Allow access from specific subnet
externalTrafficPolicy: "Local" # Preserve source IP and reduce hops
```

**Engine Metrics Service with NodePort:**
```yaml
engine:
service:
type: NodePort
```

**License Proxy Service with LoadBalancer (restricted access):**
```yaml
licenseProxy:
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
loadBalancerSourceRanges:
- "10.0.0.0/8" # Only allow internal network access
externalTrafficPolicy: "Cluster" # Allow traffic from any node
```

#### LoadBalancer Security Options

When using `LoadBalancer` service type, you can configure additional security and performance options:

- **`loadBalancerSourceRanges`**: Restrict access to specific IP CIDR ranges. This provides network-level security by only allowing traffic from specified IP ranges.
- **`externalTrafficPolicy`**: Controls how external traffic is routed:
- `Cluster` (default): Traffic can be routed to any node in the cluster, then forwarded to the target pod
- `Local`: Traffic is only routed to nodes that have the target pod running, preserving source IP addresses

### Autoscaling

Autoscaling your cluster's capacity to meet incoming traffic demands involves both node autoscaling and pod autoscaling. Node autoscaling for supported cloud providers is setup by default when using this Helm chart and creating your cluster with the [Deepgram self-hosted guides](https://developers.deepgram.com/docs/kubernetes). Pod autoscaling can be enabled via the `scaling.auto.enabled` configuration option in this chart.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,22 @@ api:
limits:
memory: "8Gi"
cpu: "4000m"

# -- Service configuration for the API external service
# Uncomment and modify the service type as needed for your deployment
# service:
# # Options: ClusterIP (default), NodePort, LoadBalancer
# type: LoadBalancer
# # Add annotations when using LoadBalancer type (e.g., for AWS ELB configuration)
# annotations:
# service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
# service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
# # Restrict access to specific IP ranges (optional)
# loadBalancerSourceRanges:
# - "10.0.0.0/8" # Allow access from private networks
# - "192.168.1.0/24" # Allow access from specific subnet
# # External traffic policy: Cluster (default) or Local
# externalTrafficPolicy: "Local" # Preserve source IP and reduce hops

# -- Custom TOML sections can be added here to extend api.toml
# customToml: |
Expand Down
17 changes: 16 additions & 1 deletion charts/deepgram-self-hosted/templates/api/api.service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,26 @@ metadata:
{{- range $key, $val := .Values.engine.additionalLabels }}
{{ $key }}: {{ $val | quote }}
{{- end}}
{{- if and (eq .Values.api.service.type "LoadBalancer") .Values.api.service.annotations }}
annotations:
{{- range $key, $val := .Values.api.service.annotations }}
{{ $key }}: {{ $val | quote }}
{{- end}}
{{- end }}
spec:
selector:
app: deepgram-api
{{ include "deepgram-self-hosted.selectorLabels" . }}
type: NodePort
type: {{ .Values.api.service.type | default "ClusterIP" }}
{{- if and (eq .Values.api.service.type "LoadBalancer") .Values.api.service.loadBalancerSourceRanges }}
loadBalancerSourceRanges:
{{- range .Values.api.service.loadBalancerSourceRanges }}
- {{ . }}
{{- end }}
{{- end }}
{{- if and (eq .Values.api.service.type "LoadBalancer") .Values.api.service.externalTrafficPolicy }}
externalTrafficPolicy: {{ .Values.api.service.externalTrafficPolicy }}
{{- end }}
ports:
- name: "primary"
port: {{ .Values.api.server.port }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,29 @@ metadata:
{{- range $key, $val := $.Values.engine.additionalLabels }}
{{ $key }}: {{ $val | quote }}
{{- end}}
{{- if and (eq $.Values.engine.service.type "LoadBalancer") $.Values.engine.service.annotations }}
annotations:
{{- range $key, $val := $.Values.engine.service.annotations }}
{{ $key }}: {{ $val | quote }}
{{- end}}
{{- end }}
spec:
selector:
app: deepgram-engine
{{- if $type }}
engine-type: {{ $type }}
{{- end }}
{{ include "deepgram-self-hosted.selectorLabels" $ }}
type: NodePort
type: {{ $.Values.engine.service.type | default "ClusterIP" }}
{{- if and (eq $.Values.engine.service.type "LoadBalancer") $.Values.engine.service.loadBalancerSourceRanges }}
loadBalancerSourceRanges:
{{- range $.Values.engine.service.loadBalancerSourceRanges }}
- {{ . }}
{{- end }}
{{- end }}
{{- if and (eq $.Values.engine.service.type "LoadBalancer") $.Values.engine.service.externalTrafficPolicy }}
externalTrafficPolicy: {{ $.Values.engine.service.externalTrafficPolicy }}
{{- end }}
ports:
- name: "metrics"
port: {{ $.Values.engine.metricsServer.port }}
Expand All @@ -53,6 +68,7 @@ spec:
engine-type: {{ $type }}
{{- end }}
{{ include "deepgram-self-hosted.selectorLabels" $ }}
type: ClusterIP
ports:
- name: "primary"
port: {{ $.Values.engine.server.port }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,26 @@ metadata:
{{- range $key, $val := .Values.licenseProxy.additionalLabels }}
{{ $key }}: {{ $val | quote }}
{{- end}}
{{- if and (eq .Values.licenseProxy.service.type "LoadBalancer") .Values.licenseProxy.service.annotations }}
annotations:
{{- range $key, $val := .Values.licenseProxy.service.annotations }}
{{ $key }}: {{ $val | quote }}
{{- end}}
{{- end }}
spec:
selector:
app: deepgram-license-proxy
{{ include "deepgram-self-hosted.selectorLabels" . }}
type: NodePort
type: {{ .Values.licenseProxy.service.type | default "ClusterIP" }}
{{- if and (eq .Values.licenseProxy.service.type "LoadBalancer") .Values.licenseProxy.service.loadBalancerSourceRanges }}
loadBalancerSourceRanges:
{{- range .Values.licenseProxy.service.loadBalancerSourceRanges }}
- {{ . }}
{{- end }}
{{- end }}
{{- if and (eq .Values.licenseProxy.service.type "LoadBalancer") .Values.licenseProxy.service.externalTrafficPolicy }}
externalTrafficPolicy: {{ .Values.licenseProxy.service.externalTrafficPolicy }}
{{- end }}
ports:
- name: "status"
port: {{ .Values.licenseProxy.server.statusPort }}
Expand All @@ -33,6 +48,7 @@ spec:
selector:
app: deepgram-license-proxy
{{ include "deepgram-self-hosted.selectorLabels" . }}
type: ClusterIP
ports:
- name: "primary"
port: {{ .Values.licenseProxy.server.port }}
Expand Down
Loading
Loading