You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: charts/deepgram-self-hosted/README.md
+15-12Lines changed: 15 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -284,12 +284,13 @@ If you encounter issues while deploying or using Deepgram, consider the followin
284
284
| api.server.fetchTimeout | string | `"60s"` | fetchTimeout configures how long to wait for a response from a fetch URL. The value should be a humantime duration. A fetch URL is a URL passed in an inference request from which a payload should be downloaded. |
285
285
| api.server.host | string | `"0.0.0.0"` | host is the IP address to listen on. You will want to listen on all interfaces to interact with other pods in the cluster. |
286
286
| api.server.port | int | `8080` | port to listen on. |
287
+
| api.service | object | `` | Service configuration for the API external service |
288
+
| api.service.annotations | object | `` | Additional annotations to add to the service when type is LoadBalancer |
289
+
| api.service.externalTrafficPolicy | string | `` | External traffic policy for LoadBalancer service. Options: Cluster, Local Only applies when service type is LoadBalancer |
290
+
| api.service.loadBalancerSourceRanges | list | `` | List of IP CIDR ranges allowed to access the LoadBalancer service Only applies when service type is LoadBalancer |
291
+
| api.service.type | string | `ClusterIP` | Service type for the API external service. Options: ClusterIP, NodePort, LoadBalancer |
287
292
| api.serviceAccount.create | bool | `true` | Specifies whether to create a default service account for the Deepgram API Deployment. |
288
293
| api.serviceAccount.name | string | `nil` | Allows providing a custom service account name for the API component. If left empty, the default service account name will be used. If specified, and `api.serviceAccount.create = true`, this defines the name of the default service account. If specified, and `api.serviceAccount.create = false`, this provides the name of a preconfigured service account you wish to attach to the API deployment. |
289
-
| api.service.annotations | object | `{}` | Additional annotations to add to the service when type is LoadBalancer |
290
-
| api.service.externalTrafficPolicy | string | `""` | External traffic policy for LoadBalancer service. Options: Cluster, Local |
291
-
| api.service.loadBalancerSourceRanges | list | `[]` | List of IP CIDR ranges allowed to access the LoadBalancer service |
292
-
| api.service.type | string | `"ClusterIP"` | Service type for the API external service. Options: ClusterIP, NodePort, LoadBalancer |
293
294
| api.tolerations | list | `[]` | [Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) to apply to API pods. |
294
295
| api.updateStrategy.rollingUpdate.maxSurge | int | `1` | The maximum number of extra API pods that can be created during a rollingUpdate, relative to the number of replicas. See the [Kubernetes documentation](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-surge) for more details. |
295
296
| api.updateStrategy.rollingUpdate.maxUnavailable | int | `0` | The maximum number of API pods, relative to the number of replicas, that can go offline during a rolling update. See the [Kubernetes documentation](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-unavailable) for more details. |
@@ -348,12 +349,13 @@ If you encounter issues while deploying or using Deepgram, consider the followin
348
349
| engine.server | object |``| Configure Engine containers to listen for requests from API containers. |
349
350
| engine.server.host | string |`"0.0.0.0"`| host is the IP address to listen on forinference requests. You will want to listen on all interfaces to interact with other podsin the cluster. |
350
351
| engine.server.port | int |`8080`| port to listen on for inference requests |
352
+
| engine.service | object |``| Service configuration for the Engine metrics service |
353
+
| engine.service.annotations | object |``| Additional annotations to add to the service when type is LoadBalancer |
354
+
| engine.service.externalTrafficPolicy | string |``| External traffic policy for LoadBalancer service. Options: Cluster, Local Only applies when service type is LoadBalancer |
355
+
| engine.service.loadBalancerSourceRanges | list |``| List of IP CIDR ranges allowed to access the LoadBalancer service Only applies when service type is LoadBalancer |
356
+
| engine.service.type | string |`ClusterIP`| Service typefor the Engine metrics service. Options: ClusterIP, NodePort, LoadBalancer |
351
357
| engine.serviceAccount.create | bool |`true`| Specifies whether to create a default service account for the Deepgram Engine Deployment. |
352
358
| engine.serviceAccount.name | string |`nil`| Allows providing a custom service account name for the Engine component. If left empty, the default service account name will be used. If specified, and `engine.serviceAccount.create = true`, this defines the name of the default service account. If specified, and `engine.serviceAccount.create = false`, this provides the name of a preconfigured service account you wish to attach to the Engine deployment. |
353
-
| engine.service.annotations | object |`{}`| Additional annotations to add to the service when type is LoadBalancer |
354
-
| engine.service.externalTrafficPolicy | string |`""`| External traffic policy for LoadBalancer service. Options: Cluster, Local |
355
-
| engine.service.loadBalancerSourceRanges | list |`[]`| List of IP CIDR ranges allowed to access the LoadBalancer service |
356
-
| engine.service.type | string |`"ClusterIP"`| Service typefor the Engine metrics service. Options: ClusterIP, NodePort, LoadBalancer |
357
359
| engine.startupProbe | object |``| The startupProbe combination of `periodSeconds` and `failureThreshold` allows timefor the container to load all models and start listening for incoming requests. Model load time can be affected by hardware I/O speeds, as well as network speeds if you are using a network volume mount for the models. If you are hitting the failure threshold before models are finished loading, you may want to extend the startup probe. However, this will also extend the time it takes to detect a pod that can't establish a network connection to validate its license. |
358
360
| engine.startupProbe.failureThreshold | int | `60` | failureThreshold defines how many unsuccessful startup probe attempts are allowed before the container will be marked as Failed |
359
361
| engine.startupProbe.periodSeconds | int | `10` | periodSeconds defines how often to execute the probe. |
@@ -393,12 +395,13 @@ If you encounter issues while deploying or using Deepgram, consider the followin
393
395
| licenseProxy.server.host | string | `"0.0.0.0"` | host is the IP address to listen on. You will want to listen on all interfaces to interact with other pods in the cluster. |
394
396
| licenseProxy.server.port | int | `8443` | port to listen on. |
395
397
| licenseProxy.server.statusPort | int | `8080` | statusPort is the port to listen on for the status/health endpoint. |
398
+
| licenseProxy.service | object | `` | Service configuration for the License Proxy status service |
399
+
| licenseProxy.service.annotations | object | `` | Additional annotations to add to the service when type is LoadBalancer |
400
+
| licenseProxy.service.externalTrafficPolicy | string | `` | External traffic policy for LoadBalancer service. Options: Cluster, Local Only applies when service type is LoadBalancer |
401
+
| licenseProxy.service.loadBalancerSourceRanges | list | `` | List of IP CIDR ranges allowed to access the LoadBalancer service Only applies when service type is LoadBalancer |
402
+
| licenseProxy.service.type | string | `ClusterIP` | Service type for the License Proxy status service. Options: ClusterIP, NodePort, LoadBalancer |
396
403
| licenseProxy.serviceAccount.create | bool | `true` | Specifies whether to create a default service account for the Deepgram License Proxy Deployment. |
397
404
| licenseProxy.serviceAccount.name | string | `nil` | Allows providing a custom service account name for the LicenseProxy component. If left empty, the default service account name will be used. If specified, and `licenseProxy.serviceAccount.create = true`, this defines the name of the default service account. If specified, and `licenseProxy.serviceAccount.create = false`, this provides the name of a preconfigured service account you wish to attach to the License Proxy deployment. |
398
-
| licenseProxy.service.annotations | object | `{}` | Additional annotations to add to the service when type is LoadBalancer |
399
-
| licenseProxy.service.externalTrafficPolicy | string | `""` | External traffic policy for LoadBalancer service. Options: Cluster, Local |
400
-
| licenseProxy.service.loadBalancerSourceRanges | list | `[]` | List of IP CIDR ranges allowed to access the LoadBalancer service |
401
-
| licenseProxy.service.type | string | `"ClusterIP"` | Service type for the License Proxy status service. Options: ClusterIP, NodePort, LoadBalancer |
402
405
| licenseProxy.tolerations | list | `[]` | [Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) to apply to License Proxy pods. |
403
406
| licenseProxy.updateStrategy.rollingUpdate | object | `` | For the LicenseProxy, we only expose maxSurge and not maxUnavailable. This is to avoid accidentally having all LicenseProxy nodes go offline during upgrades, which could impact the entire cluster's connection to the Deepgram License Server. |
404
407
| licenseProxy.updateStrategy.rollingUpdate.maxSurge | int |`1`| The maximum number of extra License Proxy pods that can be created during a rollingUpdate, relative to the number of replicas. See the [Kubernetes documentation](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-surge) for more details. |
Copy file name to clipboardExpand all lines: charts/deepgram-self-hosted/README.md.gotmpl
+54Lines changed: 54 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -82,6 +82,60 @@ To configure a specific storage option, see the `engine.modelManager.volumes` [c
82
82
83
83
For detailed instructions on setting up and configuring each storage option, refer to the [Deepgram self-hosted guides](https://developers.deepgram.com/docs/kubernetes) and the respective cloud provider's documentation.
84
84
85
+
### Service Configuration
86
+
87
+
The Deepgram Helm chart provides flexible service configuration options for exposing the API, Engine, and License Proxy services. By default, all services use `ClusterIP` type, which provides internal cluster access only.
88
+
89
+
#### Service Types
90
+
91
+
- **ClusterIP** (default): Exposes the service on a cluster-internal IP. This is the default and recommended option for most deployments.
92
+
- **NodePort**: Exposes the service on each Node's IP at a static port. Useful for development or when you need direct node access.
93
+
- **LoadBalancer**: Exposes the service externally using a cloud provider's load balancer. Recommended for production deployments requiring external access.
94
+
95
+
#### Configuration Examples
96
+
97
+
**API Service with LoadBalancer (with security restrictions):**
- "10.0.0.0/8" # Only allow internal network access
127
+
externalTrafficPolicy: "Cluster" # Allow traffic from any node
128
+
```
129
+
130
+
#### LoadBalancer Security Options
131
+
132
+
When using `LoadBalancer` service type, you can configure additional security and performance options:
133
+
134
+
- **`loadBalancerSourceRanges`**: Restrict access to specific IP CIDR ranges. This provides network-level security by only allowing traffic from specified IP ranges.
135
+
- **`externalTrafficPolicy`**: Controls how external traffic is routed:
136
+
- `Cluster` (default): Traffic can be routed to any node in the cluster, then forwarded to the target pod
137
+
- `Local`: Traffic is only routed to nodes that have the target pod running, preserving source IP addresses
138
+
85
139
### Autoscaling
86
140
87
141
Autoscaling your cluster's capacity to meet incoming traffic demands involves both node autoscaling and pod autoscaling. Node autoscaling for supported cloud providers is setup by default when using this Helm chart and creating your cluster with the [Deepgram self-hosted guides](https://developers.deepgram.com/docs/kubernetes). Pod autoscaling can be enabled via the `scaling.auto.enabled` configuration option in this chart.
0 commit comments