Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions charts/deepgram-self-hosted/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
- Added `loadBalancerSourceRanges` configuration for LoadBalancer services to restrict access to specific IP CIDR ranges
- Added `externalTrafficPolicy` configuration for LoadBalancer services to control traffic routing behavior
- Updated sample configurations to demonstrate service configuration options including LoadBalancer security settings
- Container-level security context support to Helm templates

### Changed

Expand Down
9 changes: 6 additions & 3 deletions charts/deepgram-self-hosted/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| api.additionalAnnotations | object | `nil` | Additional annotations to add to the API deployment |
| api.additionalLabels | object | `{}` | Additional labels to add to API resources |
| api.affinity | object | `{}` | [Affinity and anti-affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity) to apply for API pods. |
| api.containerSecurityContext | object | `{}` | [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for API containers. |
| api.customToml | string | `nil` | Custom TOML sections can be added to extend api.toml |
| api.driverPool | object | `` | driverPool configures the backend pool of speech engines (generically referred to as "drivers" here). The API will load-balance among drivers in the standard pool; if one standard driver fails, the next one will be tried. |
| api.driverPool.standard | object | `` | standard is the main driver pool to use. |
Expand All @@ -276,7 +277,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| api.resolver.maxTTL | int | `nil` | maxTTL sets the DNS TTL value if specifying a custom DNS nameserver. |
| api.resolver.nameservers | list | `[]` | nameservers allows for specifying custom domain name server(s). A valid list item's format is "{IP} {PORT} {PROTOCOL (tcp or udp)}", e.g. `"127.0.0.1 53 udp"`. |
| api.resources | object | `` | Configure resource limits per API container. See [Deepgram's documentation](https://developers.deepgram.com/docs/self-hosted-deployment-environments#api) for more details. |
| api.securityContext | object | `{}` | [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for API pods. |
| api.securityContext | object | `{}` | [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for API pods. |
| api.server | object | `` | Configure how the API will listen for your requests |
| api.server.callbackConnTimeout | string | `"1s"` | callbackConnTimeout configures how long to wait for a connection to a callback URL. See [Deepgram's callback documentation](https://developers.deepgram.com/docs/callback) for more details. The value should be a humantime duration. |
| api.server.callbackTimeout | string | `"10s"` | callbackTimeout configures how long to wait for a response from a callback URL. See [Deepgram's callback documentation](https://developers.deepgram.com/docs/callback) for more details. The value should be a humantime duration. |
Expand Down Expand Up @@ -313,6 +314,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| engine.chunking.speechToText.streaming.minDuration | float | `nil` | minDuration is the minimum audio duration for a STT chunk size for a streaming request |
| engine.chunking.speechToText.streaming.step | float | `1` | step defines how often to return interim results, in seconds. This value may be lowered to increase the frequency of interim results. However, this also causes a significant decrease in the number of concurrent streams supported by a single GPU. Please contact your Deepgram Account representative for more details. |
| engine.concurrencyLimit.activeRequests | int | `nil` | activeRequests limits the number of active requests handled by a single Engine container. If additional requests beyond the limit are sent, the API container forming the request will try a different Engine pod. If no Engine pods are able to accept the request, the API will return a 429 HTTP response to the client. The `nil` default means no limit will be set. |
| engine.containerSecurityContext | object | `{}` | [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for Engine containers. |
| engine.customToml | string | `nil` | Custom TOML sections can be added to extend engine.toml |
| engine.features.streamingNer | bool | `true` | Enables format entity tags on streaming audio *if* a valid NER model is available. |
| engine.halfPrecision.state | string | `"auto"` | Engine will automatically enable half precision operations if your GPU supports them. You can explicitly enable or disable this behavior with the state parameter which supports `"enable"`, `"disabled"`, and `"auto"`. |
Expand Down Expand Up @@ -345,7 +347,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| engine.resources | object | `` | Configure resource limits per Engine container. See [Deepgram's documentation](https://developers.deepgram.com/docs/self-hosted-deployment-environments#engine) for more details. |
| engine.resources.limits.gpu | int | `1` | gpu maps to the nvidia.com/gpu resource parameter |
| engine.resources.requests.gpu | int | `1` | gpu maps to the nvidia.com/gpu resource parameter |
| engine.securityContext | object | `{}` | [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for Engine pods. |
| engine.securityContext | object | `{}` | [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for Engine pods. |
| engine.server | object | `` | Configure Engine containers to listen for requests from API containers. |
| engine.server.host | string | `"0.0.0.0"` | host is the IP address to listen on for inference requests. You will want to listen on all interfaces to interact with other pods in the cluster. |
| engine.server.port | int | `8080` | port to listen on for inference requests |
Expand Down Expand Up @@ -378,6 +380,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| licenseProxy.additionalAnnotations | object | `nil` | Additional annotations to add to the LicenseProxy deployment |
| licenseProxy.additionalLabels | object | `{}` | Additional labels to add to License Proxy resources |
| licenseProxy.affinity | object | `{}` | [Affinity and anti-affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity) to apply for License Proxy pods. |
| licenseProxy.containerSecurityContext | object | `{}` | [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for License Proxy containers. |
| licenseProxy.deploySecondReplica | bool | `false` | If the License Proxy is deployed, one replica should be sufficient to support many API/Engine pods. Highly available environments may wish to deploy a second replica to ensure uptime, which can be toggled with this option. |
| licenseProxy.enabled | bool | `false` | The License Proxy is optional, but highly recommended to be deployed in production to enable highly available environments. |
| licenseProxy.image.path | string | `"quay.io/deepgram/self-hosted-license-proxy"` | path configures the image path to use for creating License Proxy containers. You may change this from the public Quay image path if you have imported Deepgram images into a private container registry. |
Expand All @@ -389,7 +392,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin
| licenseProxy.nodeSelector | object | `{}` | [Node selector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector) to apply to License Proxy pods. |
| licenseProxy.readinessProbe | object | `` | Readiness probe customization for License Proxy pods. |
| licenseProxy.resources | object | `` | Configure resource limits per License Proxy container. See [Deepgram's documentation](https://developers.deepgram.com/docs/license-proxy#system-requirements) for more details. |
| licenseProxy.securityContext | object | `{}` | [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for License Proxy pods. |
| licenseProxy.securityContext | object | `{}` | [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for License Proxy pods. |
| licenseProxy.server | object | `` | Configure how the license proxy will listen for licensing requests. |
| licenseProxy.server.baseUrl | string | `"/"` | baseUrl is the prefix for incoming license verification requests. |
| licenseProxy.server.host | string | `"0.0.0.0"` | host is the IP address to listen on. You will want to listen on all interfaces to interact with other pods in the cluster. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ spec:
{{- end }}
containers:
- name: {{ .Values.api.namePrefix }}
{{- with .Values.api.containerSecurityContext }}
securityContext:
{{- toYaml . | nindent 10 }}
{{- end }}
image: {{ .Values.api.image.path }}:{{ .Values.api.image.tag }}
imagePullPolicy: {{ .Values.api.image.pullPolicy }}
envFrom:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,13 +63,19 @@ spec:
{{- toYaml $.Values.engine.tolerations | nindent 8 }}
nodeSelector:
{{- toYaml $.Values.engine.nodeSelector | nindent 8 }}
{{- with $.Values.engine.securityContext }}
securityContext:
{{- toYaml $.Values.engine.securityContext | nindent 8 }}
{{- toYaml . | nindent 8 }}
{{- end}}
{{- if or $.Values.engine.serviceAccount.create $.Values.engine.serviceAccount.name }}
serviceAccountName: {{ default (printf "%s-sa" $.Values.engine.namePrefix) $.Values.engine.serviceAccount.name }}
{{- end }}
containers:
- name: {{ $.Values.engine.namePrefix }}{{- if $type }}-{{ $type }}{{- end }}
{{- with $.Values.engine.containerSecurityContext }}
securityContext:
{{- toYaml . | nindent 10 }}
{{- end }}
image: {{ $.Values.engine.image.path }}:{{ $.Values.engine.image.tag }}
imagePullPolicy: {{ $.Values.engine.image.pullPolicy }}
envFrom:
Expand All @@ -83,24 +89,24 @@ spec:
value: "void"
{{- end }}
{{- if $.Values.aura2.enabled }}
{{- if .Values.aura2.english.enabled }}
{{- if $.Values.aura2.english.enabled }}
- name: IMPELLER_AURA2_MAX_BATCH_SIZE
value: "{{ .Values.aura2.english.maxBatchSize }}"
value: "{{ $.Values.aura2.english.maxBatchSize }}"
- name: IMPELLER_AURA2_T2C_UUID
value: "{{ .Values.aura2.english.t2cUuid }}"
value: "{{ $.Values.aura2.english.t2cUuid }}"
- name: IMPELLER_AURA2_C2A_UUID
value: "{{ .Values.aura2.english.c2aUuid }}"
value: "{{ $.Values.aura2.english.c2aUuid }}"
- name: CUDA_VISIBLE_DEVICES
value: "{{ .Values.aura2.english.cudaVisibleDevices }}"
{{- else if .Values.aura2.spanish.enabled }}
value: "{{ $.Values.aura2.english.cudaVisibleDevices }}"
{{- else if $.Values.aura2.spanish.enabled }}
- name: IMPELLER_AURA2_MAX_BATCH_SIZE
value: "{{ .Values.aura2.spanish.maxBatchSize }}"
value: "{{ $.Values.aura2.spanish.maxBatchSize }}"
- name: IMPELLER_AURA2_T2C_UUID
value: "{{ .Values.aura2.spanish.t2cUuid }}"
value: "{{ $.Values.aura2.spanish.t2cUuid }}"
- name: IMPELLER_AURA2_C2A_UUID
value: "{{ .Values.aura2.spanish.c2aUuid }}"
value: "{{ $.Values.aura2.spanish.c2aUuid }}"
- name: CUDA_VISIBLE_DEVICES
value: "{{ .Values.aura2.spanish.cudaVisibleDevices }}"
value: "{{ $.Values.aura2.spanish.cudaVisibleDevices }}"
{{- end }}
{{- end }}
command: [ "impeller" ]
Expand Down Expand Up @@ -157,13 +163,13 @@ spec:
{{- $gcpGpdEnabled := $.Values.engine.modelManager.volumes.gcp.gpd.enabled }}

{{- $enabledCount := (int $customClaimEnabled) | add (int $awsEfsEnabled) | add (int $gcpGpdEnabled) }}

{{- if eq $enabledCount 0 }}
{{- fail "Error: At least one of customVolumeClaim.enabled, aws.efs.enabled, or gcp.gpd.enabled must be set to true." }}
{{- else if gt $enabledCount 1 }}
{{- fail "Error: Only one of customVolumeClaim.enabled, aws.efs.enabled, or gcp.gpd.enabled can be set to true." }}
{{- end }}

{{- if $customClaimEnabled }}
{{- if not $customClaimName }}
{{- fail "Error: customVolumeClaim.name must be set when customVolumeClaim.enabled is true." }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,10 @@ spec:
{{- end }}
containers:
- name: {{ .Values.licenseProxy.namePrefix }}
{{- with .Values.licenseProxy.containerSecurityContext }}
securityContext:
{{- toYaml . | nindent 10 }}
{{- end }}
image: {{ .Values.licenseProxy.image.path }}:{{ .Values.licenseProxy.image.tag }}
imagePullPolicy: {{ .Values.licenseProxy.image.pullPolicy }}
envFrom:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,18 @@ spec:
{{- toYaml .Values.engine.affinity | nindent 8 }}
tolerations:
{{- toYaml .Values.engine.tolerations | nindent 8 }}
{{- with .Values.engine.securityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
nodeSelector:
{{- toYaml .Values.engine.nodeSelector | nindent 8 }}
containers:
- name: model-management
{{- with .Values.engine.containerSecurityContext }}
securityContext:
{{- toYaml . | nindent 10 }}
{{- end }}
image: alpine
command:
- /bin/sh
Expand Down
15 changes: 12 additions & 3 deletions charts/deepgram-self-hosted/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -293,9 +293,12 @@ api:
# to apply to API pods.
nodeSelector: {}

# -- [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for API pods.
# -- [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for API pods.
securityContext: {}

# -- [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for API containers.
containerSecurityContext: {}

serviceAccount:
# -- Specifies whether to create a default service account for the Deepgram API Deployment.
create: true
Expand Down Expand Up @@ -507,9 +510,12 @@ engine:
# to apply to Engine pods.
nodeSelector: {}

# -- [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for Engine pods.
# -- [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for Engine pods.
securityContext: {}

# -- [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for Engine containers.
containerSecurityContext: {}

serviceAccount:
# -- Specifies whether to create a default service account for the Deepgram Engine Deployment.
create: true
Expand Down Expand Up @@ -769,9 +775,12 @@ licenseProxy:
# to apply to License Proxy pods.
nodeSelector: {}

# -- [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for License Proxy pods.
# -- [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for License Proxy pods.
securityContext: {}

# -- [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for License Proxy containers.
containerSecurityContext: {}

serviceAccount:
# -- Specifies whether to create a default service account for the Deepgram License Proxy Deployment.
create: true
Expand Down
Loading