diff --git a/charts/deepgram-self-hosted/CHANGELOG.md b/charts/deepgram-self-hosted/CHANGELOG.md index d57d472..a929f07 100644 --- a/charts/deepgram-self-hosted/CHANGELOG.md +++ b/charts/deepgram-self-hosted/CHANGELOG.md @@ -15,6 +15,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), - Added `loadBalancerSourceRanges` configuration for LoadBalancer services to restrict access to specific IP CIDR ranges - Added `externalTrafficPolicy` configuration for LoadBalancer services to control traffic routing behavior - Updated sample configurations to demonstrate service configuration options including LoadBalancer security settings +- Container-level security context support to Helm templates ### Changed diff --git a/charts/deepgram-self-hosted/README.md b/charts/deepgram-self-hosted/README.md index 127e920..7634781 100644 --- a/charts/deepgram-self-hosted/README.md +++ b/charts/deepgram-self-hosted/README.md @@ -253,6 +253,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin | api.additionalAnnotations | object | `nil` | Additional annotations to add to the API deployment | | api.additionalLabels | object | `{}` | Additional labels to add to API resources | | api.affinity | object | `{}` | [Affinity and anti-affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity) to apply for API pods. | +| api.containerSecurityContext | object | `{}` | [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for API containers. | | api.customToml | string | `nil` | Custom TOML sections can be added to extend api.toml | | api.driverPool | object | `` | driverPool configures the backend pool of speech engines (generically referred to as "drivers" here). The API will load-balance among drivers in the standard pool; if one standard driver fails, the next one will be tried. | | api.driverPool.standard | object | `` | standard is the main driver pool to use. | @@ -276,7 +277,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin | api.resolver.maxTTL | int | `nil` | maxTTL sets the DNS TTL value if specifying a custom DNS nameserver. | | api.resolver.nameservers | list | `[]` | nameservers allows for specifying custom domain name server(s). A valid list item's format is "{IP} {PORT} {PROTOCOL (tcp or udp)}", e.g. `"127.0.0.1 53 udp"`. | | api.resources | object | `` | Configure resource limits per API container. See [Deepgram's documentation](https://developers.deepgram.com/docs/self-hosted-deployment-environments#api) for more details. | -| api.securityContext | object | `{}` | [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for API pods. | +| api.securityContext | object | `{}` | [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for API pods. | | api.server | object | `` | Configure how the API will listen for your requests | | api.server.callbackConnTimeout | string | `"1s"` | callbackConnTimeout configures how long to wait for a connection to a callback URL. See [Deepgram's callback documentation](https://developers.deepgram.com/docs/callback) for more details. The value should be a humantime duration. | | api.server.callbackTimeout | string | `"10s"` | callbackTimeout configures how long to wait for a response from a callback URL. See [Deepgram's callback documentation](https://developers.deepgram.com/docs/callback) for more details. The value should be a humantime duration. | @@ -313,6 +314,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin | engine.chunking.speechToText.streaming.minDuration | float | `nil` | minDuration is the minimum audio duration for a STT chunk size for a streaming request | | engine.chunking.speechToText.streaming.step | float | `1` | step defines how often to return interim results, in seconds. This value may be lowered to increase the frequency of interim results. However, this also causes a significant decrease in the number of concurrent streams supported by a single GPU. Please contact your Deepgram Account representative for more details. | | engine.concurrencyLimit.activeRequests | int | `nil` | activeRequests limits the number of active requests handled by a single Engine container. If additional requests beyond the limit are sent, the API container forming the request will try a different Engine pod. If no Engine pods are able to accept the request, the API will return a 429 HTTP response to the client. The `nil` default means no limit will be set. | +| engine.containerSecurityContext | object | `{}` | [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for Engine containers. | | engine.customToml | string | `nil` | Custom TOML sections can be added to extend engine.toml | | engine.features.streamingNer | bool | `true` | Enables format entity tags on streaming audio *if* a valid NER model is available. | | engine.halfPrecision.state | string | `"auto"` | Engine will automatically enable half precision operations if your GPU supports them. You can explicitly enable or disable this behavior with the state parameter which supports `"enable"`, `"disabled"`, and `"auto"`. | @@ -345,7 +347,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin | engine.resources | object | `` | Configure resource limits per Engine container. See [Deepgram's documentation](https://developers.deepgram.com/docs/self-hosted-deployment-environments#engine) for more details. | | engine.resources.limits.gpu | int | `1` | gpu maps to the nvidia.com/gpu resource parameter | | engine.resources.requests.gpu | int | `1` | gpu maps to the nvidia.com/gpu resource parameter | -| engine.securityContext | object | `{}` | [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for Engine pods. | +| engine.securityContext | object | `{}` | [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for Engine pods. | | engine.server | object | `` | Configure Engine containers to listen for requests from API containers. | | engine.server.host | string | `"0.0.0.0"` | host is the IP address to listen on for inference requests. You will want to listen on all interfaces to interact with other pods in the cluster. | | engine.server.port | int | `8080` | port to listen on for inference requests | @@ -378,6 +380,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin | licenseProxy.additionalAnnotations | object | `nil` | Additional annotations to add to the LicenseProxy deployment | | licenseProxy.additionalLabels | object | `{}` | Additional labels to add to License Proxy resources | | licenseProxy.affinity | object | `{}` | [Affinity and anti-affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity) to apply for License Proxy pods. | +| licenseProxy.containerSecurityContext | object | `{}` | [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for License Proxy containers. | | licenseProxy.deploySecondReplica | bool | `false` | If the License Proxy is deployed, one replica should be sufficient to support many API/Engine pods. Highly available environments may wish to deploy a second replica to ensure uptime, which can be toggled with this option. | | licenseProxy.enabled | bool | `false` | The License Proxy is optional, but highly recommended to be deployed in production to enable highly available environments. | | licenseProxy.image.path | string | `"quay.io/deepgram/self-hosted-license-proxy"` | path configures the image path to use for creating License Proxy containers. You may change this from the public Quay image path if you have imported Deepgram images into a private container registry. | @@ -389,7 +392,7 @@ If you encounter issues while deploying or using Deepgram, consider the followin | licenseProxy.nodeSelector | object | `{}` | [Node selector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector) to apply to License Proxy pods. | | licenseProxy.readinessProbe | object | `` | Readiness probe customization for License Proxy pods. | | licenseProxy.resources | object | `` | Configure resource limits per License Proxy container. See [Deepgram's documentation](https://developers.deepgram.com/docs/license-proxy#system-requirements) for more details. | -| licenseProxy.securityContext | object | `{}` | [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for License Proxy pods. | +| licenseProxy.securityContext | object | `{}` | [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for License Proxy pods. | | licenseProxy.server | object | `` | Configure how the license proxy will listen for licensing requests. | | licenseProxy.server.baseUrl | string | `"/"` | baseUrl is the prefix for incoming license verification requests. | | licenseProxy.server.host | string | `"0.0.0.0"` | host is the IP address to listen on. You will want to listen on all interfaces to interact with other pods in the cluster. | diff --git a/charts/deepgram-self-hosted/templates/api/api.deployment.yaml b/charts/deepgram-self-hosted/templates/api/api.deployment.yaml index bb365c7..d14b5d7 100644 --- a/charts/deepgram-self-hosted/templates/api/api.deployment.yaml +++ b/charts/deepgram-self-hosted/templates/api/api.deployment.yaml @@ -52,6 +52,10 @@ spec: {{- end }} containers: - name: {{ .Values.api.namePrefix }} + {{- with .Values.api.containerSecurityContext }} + securityContext: + {{- toYaml . | nindent 10 }} + {{- end }} image: {{ .Values.api.image.path }}:{{ .Values.api.image.tag }} imagePullPolicy: {{ .Values.api.image.pullPolicy }} envFrom: diff --git a/charts/deepgram-self-hosted/templates/engine/engine.deployment.yaml b/charts/deepgram-self-hosted/templates/engine/engine.deployment.yaml index df91dd9..b728134 100644 --- a/charts/deepgram-self-hosted/templates/engine/engine.deployment.yaml +++ b/charts/deepgram-self-hosted/templates/engine/engine.deployment.yaml @@ -63,13 +63,19 @@ spec: {{- toYaml $.Values.engine.tolerations | nindent 8 }} nodeSelector: {{- toYaml $.Values.engine.nodeSelector | nindent 8 }} + {{- with $.Values.engine.securityContext }} securityContext: - {{- toYaml $.Values.engine.securityContext | nindent 8 }} + {{- toYaml . | nindent 8 }} + {{- end}} {{- if or $.Values.engine.serviceAccount.create $.Values.engine.serviceAccount.name }} serviceAccountName: {{ default (printf "%s-sa" $.Values.engine.namePrefix) $.Values.engine.serviceAccount.name }} {{- end }} containers: - name: {{ $.Values.engine.namePrefix }}{{- if $type }}-{{ $type }}{{- end }} + {{- with $.Values.engine.containerSecurityContext }} + securityContext: + {{- toYaml . | nindent 10 }} + {{- end }} image: {{ $.Values.engine.image.path }}:{{ $.Values.engine.image.tag }} imagePullPolicy: {{ $.Values.engine.image.pullPolicy }} envFrom: @@ -83,24 +89,24 @@ spec: value: "void" {{- end }} {{- if $.Values.aura2.enabled }} - {{- if .Values.aura2.english.enabled }} + {{- if $.Values.aura2.english.enabled }} - name: IMPELLER_AURA2_MAX_BATCH_SIZE - value: "{{ .Values.aura2.english.maxBatchSize }}" + value: "{{ $.Values.aura2.english.maxBatchSize }}" - name: IMPELLER_AURA2_T2C_UUID - value: "{{ .Values.aura2.english.t2cUuid }}" + value: "{{ $.Values.aura2.english.t2cUuid }}" - name: IMPELLER_AURA2_C2A_UUID - value: "{{ .Values.aura2.english.c2aUuid }}" + value: "{{ $.Values.aura2.english.c2aUuid }}" - name: CUDA_VISIBLE_DEVICES - value: "{{ .Values.aura2.english.cudaVisibleDevices }}" - {{- else if .Values.aura2.spanish.enabled }} + value: "{{ $.Values.aura2.english.cudaVisibleDevices }}" + {{- else if $.Values.aura2.spanish.enabled }} - name: IMPELLER_AURA2_MAX_BATCH_SIZE - value: "{{ .Values.aura2.spanish.maxBatchSize }}" + value: "{{ $.Values.aura2.spanish.maxBatchSize }}" - name: IMPELLER_AURA2_T2C_UUID - value: "{{ .Values.aura2.spanish.t2cUuid }}" + value: "{{ $.Values.aura2.spanish.t2cUuid }}" - name: IMPELLER_AURA2_C2A_UUID - value: "{{ .Values.aura2.spanish.c2aUuid }}" + value: "{{ $.Values.aura2.spanish.c2aUuid }}" - name: CUDA_VISIBLE_DEVICES - value: "{{ .Values.aura2.spanish.cudaVisibleDevices }}" + value: "{{ $.Values.aura2.spanish.cudaVisibleDevices }}" {{- end }} {{- end }} command: [ "impeller" ] @@ -157,13 +163,13 @@ spec: {{- $gcpGpdEnabled := $.Values.engine.modelManager.volumes.gcp.gpd.enabled }} {{- $enabledCount := (int $customClaimEnabled) | add (int $awsEfsEnabled) | add (int $gcpGpdEnabled) }} - + {{- if eq $enabledCount 0 }} {{- fail "Error: At least one of customVolumeClaim.enabled, aws.efs.enabled, or gcp.gpd.enabled must be set to true." }} {{- else if gt $enabledCount 1 }} {{- fail "Error: Only one of customVolumeClaim.enabled, aws.efs.enabled, or gcp.gpd.enabled can be set to true." }} {{- end }} - + {{- if $customClaimEnabled }} {{- if not $customClaimName }} {{- fail "Error: customVolumeClaim.name must be set when customVolumeClaim.enabled is true." }} diff --git a/charts/deepgram-self-hosted/templates/license-proxy/license-proxy.deployment.yaml b/charts/deepgram-self-hosted/templates/license-proxy/license-proxy.deployment.yaml index b976d94..fcb8cea 100644 --- a/charts/deepgram-self-hosted/templates/license-proxy/license-proxy.deployment.yaml +++ b/charts/deepgram-self-hosted/templates/license-proxy/license-proxy.deployment.yaml @@ -50,6 +50,10 @@ spec: {{- end }} containers: - name: {{ .Values.licenseProxy.namePrefix }} + {{- with .Values.licenseProxy.containerSecurityContext }} + securityContext: + {{- toYaml . | nindent 10 }} + {{- end }} image: {{ .Values.licenseProxy.image.path }}:{{ .Values.licenseProxy.image.tag }} imagePullPolicy: {{ .Values.licenseProxy.image.pullPolicy }} envFrom: diff --git a/charts/deepgram-self-hosted/templates/volumes/aws/efs-model-download.job.yaml b/charts/deepgram-self-hosted/templates/volumes/aws/efs-model-download.job.yaml index d9dd5ec..1e32e1e 100644 --- a/charts/deepgram-self-hosted/templates/volumes/aws/efs-model-download.job.yaml +++ b/charts/deepgram-self-hosted/templates/volumes/aws/efs-model-download.job.yaml @@ -16,10 +16,18 @@ spec: {{- toYaml .Values.engine.affinity | nindent 8 }} tolerations: {{- toYaml .Values.engine.tolerations | nindent 8 }} + {{- with .Values.engine.securityContext }} + securityContext: + {{- toYaml . | nindent 8 }} + {{- end }} nodeSelector: {{- toYaml .Values.engine.nodeSelector | nindent 8 }} containers: - name: model-management + {{- with .Values.engine.containerSecurityContext }} + securityContext: + {{- toYaml . | nindent 10 }} + {{- end }} image: alpine command: - /bin/sh diff --git a/charts/deepgram-self-hosted/values.yaml b/charts/deepgram-self-hosted/values.yaml index d8e75f7..fced4e1 100644 --- a/charts/deepgram-self-hosted/values.yaml +++ b/charts/deepgram-self-hosted/values.yaml @@ -293,9 +293,12 @@ api: # to apply to API pods. nodeSelector: {} - # -- [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for API pods. + # -- [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for API pods. securityContext: {} + # -- [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for API containers. + containerSecurityContext: {} + serviceAccount: # -- Specifies whether to create a default service account for the Deepgram API Deployment. create: true @@ -507,9 +510,12 @@ engine: # to apply to Engine pods. nodeSelector: {} - # -- [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for Engine pods. + # -- [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for Engine pods. securityContext: {} + # -- [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for Engine containers. + containerSecurityContext: {} + serviceAccount: # -- Specifies whether to create a default service account for the Deepgram Engine Deployment. create: true @@ -769,9 +775,12 @@ licenseProxy: # to apply to License Proxy pods. nodeSelector: {} - # -- [Security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) for License Proxy pods. + # -- [Pod-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod) for License Proxy pods. securityContext: {} + # -- [Container-level security context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container) for License Proxy containers. + containerSecurityContext: {} + serviceAccount: # -- Specifies whether to create a default service account for the Deepgram License Proxy Deployment. create: true