Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
eedd8f5
Import template app from waldur example
JohnGarbutt Jan 24, 2025
7604cea
Add initial README file
JohnGarbutt Jan 24, 2025
3c45ec7
Turn on apps
JohnGarbutt Jan 24, 2025
1246d54
Remove secret
JohnGarbutt Jan 24, 2025
9dbe2ff
Attempt to fix juypterhub deploy
JohnGarbutt Jan 24, 2025
c4e7645
Actually remove secret
JohnGarbutt Jan 24, 2025
417c624
Initial slurm-operator deploy
JohnGarbutt Jan 27, 2025
b2562b8
Get initial helm deploy of slurm operator up
JohnGarbutt Jan 27, 2025
d8a576c
Initial attempt to deploy slurm control plane
JohnGarbutt Jan 27, 2025
729c6eb
Fix storage class for slurm pvc
JohnGarbutt Jan 27, 2025
7cc4a62
Add nodesets
JohnGarbutt Jan 27, 2025
cc7f536
Update slurm deploy timeout to 10mins
JohnGarbutt Jan 27, 2025
b4d0885
Fix up where nodesets are defined
JohnGarbutt Jan 27, 2025
fd0f6c2
Apply the example values
JohnGarbutt Jan 27, 2025
90a9df2
Reset to the default namespaces and release names
JohnGarbutt Jan 27, 2025
52250ad
Increase the slurm timeout a bit more
JohnGarbutt Jan 27, 2025
a7c55a8
Fix up JuypterHub
JohnGarbutt Jan 27, 2025
97f579e
Simplify values in juypterhub
JohnGarbutt Jan 29, 2025
7925dc9
Fix up storage class with new name
JohnGarbutt Jan 30, 2025
0cb5684
Expose XPUs
JohnGarbutt Jan 30, 2025
60805e0
Add customer juypter image build
JohnGarbutt Jan 31, 2025
be2d370
Fix typo in dockerfile
JohnGarbutt Jan 31, 2025
6421ba1
Add the missing -U upgrade flag
JohnGarbutt Jan 31, 2025
1a74c1c
Fix up the workflow permissions
JohnGarbutt Jan 31, 2025
df5f475
Add examples into notebook image
JohnGarbutt Jan 31, 2025
effd9b5
Initial xpu profiles
JohnGarbutt Jan 31, 2025
0ca30f5
Adding inital rdma test
JohnGarbutt Feb 3, 2025
f251782
Attempt to reduce privilage of juypterhub pods
JohnGarbutt Feb 6, 2025
09a4ef6
Add kube-perftest
JohnGarbutt Feb 6, 2025
5494585
fixed slurm control plane getting deleted by helm
wtripp180901 Feb 6, 2025
50e2993
added operator as dependency of control plane
wtripp180901 Feb 7, 2025
e32748e
now slinky compute nodes now preempted by jupyter labs
wtripp180901 Feb 6, 2025
618dd10
Added keda autoscaling
wtripp180901 Feb 7, 2025
96b306e
Merge pull request #6 from JohnGarbutt/slinky-autoscale
JohnGarbutt Feb 10, 2025
d36851e
Merge pull request #5 from JohnGarbutt/slinky-preempt
JohnGarbutt Feb 10, 2025
60cecfe
Merge pull request #4 from JohnGarbutt/fix-flux-slurm-delete
JohnGarbutt Feb 10, 2025
9ffac64
Be sure to install keda with slinky
JohnGarbutt Feb 11, 2025
aab92f4
Restore JuypterHub while lockdown is wip
JohnGarbutt Feb 11, 2025
f0863a3
Use kubeperf test 0.1.0 release
JohnGarbutt Feb 11, 2025
3f68024
Remote prometheus bits, we have service monitor already
JohnGarbutt Feb 11, 2025
de8a237
Add extra namespace to slinky
JohnGarbutt Feb 11, 2025
c59f169
Update build-images.yml
JohnGarbutt Feb 13, 2025
f2eecd3
Update README.md
JohnGarbutt Feb 13, 2025
c5e5813
disabled privileged initcontainers
wtripp180901 Feb 14, 2025
5485934
Merge pull request #8 from JohnGarbutt/jupyter-privilege-fix
JohnGarbutt Feb 14, 2025
6670396
Move to baseline in juypterhub
JohnGarbutt Feb 14, 2025
4a59e82
Add in opencost
JohnGarbutt Feb 16, 2025
ebbc1ac
Add tetragon example
JohnGarbutt Mar 10, 2025
616e0bb
Add tetragon example
JohnGarbutt Mar 10, 2025
7aacc4f
Add tetragon example
JohnGarbutt Mar 10, 2025
2ae374c
Merge remote-tracking branch 'origin/main' into HEAD
JohnGarbutt Mar 10, 2025
ffb6142
Merge pull request #11 from JohnGarbutt/fix-up-terragon
JohnGarbutt Mar 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions .github/workflows/build-images.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Publish Container Images
on:
push:
paths:
- images/**
jobs:
build_push_images:
name: Build and push images
permissions:
contents: read
id-token: write # needed for signing the images with GitHub OIDC Token
packages: write # required for pushing container images
security-events: write # required for pushing SARIF files
runs-on: ubuntu-22.04
strategy:
matrix:
include:
- image: jupyterhub-intel-gpu
steps:
- name: Check out the repository
uses: actions/checkout@v4

- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Calculate metadata for image
id: image-meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository_owner }}/${{ matrix.image }}
# Produce the branch name or tag and the SHA as tags
tags: |
type=ref,event=branch
type=ref,event=tag
type=sha,prefix=

- name: Build and push image
uses: azimuth-cloud/github-actions/docker-multiarch-build-push@master
with:
cache-key: ${{ matrix.image }}
context: ./images/${{ matrix.image }}
platforms: linux/amd64
push: true
tags: ${{ steps.image-meta.outputs.tags }}
labels: ${{ steps.image-meta.outputs.labels }}

20 changes: 20 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,22 @@
# fluxcd-demo-apps
A repository of example apps deployed and managed using Flux CD

> [!CAUTION]
> This is very much a work in progress!!

## Creating Sealed Secrets

We assume the use of sealed secrets.

TODO: add more instructions!

## How to install

The host cluster must have the [Flux CD](https://fluxcd.io/) controllers installed.

Configuring Flux to manage the apps defined in the repository is a one-time operation:

```sh
flux create source git myapps --url=<giturl> --branch=main
flux create kustomization myapps --source=GitRepository/myapps --prune=true
```
9 changes: 9 additions & 0 deletions apps/cert-manager/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: cert-manager-config
namespace: cert-manager
data:
values.yaml: |
installCRDs: true
13 changes: 13 additions & 0 deletions apps/cert-manager/helmchart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmChart
metadata:
name: cert-manager
namespace: cert-manager
spec:
chart: cert-manager
version: v1.16.1
sourceRef:
kind: HelmRepository
name: jetstack
interval: 1h
25 changes: 25 additions & 0 deletions apps/cert-manager/helmrelease.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: cert-manager
namespace: cert-manager
spec:
chartRef:
kind: HelmChart
name: cert-manager
releaseName: cert-manager
valuesFrom:
- kind: ConfigMap
name: cert-manager-config
valuesKey: values.yaml
install:
createNamespace: true
remediation:
retries: -1
upgrade:
remediation:
retries: -1
driftDetection:
mode: enabled
interval: 5m
9 changes: 9 additions & 0 deletions apps/cert-manager/helmrepository.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: jetstack
namespace: cert-manager
spec:
url: https://charts.jetstack.io
interval: 1h
6 changes: 6 additions & 0 deletions apps/cert-manager/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
resources:
- namespace.yaml
- configmap.yaml
- helmrepository.yaml
- helmchart.yaml
- helmrelease.yaml
5 changes: 5 additions & 0 deletions apps/cert-manager/namespace.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager
137 changes: 137 additions & 0 deletions apps/jupyterhub/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: jupyterhub-config
namespace: jupyterhub
data:
values.yaml: |

# Add JupyterHub customisations here
# See https://artifacthub.io/packages/helm/jupyterhub/jupyterhub

# We don't need a load balancer for the proxy
# since we want to use ingress instead.
#
# To access manually try:
# kubectl port-forward -n jupyterhub svc/proxy-public 8080:80
proxy:
service:
type: ClusterIP

# Make JupyterHub accessible via ingress
ingress:
enabled: false
ingressClassName: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
# IP must match NGINX ingress controller's
# load balancer IP.
# See `kubectl get svc -n ingress-nginx`
- &host jh.dawntest.128-232-224-75.nip.io
pathSuffix: ""
tls:
- hosts:
- *host
secretName: jupyterhub-ingress-cert

hub:
allowNamedServers: true
namedServerLimitPerUser: 3
activeServerLimit: 5
# Server startup fails with default
# restrictive network policy.
networkPolicy:
enabled: false

# # Configure Keycloak auth
# config:
# JupyterHub:
# authenticator_class: generic-oauth
# GenericOAuthenticator:
# client_id: scott-jupyterhub-test
# # client_secret: <stored-in-sealed-secret>
# # Must match ingress host
# oauth_callback_url: https://128-232-226-29.sslip.io/hub/oauth_callback
# authorize_url: https://identity.apps.hpc.cam.ac.uk/realms/az-rcp-cloud-portal-demo/protocol/openid-connect/auth
# token_url: https://identity.apps.hpc.cam.ac.uk/realms/az-rcp-cloud-portal-demo/protocol/openid-connect/token
# userdata_url: https://identity.apps.hpc.cam.ac.uk/realms/az-rcp-cloud-portal-demo/protocol/openid-connect/userinfo
# scope:
# - openid
# - groups
# username_claim: preferred_username
# claim_groups_key: groups
# userdata_params:
# state: state

# # Limit access to specific keycloak groups
# allowed_groups:
# - /admins
# - /platform-users

# # Allow hub admin access to keycloak users/groups
# # admin_groups:
# # - /admins
# admin_users:
# - scottd_stack

# # Label for the 'Sign in with ___' button
# login_service: Keycloak

# turn this off for now
prePuller:
hook:
enabled: false
continuous:
enabled: false

singleuser:
image:
name: quay.io/jupyter/minimal-notebook
tag: "2025-01-28"
cloudMetadata:
blockWithIptables: false
profileList:
- display_name: "Minimal environment"
description: "To avoid too much bells and whistles: Python."
default: true
- display_name: "Datascience environment"
description: "If you want the additional bells and whistles: Python, R, and Julia."
kubespawner_override:
image: quay.io/jupyter/datascience-notebook:2025-01-28
- display_name: "Pytorch environment with 1 x Intel XPUs"
description: "Pytorch Jupyter Stacks image!"
kubespawner_override:
#image: quay.io/jupyter/pytorch-notebook:2025-01-28
#image: ghcr.io/stackhpc/jupyterhub-pytorch-intel-gpu:v0.0.1
image: ghcr.io/johngarbutt/jupyterhub-intel-gpu:6421ba1
extra_resource_limits:
"gpu.intel.com/i915": "1"
# "nvidia.com/hostdev": "1"
supplemental_gids:
- "110" # Ubuntu render group GID, requred for permission to use Intel GPU device
# privilaged: false
# container_security_context:
# allowPrivilegeEscalation: false
# capabilities:
# drop:
# - ALL
- display_name: "Pytorch environment with 2 x Intel XPUs"
description: "Pytorch Jupyter Stacks image!"
kubespawner_override:
image: ghcr.io/johngarbutt/jupyterhub-intel-gpu:6421ba1
extra_resource_limits:
"gpu.intel.com/i915": "2"
"nvidia.com/hostdev": "2"
supplemental_gids:
- "110" # Ubuntu render group GID, requred for permission to use Intel GPU device
- display_name: "Pytorch environment with 4 x Intel XPUs"
description: "Pytorch Jupyter Stacks image!"
kubespawner_override:
image: ghcr.io/johngarbutt/jupyterhub-intel-gpu:6421ba1
extra_resource_limits:
"gpu.intel.com/i915": "4"
"nvidia.com/hostdev": "4"
supplemental_gids:
- "110" # Ubuntu render group GID, requred for permission to use Intel GPU device
19 changes: 19 additions & 0 deletions apps/jupyterhub/example-secret.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
############
# IMPORTANT: Make sure you run kubeseal against any secret before commiting it to git!
# Example command:
# kubeseal \
# --kubeconfig clusters/jupyterhub/kubeconfig \
# --format yaml \
# --controller-name sealed-secrets \
# --controller-namespace sealed-secrets-system \
# --secret-file components/jupyterhub/secret.yaml \
# --sealed-secret-file components/jupyterhub/secret.yaml
############
apiVersion: v1
kind: Secret
metadata:
name: jupyterhub-keycloak-config
namespace: jupyterhub
stringData:
keycloakClientSecret: <keycloak-client-secret>
25 changes: 25 additions & 0 deletions apps/jupyterhub/extra-rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: jupyterhub-node-list
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: jupyterhub-node-list
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: jupyterhub-node-list
subjects:
- kind: ServiceAccount
name: hub
namespace: jupyterhub
13 changes: 13 additions & 0 deletions apps/jupyterhub/helmchart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmChart
metadata:
name: jupyterhub
namespace: jupyterhub
spec:
chart: jupyterhub
version: "4.1.0"
sourceRef:
kind: HelmRepository
name: jupyterhub
interval: 10m0s
28 changes: 28 additions & 0 deletions apps/jupyterhub/helmrelease.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: jupyterhub
namespace: jupyterhub
spec:
chartRef:
kind: HelmChart
name: jupyterhub
releaseName: jupyterhub
valuesFrom:
- kind: ConfigMap
name: jupyterhub-config
# - kind: Secret
# name: jupyterhub-keycloak-config
# valuesKey: keycloakClientSecret
# targetPath: hub.config.GenericOAuthenticator.client_secret
install:
createNamespace: true
remediation:
retries: 3
upgrade:
remediation:
retries: 3
driftDetection:
mode: enabled
interval: 5m
9 changes: 9 additions & 0 deletions apps/jupyterhub/helmrepository.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: jupyterhub
namespace: jupyterhub
spec:
url: https://jupyterhub.github.io/helm-chart
interval: 1h
9 changes: 9 additions & 0 deletions apps/jupyterhub/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
resources:
- namespace.yaml
- helmrepository.yaml
- helmchart.yaml
- helmrelease.yaml
- configmap.yaml
# - secret.yaml
# TODO - restore the auto profile detection
# - extra-rbac.yaml
Loading