Skip to content

Commit 11f5cce

Browse files
edmundmillerclaude
andauthored
Update Seqerakit Megatests (#157)
* docs: overview * chore: Add snapshots * build: Fix from_op * fix: Use fusionSnapshots field for snapshots support - Update README.md to use correct fusionSnapshots field name instead of snapshots - All three compute environments now use fusionSnapshots: true in JSON configs - Verified ARM environment deployment with proper snapshots enablement - Clean configuration without embedding snapshots in nextflowConfig 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Clean up seqerakit directory structure - Remove legacy compute-envs/ directory with old YAML configurations - Update README.md to reflect current production environment structure - Remove temporary verification files from testing - Update environment IDs to current deployed instances - Simplify documentation to focus on active configurations The directory now contains only production-ready files: - Three current YAML configs: *_current.yml - Three JSON configurations: current-env-*.json - Updated documentation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add GitHub secrets automation with 1Password and seqerakit integration - Add Pulumi GitHub provider to create org-level secrets automatically - Integrate 1Password CLI to retrieve tokens securely - Use Tower CLI to dynamically extract compute environment IDs from deployed seqerakit configs - Create GitHub secrets: TOWER_ACCESS_TOKEN, TOWER_WORKSPACE_ID, TOWER_COMPUTE_ENV_* - Add pulumi-github and pulumi-command dependencies 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 583747c commit 11f5cce

12 files changed

+333
-94
lines changed

CLAUDE.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Nextflow Best Practices
2+
3+
- Do NOT embed the configuration in nextflowConfig instead of using the snapshots field in seqerakit

pulumi/AWSMegatests/__main__.py

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,135 @@
11
"""An AWS Python Pulumi program"""
22

33
import pulumi
4+
import pulumi_github as github
5+
import pulumi_command as command
46
from pulumi_aws import s3
57

68
# Create an AWS resource (S3 Bucket)
79
bucket = s3.Bucket("my-bucket")
810

911
# Export the name of the bucket
1012
pulumi.export("bucket_name", bucket.id) # type: ignore[attr-defined]
13+
14+
15+
# Get secrets from 1Password using the CLI
16+
def get_1password_secret(secret_ref: str) -> str:
17+
"""Get a secret from 1Password using the CLI"""
18+
get_secret_cmd = command.local.Command(
19+
f"get-1password-{secret_ref.replace('/', '-').replace(' ', '-')}",
20+
create=f"op read '{secret_ref}'",
21+
opts=pulumi.ResourceOptions(additional_secret_outputs=["stdout"]),
22+
)
23+
return get_secret_cmd.stdout
24+
25+
26+
# Get secrets from 1Password
27+
tower_access_token = get_1password_secret(
28+
"op://Employee/Seqera Platform Token/credential"
29+
)
30+
github_token = get_1password_secret("op://Employee/Github Token nf-core/credential")
31+
32+
# Get workspace ID from Tower CLI
33+
workspace_cmd = command.local.Command(
34+
"get-workspace-id",
35+
create="tw -o nf-core workspaces list --format json | jq -r '.[] | select(.name==\"AWSmegatests\") | .id'",
36+
environment={
37+
"TOWER_ACCESS_TOKEN": tower_access_token,
38+
"ORGANIZATION_NAME": "nf-core",
39+
},
40+
opts=pulumi.ResourceOptions(additional_secret_outputs=["stdout"]),
41+
)
42+
workspace_id = workspace_cmd.stdout
43+
44+
45+
# Get compute environment IDs using Tower CLI
46+
def get_compute_env_id(env_name: str, display_name: str) -> str:
47+
"""Get compute environment ID by name"""
48+
get_env_cmd = command.local.Command(
49+
f"get-compute-env-{env_name}",
50+
create=f"tw -o nf-core -w AWSmegatests compute-envs list --format json | jq -r '.[] | select(.name==\"{display_name}\") | .id'",
51+
environment={
52+
"TOWER_ACCESS_TOKEN": tower_access_token,
53+
"ORGANIZATION_NAME": "nf-core",
54+
"WORKSPACE_NAME": "AWSmegatests",
55+
},
56+
opts=pulumi.ResourceOptions(additional_secret_outputs=["stdout"]),
57+
)
58+
return get_env_cmd.stdout
59+
60+
61+
# Get compute environment IDs for each environment
62+
cpu_compute_env_id = get_compute_env_id("cpu", "aws_ireland_fusionv2_nvme_cpu")
63+
gpu_compute_env_id = get_compute_env_id(
64+
"gpu", "aws_ireland_fusionv2_nvme_gpu_snapshots"
65+
)
66+
arm_compute_env_id = get_compute_env_id(
67+
"arm", "aws_ireland_fusionv2_nvme_cpu_ARM_snapshots"
68+
)
69+
70+
# Create GitHub provider
71+
github_provider = github.Provider("github", token=github_token)
72+
73+
# Create org-level GitHub secrets for compute environment IDs
74+
cpu_secret = github.ActionsOrganizationSecret(
75+
"tower-compute-env-cpu",
76+
visibility="private",
77+
secret_name="TOWER_COMPUTE_ENV_CPU",
78+
plaintext_value=cpu_compute_env_id,
79+
opts=pulumi.ResourceOptions(provider=github_provider),
80+
)
81+
82+
gpu_secret = github.ActionsOrganizationSecret(
83+
"tower-compute-env-gpu",
84+
visibility="private",
85+
secret_name="TOWER_COMPUTE_ENV_GPU",
86+
plaintext_value=gpu_compute_env_id,
87+
opts=pulumi.ResourceOptions(provider=github_provider),
88+
)
89+
90+
arm_secret = github.ActionsOrganizationSecret(
91+
"tower-compute-env-arm",
92+
visibility="private",
93+
secret_name="TOWER_COMPUTE_ENV_ARM",
94+
plaintext_value=arm_compute_env_id,
95+
opts=pulumi.ResourceOptions(provider=github_provider),
96+
)
97+
98+
# Create org-level GitHub secret for Seqera Platform API token
99+
seqera_token_secret = github.ActionsOrganizationSecret(
100+
"tower-access-token",
101+
visibility="private",
102+
secret_name="TOWER_ACCESS_TOKEN",
103+
plaintext_value=tower_access_token,
104+
opts=pulumi.ResourceOptions(provider=github_provider),
105+
)
106+
107+
# Create org-level GitHub secret for workspace ID
108+
workspace_id_secret = github.ActionsOrganizationSecret(
109+
"tower-workspace-id",
110+
visibility="private",
111+
secret_name="TOWER_WORKSPACE_ID",
112+
plaintext_value=workspace_id,
113+
opts=pulumi.ResourceOptions(provider=github_provider),
114+
)
115+
116+
# Export the created secrets
117+
pulumi.export(
118+
"github_secrets",
119+
{
120+
"compute_env_cpu": cpu_secret.secret_name,
121+
"compute_env_gpu": gpu_secret.secret_name,
122+
"compute_env_arm": arm_secret.secret_name,
123+
"tower_access_token": seqera_token_secret.secret_name,
124+
"tower_workspace_id": workspace_id_secret.secret_name,
125+
},
126+
)
127+
128+
# Export compute environment IDs for reference
129+
pulumi.export(
130+
"compute_env_ids",
131+
{"cpu": cpu_compute_env_id, "gpu": gpu_compute_env_id, "arm": arm_compute_env_id},
132+
)
133+
134+
# Export workspace ID for reference
135+
pulumi.export("workspace_id", workspace_id)

pulumi/AWSMegatests/pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,6 @@ requires-python = ">=3.12"
77
dependencies = [
88
"pulumi>=3.173.0,<4.0.0",
99
"pulumi-aws>=6.81.0,<7.0.0",
10+
"pulumi-github>=6.4.0,<7.0.0",
11+
"pulumi-command>=1.0.1,<2.0.0",
1012
]

pulumi/AWSMegatests/seqerakit/.envrc

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,14 @@ source_url "https://github.com/tmatilai/direnv-1password/raw/v1.0.1/1password.sh
55
"sha256-4dmKkmlPBNXimznxeehplDfiV+CvJiIzg7H1Pik4oqY="
66

77
# Load secrets from 1Password
8-
from_op <<OP
9-
TOWER_ACCESS_TOKEN="op://Dev/Tower nf-core Access Token/password"
10-
AWS_ACCESS_KEY_ID="op://Dev/AWS Tower Test Credentials/access key id"
11-
AWS_SECRET_ACCESS_KEY="op://Dev/AWS Tower Test Credentials/secret access key"
12-
OP
8+
from_op TOWER_ACCESS_TOKEN="op://Employee/Seqera Platform Token/credential"
9+
from_op AWS_ACCESS_KEY_ID="op://Dev/AWS Tower Test Credentials/access key id"
10+
from_op AWS_SECRET_ACCESS_KEY="op://Dev/AWS Tower Test Credentials/secret access key"
1311

1412
# Static configuration variables
1513
export ORGANIZATION_NAME="nf-core"
1614
export WORKSPACE_NAME="AWSmegatests"
1715
export AWS_CREDENTIALS_NAME="tower-awstest"
1816
export AWS_REGION="eu-west-1"
1917
export AWS_WORK_DIR="s3://nf-core-awsmegatests"
20-
export AWS_COMPUTE_ENV_ALLOWED_BUCKETS="s3://ngi-igenomes,s3://annotation-cache"
18+
export AWS_COMPUTE_ENV_ALLOWED_BUCKETS="s3://ngi-igenomes,s3://annotation-cache"
Lines changed: 184 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,209 @@
1-
# megatest-seqerakit
1+
# nf-core megatest seqerakit
22

3-
Contains the seqerakit scripts used to stand up the nf-core megatest workspace
3+
Contains the seqerakit configurations for the three core compute environments used in nf-core megatests on the Seqera Platform.
44

55
## Quick Start
66

77
1. Install seqerakit: `pip install seqerakit`
88
2. Install direnv: `brew install direnv`
99
3. Allow environment loading: `direnv allow`
10-
4. Deploy compute environments: `seqerakit aws_ireland_fusionv2_nvme_*_current.yml`
10+
4. Deploy compute environments:
11+
```bash
12+
seqerakit aws_ireland_fusionv2_nvme_cpu_current.yml
13+
seqerakit aws_ireland_fusionv2_nvme_cpu_arm_current.yml
14+
seqerakit aws_ireland_fusionv2_nvme_gpu_current.yml
15+
```
16+
17+
## Architecture
18+
19+
### Three Core Compute Environments
20+
21+
This repository manages **three compute environments** on AWS Batch, all with fusion snapshots enabled:
22+
23+
#### 1. **CPU Environment** (`aws_ireland_fusionv2_nvme_cpu`)
24+
25+
- **Instance Types**: `c6id`, `m6id`, `r6id` (Intel x86_64 with NVMe storage)
26+
- **Features**: Fusion v2, Wave, NVMe storage, **snapshots enabled**
27+
- **Provisioning**: SPOT instances
28+
- **Max CPUs**: 500
29+
- **Use Case**: Standard CPU-intensive workflows
30+
31+
#### 2. **ARM Environment** (`aws_ireland_fusionv2_nvme_cpu_ARM_snapshots`)
32+
33+
- **Instance Types**: `m6gd`, `r6gd`, `c6gd` (ARM Graviton with NVMe storage)
34+
- **Features**: Fusion v2, Wave, NVMe storage, **snapshots enabled**
35+
- **Provisioning**: SPOT instances
36+
- **Max CPUs**: 1000
37+
- **Use Case**: ARM-optimized workflows and cost optimization
38+
39+
#### 3. **GPU Environment** (`aws_ireland_fusionv2_nvme_gpu_snapshots`)
40+
41+
- **Instance Types**: `g4dn`, `g5` (GPU) + `c6id`, `m6id`, `r6id` (CPU fallback)
42+
- **Features**: GPU enabled, Fusion v2, Wave, NVMe storage, **snapshots enabled**
43+
- **Provisioning**: SPOT instances
44+
- **Max CPUs**: 500
45+
- **Use Case**: GPU-accelerated workflows (ML, bioinformatics tools)
46+
47+
### Common Configuration
48+
49+
All environments share these settings:
50+
51+
- **Type**: aws-batch
52+
- **Region**: eu-west-1 (Ireland)
53+
- **Provisioning**: SPOT instances
54+
- **Wave**: Enabled for container optimization
55+
- **Fusion v2**: Enabled for high-performance I/O
56+
- **NVMe Storage**: Enabled for fast local storage
57+
- **Snapshots**: **Enabled** for all environments
58+
- **Wait State**: AVAILABLE
59+
- **Overwrite**: Enabled
60+
61+
## Snapshots Configuration
62+
63+
All three environments have fusion snapshots enabled using the seqerakit `fusionSnapshots` field:
64+
65+
```json
66+
{
67+
"fusionSnapshots": true,
68+
"fusion2Enabled": true,
69+
"waveEnabled": true,
70+
"nvnmeStorageEnabled": true,
71+
"nextflowConfig": "aws.batch.maxSpotAttempts=5\nprocess {\n maxRetries = 2\n errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : 'terminate' }\n}\n"
72+
}
73+
```
74+
75+
This approach keeps snapshots configuration separate from Nextflow configuration, making it cleaner and more maintainable.
76+
77+
## Seqerakit Deployment
78+
79+
### Why Seqerakit?
80+
81+
We use seqerakit for Infrastructure as Code management of compute environments because:
82+
83+
- **Native snapshots support**: Supports the `fusionSnapshots` field directly
84+
- **Clean configuration**: No need to embed snapshots in `nextflowConfig`
85+
- **GitOps workflow**: Infrastructure managed through version control
86+
- **Validation**: Built-in `--dryrun` support for testing configurations
87+
88+
### Deployment Commands
89+
90+
```bash
91+
# Deploy individual environments
92+
seqerakit aws_ireland_fusionv2_nvme_cpu_current.yml
93+
seqerakit aws_ireland_fusionv2_nvme_cpu_arm_current.yml
94+
seqerakit aws_ireland_fusionv2_nvme_gpu_current.yml
95+
96+
# Validate configurations (dry run)
97+
seqerakit aws_ireland_fusionv2_nvme_cpu_current.yml --dryrun
98+
seqerakit aws_ireland_fusionv2_nvme_cpu_arm_current.yml --dryrun
99+
seqerakit aws_ireland_fusionv2_nvme_gpu_current.yml --dryrun
100+
101+
# Delete environments
102+
seqerakit aws_ireland_fusionv2_nvme_cpu_current.yml --delete
103+
seqerakit aws_ireland_fusionv2_nvme_cpu_arm_current.yml --delete
104+
seqerakit aws_ireland_fusionv2_nvme_gpu_current.yml --delete
105+
```
11106

12107
## GitOps Workflow
13108

14-
This repository now implements GitOps with GitHub Actions:
109+
This repository implements GitOps with GitHub Actions:
15110

16111
- **Pull Requests**: Automatically validate configurations with `--dryrun`
17112
- **Main Branch**: Automatically deploy infrastructure changes
18113
- **1Password Integration**: Secure credential management with `.envrc`
19114

20115
## Infrastructure Files
21116

22-
- **Current Production**: `*_current.yml` files reference exported JSON configurations
23-
- **Legacy Templates**: `compute-envs/*.yml` files for reference
24-
- **Exported Configs**: `current-env-*.json` files from Tower CLI export
117+
### Current Production Files
25118

26-
## Resolved Issues
119+
- `aws_ireland_fusionv2_nvme_cpu_current.yml``current-env-cpu.json`
120+
- `aws_ireland_fusionv2_nvme_cpu_arm_current.yml``current-env-cpu-arm.json`
121+
- `aws_ireland_fusionv2_nvme_gpu_current.yml``current-env-gpu.json`
27122

28-
**Snapshots with seqerakit**: Implemented in ARM environment (`current-env-cpu-arm.json`) with:
123+
### Configuration Structure
29124

30-
```json
31-
"nextflowConfig": "fusion.enabled = true\nfusion.snapshots = true\nfusion.containerConfigUrl = '...'"
125+
Each YAML file references an exported JSON configuration:
126+
127+
```yaml
128+
compute-envs:
129+
- name: "environment_name"
130+
workspace: "$ORGANIZATION_NAME/$WORKSPACE_NAME"
131+
credentials: "$AWS_CREDENTIALS_NAME"
132+
wait: "AVAILABLE"
133+
file-path: "./current-env-[type].json"
134+
overwrite: True
32135
```
33136
34-
**GPU-enabled compute environments**: Implemented in GPU environment (`current-env-gpu.json`) with:
137+
### JSON Configuration Structure
138+
139+
Each JSON file contains the complete compute environment configuration:
35140
36141
```json
37-
"forge": { "gpuEnabled": true, "instanceTypes": ["g4dn", "g5", "c6id", "m6id", "r6id"] }
142+
{
143+
"discriminator": "aws-batch",
144+
"region": "eu-west-1",
145+
"executionRole": "arn:aws:iam::...:role/TowerForge-...-ExecutionRole",
146+
"headJobRole": "arn:aws:iam::...:role/TowerForge-...-FargateRole",
147+
"workDir": "s3://nf-core-awsmegatests",
148+
"headJobCpus": 4,
149+
"headJobMemoryMb": 16384,
150+
"waveEnabled": true,
151+
"fusion2Enabled": true,
152+
"nvnmeStorageEnabled": true,
153+
"fusionSnapshots": true,
154+
"nextflowConfig": "aws.batch.maxSpotAttempts=5\nprocess {\n maxRetries = 2\n errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : 'terminate' }\n}\n",
155+
"forge": {
156+
"type": "SPOT",
157+
"minCpus": 0,
158+
"maxCpus": 500,
159+
"gpuEnabled": false,
160+
"instanceTypes": ["c6id", "m6id", "r6id"],
161+
"allowBuckets": [
162+
"s3://ngi-igenomes",
163+
"s3://nf-core-awsmegatests",
164+
"s3://annotation-cache/"
165+
],
166+
"fargateHeadEnabled": true
167+
}
168+
}
38169
```
170+
171+
## Environment Variables
172+
173+
The `.envrc` file defines key configuration variables:
174+
175+
```bash
176+
export ORGANIZATION_NAME="nf-core"
177+
export WORKSPACE_NAME="AWSmegatests"
178+
export AWS_CREDENTIALS_NAME="tower-awstest"
179+
export AWS_REGION="eu-west-1"
180+
export AWS_WORK_DIR="s3://nf-core-awsmegatests"
181+
export AWS_COMPUTE_ENV_ALLOWED_BUCKETS="s3://ngi-igenomes,s3://annotation-cache"
182+
```
183+
184+
## Current Environment Status
185+
186+
All three compute environments are successfully deployed with fusion snapshots enabled:
187+
188+
**CPU Environment**: Standard x86_64 instances with fusion snapshots
189+
**ARM Environment**: ARM Graviton instances with fusion snapshots
190+
**GPU Environment**: GPU + CPU instances with fusion snapshots
191+
192+
## Environment IDs
193+
194+
For reference, the current environment IDs are:
195+
196+
- CPU: `53ljSqphNKjm6jjmuB6T9b``aws_ireland_fusionv2_nvme_cpu`
197+
- ARM: `7eC1zALvNGIaFXbybVohP1``aws_ireland_fusionv2_nvme_cpu_ARM_snapshots`
198+
- GPU: `2SRyFNKtLVAJCxMhcZRMfx``aws_ireland_fusionv2_nvme_gpu_snapshots`
199+
200+
## Technical Details
201+
202+
- **Cloud Provider**: AWS
203+
- **Region**: eu-west-1 (Ireland)
204+
- **Compute Backend**: AWS Batch
205+
- **Container Technology**: Docker with Wave optimization
206+
- **Storage**: S3 for work directory, NVMe for fast local storage
207+
- **Networking**: Managed by Seqera Platform forge mode
208+
- **Cost Optimization**: SPOT instances for all environments
209+
- **Snapshots**: Enabled for optimized container layer caching using seqerakit's native `fusionSnapshots` field

0 commit comments

Comments
 (0)