Skip to content

Commit 3d14d3b

Browse files
committed
add ollama example
1 parent a4920f6 commit 3d14d3b

File tree

3 files changed

+154
-49
lines changed

3 files changed

+154
-49
lines changed

.env

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Add any desired environment variables here

README.md

Lines changed: 140 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,85 +1,176 @@
1-
# A CHANGEME-FRAMEWORK App Running On AWS Lambda
1+
# Ollama Running On AWS Lambda
22

3-
![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/scaffoldly/scaffoldly-examples/scaffoldly.yml?branch=CHANGEME-BRANCHNAME&link=https%3A%2F%2Fgithub.com%2Fscaffoldly%2Fscaffoldly-examples%2Factions)
3+
![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/scaffoldly/scaffoldly-examples/scaffoldly.yml?branch=ollama&link=https%3A%2F%2Fgithub.com%2Fscaffoldly%2Fscaffoldly-examples%2Factions)
44

5-
## ✨ Quickstart
5+
## 🚀 Working Example
66

7-
Run the following command to create your own copy of this application:
7+
> [!NOTE]
8+
>
9+
> - AWS Lambda uses **CPUs**, therefore running `generate` / `chat` is a little slow.
10+
> - The first deployment takes **~5m** while the container is built and models are cached, subsequent deployments take **~1m**.
11+
> - The first request while the model is loaded **takes ~20s**, subsequent requests take **~5-20s**.
12+
> - While this is **not production grade**, it is **a cost effective way** to serve models.
813
914
```bash
10-
npx scaffoldly create app --template CHANGEME-BRANCHNAME
15+
curl https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws/api/generate -d '{
16+
"model": "llama3.2:1b",
17+
"prompt":"Why is the sky blue?"
18+
}'
1119
```
1220

13-
## Manual Setup
21+
- 🙏 Please, please, please don't abuse this endpoint, **Scaffoldly is Open Source** (a.k.a. cash strapped 🤣) and we're hosting it for demonstration purposes only!
22+
- Please [consider donating](https://github.com/sponsors/scaffoldly) if you like what Scaffoldly is doing!
23+
- Check out our [other examples](https://github.com/scaffoldly/scaffoldly-examples)
24+
- Give our [Tooling](https://github.com/scaffoldly/scaffoldly) and [Examples](https://github.com/scaffoldly/scaffoldly-examples) repositories a ⭐️ if you like what you see!
1425

15-
This application was generated with the following command:
26+
## ✨ Host Your Own!
27+
28+
> [!TIP]
29+
> To use a different model than [`llama3.2:1b`](https://ollama.com/library/llama3.2:1b), update [`scaffoldly.json`](./scaffoldly.json) with the desired model(s).
30+
31+
1. Run the following command to create your own copy of this application:
1632

1733
```bash
18-
CHANGEME-CREATECOMMAND
34+
npx scaffoldly create app --template ollama
1935
```
2036

21-
✨ No modifications or SDKs were made or added to the code to "make it work" in AWS Lambda.
37+
2. Create an [EFS Filesystem in AWS](https://console.aws.amazon.com/efs/home), give it a `Name` of `.cache` (to match [`scaffoldly.json`](scaffoldly.json))
2238

23-
Check out our other [examples](https://github.com/scaffoldly/scaffoldly-examples) and Learn more at [scaffoldly.dev](https://scaffoldly.dev)!
39+
3. Finally, deploy:
2440

25-
### Working example
41+
```bash
42+
cd my-app
43+
npx scaffoldly deploy
44+
```
2645

27-
[CHANGEME-URL](CHANGEME-URL)
46+
You will see output that looks like:
2847

29-
## First, Scaffoldly Config was added...
48+
```
49+
🟠 App framework not detected. Using `scaffoldly.json` for configuration.
50+
51+
✅ Updated Identity: arn:aws:sts::123456789012:assumed-role/aws-examples@scaffold.ly/cnuss
52+
✅ Updated ECR Repository: 123456789012.dkr.ecr.us-east-1.amazonaws.com/ollama
53+
✅ Updated Local Image Digest: sha256:f7ee27705d66c64a250982d6ee8282d5338a4989ae95c5ac4453a15c264efc97
54+
✅ Updated Secret: arn:aws:secretsmanager:us-east-1:123456789012:secret:ollama@ollama-yaVNCp
55+
✅ Updated EFS Access Point: arn:aws:elasticfilesystem:us-east-1:123456789012:access-point/fsap-0b0e5506324efd541
56+
✅ Updated IAM Role: ollama-0447aaae
57+
✅ Updated IAM Role Policy: ollama
58+
✅ Updated Lambda Function: ollama
59+
✅ Updated Function URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws
60+
✅ Updated Schedule Group: ollama-0447aaae
61+
✅ Updated Local Image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/ollama:0.0.0-0-0447aaae
62+
✅ Updated Local Image Digest: sha256:320447c49d08d109c4fc1702acc24768657a9a09e4e0eb90f8b32051500664ba
63+
✅ Updated Secret: arn:aws:secretsmanager:us-east-1:123456789012:secret:ollama@ollama-yaVNCp
64+
✅ Updated Lambda Function: ollama
65+
✅ Updated Function Code: ollama@sha256:320447c49d08d109c4fc1702acc24768657a9a09e4e0eb90f8b32051500664ba
66+
✅ Updated Function Alias: ollama (version: 4)
67+
✅ Updated Function Policies: InvokeFunctionUrl
68+
✅ Updated Function URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws
69+
✅ Updated Network Interface: eni-0dc0e11444fa19715
70+
✅ Created Invocation of `( HOME=$XDG_CACHE_HOME OLLAMA_HOST=$URL ollama pull llama3.2:1b )`:
71+
pulling manifest
72+
==> pulling 74701a8c35f6... 100% ▕████████████████▏ 1.3 GB
73+
==> pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB
74+
==> pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB
75+
==> pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB
76+
==> pulling 4f659a1e86d7... 100% ▕████████████████▏ 485 B
77+
==> verifying sha256 digest
78+
==> writing manifest
79+
==> success
80+
✅ Updated HTTP GET on https://wm4s6cx...s-east-1.on.aws: 200 OK
3081
31-
In the project's [`CHANGEME-CONFIGFILE`](CHANGEME-CONFIGFILE) file, the `scaffoldly` configuration was added:
82+
🚀 Deployment Complete!
83+
🆔 App Identity: arn:aws:iam::123456789012:role/ollama-0447aaae
84+
📄 Env Files: .env.ollama, .env.main, .env
85+
📦 Image Size: 4.81 GB
86+
🌎 URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws
87+
```
3288

33-
- Note 1
34-
- Note 2
89+
## 🤨 How It Works
3590

36-
```
37-
CHANGEME-CONFIG
38-
```
91+
- The [`scaffoldly.json`](scaffoldly.json) is converted into a [Multi-Stage Docker Build](#multi-stage-docker-build)
92+
- A docker build is pushed to [Amazon ECR](#amazon-ecr)
93+
- A [Lambda Function](#lambda-function) is created to serve the image
94+
- Models are [cached](#model-caching) to Amazon EFS
95+
- Requests are [proxied](#request-proxy) to the underlying Ollama server
3996

40-
See the [Scaffoldly Docs](https://scaffoldly.dev/docs/config/) for additional configuration directives.
97+
> [!TIP]
98+
> This repoistory also comes with a [GitHub Action](.github/workflows/scaffoldly.yml) so that deployments can occur from GitHub instead of being executed manually!
4199
42-
## Then, deployed to AWS Lambda
100+
### Multi-Stage Docker Build
43101

44-
```bash
45-
npx scaffoldly deploy
46-
```
102+
After the [project has been created](#-host-your-own), run `npx scaffoldly show dockerfile` to see the resultant Dockerfile:
47103

48-
See the [Scaffoldly Docs](https://scaffoldly.dev/docs/cli/#scaffoldly-deploy) for details on the `scaffoldly deploy` command.
104+
```Dockerfile
105+
FROM ollama/ollama:0.4.7 AS install-base
106+
WORKDIR /var/task
49107

50-
### After deploy the app is available on a public URL
108+
FROM install-base AS build-base
109+
WORKDIR /var/task
110+
ENV PATH="/var/task:$PATH"
111+
COPY . /var/task/
51112

52-
```bash
53-
🚀 Deployment Complete!
54-
🆔 App Identity: CHANGEME-IDENTITY
55-
📄 Env Files: .env.main, .env
56-
📦 Image Size: CHANGEME-IMAGESIZE MB
57-
🌎 URL: CHANGEME-URL
113+
FROM install-base AS package-base
114+
WORKDIR /var/task
115+
ENV PATH="/var/task:$PATH"
116+
117+
FROM install-base AS runtime
118+
WORKDIR /var/task
119+
ENV PATH="/var/task:$PATH"
120+
COPY --from=scaffoldly/scaffoldly:1 /linux/arm64/awslambda-entrypoint /var/task/.entrypoint
121+
CMD [ "( HOME=$XDG_CACHE_HOME ollama serve )" ]
58122
```
59123

60-
## GitHub Action added for CI/CD
124+
Running `npx scaffoldly deploy` will:
61125

62-
A [`scaffoldly.yml`](.github/workflows/scaffoldly.yml) was added to `.github/workflows` so that a push will trigger a deploy
126+
- Infer [`scaffoldly.json`](scaffoldly.json) into a Multi-Stage Docker Build
127+
- Run the equivalent of `docker build`
128+
- Setup [Amazon ECR](#amazon-ecr)
129+
- Create a [Lambda Function](#lambda-function)
63130

64-
```
65-
name: Scaffoldly Deploy
131+
### Amazon ECR
66132

67-
# ... snip ...
133+
AWS Lambda requires that Docker Images come from Amazon ECR Private Registries, and it can't run public images either.
68134

69-
jobs:
70-
deploy:
71-
runs-on: ubuntu-latest
72-
steps:
73-
- name: Checkout
74-
uses: actions/checkout@v4
135+
Running `npx scaffoldly deploy` will:
75136

76-
- name: Deploy
77-
uses: scaffoldly/scaffoldly@v1
78-
with:
79-
secrets: ${{ toJSON(secrets) }}
80-
```
137+
- Pull `ollama/ollama:0.4.7` and re-tag it and push it to Amazon ECR as a private image
138+
- Create an ECR Repository if it doesn't already exist
139+
- Run the equivalent of `docker push`
140+
141+
### Lambda Function
142+
143+
An AWS Lambda Function is created with the configuration in the [`scaffoldly.json`](scaffoldly.json) file:
144+
145+
Running `npx scaffoldly deploy` will:
146+
147+
- Setup Function Environment Variables from `.env`
148+
- Deploy the Function with a VPC Configuration and EFS Mounts inferred from [Amazon EFS](#model-caching)
149+
- Create Lambda Versions and Aliases
150+
- Set an `ENTRYPOINT` which routes [AWS Lambda HTTP Requests to Ollama](#request-proxy)
151+
- Create a Lambda Function URL and set it as an environment variable as `URL`
152+
153+
### Model Caching
154+
155+
Model files are large and cached in Amazon EFS. Using the `@immediately` option in the `schedules` directive of [`scaffoldly.json`](scaffoldly.json), the Model is pre-downloaded after the deployment.
156+
157+
Running `npx scaffoldly deploy` will:
158+
159+
- Set up a `XDG_CACHE_HOME` environment to be the EFS Mount on the Lambda Function
160+
- Use the `OLLAMA_HOST=$URL` envrionment variable to trigger a remote download (on itself)
161+
- Use the `HOME=$XDG_CACHE_HOME` to direct Ollama where to store files
162+
- Invoke `ollama pull` once the AWS Lambda Function is finished deploying
163+
164+
### Request Proxy
165+
166+
Finally, Scaffoldly uses the `start` option in the `scripts` directive of [`scaffoldly.json`](scaffoldly.json) to run `ollama serve`.
167+
168+
Running `npx scaffoldly deploy` will:
81169

82-
See the [Scaffoldly Docs](https://scaffoldly.dev/docs/gha/) for additional GitHub Actions directives.
170+
- Copy the [`awslambda-entrypoint`](https://github.com/scaffoldly/scaffoldly/blob/main/src/awslambda-entrypoint.ts)
171+
- The `awslambda-entrypoint` reads the `SLY_ROUTES` and `SLY_SERVE` environment variables to start and route requests
172+
- Requests are converted from the AWS Lambda HTTP Request format back into a HTTP Request forwarded to the Ollama Server.
173+
- The Ollama Server response is streamed back to the requestor.
83174

84175
## Questions, Feedback, and Help
85176

scaffoldly.json

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"name": "ollama",
3+
"runtime": "ollama/ollama:0.4.7",
4+
"handler": "localhost:11434",
5+
"resources": ["arn::elasticfilesystem:::file-system/.cache"],
6+
"schedules": {
7+
"@immediately": "HOME=$XDG_CACHE_HOME OLLAMA_HOST=$URL ollama pull llama3.2:1b"
8+
},
9+
"scripts": {
10+
"start": "HOME=$XDG_CACHE_HOME ollama serve"
11+
},
12+
"memorySize": 3008
13+
}

0 commit comments

Comments
 (0)