|
1 | | -# A CHANGEME-FRAMEWORK App Running On AWS Lambda |
| 1 | +# Ollama Running On AWS Lambda |
2 | 2 |
|
3 | | - |
| 3 | + |
4 | 4 |
|
5 | | -## ✨ Quickstart |
| 5 | +## 🚀 Working Example |
6 | 6 |
|
7 | | -Run the following command to create your own copy of this application: |
| 7 | +> [!NOTE] |
| 8 | +> |
| 9 | +> - AWS Lambda uses **CPUs**, therefore running `generate` / `chat` is a little slow. |
| 10 | +> - The first deployment takes **~5m** while the container is built and models are cached, subsequent deployments take **~1m**. |
| 11 | +> - The first request while the model is loaded **takes ~20s**, subsequent requests take **~5-20s**. |
| 12 | +> - While this is **not production grade**, it is **a cost effective way** to serve models. |
8 | 13 |
|
9 | 14 | ```bash |
10 | | -npx scaffoldly create app --template CHANGEME-BRANCHNAME |
| 15 | +curl https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws/api/generate -d '{ |
| 16 | + "model": "llama3.2:1b", |
| 17 | + "prompt":"Why is the sky blue?" |
| 18 | +}' |
11 | 19 | ``` |
12 | 20 |
|
13 | | -## Manual Setup |
| 21 | +- 🙏 Please, please, please don't abuse this endpoint, **Scaffoldly is Open Source** (a.k.a. cash strapped 🤣) and we're hosting it for demonstration purposes only! |
| 22 | +- Please [consider donating](https://github.com/sponsors/scaffoldly) if you like what Scaffoldly is doing! |
| 23 | +- Check out our [other examples](https://github.com/scaffoldly/scaffoldly-examples) |
| 24 | +- Give our [Tooling](https://github.com/scaffoldly/scaffoldly) and [Examples](https://github.com/scaffoldly/scaffoldly-examples) repositories a ⭐️ if you like what you see! |
14 | 25 |
|
15 | | -This application was generated with the following command: |
| 26 | +## ✨ Host Your Own! |
| 27 | + |
| 28 | +> [!TIP] |
| 29 | +> To use a different model than [`llama3.2:1b`](https://ollama.com/library/llama3.2:1b), update [`scaffoldly.json`](./scaffoldly.json) with the desired model(s). |
| 30 | +
|
| 31 | +1. Run the following command to create your own copy of this application: |
16 | 32 |
|
17 | 33 | ```bash |
18 | | -CHANGEME-CREATECOMMAND |
| 34 | +npx scaffoldly create app --template ollama |
19 | 35 | ``` |
20 | 36 |
|
21 | | -✨ No modifications or SDKs were made or added to the code to "make it work" in AWS Lambda. |
| 37 | +2. Create an [EFS Filesystem in AWS](https://console.aws.amazon.com/efs/home), give it a `Name` of `.cache` (to match [`scaffoldly.json`](scaffoldly.json)) |
22 | 38 |
|
23 | | -Check out our other [examples](https://github.com/scaffoldly/scaffoldly-examples) and Learn more at [scaffoldly.dev](https://scaffoldly.dev)! |
| 39 | +3. Finally, deploy: |
24 | 40 |
|
25 | | -### Working example |
| 41 | +```bash |
| 42 | +cd my-app |
| 43 | +npx scaffoldly deploy |
| 44 | +``` |
26 | 45 |
|
27 | | -[CHANGEME-URL](CHANGEME-URL) |
| 46 | +You will see output that looks like: |
28 | 47 |
|
29 | | -## First, Scaffoldly Config was added... |
| 48 | +``` |
| 49 | +🟠 App framework not detected. Using `scaffoldly.json` for configuration. |
| 50 | +
|
| 51 | +✅ Updated Identity: arn:aws:sts::123456789012:assumed-role/aws-examples@scaffold.ly/cnuss |
| 52 | +✅ Updated ECR Repository: 123456789012.dkr.ecr.us-east-1.amazonaws.com/ollama |
| 53 | +✅ Updated Local Image Digest: sha256:f7ee27705d66c64a250982d6ee8282d5338a4989ae95c5ac4453a15c264efc97 |
| 54 | +✅ Updated Secret: arn:aws:secretsmanager:us-east-1:123456789012:secret:ollama@ollama-yaVNCp |
| 55 | +✅ Updated EFS Access Point: arn:aws:elasticfilesystem:us-east-1:123456789012:access-point/fsap-0b0e5506324efd541 |
| 56 | +✅ Updated IAM Role: ollama-0447aaae |
| 57 | +✅ Updated IAM Role Policy: ollama |
| 58 | +✅ Updated Lambda Function: ollama |
| 59 | +✅ Updated Function URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws |
| 60 | +✅ Updated Schedule Group: ollama-0447aaae |
| 61 | +✅ Updated Local Image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/ollama:0.0.0-0-0447aaae |
| 62 | +✅ Updated Local Image Digest: sha256:320447c49d08d109c4fc1702acc24768657a9a09e4e0eb90f8b32051500664ba |
| 63 | +✅ Updated Secret: arn:aws:secretsmanager:us-east-1:123456789012:secret:ollama@ollama-yaVNCp |
| 64 | +✅ Updated Lambda Function: ollama |
| 65 | +✅ Updated Function Code: ollama@sha256:320447c49d08d109c4fc1702acc24768657a9a09e4e0eb90f8b32051500664ba |
| 66 | +✅ Updated Function Alias: ollama (version: 4) |
| 67 | +✅ Updated Function Policies: InvokeFunctionUrl |
| 68 | +✅ Updated Function URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws |
| 69 | +✅ Updated Network Interface: eni-0dc0e11444fa19715 |
| 70 | +✅ Created Invocation of `( HOME=$XDG_CACHE_HOME OLLAMA_HOST=$URL ollama pull llama3.2:1b )`: |
| 71 | +pulling manifest |
| 72 | + ==> pulling 74701a8c35f6... 100% ▕████████████████▏ 1.3 GB |
| 73 | + ==> pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB |
| 74 | + ==> pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB |
| 75 | + ==> pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB |
| 76 | + ==> pulling 4f659a1e86d7... 100% ▕████████████████▏ 485 B |
| 77 | + ==> verifying sha256 digest |
| 78 | + ==> writing manifest |
| 79 | + ==> success |
| 80 | +✅ Updated HTTP GET on https://wm4s6cx...s-east-1.on.aws: 200 OK |
30 | 81 |
|
31 | | -In the project's [`CHANGEME-CONFIGFILE`](CHANGEME-CONFIGFILE) file, the `scaffoldly` configuration was added: |
| 82 | +🚀 Deployment Complete! |
| 83 | + 🆔 App Identity: arn:aws:iam::123456789012:role/ollama-0447aaae |
| 84 | + 📄 Env Files: .env.ollama, .env.main, .env |
| 85 | + 📦 Image Size: 4.81 GB |
| 86 | + 🌎 URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws |
| 87 | +``` |
32 | 88 |
|
33 | | -- Note 1 |
34 | | -- Note 2 |
| 89 | +## 🤨 How It Works |
35 | 90 |
|
36 | | -``` |
37 | | -CHANGEME-CONFIG |
38 | | -``` |
| 91 | +- The [`scaffoldly.json`](scaffoldly.json) is converted into a [Multi-Stage Docker Build](#multi-stage-docker-build) |
| 92 | +- A docker build is pushed to [Amazon ECR](#amazon-ecr) |
| 93 | +- A [Lambda Function](#lambda-function) is created to serve the image |
| 94 | +- Models are [cached](#model-caching) to Amazon EFS |
| 95 | +- Requests are [proxied](#request-proxy) to the underlying Ollama server |
39 | 96 |
|
40 | | -See the [Scaffoldly Docs](https://scaffoldly.dev/docs/config/) for additional configuration directives. |
| 97 | +> [!TIP] |
| 98 | +> This repoistory also comes with a [GitHub Action](.github/workflows/scaffoldly.yml) so that deployments can occur from GitHub instead of being executed manually! |
41 | 99 |
|
42 | | -## Then, deployed to AWS Lambda |
| 100 | +### Multi-Stage Docker Build |
43 | 101 |
|
44 | | -```bash |
45 | | -npx scaffoldly deploy |
46 | | -``` |
| 102 | +After the [project has been created](#-host-your-own), run `npx scaffoldly show dockerfile` to see the resultant Dockerfile: |
47 | 103 |
|
48 | | -See the [Scaffoldly Docs](https://scaffoldly.dev/docs/cli/#scaffoldly-deploy) for details on the `scaffoldly deploy` command. |
| 104 | +```Dockerfile |
| 105 | +FROM ollama/ollama:0.4.7 AS install-base |
| 106 | +WORKDIR /var/task |
49 | 107 |
|
50 | | -### After deploy the app is available on a public URL |
| 108 | +FROM install-base AS build-base |
| 109 | +WORKDIR /var/task |
| 110 | +ENV PATH="/var/task:$PATH" |
| 111 | +COPY . /var/task/ |
51 | 112 |
|
52 | | -```bash |
53 | | -🚀 Deployment Complete! |
54 | | - 🆔 App Identity: CHANGEME-IDENTITY |
55 | | - 📄 Env Files: .env.main, .env |
56 | | - 📦 Image Size: CHANGEME-IMAGESIZE MB |
57 | | - 🌎 URL: CHANGEME-URL |
| 113 | +FROM install-base AS package-base |
| 114 | +WORKDIR /var/task |
| 115 | +ENV PATH="/var/task:$PATH" |
| 116 | + |
| 117 | +FROM install-base AS runtime |
| 118 | +WORKDIR /var/task |
| 119 | +ENV PATH="/var/task:$PATH" |
| 120 | +COPY --from=scaffoldly/scaffoldly:1 /linux/arm64/awslambda-entrypoint /var/task/.entrypoint |
| 121 | +CMD [ "( HOME=$XDG_CACHE_HOME ollama serve )" ] |
58 | 122 | ``` |
59 | 123 |
|
60 | | -## GitHub Action added for CI/CD |
| 124 | +Running `npx scaffoldly deploy` will: |
61 | 125 |
|
62 | | -A [`scaffoldly.yml`](.github/workflows/scaffoldly.yml) was added to `.github/workflows` so that a push will trigger a deploy |
| 126 | +- Infer [`scaffoldly.json`](scaffoldly.json) into a Multi-Stage Docker Build |
| 127 | +- Run the equivalent of `docker build` |
| 128 | +- Setup [Amazon ECR](#amazon-ecr) |
| 129 | +- Create a [Lambda Function](#lambda-function) |
63 | 130 |
|
64 | | -``` |
65 | | -name: Scaffoldly Deploy |
| 131 | +### Amazon ECR |
66 | 132 |
|
67 | | -# ... snip ... |
| 133 | +AWS Lambda requires that Docker Images come from Amazon ECR Private Registries, and it can't run public images either. |
68 | 134 |
|
69 | | -jobs: |
70 | | - deploy: |
71 | | - runs-on: ubuntu-latest |
72 | | - steps: |
73 | | - - name: Checkout |
74 | | - uses: actions/checkout@v4 |
| 135 | +Running `npx scaffoldly deploy` will: |
75 | 136 |
|
76 | | - - name: Deploy |
77 | | - uses: scaffoldly/scaffoldly@v1 |
78 | | - with: |
79 | | - secrets: ${{ toJSON(secrets) }} |
80 | | -``` |
| 137 | +- Pull `ollama/ollama:0.4.7` and re-tag it and push it to Amazon ECR as a private image |
| 138 | +- Create an ECR Repository if it doesn't already exist |
| 139 | +- Run the equivalent of `docker push` |
| 140 | + |
| 141 | +### Lambda Function |
| 142 | + |
| 143 | +An AWS Lambda Function is created with the configuration in the [`scaffoldly.json`](scaffoldly.json) file: |
| 144 | + |
| 145 | +Running `npx scaffoldly deploy` will: |
| 146 | + |
| 147 | +- Setup Function Environment Variables from `.env` |
| 148 | +- Deploy the Function with a VPC Configuration and EFS Mounts inferred from [Amazon EFS](#model-caching) |
| 149 | +- Create Lambda Versions and Aliases |
| 150 | +- Set an `ENTRYPOINT` which routes [AWS Lambda HTTP Requests to Ollama](#request-proxy) |
| 151 | +- Create a Lambda Function URL and set it as an environment variable as `URL` |
| 152 | + |
| 153 | +### Model Caching |
| 154 | + |
| 155 | +Model files are large and cached in Amazon EFS. Using the `@immediately` option in the `schedules` directive of [`scaffoldly.json`](scaffoldly.json), the Model is pre-downloaded after the deployment. |
| 156 | + |
| 157 | +Running `npx scaffoldly deploy` will: |
| 158 | + |
| 159 | +- Set up a `XDG_CACHE_HOME` environment to be the EFS Mount on the Lambda Function |
| 160 | +- Use the `OLLAMA_HOST=$URL` envrionment variable to trigger a remote download (on itself) |
| 161 | +- Use the `HOME=$XDG_CACHE_HOME` to direct Ollama where to store files |
| 162 | +- Invoke `ollama pull` once the AWS Lambda Function is finished deploying |
| 163 | + |
| 164 | +### Request Proxy |
| 165 | + |
| 166 | +Finally, Scaffoldly uses the `start` option in the `scripts` directive of [`scaffoldly.json`](scaffoldly.json) to run `ollama serve`. |
| 167 | + |
| 168 | +Running `npx scaffoldly deploy` will: |
81 | 169 |
|
82 | | -See the [Scaffoldly Docs](https://scaffoldly.dev/docs/gha/) for additional GitHub Actions directives. |
| 170 | +- Copy the [`awslambda-entrypoint`](https://github.com/scaffoldly/scaffoldly/blob/main/src/awslambda-entrypoint.ts) |
| 171 | +- The `awslambda-entrypoint` reads the `SLY_ROUTES` and `SLY_SERVE` environment variables to start and route requests |
| 172 | +- Requests are converted from the AWS Lambda HTTP Request format back into a HTTP Request forwarded to the Ollama Server. |
| 173 | +- The Ollama Server response is streamed back to the requestor. |
83 | 174 |
|
84 | 175 | ## Questions, Feedback, and Help |
85 | 176 |
|
|
0 commit comments