|
1 | 1 | # Getting Started |
2 | 2 |
|
3 | | -import QuickStart from '../src/components/QuickStart.js' |
| 3 | +LiteLLM provides a unified SDK to call 100+ LLM providers using OpenAI-compatible formats. |
4 | 4 |
|
5 | | -LiteLLM simplifies LLM API calls by mapping them all to the [OpenAI ChatCompletion format](https://platform.openai.com/docs/api-reference/chat). |
| 5 | +## Core Functions |
6 | 6 |
|
7 | | -## basic usage |
| 7 | +| Function | OpenAI Endpoint | Use Case | |
| 8 | +|----------|-----------------|----------| |
| 9 | +| `completion()` | `/chat/completions` | Chat & text generation | |
| 10 | +| `responses()` | `/responses` | Reasoning models (o1, o3) | |
| 11 | +| `embedding()` | `/embeddings` | Vector embeddings | |
| 12 | +| `image_generation()` | `/images/generations` | Image creation | |
| 13 | +| `transcription()` | `/audio/transcriptions` | Speech-to-text | |
8 | 14 |
|
9 | | -By default we provide a free $10 community-key to try all providers supported on LiteLLM. |
| 15 | +All functions use the format `provider/model` and return OpenAI-compatible responses. |
| 16 | + |
| 17 | +## Installation |
| 18 | + |
| 19 | +```shell |
| 20 | +pip install litellm |
| 21 | +``` |
| 22 | + |
| 23 | +## Basic Usage |
10 | 24 |
|
11 | 25 | ```python |
12 | 26 | from litellm import completion |
| 27 | +import os |
13 | 28 |
|
14 | | -## set ENV variables |
| 29 | +# Set your API key |
15 | 30 | os.environ["OPENAI_API_KEY"] = "your-api-key" |
16 | | -os.environ["COHERE_API_KEY"] = "your-api-key" |
17 | 31 |
|
18 | | -messages = [{ "content": "Hello, how are you?","role": "user"}] |
| 32 | +messages = [{"role": "user", "content": "Hello, how are you?"}] |
19 | 33 |
|
20 | | -# openai call |
21 | | -response = completion(model="gpt-3.5-turbo", messages=messages) |
| 34 | +# Use provider/model format |
| 35 | +response = completion( |
| 36 | + model="openai/gpt-4o", |
| 37 | + messages=messages |
| 38 | +) |
22 | 39 |
|
23 | | -# cohere call |
24 | | -response = completion("command-nightly", messages) |
| 40 | +print(response.choices[0].message.content) |
25 | 41 | ``` |
26 | 42 |
|
27 | | -**Need a dedicated key?** |
28 | | -Email us @ krrish@berri.ai |
| 43 | +### Switch Providers with One Line |
| 44 | + |
| 45 | +```python |
| 46 | +import os |
| 47 | +from litellm import completion |
| 48 | + |
| 49 | +os.environ["OPENAI_API_KEY"] = "your-openai-key" |
| 50 | +os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key" |
29 | 51 |
|
30 | | -Next Steps 👉 [Call all supported models - e.g. Claude-2, Llama2-70b, etc.](./proxy_api.md#supported-models) |
| 52 | +messages = [{"role": "user", "content": "Hello!"}] |
31 | 53 |
|
32 | | -More details 👉 |
| 54 | +# Same code, different providers |
| 55 | +response = completion(model="openai/gpt-4o", messages=messages) |
| 56 | +response = completion(model="anthropic/claude-sonnet-4-20250514", messages=messages) |
33 | 57 |
|
34 | | -- [Completion() function details](./completion/) |
35 | | -- [Overview of supported models / providers on LiteLLM](./providers/) |
36 | | -- [Search all models / providers](https://models.litellm.ai/) |
37 | | -- [Build your own OpenAI proxy](https://github.com/BerriAI/liteLLM-proxy/tree/main) |
| 58 | +# Both return the same OpenAI-compatible format |
| 59 | +print(response.choices[0].message.content) |
| 60 | +``` |
38 | 61 |
|
39 | | -## streaming |
| 62 | +## Streaming |
40 | 63 |
|
41 | | -Same example from before. Just pass in `stream=True` in the completion args. |
| 64 | +Pass `stream=True` to get a streaming iterator: |
42 | 65 |
|
43 | 66 | ```python |
44 | 67 | from litellm import completion |
45 | 68 |
|
46 | | -## set ENV variables |
47 | | -os.environ["OPENAI_API_KEY"] = "openai key" |
48 | | -os.environ["COHERE_API_KEY"] = "cohere key" |
| 69 | +response = completion( |
| 70 | + model="openai/gpt-4o", |
| 71 | + messages=[{"role": "user", "content": "Write a short poem"}], |
| 72 | + stream=True |
| 73 | +) |
| 74 | + |
| 75 | +for chunk in response: |
| 76 | + print(chunk.choices[0].delta.content or "", end="") |
| 77 | +``` |
| 78 | + |
| 79 | +## Async |
| 80 | + |
| 81 | +Use `acompletion()` for async operations: |
| 82 | + |
| 83 | +```python |
| 84 | +from litellm import acompletion |
| 85 | +import asyncio |
| 86 | + |
| 87 | +async def main(): |
| 88 | + response = await acompletion( |
| 89 | + model="openai/gpt-4o", |
| 90 | + messages=[{"role": "user", "content": "Hello!"}] |
| 91 | + ) |
| 92 | + print(response.choices[0].message.content) |
49 | 93 |
|
50 | | -messages = [{ "content": "Hello, how are you?","role": "user"}] |
| 94 | +asyncio.run(main()) |
| 95 | +``` |
51 | 96 |
|
52 | | -# openai call |
53 | | -response = completion(model="gpt-3.5-turbo", messages=messages, stream=True) |
| 97 | +## Responses API |
54 | 98 |
|
55 | | -# cohere call |
56 | | -response = completion("command-nightly", messages, stream=True) |
| 99 | +For reasoning models (o1, o3) or OpenAI's new `/responses` format: |
57 | 100 |
|
58 | | -print(response) |
| 101 | +```python |
| 102 | +import litellm |
| 103 | + |
| 104 | +response = litellm.responses( |
| 105 | + model="openai/gpt-4o", |
| 106 | + input="Tell me a bedtime story" |
| 107 | +) |
| 108 | + |
| 109 | +print(response.output[0].content[0].text) |
59 | 110 | ``` |
60 | 111 |
|
61 | | -More details 👉 |
| 112 | +Works with all providers - LiteLLM handles the translation automatically. |
| 113 | + |
| 114 | +## Embeddings |
62 | 115 |
|
63 | | -- [streaming + async](./completion/stream.md) |
64 | | -- [tutorial for streaming Llama2 on TogetherAI](./tutorials/TogetherAI_liteLLM.md) |
| 116 | +```python |
| 117 | +from litellm import embedding |
| 118 | + |
| 119 | +response = embedding( |
| 120 | + model="openai/text-embedding-3-small", |
| 121 | + input=["Hello world", "How are you?"] |
| 122 | +) |
| 123 | + |
| 124 | +print(response.data[0]["embedding"][:5]) |
| 125 | +``` |
65 | 126 |
|
66 | | -## exception handling |
| 127 | +## Exception Handling |
67 | 128 |
|
68 | | -LiteLLM maps exceptions across all supported providers to the OpenAI exceptions. All our exceptions inherit from OpenAI's exception types, so any error-handling you have for that, should work out of the box with LiteLLM. |
| 129 | +LiteLLM maps all provider exceptions to OpenAI-compatible exceptions: |
69 | 130 |
|
70 | 131 | ```python |
71 | | -from openai.error import OpenAIError |
| 132 | +from openai import OpenAIError |
72 | 133 | from litellm import completion |
73 | 134 |
|
74 | | -os.environ["ANTHROPIC_API_KEY"] = "bad-key" |
75 | 135 | try: |
76 | | - # some code |
77 | | - completion(model="claude-instant-1", messages=[{"role": "user", "content": "Hey, how's it going?"}]) |
| 136 | + response = completion( |
| 137 | + model="anthropic/claude-sonnet-4-20250514", |
| 138 | + messages=[{"role": "user", "content": "Hello"}] |
| 139 | + ) |
78 | 140 | except OpenAIError as e: |
79 | | - print(e) |
| 141 | + print(f"Error: {e}") |
80 | 142 | ``` |
81 | 143 |
|
82 | | -## Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks)) |
| 144 | +## Observability |
83 | 145 |
|
84 | | -LiteLLM exposes pre defined callbacks to send data to MLflow, Lunary, Langfuse, Helicone, Promptlayer, Traceloop, Slack |
| 146 | +Log LLM calls to Langfuse, Lunary, MLflow, and more: |
85 | 147 |
|
86 | 148 | ```python |
| 149 | +import litellm |
87 | 150 | from litellm import completion |
88 | 151 |
|
89 | | -## set env variables for logging tools (API key set up is not required when using MLflow) |
90 | | -os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key" # get your public key at https://app.lunary.ai/settings |
91 | | -os.environ["HELICONE_API_KEY"] = "your-helicone-key" |
92 | | -os.environ["LANGFUSE_PUBLIC_KEY"] = "" |
93 | | -os.environ["LANGFUSE_SECRET_KEY"] = "" |
| 152 | +# Set callbacks |
| 153 | +litellm.success_callback = ["langfuse", "lunary"] |
94 | 154 |
|
95 | | -os.environ["OPENAI_API_KEY"] |
96 | | - |
97 | | -# set callbacks |
98 | | -litellm.success_callback = ["lunary", "mlflow", "langfuse", "helicone"] # log input/output to MLflow, langfuse, lunary, helicone |
99 | | - |
100 | | -#openai call |
101 | | -response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}]) |
| 155 | +response = completion( |
| 156 | + model="openai/gpt-4o", |
| 157 | + messages=[{"role": "user", "content": "Hello!"}] |
| 158 | +) |
102 | 159 | ``` |
103 | 160 |
|
104 | | -More details 👉 |
| 161 | +More details: [Observability & Logging](./observability/callbacks) |
| 162 | + |
| 163 | +## Next Steps |
105 | 164 |
|
106 | | -- [exception mapping](./exception_mapping.md) |
107 | | -- [retries + model fallbacks for completion()](./completion/reliable_completions.md) |
108 | | -- [tutorial for model fallbacks with completion()](./tutorials/fallbacks.md) |
| 165 | +- [All Providers](./providers/) - Provider-specific documentation |
| 166 | +- [Supported Endpoints](./supported_endpoints) - Full list of supported endpoints |
| 167 | +- [LiteLLM Proxy](./simple_proxy) - Deploy as an API gateway with load balancing |
| 168 | +- [Router](./routing) - Load balancing & fallbacks |
| 169 | +- [Caching](./caching/) - Response caching |
0 commit comments