Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,13 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
| Mistral | `voxtral-small-24b-2507` | 32k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Voxtral-Small-24B-2507) |

### Audio transcription models

| Provider | Model string | Maximum audio duration (Minutes) | Chunk size (Seconds) | Maximum file size (MB) | License | Model card |
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
| Mistral | `voxtral-small-24b-2507` | 30 | 30 | 25 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Voxtral-Small-24B-2507) |
| OpenAI | `whisper-large-v3` | - | 30 | 25 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/openai/whisper-large-v3) |

## Chat models

| Provider | Model string | Context window (Tokens) | Maximum output (Tokens)| License | Model card |
Expand Down
22 changes: 22 additions & 0 deletions pages/managed-inference/reference-content/model-catalog.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
| Model name | Provider | Maximum Context length (tokens) | Modalities | Compatible Instances (Max Context in tokens\*) | License |
|------------|----------|--------------|------------|-----------|---------|
| [`gpt-oss-120b`](#gpt-oss-120b) | OpenAI | 128k | Text | H100 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`whisper-large-v3`](#whisper-large-v3) | OpenAI | - | Audio transcription | L4, L40S, H100, H100-SXM-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`qwen3-235b-a22b-instruct-2507`](#qwen3-235b-a22b-instruct-2507) | Qwen | 40k | Text | H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| [`gemma-3-27b-it`](#gemma-3-27b-it) | Google | 40k | Text, Vision | H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) |
| [`llama-3.3-70b-instruct`](#llama-33-70b-instruct) | Meta | 128k | Text | H100 (15k), H100-2 | [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
Expand Down Expand Up @@ -48,6 +49,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
| Model name | Structured output supported | Function calling | Supported languages |
| --- | --- | --- | --- |
| `gpt-oss-120b` | Yes | Yes | English |
| `whisper-large-v3` | - | - | English, French, German, Chinese, Japanese, Korean and 81 additional languages |
| `qwen3-235b-a22b-instruct-2507` | Yes | Yes | English, French, German, Chinese, Japanese, Korean and 113 additional languages and dialects |
| `gemma-3-27b-it` | Yes | Partial | English, Chinese, Japanese, Korean and 31 additional languages |
| `llama-3.3-70b-instruct` | Yes | Yes | English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai |
Expand Down Expand Up @@ -192,6 +194,26 @@ mistral/voxtral-small-24b-2507:fp8
- If audio sent is less than 30 seconds, the rest of the chunk will be considered silent.
- 80ms is equal to 1 input token

## Audio transcription models

### Whisper-large-v3
Whisper-large-v3 is a model developed by OpenAI to transcribe audio in many languages.
This model is optimized for audio transcription tasks.

| Attribute | Value |
|-----------|-------|
| Supported audio formats | WAV and MP3 |
| Audio chunk duration | 30 seconds |

#### Model names
```
openai/whisper-large-v3:bf16
```

- Mono and stereo audio formats are supported. For stereo formats, left and right channels are merged before being processed.
- Audio files are processed in 30-second chunks:
- If audio sent is less than 30 seconds, the rest of the chunk will be considered silent.

## Text models

### Qwen3-235b-a22b-instruct-2507
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,11 @@ Generative APIs are rate limited based on:
| gpt-oss-120b | 200k | 400k |
| bge-multilingual-gemma2 | 200k | 400k |

| Audio seconds per minute | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) |
|-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|
| voxtral-small-24b-2507 | 1800 | 3600 |
| whisper-large-v3 | 1800 | 3600 |


| Requests per minute | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) |
|-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|
Expand All @@ -228,6 +233,7 @@ Generative APIs are rate limited based on:
| qwen3-coder-30b-a3b-instruct | 300 | 600 |
| gpt-oss-120b | 300 | 600 |
| bge-multilingual-gemma2 | 300 | 600 |
| whisper-large-v3 | 300 | 600 |

| Concurrent requests | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) |
|-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|
Expand Down