scaleway · bene2k1 · Oct 27, 2025 · Oct 24, 2025 · Oct 24, 2025 · Oct 24, 2025
@@ -24,6 +24,13 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer
 |-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
 | Mistral | `voxtral-small-24b-2507`  | 32k  | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Voxtral-Small-24B-2507) |
 
+### Audio transcription models
+
+| Provider | Model string | Maximum audio duration (Minutes) | Chunk size (Seconds) | Maximum file size (MB) | License | Model card |
+|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
+| Mistral | `voxtral-small-24b-2507`  | 30 | 30 | 25 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Voxtral-Small-24B-2507) |
+| OpenAI | `whisper-large-v3`  | - | 30 | 25 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/openai/whisper-large-v3) |
+
 ## Chat models
 
 | Provider | Model string | Context window (Tokens) | Maximum output (Tokens)| License | Model card |

@@ -17,6 +17,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
 | Model name | Provider | Maximum Context length (tokens) | Modalities | Compatible Instances (Max Context in tokens\*) | License |
 |------------|----------|--------------|------------|-----------|---------|
 | [`gpt-oss-120b`](#gpt-oss-120b) | OpenAI | 128k | Text | H100 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
+| [`whisper-large-v3`](#whisper-large-v3) | OpenAI | - | Audio transcription | L4, L40S, H100, H100-SXM-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
 | [`qwen3-235b-a22b-instruct-2507`](#qwen3-235b-a22b-instruct-2507) | Qwen | 40k | Text | H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
 | [`gemma-3-27b-it`](#gemma-3-27b-it) | Google | 40k | Text, Vision | H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) |
 | [`llama-3.3-70b-instruct`](#llama-33-70b-instruct) | Meta | 128k | Text | H100 (15k), H100-2 | [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
@@ -48,6 +49,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
 | Model name | Structured output supported | Function calling | Supported languages |
 | --- | --- | --- | --- |
 | `gpt-oss-120b` | Yes | Yes | English |
+| `whisper-large-v3` | - | - | English, French, German, Chinese, Japanese, Korean and 81 additional languages  |
 | `qwen3-235b-a22b-instruct-2507` | Yes | Yes | English, French, German, Chinese, Japanese, Korean and 113 additional languages and dialects |
 | `gemma-3-27b-it` | Yes | Partial | English, Chinese, Japanese, Korean and 31 additional languages |
 | `llama-3.3-70b-instruct` | Yes | Yes | English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai |
@@ -192,6 +194,26 @@ mistral/voxtral-small-24b-2507:fp8
   - If audio sent is less than 30 seconds, the rest of the chunk will be considered silent. 
   - 80ms is equal to 1 input token
 
+## Audio transcription models
+
+### Whisper-large-v3
+Whisper-large-v3 is a model developed by OpenAI to transcribe audio in many languages.
+This model is optimized for audio transcription tasks.
+
+| Attribute | Value |
+|-----------|-------|
+| Supported audio formats | WAV and MP3 |
+| Audio chunk duration | 30 seconds |
+
+#### Model names
+```
+openai/whisper-large-v3:bf16
+```
+
+- Mono and stereo audio formats are supported. For stereo formats, left and right channels are merged before being processed.
+- Audio files are processed in 30-second chunks:
+  - If audio sent is less than 30 seconds, the rest of the chunk will be considered silent. 
+
 ## Text models
 
 ### Qwen3-235b-a22b-instruct-2507

diff --git a/pages/organizations-and-projects/additional-content/organization-quotas.mdx b/pages/organizations-and-projects/additional-content/organization-quotas.mdx
@@ -210,6 +210,11 @@ Generative APIs are rate limited based on:
 | gpt-oss-120b	 | 200k | 400k   |
 | bge-multilingual-gemma2		  | 200k | 400k  |
 
+|  Audio seconds per minute   | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) |
+|-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|
+| voxtral-small-24b-2507		  | 1800 | 3600     |
+| whisper-large-v3	  | 1800 | 3600    |
+
 
 |  Requests per minute   | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) |
 |-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|
@@ -228,6 +233,7 @@ Generative APIs are rate limited based on:
 | qwen3-coder-30b-a3b-instruct	 | 300 | 600   |
 | gpt-oss-120b	  | 300 | 600   |
 | bge-multilingual-gemma2		  | 300 | 600     |
+| whisper-large-v3		  | 300 | 600     |
 
 |  Concurrent requests   | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) |
 |-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|