diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx index 02581655ad..551cc0a309 100644 --- a/pages/generative-apis/reference-content/supported-models.mdx +++ b/pages/generative-apis/reference-content/supported-models.mdx @@ -24,6 +24,13 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer |-----------------|-----------------|-----------------|-----------------|-----------------|-----------------| | Mistral | `voxtral-small-24b-2507` | 32k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Voxtral-Small-24B-2507) | +### Audio transcription models + +| Provider | Model string | Maximum audio duration (Minutes) | Chunk size (Seconds) | Maximum file size (MB) | License | Model card | +|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------| +| Mistral | `voxtral-small-24b-2507` | 30 | 30 | 25 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Voxtral-Small-24B-2507) | +| OpenAI | `whisper-large-v3` | - | 30 | 25 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/openai/whisper-large-v3) | + ## Chat models | Provider | Model string | Context window (Tokens) | Maximum output (Tokens)| License | Model card | diff --git a/pages/managed-inference/reference-content/model-catalog.mdx b/pages/managed-inference/reference-content/model-catalog.mdx index 6be7013993..a3e863dd25 100644 --- a/pages/managed-inference/reference-content/model-catalog.mdx +++ b/pages/managed-inference/reference-content/model-catalog.mdx @@ -17,6 +17,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib | Model name | Provider | Maximum Context length (tokens) | Modalities | Compatible Instances (Max Context in tokens\*) | License | |------------|----------|--------------|------------|-----------|---------| | [`gpt-oss-120b`](#gpt-oss-120b) | OpenAI | 128k | Text | H100 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | +| [`whisper-large-v3`](#whisper-large-v3) | OpenAI | - | Audio transcription | L4, L40S, H100, H100-SXM-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`qwen3-235b-a22b-instruct-2507`](#qwen3-235b-a22b-instruct-2507) | Qwen | 40k | Text | H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) | | [`gemma-3-27b-it`](#gemma-3-27b-it) | Google | 40k | Text, Vision | H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) | | [`llama-3.3-70b-instruct`](#llama-33-70b-instruct) | Meta | 128k | Text | H100 (15k), H100-2 | [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | @@ -48,6 +49,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib | Model name | Structured output supported | Function calling | Supported languages | | --- | --- | --- | --- | | `gpt-oss-120b` | Yes | Yes | English | +| `whisper-large-v3` | - | - | English, French, German, Chinese, Japanese, Korean and 81 additional languages | | `qwen3-235b-a22b-instruct-2507` | Yes | Yes | English, French, German, Chinese, Japanese, Korean and 113 additional languages and dialects | | `gemma-3-27b-it` | Yes | Partial | English, Chinese, Japanese, Korean and 31 additional languages | | `llama-3.3-70b-instruct` | Yes | Yes | English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai | @@ -192,6 +194,26 @@ mistral/voxtral-small-24b-2507:fp8 - If audio sent is less than 30 seconds, the rest of the chunk will be considered silent. - 80ms is equal to 1 input token +## Audio transcription models + +### Whisper-large-v3 +Whisper-large-v3 is a model developed by OpenAI to transcribe audio in many languages. +This model is optimized for audio transcription tasks. + +| Attribute | Value | +|-----------|-------| +| Supported audio formats | WAV and MP3 | +| Audio chunk duration | 30 seconds | + +#### Model names +``` +openai/whisper-large-v3:bf16 +``` + +- Mono and stereo audio formats are supported. For stereo formats, left and right channels are merged before being processed. +- Audio files are processed in 30-second chunks: + - If audio sent is less than 30 seconds, the rest of the chunk will be considered silent. + ## Text models ### Qwen3-235b-a22b-instruct-2507 diff --git a/pages/organizations-and-projects/additional-content/organization-quotas.mdx b/pages/organizations-and-projects/additional-content/organization-quotas.mdx index 6f5fe5461f..04cdade983 100644 --- a/pages/organizations-and-projects/additional-content/organization-quotas.mdx +++ b/pages/organizations-and-projects/additional-content/organization-quotas.mdx @@ -210,6 +210,11 @@ Generative APIs are rate limited based on: | gpt-oss-120b | 200k | 400k | | bge-multilingual-gemma2 | 200k | 400k | +| Audio seconds per minute | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) | +|-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:| +| voxtral-small-24b-2507 | 1800 | 3600 | +| whisper-large-v3 | 1800 | 3600 | + | Requests per minute | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) | |-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:| @@ -228,6 +233,7 @@ Generative APIs are rate limited based on: | qwen3-coder-30b-a3b-instruct | 300 | 600 | | gpt-oss-120b | 300 | 600 | | bge-multilingual-gemma2 | 300 | 600 | +| whisper-large-v3 | 300 | 600 | | Concurrent requests | [Payment method validated](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) | Payment method and [identity validated](/account/how-to/verify-identity/) | |-------------|:----------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|