Skip to content

Python SDK for Palabra AI's real-time speech-to-speech translation API. Break down language barriers and enable seamless communication across 25+ languages

License

Notifications You must be signed in to change notification settings

PalabraAI/palabra-ai-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Palabra AI Palabra AI Python SDK

Tests Release Python Versions PyPI version Downloads Docker License: MIT

๐ŸŒ Python SDK for Palabra AI's real-time speech-to-speech translation API
๐Ÿš€ Break down language barriers and enable seamless communication across 25+ languages

Overview ๐Ÿ“‹

๐ŸŽฏ The Palabra AI Python SDK provides a high-level API for integrating real-time speech-to-speech translation into your Python applications.

โœจ What can Palabra.ai do?

  • โšก Real-time speech-to-speech translation with near-zero latency
  • ๐ŸŽ™๏ธ Auto voice cloning - speak any language in YOUR voice
  • ๐Ÿ”„ Two-way simultaneous translation for live discussions
  • ๐Ÿš€ Developer API/SDK for building your own apps
  • ๐ŸŽฏ Works everywhere - Zoom, streams, events, any platform
  • ๐Ÿ”’ Zero data storage - your conversations stay private

๐Ÿ”ง This SDK focuses on making real-time translation simple and accessible:

  • ๐Ÿ›ก๏ธ Uses WebRTC and WebSockets under the hood
  • โšก Abstracts away all complexity
  • ๐ŸŽฎ Simple configuration with source/target languages
  • ๐ŸŽค Supports multiple input/output adapters (microphones, speakers, files, buffers)

๐Ÿ“Š How it works:

  1. ๐ŸŽค Configure input/output adapters
  2. ๐Ÿ”„ SDK handles the entire pipeline
  3. ๐ŸŽฏ Automatic transcription, translation, and synthesis
  4. ๐Ÿ”Š Real-time audio stream ready for playback

๐Ÿ’ก All with just a few lines of code!

Installation ๐Ÿ“ฆ

From PyPI ๐Ÿ“ฆ

pip install palabra-ai

macOS SSL Certificate Setup ๐Ÿ”’

If you encounter SSL certificate errors on macOS like:

SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate

Option 1: Install Python certificates (recommended)

/Applications/Python\ $(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')")/Install\ Certificates.command

Option 2: Use system certificates

pip install pip-system-certs

This will configure Python to use your system's certificate store.

Quick Start ๐Ÿš€

Real-time microphone translation ๐ŸŽค

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        EN, ES, DeviceManager)

palabra = PalabraAI()
dm = DeviceManager()
mic, speaker = dm.select_devices_interactive()
cfg = Config(SourceLang(EN, mic), [TargetLang(ES, speaker)])
palabra.run(cfg)

โš™๏ธ Set your API credentials as environment variables:

export PALABRA_API_KEY=your_api_key
export PALABRA_API_SECRET=your_api_secret

Examples ๐Ÿ’ก

File-to-file translation ๐Ÿ“

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        FileReader, FileWriter, EN, ES)

palabra = PalabraAI()
reader = FileReader("./speech/es.mp3")
writer = FileWriter("./es2en_out.wav")
cfg = Config(SourceLang(ES, reader), [TargetLang(EN, writer)])
palabra.run(cfg)

Multiple target languages ๐ŸŒ

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        FileReader, FileWriter, EN, ES, FR, DE)

palabra = PalabraAI()
config = Config(
    source=SourceLang(EN, FileReader("presentation.mp3")),
    targets=[
        TargetLang(ES, FileWriter("spanish.wav")),
        TargetLang(FR, FileWriter("french.wav")),
        TargetLang(DE, FileWriter("german.wav"))
    ]
)
palabra.run(config)

Customizable output ๐Ÿ“

๐Ÿ“‹ Add a transcription of the source and translated speech.
โš™๏ธ Configure output to provide:

  • ๐Ÿ”Š Audio only
  • ๐Ÿ“ Transcriptions only
  • ๐ŸŽฏ Both audio and transcriptions
from palabra_ai import (
    PalabraAI,
    Config,
    SourceLang,
    TargetLang,
    FileReader,
    EN,
    ES,
)
from palabra_ai.base.message import TranscriptionMessage


async def print_translation_async(msg: TranscriptionMessage):
    print(repr(msg))


def print_translation(msg: TranscriptionMessage):
    print(str(msg))


palabra = PalabraAI()
cfg = Config(
    source=SourceLang(
        EN,
        FileReader("speech/en.mp3"),
        print_translation  # Callback for source transcriptions
    ),
    targets=[
        TargetLang(
            ES,
            # You can use only transcription without audio writer if you want
            # FileWriter("./test_output.wav"),  # Optional: audio output
            on_transcription=print_translation_async  # Callback for translated transcriptions
        )
    ],
    silent=True,  # Set to True to disable verbose logging to console
)
palabra.run(cfg)

Transcription output options: ๐Ÿ“Š

1๏ธโƒฃ Audio only (default):

TargetLang(ES, FileWriter("output.wav"))

2๏ธโƒฃ Transcription only:

TargetLang(ES, on_transcription=your_callback_function)

3๏ธโƒฃ Audio and transcription:

TargetLang(ES, FileWriter("output.wav"), on_transcription=your_callback_function)

๐Ÿ’ก The transcription callbacks receive TranscriptionMessage objects containing the transcribed text and metadata.
๐Ÿ”„ Callbacks can be either synchronous or asynchronous functions.

Integrate with FFmpeg (streaming) ๐ŸŽฌ

import io
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        BufferReader, BufferWriter, AR, EN, RunAsPipe)

ffmpeg_cmd = [
    'ffmpeg',
    '-i', 'speech/ar.mp3',
    '-f', 's16le',      # 16-bit PCM
    '-acodec', 'pcm_s16le',
    '-ar', '48000',     # 48kHz
    '-ac', '1',         # mono
    '-'                 # output to stdout
]

pipe_buffer = RunAsPipe(ffmpeg_cmd)
es_buffer = io.BytesIO()

palabra = PalabraAI()
reader = BufferReader(pipe_buffer)
writer = BufferWriter(es_buffer)
cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)])
palabra.run(cfg)

print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes")
with open("./ar2en_out.wav", "wb") as f:
    f.write(es_buffer.getbuffer())

Using buffers ๐Ÿ’พ

import io
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        BufferReader, BufferWriter, AR, EN)
from palabra_ai.internal.audio import convert_any_to_pcm16

en_buffer, es_buffer = io.BytesIO(), io.BytesIO()
with open("speech/ar.mp3", "rb") as f:
    en_buffer.write(convert_any_to_pcm16(f.read()))
palabra = PalabraAI()
reader = BufferReader(en_buffer)
writer = BufferWriter(es_buffer)
cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)])
palabra.run(cfg)
print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes")
with open("./ar2en_out.wav", "wb") as f:
    f.write(es_buffer.getbuffer())

Using default audio devices ๐Ÿ”Š

from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, DeviceManager, EN, ES

dm = DeviceManager()
reader, writer = dm.get_default_readers_writers()

if reader and writer:
    palabra = PalabraAI()
    config = Config(
        source=SourceLang(EN, reader),
        targets=[TargetLang(ES, writer)]
    )
    palabra.run(config)

Async API โšก

import asyncio
from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES

async def translate():
    palabra = PalabraAI()
    config = Config(
        source=SourceLang(EN, FileReader("input.mp3")),
        targets=[TargetLang(ES, FileWriter("output.wav"))]
    )
    await palabra.run(config)

asyncio.run(translate())

I/O Adapters & Mixing ๐Ÿ”Œ

Available adapters ๐Ÿ› ๏ธ

๐ŸŽฏ The Palabra AI SDK provides flexible I/O adapters that can combined to:

  • ๐Ÿ“ FileReader/FileWriter: Read from and write to audio files
  • ๐ŸŽค DeviceReader/DeviceWriter: Use microphones and speakers
  • ๐Ÿ’พ BufferReader/BufferWriter: Work with in-memory buffers
  • ๐Ÿ”ง RunAsPipe: Run command and represent as pipe (e.g., FFmpeg stdout)

Mixing examples ๐ŸŽจ

๐Ÿ”„ Combine any input adapter with any output adapter:

๐ŸŽคโžก๏ธ๐Ÿ“ Microphone to file - record translations

config = Config(
    source=SourceLang(EN, mic),
    targets=[TargetLang(ES, FileWriter("recording_es.wav"))]
)

๐Ÿ“โžก๏ธ๐Ÿ”Š File to speaker - play translations

config = Config(
    source=SourceLang(EN, FileReader("presentation.mp3")),
    targets=[TargetLang(ES, speaker)]
)

๐ŸŽคโžก๏ธ๐Ÿ”Š๐Ÿ“ Microphone to multiple outputs

config = Config(
    source=SourceLang(EN, mic),
    targets=[
        TargetLang(ES, speaker),  # Play Spanish through speaker
        TargetLang(ES, FileWriter("spanish.wav")),  # Save Spanish to file
        TargetLang(FR, FileWriter("french.wav"))    # Save French to file
    ]
)

๐Ÿ’พโžก๏ธ๐Ÿ’พ Buffer to buffer - for integration

input_buffer = io.BytesIO(audio_data)
output_buffer = io.BytesIO()

config = Config(
    source=SourceLang(EN, BufferReader(input_buffer)),
    targets=[TargetLang(ES, BufferWriter(output_buffer))]
)

๐Ÿ”งโžก๏ธ๐Ÿ”Š FFmpeg pipe to speaker

pipe = RunAsPipe(ffmpeg_process.stdout)
config = Config(
    source=SourceLang(EN, BufferReader(pipe)),
    targets=[TargetLang(ES, speaker)]
)

Features โœจ

Real-time translation โšก

๐ŸŽฏ Translate audio streams in real-time with minimal latency
๐Ÿ’ฌ Perfect for live conversations, conferences, and meetings

Voice cloning ๐Ÿ—ฃ๏ธ

๐ŸŽญ Preserve the original speaker's voice characteristics in translations
โš™๏ธ Enable voice cloning in the configuration

Device management ๐ŸŽฎ

๐ŸŽค Easy device selection with interactive prompts or programmatic access:

dm = DeviceManager()

# Interactive selection
mic, speaker = dm.select_devices_interactive()

# Get devices by name
mic = dm.get_mic_by_name("Blue Yeti")
speaker = dm.get_speaker_by_name("MacBook Pro Speakers")

# List all devices
input_devices = dm.get_input_devices()
output_devices = dm.get_output_devices()

Supported languages ๐ŸŒ

Speech recognition languages ๐ŸŽค

๐Ÿ‡ธ๐Ÿ‡ฆ Arabic (AR), ๐Ÿ‡จ๐Ÿ‡ณ Chinese (ZH), ๐Ÿ‡จ๐Ÿ‡ฟ Czech (CS), ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish (DA), ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch (NL), ๐Ÿ‡ฌ๐Ÿ‡ง English (EN), ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish (FI), ๐Ÿ‡ซ๐Ÿ‡ท French (FR), ๐Ÿ‡ฉ๐Ÿ‡ช German (DE), ๐Ÿ‡ฌ๐Ÿ‡ท Greek (EL), ๐Ÿ‡ฎ๐Ÿ‡ฑ Hebrew (HE), ๐Ÿ‡ญ๐Ÿ‡บ Hungarian (HU), ๐Ÿ‡ฎ๐Ÿ‡น Italian (IT), ๐Ÿ‡ฏ๐Ÿ‡ต Japanese (JA), ๐Ÿ‡ฐ๐Ÿ‡ท Korean (KO), ๐Ÿ‡ต๐Ÿ‡ฑ Polish (PL), ๐Ÿ‡ต๐Ÿ‡น Portuguese (PT), ๐Ÿ‡ท๐Ÿ‡บ Russian (RU), ๐Ÿ‡ช๐Ÿ‡ธ Spanish (ES), ๐Ÿ‡น๐Ÿ‡ท Turkish (TR), ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian (UK)

Translation languages ๐Ÿ”„

๐Ÿ‡ธ๐Ÿ‡ฆ Arabic (AR), ๐Ÿ‡ง๐Ÿ‡ฌ Bulgarian (BG), ๐Ÿ‡จ๐Ÿ‡ณ Chinese Mandarin (ZH), ๐Ÿ‡จ๐Ÿ‡ฟ Czech (CS), ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish (DA), ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch (NL), ๐Ÿ‡ฌ๐Ÿ‡ง English UK (EN_GB), ๐Ÿ‡บ๐Ÿ‡ธ English US (EN_US), ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish (FI), ๐Ÿ‡ซ๐Ÿ‡ท French (FR), ๐Ÿ‡ฉ๐Ÿ‡ช German (DE), ๐Ÿ‡ฌ๐Ÿ‡ท Greek (EL), ๐Ÿ‡ฎ๐Ÿ‡ฑ Hebrew (HE), ๐Ÿ‡ญ๐Ÿ‡บ Hungarian (HU), ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian (ID), ๐Ÿ‡ฎ๐Ÿ‡น Italian (IT), ๐Ÿ‡ฏ๐Ÿ‡ต Japanese (JA), ๐Ÿ‡ฐ๐Ÿ‡ท Korean (KO), ๐Ÿ‡ต๐Ÿ‡ฑ Polish (PL), ๐Ÿ‡ต๐Ÿ‡น Portuguese (PT), ๐Ÿ‡ง๐Ÿ‡ท Portuguese Brazilian (PT_BR), ๐Ÿ‡ท๐Ÿ‡ด Romanian (RO), ๐Ÿ‡ท๐Ÿ‡บ Russian (RU), ๐Ÿ‡ธ๐Ÿ‡ฐ Slovak (SK), ๐Ÿ‡ช๐Ÿ‡ธ Spanish (ES), ๐Ÿ‡ฒ๐Ÿ‡ฝ Spanish Mexican (ES_MX), ๐Ÿ‡ธ๐Ÿ‡ช Swedish (SV), ๐Ÿ‡น๐Ÿ‡ท Turkish (TR), ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian (UK), ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese (VI)

Available language constants ๐Ÿ“š

from palabra_ai import (
    # English variants - 1.5+ billion speakers (including L2)
    EN, EN_AU, EN_CA, EN_GB, EN_US,

    # Chinese - 1.3+ billion speakers
    ZH,

    # Hindi - 600+ million speakers
    HI,

    # Spanish variants - 500+ million speakers
    ES, ES_MX,

    # Arabic variants - 400+ million speakers
    AR, AR_AE, AR_SA,

    # French variants - 280+ million speakers
    FR, FR_CA,

    # Portuguese variants - 260+ million speakers
    PT, PT_BR,

    # Russian - 260+ million speakers
    RU,

    # Japanese & Korean - 200+ million speakers combined
    JA, KO,

    # Southeast Asian languages - 400+ million speakers
    ID, VI, TA, MS, FIL,

    # Germanic languages - 150+ million speakers
    DE, NL, SV, NO, DA,

    # Other European languages - 300+ million speakers
    TR, IT, PL, UK, RO, EL, HU, CS, BG, SK, FI, HR,

    # Other languages - 40+ million speakers
    AZ, HE
)

Development status ๐Ÿ› ๏ธ

Current status โœ…

  • โœ… Core SDK functionality
  • โœ… GitHub Actions CI/CD
  • โœ… Docker packaging
  • โœ… Python 3.11, 3.12, 3.13 support
  • โœ… PyPI publication (coming soon)
  • โœ… Documentation site (coming soon)
  • โณ Code coverage reporting (setup required)

Current dev roadmap ๐Ÿ—บ๏ธ

  • โณ TODO: global timeout support for long-running tasks
  • โณ TODO: support for multiple source languages in a single run
  • โณ TODO: fine cancelling on cancel_all_tasks()
  • โณ TODO: error handling improvements

Build status ๐Ÿ—๏ธ

  • ๐Ÿงช Tests: Running on Python 3.11, 3.12, 3.13
  • ๐Ÿ“ฆ Release: Automated releases with Docker images
  • ๐Ÿ“Š Coverage: Tests implemented, reporting setup needed

Requirements ๐Ÿ“‹

  • ๐Ÿ Python 3.11+
  • ๐Ÿ”‘ Palabra AI API credentials (get them at palabra.ai)

Support ๐Ÿค

License ๐Ÿ“„

This project is licensed under the MIT License - see the LICENSE file for details.


ยฉ Palabra.ai, 2025 | ๐ŸŒ Breaking down language barriers with AI ๐Ÿš€

About

Python SDK for Palabra AI's real-time speech-to-speech translation API. Break down language barriers and enable seamless communication across 25+ languages

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages