Skip to content

🎀 A lightweight Windows system tray app that converts speech to text using OpenAI's Whisper API. Perfect for code-switching between Hindi and English\!

License

Notifications You must be signed in to change notification settings

mvijay24/claude__voice-in-windows

Repository files navigation

🎀 Whisper Paste - Hinglish Voice Transcription

A lightweight Windows system tray application that converts speech to text using OpenAI's Whisper API. Perfect for code-switching between Hindi and English!

Python Platform License

✨ Features

  • πŸ”΅ System Tray Application - Runs silently in background
  • πŸ”‘ Easy API Key Setup - Set your API key directly from the tray menu
  • 🎯 Two Output Modes:
    • Hinglish (Roman): Preserves Hindi words in Roman script
    • English: Translates everything to English
  • ⌨️ Global Hotkey - Ctrl+Space to start/stop recording
  • πŸ“‹ Auto-Paste - Transcribed text automatically pastes at cursor
  • πŸ”΄ Visual Feedback - Icon changes color when recording
  • ⏱️ Long Recordings - Up to 5 minutes per session
  • πŸ’Ύ Settings Persistence - Remembers your API key and preferences

πŸš€ Quick Start

Prerequisites

Installation

  1. Clone the repository

    git clone https://github.com/mvijay24/whisper-paste.git
    cd whisper-paste
  2. Install dependencies

    setup.bat
  3. Run the application

    start_silent.vbs

    Or simply double-click start_silent.vbs for completely silent startup!

  4. Set your API key

    • Right-click the tray icon
    • Select "πŸ”‘ Set API Key..."
    • Enter your OpenAI API key
    • Click Save

πŸ“– Usage

  1. Look for the mic icon in your system tray (near clock)
  2. Right-click the icon to access settings:
    • Set/Update API Key
    • Choose output mode (Hinglish or English)
  3. Press Ctrl+Space to start recording
  4. Speak in Hindi, English, or Hinglish
  5. Press Ctrl+Space again to stop
  6. Text automatically pastes at your cursor position!

Menu Options

  • πŸ”‘ Set API Key... - Add or update your OpenAI API key
  • API Status - Shows connection status (βœ“ Connected or ⚠️ No API Key)
  • πŸ“ Output Mode - Choose between Hinglish (Roman) or English
  • πŸ› Enable and Display Debug Panel - Shows real-time execution logs
  • πŸ“Š Session Log Summary - Shows detailed report after each recording
  • Exit - Properly closes the application

Examples

Hinglish Mode:

  • You say: "Bhai ye file jaldi bhej de"
  • Output: bhai ye file jaldi bhej de

English Mode:

  • You say: "Bhai ye file jaldi bhej de"
  • Output: brother send this file quickly

πŸ› οΈ Building Executable

To create a standalone .exe file:

build.bat

The executable will be created in the dist folder.

πŸ“ Project Structure

whisper-paste/
β”œβ”€β”€ whisper_tray.pyw      # Main application (no console window)
β”œβ”€β”€ start_silent.vbs      # Silent launcher
β”œβ”€β”€ start.bat            # Standard launcher
β”œβ”€β”€ restart.bat          # Kill old instances & restart
β”œβ”€β”€ setup.bat            # Install dependencies
β”œβ”€β”€ build.bat            # Build executable
β”œβ”€β”€ icon.ico             # Application icon
β”œβ”€β”€ settings.json        # Saved settings (auto-created)
└── README.md            # This file

βš™οΈ Configuration

Settings are automatically saved in settings.json:

{
  "output_mode": "hinglish",
  "api_key": "sk-..."
}

πŸ”§ Troubleshooting

Can't see the tray icon?

  • Click "Show hidden icons" arrow in system tray
  • Use restart.bat to kill old instances

API Key issues?

  • Ensure you have a valid OpenAI API key
  • Check your API usage limits at OpenAI Dashboard

No audio recorded?

  • Check microphone permissions in Windows Settings
  • Ensure default microphone is set correctly

Text not pasting or cursor errors?

  • Close clipboard managers like BeefText, Ditto, or ClipboardFusion - they interfere with paste functionality
  • Disable any text expander software temporarily
  • If you see "[WinError 1402] Invalid cursor handle", it's likely due to clipboard manager interference

πŸ’° Cost

  • Uses OpenAI's Whisper API
  • Approximately $0.006 per minute of audio
  • See OpenAI Pricing

πŸ’‘ Why "Toast"?

The small popup notification that shows transcribed text is called a "toast" because:

  • It "pops up" like bread from a toaster
  • It appears briefly and then disappears
  • Common UI term from Android/Windows for temporary notifications
  • Shows at the corner of the screen without interrupting workflow

🀝 Contributing

Feel free to open issues or submit pull requests!

πŸ“œ License

MIT License - feel free to use this in your projects!

πŸ™ Acknowledgments

  • OpenAI for the amazing Whisper API
  • The Python community for excellent libraries
  • Special thanks to the Hinglish-speaking community!

Made with ❀️ for the Hinglish-speaking developers!

About

🎀 A lightweight Windows system tray app that converts speech to text using OpenAI's Whisper API. Perfect for code-switching between Hindi and English\!

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •