Core is working; improvements are shipping daily.
OpenCluely is a revolutionary AI-powered desktop application that provides invisible, real-time assistance during technical rounds.
OopenCluelyDemo.mp4
|
|
- Floating Overlay Bar: Compact command center with camera, mic, and skill selector
- Draggable Answer Window: Move and resize AI response window anywhere
- Close Button: Clean × button to close answer window when needed
- Auto-Hide Mic: Microphone button appears only when Azure Speech is configured
- Interactive Chat: Full conversation window with markdown support
- Glass Morphism: Beautiful blur effects and transparency
- Adaptive Layout: UI adjusts based on available services
- Smart Resizing: Windows resize automatically to fit content
- Professional Look: Mimics system applications for perfect stealth
|
|
|
- Stealth overlay with draggable command bar and click‑through toggle
- Screenshot capture with direct Gemini analysis (no OCR step)
- AI response window with markdown and code highlighting
- Global shortcuts (capture, visibility, interaction, chat, settings)
- Session memory and chat UI
- Language picker and DSA skill prompt
- Optional Azure Speech integration with auto‑hide mic
- Multi‑monitor and area capture APIs
- Window binding and positioning system
- Settings management with app icon/stealth modes
- Hidden during screen share (auto‑hide all windows while screen is being shared)
- Multi‑model support (OpenAI/Anthropic/Local backends alongside Gemini)
- Auto‑typer for code snippets (paste or simulate typing into editors/IDEs)
- Export conversation history (save sessions as markdown/PDF)
- Performance optimizations (faster startup, reduced memory usage)
- Enhanced stealth modes (process name randomization, deeper OS integration)
The setup script automatically handles configuration. You only need:
# Required: Google Gemini API Key (setup script will ask for this)
GEMINI_API_KEY=your_gemini_api_key_here
# Optional: Azure Speech Recognition (add later if you want voice features)
AZURE_SPEECH_KEY=your_azure_speech_key
AZURE_SPEECH_REGION=your_region
Note: Speech recognition is completely optional. If Azure credentials are not provided, the microphone button will be automatically hidden from all interfaces.
-
Clone the repository
git clone https://github.com/TechyCSR/OpenCluely.git cd OpenCluely
-
Get your Gemini API key (Required)
- Visit Google AI Studio
- Click "Create API Key"
- Copy the key (you'll need it in step 3)
-
Run the setup script (One command does everything!)
./setup.sh
That's it! The setup script will:
- Install all dependencies automatically
- Create and configure your
.env
file - Build the app (if needed)
- Launch OpenCluely ready to use (if not works use npm install & then npm start)
- Windows: Use Git Bash (comes with Git for Windows), WSL, or any bash environment
- macOS/Linux: Use your regular terminal
- All platforms: No manual npm commands needed - the setup script handles everything
./setup.sh --build # Build distributable for your OS
./setup.sh --ci # Use npm ci instead of npm install
./setup.sh --no-run # Setup only, don't launch the app
./setup.sh --install-system-deps # Install sox for microphone (optional)
Voice recognition is completely optional. The setup script will create a .env
file with just the required Gemini key. To add voice features:
-
Get Azure Speech credentials:
- Visit Azure Portal
- Create a Speech Service
- Copy your key and region
-
Add to your
.env
file:# Already configured by setup script GEMINI_API_KEY=your_gemini_api_key_here # Add these for voice features (optional) AZURE_SPEECH_KEY=your_azure_speech_key AZURE_SPEECH_REGION=your_region
-
Restart the app - microphone buttons will now appear automatically
Action | Shortcut | Description |
---|---|---|
Screenshot Capture | ⌘⇧S |
Capture screen and analyze via Gemini (image understanding) |
Toggle Speech | Alt+R |
Start/stop voice recognition (if configured) |
Toggle Visibility | ⌘⇧V |
Show/hide all windows |
Toggle Interaction | ⌘⇧I or Alt+A |
Enable/disable window interaction |
Switch to Chat | ⌘⇧C |
Open interactive chat window |
Settings | ⌘, |
Open settings panel |
- Start OpenCluely → App appears as system process (Terminal/Activity Monitor)
- Position Windows → Drag overlay and answer windows to preferred locations
- Capture Questions → Use screenshot (⌘⇧S) or voice commands
- Get AI Answers → Instant responses in draggable answer window
- Interactive Chat → Type or speak for detailed conversations
- Stay Stealth → All operations invisible to screen recording
- Draggable Interface: Click and drag any window to reposition
- Auto-resize: Windows automatically adjust to content
- Close Button: Click × to close answer window
- Always on Top: Windows stay above all applications
- Context Awareness: Remembers entire conversation
- Code Detection: Automatically formats code blocks
- Language Specific: Tailored responses for selected programming language
- Session Memory: Maintains context across multiple questions
- Image Understanding: DSA prompt is applied only for new image-based queries; chat messages don’t include the full prompt
- Multi-monitor & Area Capture: Programmatic APIs allow targeting a display and optional rectangular crop for focused analysis
- Real-time Transcription: Speak questions naturally
- Listening Animation: Visual feedback during recording
- Interim Results: See transcription as you speak
- Auto-processing: Instant AI responses to voice input ]
🧩 Troubleshooting
-
setup.sh not found or won't run
- Make sure you're in the OpenCluely directory:
cd OpenCluely
- Make the script executable:
chmod +x setup.sh
- On Windows, use Git Bash (comes with Git for Windows)
- Make sure you're in the OpenCluely directory:
-
Setup script stops with exit code 130
- This means you pressed Ctrl+C. Just run
./setup.sh
again
- This means you pressed Ctrl+C. Just run
-
Node or npm not found
- Install Node.js 18+ from nodejs.org
- Restart your terminal and try again
-
Electron won't start or shows blank window (Linux)
- Try:
npm run dev
- Ensure X11/XWayland is available if running in headless environments
- Try:
-
macOS screen capture doesn't work
- Grant "Screen Recording" permission in System Settings → Privacy & Security → Screen Recording
- Quit and relaunch the app after granting permission
-
Windows SmartScreen blocks the app
- Click "More info" → "Run anyway" or use
npm start
during development
- Click "More info" → "Run anyway" or use
-
Microphone/voice not working
- Voice is optional - ignore related warnings if you don't need it
- To enable: install
sox
(Linux/macOS) and add Azure keys to.env
⚖️ Legal & Ethics
OpenCluely is provided for educational and research purposes. Users are responsible for:
- Complying with interview guidelines
- Respecting company policies
- Understanding legal implications
- Using ethically and responsibly
- No data collection or telemetry
- All processing happens locally
- API communications are encrypted
- Session data stays on your device
This project is licensed under the MIT License - see the LICENSE file for details.
-
Google Gemini: Powering AI intelligence
-
Azure Speech: Optional voice recognition
-
Electron: Cross-platform desktop framework
-
Community: Amazing contributors and feedback
-
Vysper: UI and code structure inspiration — see Vysper by varun-singhh
⭐ Star this repo if OpenCluely helped you ace your interviews or you vibed with it!
Made with ❤️ by TechyCSR