OSIA
Your AI. Your Machine.
Your voice. Full control.
Everything you need.
Nothing you don't.
Built from the ground up for privacy, performance, and real-world utility.
Multi-Model AI
DeepSeek, Gemini, Groq, Anthropic, OpenAI, Ollama — configure your preferred models with automatic key rotation and intelligent fallbacks.
Vision & Computer Use
6-tier deterministic hierarchy: deep links → DOM → accessibility APIs → keyboard nav → zoom-vision → simple vision. Mouse is the last resort.
Desktop Automation
Open apps, click elements, type text, navigate menus. OSIA controls your PC like a human — through APIs first, vision last.
Voice Interface
Custom wake word with ML verifier, local STT via faster-whisper, multi-provider TTS. Sub-100ms UI response, minimal voice latency.
MCP Connectors
Extend OSIA with Model Context Protocol servers. Add tools dynamically — each connector is indistinguishable from native capabilities.
Mobile Companion
Flutter app for Android & iOS. Full mirror of PC settings, chat history, voice input, cron jobs. Your phone becomes the remote.
Remote Access
Cloudflare Tunnel provides secure remote access to your OSIA instance from anywhere. Token-authenticated, zero-config networking.
Web Control
Playwright-powered browser automation. Precise DOM-level clicking, page navigation, form filling — all through natural language commands.
48+ Native Tools
File management, system control, web scraping, API calls, scheduling, and more. Each tool is deterministic and verifiable.
Deterministic First.
OSIA uses a 6-tier hierarchy for computer control. Mouse and vision are the absolute last resort. Every action goes through APIs, DOM, and accessibility layers first — ensuring precision, speed, and reliability.
Deep Links & URL Schemes
steam://, spotify:, direct APIs
DOM / Playwright
Precise web element interaction via CDP
Accessibility APIs
UIA (Windows), AT-SPI (Linux), AX (macOS)
Keyboard Navigation
Tab/Enter + UIA focus reading
Zoom Vision (2-pass)
Refined screenshot analysis
Simple Vision
Last resort — full screenshot analysis
OSIA never claims success without proof. It reads app manifests, checks process states, enumerates windows, and inspects DOM to verify every action — no hallucinated results. Screenshots are stripped from history after each turn to prevent context ballooning.
MCP Connectors
Extend OSIA with Model Context Protocol servers. Each connector adds new tools that are indistinguishable from native capabilities. Dynamic injection, zero code changes.
Dynamic Tool Injection
Tools are namespaced (mcp__server__tool) and injected at runtime into the agent's tool map.
Async Architecture
Dedicated asyncio event loop in a daemon thread. Sync wrappers for the agent dispatch system.
Seamless Integration
MCP tools appear in the system prompt, are callable like native tools, and return string results.
{
"connectors": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_TOKEN": "..." }
},
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/data"]
},
"postgres": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": { "DATABASE_URL": "..." }
}
}
}Your data. Your rules.
Privacy isn't a feature — it's the foundation. OSIA is built from the ground up to keep everything on your machine.
Zero Cloud Dependency
All AI processing can run locally. No data leaves your machine unless you explicitly configure a cloud provider.
Token Authentication
20-character secure token for WebSocket connections. Automatic silent upgrade from legacy 16-char tokens.
Encrypted Tunnel
Cloudflare Tunnel provides secure remote access without exposing your IP or opening ports on your router.
Local-First Storage
Configuration, chat history, API keys, and wake word models all stored locally. Gitignored secrets, never committed.
Key Rotation
Universal key chain with automatic rotation on technical errors (429, 5xx, auth failures). Never on content errors.
No Telemetry
OSIA collects zero usage data. No analytics, no tracking pixels, no phone-home behavior. Your usage is yours.
Confidential by Design
OSIA follows three core pillars: Confidential (100% local), Secure (Cloudflare Tunnel + token auth), and Accessible (app → connection → API). Your voice data, conversations, and system interactions never leave your device without explicit consent.
PC and phone become one.
Everything configurable on PC is mirrored on mobile. Your phone isn't a remote — it's a full control panel.
Windows
Primary development platform. Full feature set including Task Scheduler integration, UIA accessibility, and native app control.
Python 3.12 / PyQt6 / FastAPILinux
Original platform (Fedora). Wayland + X11 support with AT-SPI accessibility and multi-screenshot backends.
Python 3.12 / PyQt6 / FastAPImacOS
Full port planned after Windows stabilization. AXUIElement accessibility, native menu integration.
Python 3.12 / PyQt6 / FastAPIAndroid
Full companion app with chat, voice, settings mirror, cron jobs, and conversation management.
Flutter / DartiOS
Feature-parity companion app. Voice input, chat history, full settings control.
Flutter / DartReady to take control?
OSIA is open source and free. Install it on your machine and experience what a truly private AI assistant can do.
