Open Source AI Assistant

OSIA

Your AI. Your Machine.

Your voice. Full control.

Capabilities

Everything you need.
Nothing you don't.

Built from the ground up for privacy, performance, and real-world utility.

Multi-Model AI

DeepSeek, Gemini, Groq, Anthropic, OpenAI, Ollama — configure your preferred models with automatic key rotation and intelligent fallbacks.

Vision & Computer Use

6-tier deterministic hierarchy: deep links → DOM → accessibility APIs → keyboard nav → zoom-vision → simple vision. Mouse is the last resort.

Desktop Automation

Open apps, click elements, type text, navigate menus. OSIA controls your PC like a human — through APIs first, vision last.

Voice Interface

Custom wake word with ML verifier, local STT via faster-whisper, multi-provider TTS. Sub-100ms UI response, minimal voice latency.

MCP Connectors

Extend OSIA with Model Context Protocol servers. Add tools dynamically — each connector is indistinguishable from native capabilities.

Mobile Companion

Flutter app for Android & iOS. Full mirror of PC settings, chat history, voice input, cron jobs. Your phone becomes the remote.

Remote Access

Cloudflare Tunnel provides secure remote access to your OSIA instance from anywhere. Token-authenticated, zero-config networking.

Web Control

Playwright-powered browser automation. Precise DOM-level clicking, page navigation, form filling — all through natural language commands.

48+ Native Tools

File management, system control, web scraping, API calls, scheduling, and more. Each tool is deterministic and verifiable.

Architecture

Deterministic First.

OSIA uses a 6-tier hierarchy for computer control. Mouse and vision are the absolute last resort. Every action goes through APIs, DOM, and accessibility layers first — ensuring precision, speed, and reliability.

Tier 1

Deep Links & URL Schemes

steam://, spotify:, direct APIs

Tier 2

DOM / Playwright

Precise web element interaction via CDP

Tier 3

Accessibility APIs

UIA (Windows), AT-SPI (Linux), AX (macOS)

Tier 4

Keyboard Navigation

Tab/Enter + UIA focus reading

Tier 5

Zoom Vision (2-pass)

Refined screenshot analysis

Tier 6

Simple Vision

Last resort — full screenshot analysis

Deterministic Verification

OSIA never claims success without proof. It reads app manifests, checks process states, enumerates windows, and inspects DOM to verify every action — no hallucinated results. Screenshots are stripped from history after each turn to prevent context ballooning.

Extensibility

MCP Connectors

Extend OSIA with Model Context Protocol servers. Each connector adds new tools that are indistinguishable from native capabilities. Dynamic injection, zero code changes.

Dynamic Tool Injection

Tools are namespaced (mcp__server__tool) and injected at runtime into the agent's tool map.

Async Architecture

Dedicated asyncio event loop in a daemon thread. Sync wrappers for the agent dispatch system.

Seamless Integration

MCP tools appear in the system prompt, are callable like native tools, and return string results.

mcp_config.json
{
  "connectors": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_TOKEN": "..." }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/data"]
    },
    "postgres": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-postgres"],
      "env": { "DATABASE_URL": "..." }
    }
  }
}
3 connectors active
Security & Privacy

Your data. Your rules.

Privacy isn't a feature — it's the foundation. OSIA is built from the ground up to keep everything on your machine.

Zero Cloud Dependency

All AI processing can run locally. No data leaves your machine unless you explicitly configure a cloud provider.

Token Authentication

20-character secure token for WebSocket connections. Automatic silent upgrade from legacy 16-char tokens.

Encrypted Tunnel

Cloudflare Tunnel provides secure remote access without exposing your IP or opening ports on your router.

Local-First Storage

Configuration, chat history, API keys, and wake word models all stored locally. Gitignored secrets, never committed.

Key Rotation

Universal key chain with automatic rotation on technical errors (429, 5xx, auth failures). Never on content errors.

No Telemetry

OSIA collects zero usage data. No analytics, no tracking pixels, no phone-home behavior. Your usage is yours.

Confidential by Design

OSIA follows three core pillars: Confidential (100% local), Secure (Cloudflare Tunnel + token auth), and Accessible (app → connection → API). Your voice data, conversations, and system interactions never leave your device without explicit consent.

Platforms

PC and phone become one.

Everything configurable on PC is mirrored on mobile. Your phone isn't a remote — it's a full control panel.

Available

Windows

Primary development platform. Full feature set including Task Scheduler integration, UIA accessibility, and native app control.

Python 3.12 / PyQt6 / FastAPI
Available

Linux

Original platform (Fedora). Wayland + X11 support with AT-SPI accessibility and multi-screenshot backends.

Python 3.12 / PyQt6 / FastAPI
Planned

macOS

Full port planned after Windows stabilization. AXUIElement accessibility, native menu integration.

Python 3.12 / PyQt6 / FastAPI
Available

Android

Full companion app with chat, voice, settings mirror, cron jobs, and conversation management.

Flutter / Dart
Available

iOS

Feature-parity companion app. Voice input, chat history, full settings control.

Flutter / Dart

Ready to take control?

OSIA is open source and free. Install it on your machine and experience what a truly private AI assistant can do.

WindowsLinuxmacOS (soon)AndroidiOS