System Architecture
Privane is the execution infrastructure for sovereign AI. Rather than relying on cloud-centric models that ingest raw context and corporate credentials, Privane introduces a decoupled, hybrid architecture.
It splits AI agent execution into three discrete components:
- Privane OSS (Open Source Client): Executes cognitive reasoning loops, compiles prompts, and runs local, secure sandboxed tools directly on native hardware.
- Privane Cloud (Managed Gateway): Operates headless Chromium clusters, manages SaaS OAuth refresh configurations, anti-bot mechanisms, and routes API requests.
- Privane Hub (Hybrid Registry): Catalogs and hosts templates, custom MCP configurations, workflows, and tools.
Unified System Topology
The canonical diagram below outlines how the different open-source and proprietary components of the Privane developer operating system interact to deliver safe, low-latency, and zero-leakage agent orchestration:

Core System Layers
1. Privane OSS Core (GitHub & npm) OPEN CORE
The foundational client execution stack is open-source and runs entirely offline:
- @privane/engine: Accelerates GGUF model weight loaders using standard hardware APIs: WebGPU in the browser, Metal (MPS) on macOS/iOS, and Vulkan on Linux/Android.
- privane-cli: Exposes standard local OpenAI-compatible endpoints (
/v1/chat/completions) matching OpenAI specifications. - Local Tools: Runs security-restricted, sandboxed file operations (
LocalFileSystemTool) and keyword-filtered SQLite query commands (LocalSqliteTool) fully on-device.
2. Privane Cloud Gateway (api.privane.dev) PROPRIETARY
Operating at api.privane.dev, the cloud handles high-availability operational infrastructure:
- OAuth token storage & refresh infrastructure: Eliminates credential management pain. No keys are logged; the session layer is stateless and transient.
- Hosted Headless Chromium Clusters: Bypasses local installation bottlenecks (integrated with Browserbase and PinchTab).
- High-Signal Accessibility Tree DOM Pruner: Cloud browser nodes prune heavy DOM raw page code into compressed accessibility tree schemas in real-time, reducing context token usage by over 95%.
3. Privane Hub Registry HYBRID
Serves as the ecosystem directory connecting developers and agent executors:
- Connectors: Catalogs standardized schemas conforming to the
ToolDefinitionpayload specification. - Templates & workflows: Shared agent configurations (such as standard code reviewers and standup builders) usable with a single line of SDK code.
Sovereign Cloud Gateway Infrastructure (api.privane.dev)
To guarantee high availability, high security, and clear task separation, Privane Cloud is built as an independent, enterprise-grade execution infrastructure. It enforces a strict dual-instance EC2 setup, separating real-time stateless routing from heavy browser processes.
1. Dual-Instance EC2 Architecture
Instance 1: Gateway API (api.privane.dev)
- Responsibilities: Handles user authentication, API key validation, transient OAuth session mapping, connector schema routing, and execution telemetry.
- Stack: Node.js (Fastify/NestJS) for low-latency request mapping, coupled with PostgreSQL and Redis.
- Stateless Routing: No raw reasoning prompts or local filesystem parameters are processed or logged. The gateway acts strictly as a transient dispatcher.
Instance 2: Browser Workers (browser.privane.dev)
- Responsibilities: Isolates CPU-heavy browser orchestration scripts, Playwright browser controllers, and anti-bot stealth parameters.
- Isolation Moat: Keeps messy browser processes (RAM spikes, page crashes, zombie Chromium memory leaks) physically separated from Instance 1, preventing service degradation of core API routes.
- Cluster Integrations: Manages outbound virtual sessions hosted by compute partners Browserbase and PinchTab.
2. High-Availability Database Layer (PostgreSQL)
A robust, highly optimized relational schema structures our metadata layer, caching only system-level metrics and encrypted connector access tokens:
- Users & API Keys: Stores hashed API credentials and permission scopes.
- OAuth Credentials: Encrypted OAuth tokens enabling stateless, automated refreshes for SaaS integrations (Slack, GitHub, Gmail, Jira).
- Execution Telemetry: Logs execution IDs, run times, and connector metrics for transparent developer invoicing.
3. Distributed Queueing Layer (Redis + BullMQ)
Web automation actions and browser navigations are never executed inline within real-time HTTP threads.
- BullMQ Queue Pipeline: Real-time browser commands (
browser.goto,browser.click) are instantly serialized as structured JSON tasks and pushed into a BullMQ Redis queue on Instance 1. - Asynchronous Processing: Worker nodes on Instance 2 pull tasks asynchronously from the queue, allowing Privane to absorb massive execution traffic bursts smoothly while providing developers with real-time job status tracking.