System Architecture

System Architecture

Privane is the execution infrastructure for sovereign AI. Rather than relying on cloud-centric models that ingest raw context and corporate credentials, Privane introduces a decoupled, hybrid architecture.

It splits AI agent execution into three discrete components:

  • Privane OSS (Open Source Client): Executes cognitive reasoning loops, compiles prompts, and runs local, secure sandboxed tools directly on native hardware.
  • Privane Cloud (Managed Gateway): Operates headless Chromium clusters, manages SaaS OAuth refresh configurations, anti-bot mechanisms, and routes API requests.
  • Privane Hub (Hybrid Registry): Catalogs and hosts templates, custom MCP configurations, workflows, and tools.

Unified System Topology

The canonical diagram below outlines how the different open-source and proprietary components of the Privane developer operating system interact to deliver safe, low-latency, and zero-leakage agent orchestration:

Privane System Architecture


Core System Layers

1. Privane OSS Core (GitHub & npm) OPEN CORE

The foundational client execution stack is open-source and runs entirely offline:

  • @privane/engine: Accelerates GGUF model weight loaders using standard hardware APIs: WebGPU in the browser, Metal (MPS) on macOS/iOS, and Vulkan on Linux/Android.
  • privane-cli: Exposes standard local OpenAI-compatible endpoints (/v1/chat/completions) matching OpenAI specifications.
  • Local Tools: Runs security-restricted, sandboxed file operations (LocalFileSystemTool) and keyword-filtered SQLite query commands (LocalSqliteTool) fully on-device.

2. Privane Cloud Gateway (api.privane.dev) PROPRIETARY

Operating at api.privane.dev, the cloud handles high-availability operational infrastructure:

  • OAuth token storage & refresh infrastructure: Eliminates credential management pain. No keys are logged; the session layer is stateless and transient.
  • Hosted Headless Chromium Clusters: Bypasses local installation bottlenecks (integrated with Browserbase and PinchTab).
  • High-Signal Accessibility Tree DOM Pruner: Cloud browser nodes prune heavy DOM raw page code into compressed accessibility tree schemas in real-time, reducing context token usage by over 95%.

3. Privane Hub Registry HYBRID

Serves as the ecosystem directory connecting developers and agent executors:

  • Connectors: Catalogs standardized schemas conforming to the ToolDefinition payload specification.
  • Templates & workflows: Shared agent configurations (such as standard code reviewers and standup builders) usable with a single line of SDK code.

Sovereign Cloud Gateway Infrastructure (api.privane.dev)

To guarantee high availability, high security, and clear task separation, Privane Cloud is built as an independent, enterprise-grade execution infrastructure. It enforces a strict dual-instance EC2 setup, separating real-time stateless routing from heavy browser processes.

1. Dual-Instance EC2 Architecture

Instance 1: Gateway API (api.privane.dev)

  • Responsibilities: Handles user authentication, API key validation, transient OAuth session mapping, connector schema routing, and execution telemetry.
  • Stack: Node.js (Fastify/NestJS) for low-latency request mapping, coupled with PostgreSQL and Redis.
  • Stateless Routing: No raw reasoning prompts or local filesystem parameters are processed or logged. The gateway acts strictly as a transient dispatcher.

Instance 2: Browser Workers (browser.privane.dev)

  • Responsibilities: Isolates CPU-heavy browser orchestration scripts, Playwright browser controllers, and anti-bot stealth parameters.
  • Isolation Moat: Keeps messy browser processes (RAM spikes, page crashes, zombie Chromium memory leaks) physically separated from Instance 1, preventing service degradation of core API routes.
  • Cluster Integrations: Manages outbound virtual sessions hosted by compute partners Browserbase and PinchTab.

2. High-Availability Database Layer (PostgreSQL)

A robust, highly optimized relational schema structures our metadata layer, caching only system-level metrics and encrypted connector access tokens:

  • Users & API Keys: Stores hashed API credentials and permission scopes.
  • OAuth Credentials: Encrypted OAuth tokens enabling stateless, automated refreshes for SaaS integrations (Slack, GitHub, Gmail, Jira).
  • Execution Telemetry: Logs execution IDs, run times, and connector metrics for transparent developer invoicing.

3. Distributed Queueing Layer (Redis + BullMQ)

Web automation actions and browser navigations are never executed inline within real-time HTTP threads.

  • BullMQ Queue Pipeline: Real-time browser commands (browser.goto, browser.click) are instantly serialized as structured JSON tasks and pushed into a BullMQ Redis queue on Instance 1.
  • Asynchronous Processing: Worker nodes on Instance 2 pull tasks asynchronously from the queue, allowing Privane to absorb massive execution traffic bursts smoothly while providing developers with real-time job status tracking.