v0.3.1 · Apache License 2.0 · Python 3.12+
OpenSquilla

Token-Efficient AI Agent Intelligence

Microkernel AI Agent — same budget, have your Agent do more, do it better.
Smart routing, persistent memory, secure sandbox, plus built-in search and local embeddings.

60-80%1
Token Cost Savings
N+
Meta-skills
1-click
Migrate from OpenClaw / Hermes
10+
Channels Built-in

See It in Action

Quick demos showing how OpenSquilla solves real workflows

Windows portable install walkthrough
Short drama meta-skill
Paper writing meta-skill

Quickstart

Four paths to get started — pick the one that fits you

The recommended path on Windows, macOS, and Linux. uv installs OpenSquilla into its own isolated environment and manages its own Python — no system Python required. This path installs published releases only.

1

Install uv

Skip if uv --version already works.

$ curl -LsSf https://astral.sh/uv/install.sh | sh
$ . "$HOME/.local/bin/env"
2

Install OpenSquilla

The same command on every platform.

$ uv tool install --python 3.12 "opensquilla[recommended] @ https://github.com/opensquilla/opensquilla/releases/download/v0.3.1/opensquilla-0.3.1-py3-none-any.whl"

Installs the OpenSquilla wheel from the release URL, then lets uv download the dependencies declared by the selected extras. The default recommended extra includes SquillaRouter runtime dependencies (ONNX Runtime, LightGBM, NumPy, tokenizers).

3

Configure and run

# Interactive onboarding wizard
$ opensquilla onboard

# Start ASGI server
$ opensquilla gateway run

If opensquilla is not found right after a fresh uv install, open a new terminal or re-run the PATH line from step 1.

Wheel URLs are versioned by design — installers validate the version in the filename. The command above pins to v0.3.1.

For advanced usage, visit GitHub repo

Deploy Once, Reach Everywhere 3

Configure one Agent, serve users across multiple channels

Terminal Web Slack Discord Telegram MS Teams Matrix Lark DingTalk WeCom QQ

Every Token Spent Where It Matters

OpenSquilla makes your Agent spend less, remember more, and run safer.

💰

Cost Optimization

Multiple strategies coordinated to maximize every Token

Smart Routing ²
Like ride-sharing — simple questions take the bus (cheap models), complex ones get the premium ride (top models). The system decides.
Hybrid Feature Analysis
Combines hand-crafted features (length, language, code blocks, keywords) with embedding-based semantic features to assess complexity and pick the right model.
Reasoning Depth Tiers
Disables reasoning billing for simple queries, only enabling deep thought for complex ones — no paying reasoning Tokens for "hello".
Adaptive Prompts
Auto-tunes the prompt based on task complexity — telling the model how deeply to think. Light for simple, full power for complex.
On-Demand Skills
No dumping all capabilities into context. Only loads what's needed for the current task to avoid Token waste.
🪄

MetaSkills Protocol

A meta-protocol that tells the Agent how to retrieve, filter, compose — and even evolve — skills at scale

Self-Organizing
Multi-step work becomes reusable, inspectable workflows. Composition parsing, step scheduling, and proposal gates — recipes you trust to run.
meta-skill-creator
A bundled MetaSkill that turns recurring multi-skill collaborations into proposed new MetaSkills — the Agent grows its own catalog by running it.
N+ Community Skills
The Agent autonomously discovers, ranks, and invokes Skills from the community catalog — no manual picking, the right one just gets loaded.
10+ Bundled MetaSkills
Curated MetaSkills out of the box — research-to-report, paper drafting, job-search prep, project planning, short-drama production, and more high-quality workflows, ready to run.
Replay & Dream Mode
Every workflow execution leaves an auditable, replayable trail. While you are idle, OpenSquilla revisits those traces, distills usage patterns, and drafts candidate MetaSkills — capability grows in the background.
🧠

Human-Like Memory

Four-tier cognitive architecture — gets smarter the more you use it

Four-Tier Memory Structure
Working memory (current task) → Episodic (experience & causality) → Semantic (facts & rules) → Raw (audit & retraining base) — mirrors human cognition.
Hybrid Search + Local Embeddings
Vector semantic + full-text keyword search side by side. Bundled ONNX inference runs on CPU — embeddings stay on your machine, optionally swap to OpenAI / Ollama.
Hot Memory Promotion
Frequently recalled memories auto-bubble to the top. The more useful, the more accessible. Cold memories naturally sink.
Temporal Decay
Dated memories fade exponentially over time, while items marked "evergreen" stay sharp forever.
Memory Dream Consolidation
Every 24 hours, the AI "dreams" — consolidating scattered memories into structured knowledge. Just like sleep consolidates human memory.
🛡️

Security Sandbox

Let your Agent take action — without fearing what it might do

Three-Tier Policy
Standard runs directly, Strict requires sandbox approval, Locked enforces human review — risk-based escalation.
Real Sandbox Isolation
Bubblewrap on Linux, Seatbelt on macOS — code executes in isolated environments, never touching your real files.
Denial Ledger
Three rejections in a row? AI auto-pauses. Stops "brute-force" attempts to bypass security policies.
Stale Output Protection
Rejected operation results are immediately purged — AI can't use "read previous output" as a side-channel.
Prompt Injection Defense
XML-escapes all skill metadata and tool results — closing common injection attack vectors.

Microkernel: Tiny Core, Vast Ecosystem

Inspired by OS microkernels — the core engine does the minimum: orchestration and state management. Everything else runs as plugins in "user space". Switch LLM providers? Implement a Protocol. Add new tools? 5 lines of code. Plugin crashes don't affect the core; core upgrades don't break plugins.

OpenSquilla
OpenSquilla Core Engine
Compact pipeline orchestrator · State machine · Fully async · Auto-rollback on errors
⚙️
engine/
State Machine
🤖
provider/
Multi-LLM Provider
🌐
gateway/
ASGI RPC Gateway
🧠
memory/
Multi-Tier Memory
📡
channels/
Channel Adapters
🔧
tools/ + mcp/
MCP-First Tools
🛡️
sandbox/
Security Sandbox
scheduler/
Task Scheduler
🧩
skills/
Skill Plugins
🎭
identity/
Identity & Prompts
Built-in
🔍 Search: Brave / DuckDuckGo 🧬 Local Embeddings: ONNX local inference (offline · data stays on-device) 🔌 Optional Embeddings: OpenAI / Ollama

Same Budget, Higher Intelligence Density

Side-by-side comparison with peer open-source Agent frameworks4

🏗️Architecture
OpenSquilla
✅ Microkernel with 5-layer separation, ultra-compact core orchestrator (~100 lines), all capabilities pluggable, auto-skip + rollback on errors
OpenClaw
⚠️ Mature plugin ecosystem (dozens of extensions), clean boundaries but more layers
Hermes Agent
❌ Massive monolithic sync main loop (thousands of lines), all logic tightly coupled
💰Cost Optimization
OpenSquilla
✅ ML routing + reasoning depth tiers + prompt cache isolation + on-demand skills — multi-strategy savings of 60-80%
OpenClaw
⚠️ Config-pinned primary + fallback chain, no content-aware selection
Hermes Agent
⚠️ Crude keyword + length heuristics, single routing strategy only
🪄MetaSkills Protocol
OpenSquilla
✅ Composable workflows + meta-skill-creator for self-authoring + 10+ bundled & N+ community Skills auto-retrieved + Dream Mode distills usage into new candidates while idle
OpenClaw
⚠️ Prompt-driven skill chains, no meta-protocol layer, no self-evolution; new workflows live as docs, not first-class runtime objects
Hermes Agent
❌ No reusable workflow abstraction — multi-step work is re-prompted from scratch every session
💾Memory System
OpenSquilla
✅ Vector + keyword + dedup + temporal decay + hot memory promotion + auto schema migration
OpenClaw
⚠️ Has decay / promotion / diversity reranking, but lacks four-tier cognitive structure & Memory Dream consolidation
Hermes Agent
⚠️ Keyword-only search, no vector semantics, requires external integration for semantic memory
🛡️Security Sandbox
OpenSquilla
✅ No Docker dependency — syscall-level CPU/memory/time isolation + 3-tier network control. Fits in serverless
OpenClaw
⚠️ Docker optional with OpenShell as a lighter alternative, still heavier than syscall-level isolation
Hermes Agent
✅ Dangerous command approval + 6 execution environments (local/Docker/SSH etc)
💰Cost Tracking
OpenSquilla
✅ Actual cost per call out of the box, quota hooks for auto-throttling on overspend
OpenClaw
✅ Built-in pricing table, cost written to session metadata
Hermes Agent
✅ Input/output/cache-read/cache-write/reasoning tokens tracked separately
📊Observability
OpenSquilla
✅ Decision logs store hashes, not raw text (compliance-friendly), every pipeline stage instrumented
OpenClaw
✅ Native OpenTelemetry (as plugin), plug-and-play with Prometheus/Grafana
Hermes Agent
⚠️ SQLite session table + call counter, basic level
🧩Extension DX
OpenSquilla
✅ 5-line duck-typed class is a valid plugin — no base class, no SDK package, no manifest
OpenClaw
⚠️ Implement interface in plugin-sdk + write manifest file
Hermes Agent
⚠️ Tools auto-register on import (implicit side effects)

Who Benefits Most from OpenSquilla?

These scenarios get the highest ROI

🏢
On-Prem Deployment
Fully offline, data never leaves your network, ML routing runs locally
📋
Compliance & Audit
Three-tier policies + hashed decision logs + human approval gates
💸
Tight Budget, High Bar
Run more tasks for the same cost — smart routing picks the most cost-effective model
🧠
Agent That Gets You
Four-tier human-like memory accumulates context — never start from zero again
Limited-Time Free Token Offer

Free Tokens, Zero-Cost Trial

OpenSquilla is fully open source — pull from GitHub and self-host anytime.
But running LLMs still costs Tokens. We're giving you a starting Token credit so you can verify "OpenSquilla saves 60-80%" with zero risk.

10 seconds to fill out, no credit card required.

Apache 2.0 Open Source
No Credit Card
Priority Support
Limited Quantity · First Come First Served