YuYi gives every AI agent you use a shared, persistent memory layer — deployed on your own infrastructure, owned by you, readable by all.
# Write a memory from any agent
$ curl -X POST https://memory.local/v1/memories \
-H "Authorization: Bearer mk-..." \
-d '{
"content": "All API responses follow {code, message, data}",
"memoryType": "project_rule",
"scope": { "projectId": "p_demo" }
}'
{
"code": "CREATED",
"data": {
"id": "mem_a1b2c3d4",
"memoryType": "project_rule",
"importance": 0.9,
"status": "active"
}
}
# Recall context for a task — from any other agent
$ curl -X POST https://memory.local/v1/recall \
-d '{ "task": "Implement user registration", "maxTokens": 2000 }'
ChatGPT Memory, Claude Memory — each is siloed. The knowledge you build in one tool never reaches another.
Switch models, switch clients, or start a new session — and everything you established is gone. Agents start from zero, every time.
You can't audit what's stored, delete specific memories, or migrate your data. Your knowledge lives at someone else's address.
This isn't a bug in any single product. It's a structural gap — memory has been treated as a product feature when it should be an independent infrastructure layer, like a database or a version control system.
Memory belongs to you. Self-deploy, self-manage, export everything, delete for real. No external accounts required.
Stage one: write only when you say so. Auto-write is opt-in, always transparent, always reversible.
Stable Memory stores distilled facts. History Recall stores process context. Conflating them destroys recall quality.
No dependency on any vendor's native memory API. Plugin any LLM or embedding model. Agent stops working? Core keeps running.
Recall returns a compacted, token-budgeted context block ready to inject. Search returns a list. They are different operations.
Ship the minimum verifiable loop. Incremental complexity on top of a proven foundation — not features woven into an unproven base.
Every adapter maps intent to a standard API call. All business logic lives server-side. Adding a new agent requires only a new adapter — the core never changes.
Layer 3 answers: is this worth storing? which layer? when recalled, what fits? It is judgment and scheduling, not storage.
Vector search and lexical search fail independently. No embedding provider means pure lexical. zhparser unavailable means plainto_tsquery. The pipeline never crashes from a missing dependency.
A three-stage decision chain runs on every write request. Hard rules run first and cannot be overridden by the LLM. The LLM is advisory. The rule fallback is always available.
The Recall Orchestrator runs eight steps to produce
a RecallContextBlock — a compacted, deduplicated, token-budgeted block ready to inject directly
into any agent's context window.
prefer
mode:All endpoints return
{"{ code, message, data, httpStatus, timestamp }"}. Errors use structured MEM-* codes,
never raw HTTP status text.
// Request body
{
"content": "All backend APIs return {code, message, data}",
"title": "Response format convention",
"memoryType": "project_rule",
"scope": { "projectId": "p_demo" },
"semanticKey": "p_demo:rule:api_response_format",
"importance": 0.9
}
// Response 201
{
"code": "CREATED",
"data": { "id": "mem_a1b2c3d4", "status": "active", "version": 1 }
}
{
"content": "Completed user auth module. MailCodeService handles TTL.",
"recordKind": "task_summary",
"scope": { "projectId": "p_demo" },
"ttlPolicy": "365d"
}
// Request
{
"task": "Implement user registration with email verification",
"scope": { "projectId": "p_demo" },
"maxTokens": 2000,
"prefer": "stable_first",
"includeHistory": true
}
// Response — RecallContextBlock
{
"blockType": "task_recall",
"stableMemories": [
{ "id": "mem_...", "summary": "...", "importance": 0.92 }
],
"historyRecords": [ /* ... */ ],
"conflicts": [],
"freshnessNotes": [],
"tokenEstimate": 1420,
"truncated": false
}
GET /v1/memories/recent?projectId=p_demo&limit=10
GET /v1/memories?projectId=p_demo&status=active&memoryType=project_rule&hasConflict=false&page=1&size=20
{ "status": "archived" }
// Valid transitions: active→archived, active→invalid, active→deleted
// archived→active (restore), invalid→active (restore)
GET /v1/memories/search?q=api+response+format&limit=20
{ "username": "admin", "password": "..." }
// Returns HttpOnly Session Cookie for browser clients
{
"name": "Claude MCP",
"clientType": "mcp",
"accessLevel": "read_only",
"expiresInDays": 30
}
// Plaintext token returned only once on creation
MEM-AUTH*Authentication & authorization
MEM-MEM*Memory operation errorsMEM-RECALL*Recall pipeline errorsMEM-JUDGE*Judge / LLM provider errorsMEM-SEARCH*Embedding & search errorsMEM-SYS*System & validation errorsgit clone https://github.com/plumememory/memory-deploy
cd memory-deploy
cp .env.example .env
# Set required variables:
MEMORY_BOOTSTRAP_ADMIN_USERNAME=admin
MEMORY_BOOTSTRAP_ADMIN_PASSWORD=change-me-now
POSTGRES_PASSWORD=your-db-password
# Development
docker compose -f docker-compose.dev.yml up -d
# Production (with TLS via Caddy/Nginx)
docker compose -f docker-compose.prod.yml up -d
bash verify-mvp.sh
# ✓ Write stable memory
# ✓ Write history record
# ✓ Task recall
# ✓ Cross-agent read
# ✓ Delete & temporary mode
MEMORY_EMBEDDING_ENABLED=true
MEMORY_EMBEDDING_API_KEY=sk-...
MEMORY_EMBEDDING_MODEL=text-embedding-3-small
MEMORY_JUDGE_LLM_ENABLED=true
MEMORY_LLM_API_KEY=sk-...
MEMORY_LLM_MODEL=gpt-4o-mini
MEMORY_LLM_BASE_URL=https://api.openai.com/v1