Open Source · Self-Hosted · v0.1 MVP

Shared Memory.
Unified Soul.

YuYi gives every AI agent you use a shared, persistent memory layer — deployed on your own infrastructure, owned by you, readable by all.

Java 21 + Spring Boot · PostgreSQL + pgvector · MCP / CLI / REST
memory-server
# Write a memory from any agent
$ curl -X POST https://memory.local/v1/memories \
    -H "Authorization: Bearer mk-..." \
    -d '{
  "content": "All API responses follow {code, message, data}",
  "memoryType": "project_rule",
  "scope": { "projectId": "p_demo" }
}'

{
  "code": "CREATED",
  "data": {
    "id": "mem_a1b2c3d4",
    "memoryType": "project_rule",
    "importance": 0.9,
    "status": "active"
  }
}

# Recall context for a task — from any other agent
$ curl -X POST https://memory.local/v1/recall \
    -d '{ "task": "Implement user registration", "maxTokens": 2000 }'

Every AI product is a walled garden for memory

01

Closed ecosystems

ChatGPT Memory, Claude Memory — each is siloed. The knowledge you build in one tool never reaches another.

02

Context amnesia

Switch models, switch clients, or start a new session — and everything you established is gone. Agents start from zero, every time.

03

No user control

You can't audit what's stored, delete specific memories, or migrate your data. Your knowledge lives at someone else's address.

This isn't a bug in any single product. It's a structural gap — memory has been treated as a product feature when it should be an independent infrastructure layer, like a database or a version control system.

Six principles that govern every decision

P1

User data sovereignty

Memory belongs to you. Self-deploy, self-manage, export everything, delete for real. No external accounts required.

P2

Explicit control over automation

Stage one: write only when you say so. Auto-write is opt-in, always transparent, always reversible.

P3

Memory and history are different things

Stable Memory stores distilled facts. History Recall stores process context. Conflating them destroys recall quality.

P4

Model and host agnostic

No dependency on any vendor's native memory API. Plugin any LLM or embedding model. Agent stops working? Core keeps running.

P5

Recall is not search

Recall returns a compacted, token-budgeted context block ready to inject. Search returns a list. They are different operations.

P6

Closed loop first

Ship the minimum verifiable loop. Incremental complexity on top of a proven foundation — not features woven into an unproven base.

Conflict resolution order:
Data Sovereignty Explicit Control Separation Agnosticism Recall Quality Simplicity

Six layers, three pipelines, one stable API surface

Layer 1Client / Agent LayerCodex · Antigravity · Claude Desktop · IDE Agent · CLI
Layer 2Access Adapter LayerSkill · Tool · MCP Server · CLI · SDK
Layer 3Memory Orchestrator
Write PipelineRecall PipelineControl Pipeline
Layer 4Memory Storage PlaneStable Memory Store · History Recall Store · Control Metadata
Layer 5Search & RankingLexical · Vector (pgvector) · RRF Fusion · Tag & Scope Filter
Layer 6Storage / InfraPostgreSQL · Redis · Object Store · TLS · Reverse Proxy

Adapters are thin

Every adapter maps intent to a standard API call. All business logic lives server-side. Adding a new agent requires only a new adapter — the core never changes.

Orchestrator is the brain

Layer 3 answers: is this worth storing? which layer? when recalled, what fits? It is judgment and scheduling, not storage.

Search degrades gracefully

Vector search and lexical search fail independently. No embedding provider means pure lexical. zhparser unavailable means plainto_tsquery. The pipeline never crashes from a missing dependency.

Two kinds of memory. One unified system.

Stable Memory
Long-term, high-fidelity facts
preferenceproject_ruledecision factworkflowreferencesummary
  • Low write frequency, high read frequency
  • Versioned — every change is tracked
  • State machine: active → archived / invalid → deleted
  • semantic_key deduplication prevents clutter
  • importance ≥ 0.9 cannot be truncated by token budget
vs
History Recall
Process context, allowed to expire
task_summarydecision_tracesession_excerpt recent_progressincident_contextmeeting_note
  • Append-write primary — no overwrite pressure
  • TTL policy support (30d / 90d / 365d / permanent)
  • Lower structure constraints — richer freeform text
  • Recall weight lower — supplements stable results

Memory Judge decides where things go

A three-stage decision chain runs on every write request. Hard rules run first and cannot be overridden by the LLM. The LLM is advisory. The rule fallback is always available.

L1
Hard Guards
Sensitive content, empty inputs, length violations — deterministic, cannot be overridden
L2
LLM Judge
Structured JSON output; confidence threshold enforced; auto-fallback on failure
L3
Rule Fallback
Admin-configurable rule table; keyword classification, importance defaults, semantic_key inference

Recall is not search. It's context engineering.

The Recall Orchestrator runs eight steps to produce a RecallContextBlock — a compacted, deduplicated, token-budgeted block ready to inject directly into any agent's context window.

1Candidate retrievalHybrid: lexical + vector, both stores
2Scoringimportance × recency × hybridScore
3SortingDescending by composite score
4semantic_key dedupSame key → only highest-score entry
5Conflict detectionhasConflict flagged memories surfaced
6Token budgetstable/history split by prefer param
7Freshness notes30/90/365-day warn thresholds
8Block assemblyRecallContextBlock returned
Token budget allocation by prefer mode:
ModeStableHistory
stable_first
70%
30%
history_first
30%
70%
balanced
50%
50%
stable_only
100%

A stable, versioned REST surface

All endpoints return {"{ code, message, data, httpStatus, timestamp }"}. Errors use structured MEM-* codes, never raw HTTP status text.

POST/v1/memoriesWrite a stable memory
// Request body
{
  "content": "All backend APIs return {code, message, data}",
  "title":   "Response format convention",
  "memoryType": "project_rule",
  "scope": { "projectId": "p_demo" },
  "semanticKey": "p_demo:rule:api_response_format",
  "importance": 0.9
}

// Response 201
{
  "code": "CREATED",
  "data": { "id": "mem_a1b2c3d4", "status": "active", "version": 1 }
}
POST/v1/history-recordsWrite a history record
{
  "content": "Completed user auth module. MailCodeService handles TTL.",
  "recordKind": "task_summary",
  "scope": { "projectId": "p_demo" },
  "ttlPolicy": "365d"
}
POST/v1/recallTask context recall
// Request
{
  "task": "Implement user registration with email verification",
  "scope": { "projectId": "p_demo" },
  "maxTokens": 2000,
  "prefer": "stable_first",
  "includeHistory": true
}

// Response — RecallContextBlock
{
  "blockType": "task_recall",
  "stableMemories": [
    { "id": "mem_...", "summary": "...", "importance": 0.92 }
  ],
  "historyRecords": [ /* ... */ ],
  "conflicts": [],
  "freshnessNotes": [],
  "tokenEstimate": 1420,
  "truncated": false
}
GET/v1/memories/recentLatest stable memories
GET /v1/memories/recent?projectId=p_demo&limit=10
GET/v1/memoriesList memories with filters
GET /v1/memories?projectId=p_demo&status=active&memoryType=project_rule&hasConflict=false&page=1&size=20
PATCH/v1/memories/{id}/statusChange memory state
{ "status": "archived" }
// Valid transitions: active→archived, active→invalid, active→deleted
// archived→active (restore), invalid→active (restore)
DELETE/v1/memories/{id}Delete a memory
GET/v1/memories/searchFull-text search
GET /v1/memories/search?q=api+response+format&limit=20
POST/v1/auth/loginAdmin login (Session Cookie)
{ "username": "admin", "password": "..." }
// Returns HttpOnly Session Cookie for browser clients
POST/v1/admin/tokensCreate API Token (CLI/MCP)
{
  "name": "Claude MCP",
  "clientType": "mcp",
  "accessLevel": "read_only",
  "expiresInDays": 30
}
// Plaintext token returned only once on creation
POST/v1/admin/tokens/{id}/revokeRevoke token immediately

Error code structure

MEM-AUTH*Authentication & authorization
MEM-MEM*Memory operation errors
MEM-RECALL*Recall pipeline errors
MEM-JUDGE*Judge / LLM provider errors
MEM-SEARCH*Embedding & search errors
MEM-SYS*System & validation errors

Running in under five minutes

1

Clone the deploy repo

git clone https://github.com/plumememory/memory-deploy
cd memory-deploy
2

Configure environment

cp .env.example .env
# Set required variables:
MEMORY_BOOTSTRAP_ADMIN_USERNAME=admin
MEMORY_BOOTSTRAP_ADMIN_PASSWORD=change-me-now
POSTGRES_PASSWORD=your-db-password
3

Start the stack

# Development
docker compose -f docker-compose.dev.yml up -d

# Production (with TLS via Caddy/Nginx)
docker compose -f docker-compose.prod.yml up -d
4

Verify the MVP

bash verify-mvp.sh
# ✓ Write stable memory
# ✓ Write history record
# ✓ Task recall
# ✓ Cross-agent read
# ✓ Delete & temporary mode

What's included

  • memory-server — Java Spring Boot core
  • memory-web — React admin console
  • PostgreSQL 16 + pgvector + zhparser
  • Caddy / Nginx reverse proxy examples

Enable vector search

MEMORY_EMBEDDING_ENABLED=true
MEMORY_EMBEDDING_API_KEY=sk-...
MEMORY_EMBEDDING_MODEL=text-embedding-3-small

Enable LLM Judge

MEMORY_JUDGE_LLM_ENABLED=true
MEMORY_LLM_API_KEY=sk-...
MEMORY_LLM_MODEL=gpt-4o-mini
MEMORY_LLM_BASE_URL=https://api.openai.com/v1

Four-repo structure

Javamemory-serverCore business logic & API
TSclient-sharedCLI + MCP Server adapter
Reactmemory-webAdmin & debug console
🐳memory-deployCompose + proxy configs