Memory Architecture
Memory Architecture
Overview of the Siestai memory system — how agents remember, learn, and build context across conversations
The Memory Architecture is the intelligence layer that gives Siestai agents persistent knowledge. It enables agents to remember facts, preferences, decisions, and learnings across all interaction surfaces.
High-Level Architecture
graph TD
A[Interaction Surfaces] --> B[Memory Extraction]
B --> C[Embedding Service]
C --> D[Dedup Check]
D -->|Unique| E[Store in PostgreSQL]
D -->|Duplicate| F[Update Importance]
E --> G[pgvector Index]
G --> H[Context Assembly]
H --> I[Agent Prompt]
J[Cron Jobs] --> K[Daily Transitions]
J --> L[Stale Pruning]
J --> M[Weekly Consolidation]Core Components
| Component | Location | Purpose |
|---|---|---|
MemoryService | apps/api/src/memory/memory.service.ts | CRUD, dedup, search, consolidation |
EmbeddingService | apps/api/src/memory/embedding.service.ts | text-embedding-3-small (1536 dims) |
ContextAssemblyService | apps/api/src/memory/context-assembly.service.ts | Layered context building (16k budget) |
MemoryExtractionService | apps/api/src/arena/memory-extraction.service.ts | LLM-powered extraction (claude-haiku-4-5) |
DailyFileService | apps/api/src/memory/daily-file.service.ts | Daily activity logs with status transitions |
MdFilesService | apps/api/src/memory/md-files.service.ts | Agent/team configuration files |
MemoryCronService | apps/api/src/memory/memory-cron.service.ts | Scheduled maintenance tasks |
Key Design Decisions
- In-process Mastra — AI runtime runs inside NestJS, not as a separate service
- pgvector for search — HNSW indexes on
vector(1536)columns for fast approximate nearest neighbor - Dedup at 0.92 cosine — Prevents memory bloat while allowing nuanced variations
- Layered context assembly — 7-layer system with per-layer token budgets totaling 16k
- Three-state daily files —
active → warm → archivedwith time-based transitions
Related Pages
Schema
- schema/types — Database schema and table details
- schema/md-files — Agent and team configuration files
Pipeline
- pipeline/lifecycle — Memory status transitions
- pipeline/extraction — LLM-powered memory extraction pipeline
- pipeline/context-assembly — How context is assembled per conversation
- pipeline/scoring — Relevance scoring formula
Operations
- operations/surfaces — Interaction surfaces (Arena, Chat, Team)
- operations/cron-jobs — Automated maintenance schedules
- operations/configuration — Constants and thresholds