Memory Architecture

Overview of the Siestai memory system — how agents remember, learn, and build context across conversations

The Memory Architecture is the intelligence layer that gives Siestai agents persistent knowledge. It enables agents to remember facts, preferences, decisions, and learnings across all interaction surfaces.

High-Level Architecture

graph TD
    A[Interaction Surfaces] --> B[Memory Extraction]
    B --> C[Embedding Service]
    C --> D[Dedup Check]
    D -->|Unique| E[Store in PostgreSQL]
    D -->|Duplicate| F[Update Importance]
    E --> G[pgvector Index]
    G --> H[Context Assembly]
    H --> I[Agent Prompt]

    J[Cron Jobs] --> K[Daily Transitions]
    J --> L[Stale Pruning]
    J --> M[Weekly Consolidation]

Core Components

Component	Location	Purpose
`MemoryService`	`apps/api/src/memory/memory.service.ts`	CRUD, dedup, search, consolidation
`EmbeddingService`	`apps/api/src/memory/embedding.service.ts`	text-embedding-3-small (1536 dims)
`ContextAssemblyService`	`apps/api/src/memory/context-assembly.service.ts`	Layered context building (16k budget)
`MemoryExtractionService`	`apps/api/src/arena/memory-extraction.service.ts`	LLM-powered extraction (claude-haiku-4-5)
`DailyFileService`	`apps/api/src/memory/daily-file.service.ts`	Daily activity logs with status transitions
`MdFilesService`	`apps/api/src/memory/md-files.service.ts`	Agent/team configuration files
`MemoryCronService`	`apps/api/src/memory/memory-cron.service.ts`	Scheduled maintenance tasks

Key Design Decisions

In-process Mastra — AI runtime runs inside NestJS, not as a separate service
pgvector for search — HNSW indexes on vector(1536) columns for fast approximate nearest neighbor
Dedup at 0.92 cosine — Prevents memory bloat while allowing nuanced variations
Layered context assembly — 7-layer system with per-layer token budgets totaling 16k
Three-state daily files — active → warm → archived with time-based transitions

Schema

schema/types — Database schema and table details
schema/md-files — Agent and team configuration files

Pipeline

pipeline/lifecycle — Memory status transitions
pipeline/extraction — LLM-powered memory extraction pipeline
pipeline/context-assembly — How context is assembled per conversation
pipeline/scoring — Relevance scoring formula

Operations

operations/surfaces — Interaction surfaces (Arena, Chat, Team)
operations/cron-jobs — Automated maintenance schedules
operations/configuration — Constants and thresholds