The Agent Memory Landscape
AI agents need memory. The context window isn't enough — it's temporary, expensive to fill, and grows linearly with history. The question isn't whether your agent needs a memory layer, but which one to use.
Four products lead the space: Mem0, Zep/Graphiti, Letta (MemGPT), and Mnemora. Each takes a fundamentally different architectural approach. This post compares them honestly — including where each one falls short.
Comparison at a Glance
| Criteria | Mem0 | Zep / Graphiti | Letta (MemGPT) | Mnemora |
|---|---|---|---|---|
| Memory Types | Key-value + vector | Temporal knowledge graph | Tiered blocks (core + archival) | 4 types (working, semantic, episodic, procedural) |
| LLM Required for CRUD | Yes (every operation) | No | Yes (memory self-editing) | No |
| Self-Hosted Option | Yes (OSS SDK) | Graphiti only (Zep is closed) | Yes | No (managed only) |
| Serverless | Managed platform only | No | No | Yes (AWS Lambda + Aurora Serverless) |
| Multi-Tenant | Platform-level | No | No | Yes (API-key scoped isolation) |
| Checkpoint Support | No | No | No | Yes (LangGraph compatible) |
| Pricing Model | Per-request + storage | Seat-based | Self-hosted (infra cost) | Tiered plans ($0-$99/mo) |
| Vector Search | Yes | Yes (within graph) | Yes (archival memory) | Yes (pgvector, 1024-dim) |
Mem0: The Popular Choice
GitHub stars: 43K+ | Architecture: Managed platform + OSS SDK
Mem0 has the strongest brand awareness in the agent memory space, and for good reason. Their managed platform offers a clean API, and the open-source SDK lets you self-host with your own vector database.
Strengths:
- Large community and ecosystem. Extensive documentation and examples.
- The managed platform handles infrastructure entirely. You get a hosted API with no ops.
- Strong integrations with LangChain, LlamaIndex, and other popular frameworks.
- The OSS SDK is genuinely usable for self-hosting.
Weaknesses:
- Every CRUD operation calls an LLM. Storing a simple key-value pair triggers an LLM call to extract and categorize the memory. This adds 500ms+ latency and token cost to every write.
- No built-in working memory or state management. You get vector search, but not session state.
- No checkpoint support for agent frameworks like LangGraph.
- Multi-tenancy requires the managed platform — the OSS SDK is single-tenant.
- Cost can escalate quickly. Each memory operation burns LLM tokens on top of storage and API fees.
Best for: Teams that want a managed, battle-tested memory platform and are comfortable with per-operation LLM costs. Strong choice if you're already using their ecosystem.
Zep / Graphiti: The Knowledge Graph Approach
Architecture: Temporal knowledge graph (bi-temporal data model)
Zep takes a fundamentally different approach — instead of a vector database, it builds a temporal knowledge graph where facts have valid-time and transaction-time dimensions. The open-source component, Graphiti, provides the graph engine.
Strengths:
- The bi-temporal model is genuinely innovative. Facts can be valid for specific time ranges, making it natural to handle corrections and temporal queries.
- Sub-200ms retrieval performance is impressive for graph-based queries.
- No LLM required in the read path — graph traversal is deterministic and fast.
- Excellent for conversational agents that need to track evolving facts about users.
Weaknesses:
- The Zep platform is closed-source. Only Graphiti (the graph engine) is OSS.
- Steeper learning curve. The bi-temporal model is powerful but requires understanding graph concepts.
- No serverless option. You need to run and manage the graph database infrastructure.
- Limited to the knowledge graph paradigm. If you need simple key-value state or time-series episode logs, you'll need additional infrastructure.
- No multi-tenant isolation out of the box.
Best for: Applications where temporal reasoning about facts is critical — conversational assistants that need to know "the user moved to London in January" and handle corrections gracefully.
Letta (MemGPT): The Self-Editing Memory
GitHub stars: 42K+ | Architecture: LLM-managed memory blocks
Letta, originally MemGPT, pioneered the concept of agents that manage their own memory. The architecture splits memory into core memory (always in context) and archival memory (vector-searchable), and the LLM itself decides what to remember and forget.
Strengths:
- The self-editing memory concept is elegant. The agent autonomously decides what's important enough to persist.
- Core memory stays in the context window, so retrieval latency is zero for frequently-accessed data.
- Active open-source project with 42K+ stars and strong community.
- Built-in conversation management and multi-step tool use.
- The agent framework is comprehensive — not just memory, but a full agent runtime.
Weaknesses:
- Heavy server requirement. Letta runs as a server process, not as a serverless function. This means always-on infrastructure costs.
- Every memory operation involves an LLM call, since the LLM decides what to store. This adds cost and latency.
- Tightly coupled to the Letta agent framework. Using just the memory layer independently is difficult.
- Scaling to multi-tenant SaaS use cases requires significant custom work.
- No LangGraph checkpoint compatibility.
Best for: Teams building agents on the Letta framework who want the agent to autonomously manage its own memory. Less suitable if you just need a memory database for an existing agent.
Mnemora: Serverless Unified Memory
Architecture: AWS-native (DynamoDB + Aurora pgvector + S3 + Lambda)
Mnemora takes a different angle: instead of building a new database engine, it composes existing AWS services into a unified memory API. Four memory types — working, semantic, episodic, and procedural — are exposed through a single REST API.
Strengths:
- No LLM in the CRUD path. Storing state is a DynamoDB write (sub-10ms). Vector search embeds on write, not on read. This means predictable latency and no token costs for basic operations.
- Truly serverless. Scales to zero when idle (about $1/month), scales up automatically under load. No servers to manage.
- Multi-tenant by design. Every API key maps to an isolated tenant with partition-level isolation in DynamoDB and row-level security in Aurora.
- LangGraph checkpoint compatibility via
MnemoraCheckpointSaver. Drop-in replacement for the defaultMemorySaver. - Four memory types through one API means you don't need to stitch together separate databases.
Weaknesses:
- No self-hosted option. Mnemora is a managed service running on AWS — you can't run it on your own infrastructure.
- Newer project with a smaller community compared to Mem0 or Letta.
- AWS-only. If your stack is on GCP or Azure, the latency to Mnemora's us-east-1 deployment adds overhead.
- No temporal knowledge graph. If you need bi-temporal fact tracking, Zep/Graphiti is better suited.
- The procedural memory type is less mature than the other three.
Best for: Teams building on AWS who want a single memory API for multiple memory types, especially those using LangGraph and needing multi-tenant isolation for SaaS applications.
When to Choose Each
Choose Mem0 when:
- You want a battle-tested managed platform with the largest community
- Per-operation LLM cost is acceptable for your use case
- You need the open-source self-hosting option as a fallback
Choose Zep/Graphiti when:
- Temporal reasoning about facts is a core requirement
- You need sub-200ms graph-based retrieval
- You're willing to manage graph database infrastructure
Choose Letta when:
- You want agents that autonomously manage their own memory
- You're building on the Letta agent framework end-to-end
- Self-editing memory blocks fit your agent architecture
Choose Mnemora when:
- You need multiple memory types (state + vectors + episodes) in one API
- Serverless scale-to-zero pricing matters for your workload
- You're building multi-tenant SaaS features with agent memory
- You use LangGraph and need a persistent checkpointer
Conclusion
There is no single best agent memory solution. The right choice depends on your architecture, scale requirements, and whether you prioritize autonomous memory management (Letta), temporal knowledge graphs (Zep), community ecosystem (Mem0), or unified serverless simplicity (Mnemora).
If you're evaluating options, start with the question: does your agent need to call an LLM just to read and write memory? If the answer is no, your choice narrows to Zep and Mnemora. From there, it comes down to whether you need a knowledge graph or a unified multi-type memory API.