The Multi-Tenant Challenge
You're building a SaaS product and you want to add AI agent features. Maybe a support agent that answers customer questions, or a data analyst that generates reports, or a workflow assistant that automates tasks.
The problem: when you have 500 customers using AI agents, every customer's data must be completely isolated. Customer A's agent can never see customer B's conversations, knowledge, or state. A single data leak is a security incident, a compliance violation, and a trust-destroying event.
Multi-tenancy in traditional databases is well understood. But agent memory introduces new challenges: vector embeddings in shared indexes, episodic logs spanning multiple storage tiers, and real-time state that needs sub-10ms access. You need isolation at every layer, without sacrificing performance.
API Key Scoping
The foundation of Mnemora's multi-tenant isolation is the API key. Every API key maps to exactly one tenant_id. This mapping is stored in DynamoDB with the key SHA-256 hashed (Mnemora never stores API keys in plaintext).
When a request arrives at the API Gateway, the Lambda authorizer:
- Hashes the bearer token from the
Authorizationheader - Looks up the hash in the
mnemora-users-devDynamoDB table - Extracts the
tenant_id, tier, and rate limits from the item - Injects the
tenant_idinto the Lambda authorizer context - Downstream handlers read the tenant ID from context — never from the request body
This means the client cannot supply or override their tenant ID. Even if a malicious client sends "tenant_id": "someone-else" in the request body, the handler ignores it and uses the authorizer-derived value.
# Inside every Lambda handler
tenant_id = event["requestContext"]["authorizer"]["tenant_id"]
# NOT from the request body — ever
Isolation at the Database Level
DynamoDB: Partition Key Prefix
Every item in DynamoDB uses a composite partition key: tenant_id#agent_id. This isn't just a convention — it's a physical isolation boundary.
DynamoDB partitions data by the partition key. A query for PK = "github:12345#support-agent" physically cannot return items from PK = "github:67890#support-agent". The database engine doesn't even scan the other tenant's data.
# Tenant A's data
PK: github:12345#support-agent SK: SESSION#default
PK: github:12345#support-agent SK: EPISODE#2025-02-10T10:30:00Z#ep-001
# Tenant B's data — completely separate partitions
PK: github:67890#support-agent SK: SESSION#default
PK: github:67890#support-agent SK: EPISODE#2025-02-10T10:30:00Z#ep-002
There is no SCAN operation in Mnemora's codebase. Every DynamoDB access is a GetItem or Query with the full partition key specified, which means cross-tenant data access is structurally impossible.
Aurora: Parameterized Queries + Row-Level Security
Semantic memory lives in Aurora PostgreSQL with pgvector. Every query includes the tenant_id as a parameterized condition:
SELECT id, content, embedding <=> $1::vector AS distance
FROM semantic_memory
WHERE tenant_id = $2 AND agent_id = $3
ORDER BY embedding <=> $1::vector
LIMIT $4;
The $2 parameter is always the authorizer-derived tenant ID. SQL injection attacks against the content or metadata fields cannot escape the tenant filter because the query is parameterized — the tenant ID is never interpolated into the SQL string.
For defense-in-depth, Aurora row-level security (RLS) policies enforce isolation at the database level:
ALTER TABLE semantic_memory ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON semantic_memory
USING (tenant_id = current_setting('app.tenant_id'));
Even if a handler bug bypasses the WHERE clause, the RLS policy prevents cross-tenant reads.
S3: Prefix Isolation
Episodic memory tiers cold data to S3 with a prefix structure:
s3://mnemora-episodes-dev-993952121255/
github:12345/ # Tenant A
support-agent/
2025-02-10/ep-001.json
github:67890/ # Tenant B
support-agent/
2025-02-10/ep-002.json
Lambda functions construct S3 paths using the authorizer-derived tenant ID. The function's IAM role restricts access to the bucket, and the key prefix ensures tenants can't read each other's objects.
Example: Support Agent Per Customer
Here's how a SaaS platform would create an isolated support agent for each customer:
from mnemora import MnemoraSync
def handle_customer_message(customer_api_key: str, message: str):
"""Each customer uses their own API key, which scopes to their tenant."""
with MnemoraSync(api_key=customer_api_key) as client:
# Search this customer's knowledge base only
relevant_docs = client.search_memory(
message,
agent_id="support-agent",
top_k=5,
)
# Build context from customer-specific memories
context = "\n".join(doc.content for doc in relevant_docs)
# Your LLM call here, using the customer-specific context
response = call_llm(message=message, context=context)
# Log the interaction to this customer's episodic memory
client.store_episode(
agent_id="support-agent",
session_id=f"ticket-{generate_ticket_id()}",
type="conversation",
content={"role": "user", "message": message},
)
# Store any new knowledge the agent learned
if should_store_knowledge(response):
client.store_memory(
"support-agent",
extract_knowledge(response),
metadata={"source": "conversation"},
)
return response
Each customer's API key routes all operations to their tenant's isolated data partition. Customer A's support agent knowledge base, conversation history, and state are invisible to customer B — guaranteed at the database level.
Billing Per Tenant
Mnemora tracks usage per API key. Every API call increments a counter in the mnemora-users-dev DynamoDB table:
api_calls_today: resets daily, enforced against tier limitsvectors_stored: total semantic memory countstorage_bytes: total data across all memory types
This per-key tracking means you can bill each customer for their actual agent memory usage. The tier system (Free: 500 calls/day, Starter: 5K, Pro: 25K, Scale: 50K) enforces limits per-key at the authorizer level, before the request reaches any handler.
Why Shared-Nothing Isolation Matters
The shared-nothing model — where each tenant's data is logically separated at every layer — provides several guarantees:
Security: A vulnerability in one tenant's agent logic cannot expose another tenant's data. The isolation is enforced at the database level, not the application level.
Compliance: SOC 2, HIPAA, and GDPR audits require demonstrable data isolation. Partition key isolation in DynamoDB and RLS in Aurora provide auditable, enforceable boundaries.
Data portability: Need to export a tenant's data? Query everything with their partition key prefix. Need to delete it? A single purge_agent call removes all data across all memory types — DynamoDB items, Aurora rows, and S3 objects.
# GDPR right-to-deletion: one API call
with MnemoraSync(api_key=customer_api_key) as client:
result = client.purge_agent("support-agent")
print(result)
# PurgeResponse(state_deleted=15, semantic_deleted=234, episodes_deleted=1891)
Performance isolation: DynamoDB's partition-based architecture means one tenant's heavy workload doesn't affect another's read latency. Each tenant's data lives in its own partition space with independent throughput.
Getting Started
If you're adding AI agent features to a SaaS product, multi-tenant memory isolation isn't optional — it's a requirement. Mnemora provides this isolation by default, at every layer, without requiring you to build custom partitioning logic.
- Generate API keys per customer at mnemora.dev/dashboard
- Use each customer's key in their agent's SDK instance
- All data is automatically isolated by tenant
Read the architecture deep dive for more on how the isolation layers work, or jump into the 5-minute tutorial to start building.