How We Manage Memory

Conversation history, context windows, and persistent agent memory

How We Manage Memory

Agents need memory to maintain context across conversations. Without it, every message starts fresh. This guide covers how we think about agent memory.

Core principle: Remember what matters, forget what doesn't. Context windows are finite.

Types of Memory

TypeScopeStorageUse Case
ConversationSingle chatIn-memory or DBCurrent chat context
SessionUser sessionDatabaseMulti-conversation context
Long-termPermanentDatabaseUser preferences, facts

Each type requires different strategies.

Conversation Memory

The most common type: remembering what was said in this conversation.

The Core Challenge

LLMs have limited context windows. You can't include every message from a long conversation.

Memory Strategies

StrategyWhen to UseTrade-off
Keep allShort conversationsRuns out of space
Keep recent NMost conversationsLoses old context
Summarize oldLong conversationsLoses detail
SelectiveImportant conversationsComplex to implement

Token Budgeting

Plan your context window usage:

ComponentTypical Budget
System prompt1-2K tokens
Tools2-3K tokens
History40-60K tokens
Current message2-4K tokens
Response reserve4K tokens

When history exceeds budget, you must trim or summarize.

Trimming Approaches

Simple: Remove oldest messages when over budget.

Smarter: Summarize older messages, keep recent ones verbatim.

Selective: Keep messages marked as important, summarize the rest.

When to Summarize

Summarization is useful when:

  • Conversation exceeds 30+ messages
  • Important context is in older messages
  • You need to preserve facts but not exact wording

Summarization adds latency (LLM call) and loses detail. Use only when needed.

Session Memory

Remembering context across multiple conversations with the same user.

What to Remember

RememberDon't Remember
User preferences (language, style)Temporary task details
Key facts (company, role)One-off instructions
Recent topicsEverything
Explicit correctionsGuesses about preferences

Memory-Enhanced Prompts

Include relevant memory in the system prompt:

  • "User prefers Vietnamese"
  • "User works at [Company] doing SEO"
  • "Recent topics: keyword clustering, content optimization"

This gives the agent context without cluttering conversation history.

Extracting Memory

Automatically identify what to remember from conversations:

Look for:

  • Explicit statements: "I prefer..." "I work at..."
  • Corrections: "No, I meant..." "Actually..."
  • Repeated patterns: User always asks for X

Avoid:

  • Guessing preferences from single interactions
  • Storing temporary task context
  • Keeping everything

Long-term Memory

Persistent facts and preferences that last beyond sessions.

Storage Structure

Two tables typically:

  1. Conversations table: Conversation metadata
  2. Messages table: Individual messages with conversation_id
  3. User memory table: Key-value pairs for persistent facts

Memory Keys

Organize long-term memory by type:

KeyExample Value
preferences.language"vi"
preferences.style"professional"
facts["Works at SEO agency", "Focus on Vietnamese market"]
recent_topics["keyword clustering", "content optimization"]

Memory Hygiene

Long-term memory needs maintenance:

  • Expiry: Facts can become stale
  • Limits: Don't store unlimited facts
  • Updates: New information should replace old
  • User control: Users should be able to clear memory

Context Window Management

The Problem

A 10-message conversation might fit. A 100-message conversation won't.

The Solution

  1. Budget: Know how much space you have
  2. Prioritize: Recent messages > old messages
  3. Summarize: Compress old context when needed
  4. Truncate: Remove if summarization isn't worth it

Prioritization

What to keep when space is limited:

PriorityContent
Always keepSystem prompt, tools, current message
High priorityLast 10 messages
Medium priorityUser facts, summarized history
Low priorityDetailed old messages

Common Mistakes

Storing everything

Signs: Database grows huge, context windows overflow, costs spike.

Fix: Be selective. Not every message needs permanent storage.

No memory limits

Signs: Old conversations slow down, token limits hit.

Fix: Implement summarization, set max history length.

Ignoring memory in prompts

Signs: Agent keeps asking same questions, ignores user context.

Fix: Include relevant memory in system prompt.

Memory without cleanup

Signs: Stale facts, outdated preferences, conflicting information.

Fix: Add expiry, allow users to reset, periodic cleanup.

Forgetting mid-conversation

Signs: Agent loses track of what was discussed.

Fix: Check if trimming is too aggressive. Consider summarization.

Evaluation Checklist

Your memory system is working if:

  • Agent remembers context within conversation
  • User preferences persist across sessions
  • Old conversations don't slow things down
  • Memory stays within token budgets
  • Users can clear/reset memory

Your memory system needs work if:

  • Agent forgets mid-conversation
  • Every conversation starts fresh (when it shouldn't)
  • Database size growing unbounded
  • Context window errors appearing
  • Stale information being used

Quick Reference

Memory Types

TypePersistenceStrategy
ConversationSessionTrim or summarize old
SessionUser lifetimeKey-value storage
Long-termPermanentExplicit extraction

Token Budget Template

ComponentTokens
System prompt1,500
Tools2,500
User memory500
Conversation history40,000
Current message3,000
Response reserve4,000
Total~51,500

What to Extract for Long-term Memory

Extract:

  • Explicit preferences
  • Stated facts about user/company
  • Corrections and clarifications
  • Repeated patterns

Don't extract:

  • Temporary task context
  • Single-interaction guesses
  • Everything