$ man how-to/how-to-build-persistent-ai-memory

CLI Toolsintermediate

How to Build AI Agent Memory That Persists Across Sessions

Session handoffs, memory files, and the architecture that makes AI remember

by Shawn Tenam

The Problem

Every AI session starts with amnesia. Claude Code opens, reads your CLAUDE.md, and knows your project conventions. But it does not know what you worked on yesterday, what decisions were made, what is blocked, or what the next step is. You re-explain the context every session. This wastes time and tokens. The fix is a memory system - structured documents that carry state between sessions. Not a chat history dump. Not a vague summary. A system that gives each new session exactly the context it needs to continue where the last one left off.

PATTERN

The Three Memory Layers

A production-grade AI memory system has three layers. Each serves a different purpose and persists differently. Layer 1: Session Handoffs (episodic memory). What happened in the last session. What was built, what decisions were made, what is the current state, what should happen next. Written at the end of every session. Read at the start of the next. Short-lived - yesterday's handoff matters, last week's does not. Layer 2: Auto-Memory (semantic memory). Stable facts about the project and the user. Preferences, conventions, key architectural decisions, important file paths. Persists indefinitely. Updated when new facts are confirmed across multiple sessions. This is the MEMORY.md file that Claude Code manages in its projects directory. Layer 3: Knowledge Base (procedural memory). How to do things. Skills, workflows, patterns, templates. The CLAUDE.md, skills files, and wiki entries. Persists as long as the project exists. Evolves slowly through deliberate updates. Layer 1 changes daily. Layer 2 changes weekly. Layer 3 changes monthly. Each layer loads at a different time and serves a different function in the context window.

CODE

Building Session Handoffs

A session handoff is a structured Markdown document written at the end of every Claude Code session. It answers five questions: 1. What was done? - List of completed work with specific files changed 2. What is the current state? - Git branch, uncommitted changes, build status 3. What decisions were made? - Key choices and their reasoning 4. What is blocked? - Dependencies, waiting on external input, open questions 5. What should happen next? - Prioritized list of next steps The handoff goes to a timestamped file: ~/.claude/handoffs/YYYY-MM-DD_HHMMSS_slug.md. Timestamped names prevent conflicts when multiple sessions run in parallel. At session start, Claude Code reads all unconsumed handoffs (files not ending in _done.md), processes them, then renames each to file_done.md. Old consumed handoffs get cleaned up after 7 days. This is parallel-safe. Two Claude Code sessions in different terminals can both write handoffs without overwriting each other. The next session reads all of them and gets the combined context.

PATTERN

Auto-Memory That Actually Works

Claude Code has a built-in auto-memory system. It writes to a MEMORY.md file in its project config directory. This file loads into context at session start. Rules for effective auto-memory: Save stable patterns confirmed across multiple interactions. Not every one-off fact - patterns you see recurring. "User prefers 2-space indentation" is stable. "User is working on the auth module today" is session-specific and belongs in a handoff, not memory. Organize by topic, not chronology. Create separate files for different domains (debugging.md, infrastructure.md, voice-rules.md) and link to them from MEMORY.md. This keeps the root memory file lean. Keep MEMORY.md under 200 lines. Lines after 200 get truncated when loaded into context. Put the most important facts first. Move details to topic files. Update or remove stale memories. If a convention changes, update the memory. If a decision was reversed, delete the old memory. Stale memories cause more harm than no memories because the AI follows outdated instructions with confidence. Save explicit user requests immediately. If the user says "always use bun instead of npm," save it now. Do not wait for multiple confirmations.

PRO TIP

The Full System in Practice

Here is how the three layers work together in a real daily workflow. 9 AM: Open Claude Code. It reads CLAUDE.md (Layer 3 - knows the project). It reads MEMORY.md (Layer 2 - knows my preferences). It reads yesterday's handoff (Layer 1 - knows what happened yesterday). In 10 seconds, Claude has the context of a teammate who was here yesterday. During the session: Claude uses Layer 3 to follow project conventions. It uses Layer 2 to match my preferences. It uses Layer 1 to continue the previous session's work without me re-explaining anything. End of session: Claude writes a new handoff (Layer 1). If any stable patterns were discovered, it updates MEMORY.md (Layer 2). If a workflow changed, the relevant skill gets updated (Layer 3). The compound effect: after a week of this, the system knows my project deeply. After a month, it handles 80% of routine work without me providing context. After three months, a new Claude session is more productive in its first minute than a new human developer in their first day. The key insight: memory is not a single file. It is three layers that serve different time horizons. Build all three and the compounding effect is dramatic.

knowledge guide

→ See "Claude" in Knowledge → See "Agent" in Knowledge → See "Context" in Knowledge

← how-to wiki knowledge guide →