One of the most frustrating things about coding with AI is starting a new session and feeding it the context you expect it to know.
Right now, coding with AI feels less like the future and more like the plot of 50 First Dates.
If I’m Adam Sandler, then my agent is Drew Barrymore. I would have had a breakthrough yesterday - built a great feature, resolved a bug that AI clearly shouldn’t have made, bonded over a shared hatred of jQuery. Then I hit the daily limit on Claude and am forced to switch to Codex. Or I open a new session to multi-task different parts of the code.
I have to prime it with the architectural decisions, gotchas, and why we changed the timeout from 30 seconds to 90 seconds. It’s like I have to woo it all over again.
It’s exhausting. I’m lazy. So I do exactly what Adam Sandler did in the movie: making a "tape" it can watch every morning to catch up. We are used to using AI like a calculator (input → output → reset). But to get real value, we need to treat it like a colleague with tenure and experience.
The Evolution: From "Lookup" to "Learning"
Just a few years ago, LLM context window was small (4k tokens!) and developers had to create “traditional RAG” to manage the tokens. As context window get larger (1+ millions tokens!), LLMs decide how to perform RAG themselves to optimize their own context windows. If we enhance that with an external shared memory, that’s how we 10x productivity for lazy builders like myself.
Stage 1: Traditional RAG (The "Chatbot")
The Workflow: The system gets the input, spits out an answer, and then immediately forgets it ever happened. No learning, no memory, and zero chance of connecting the dots between separate ideas.
The Limitation: If a user mentions their favorite color in one conversation and their birthday in another, a RAG-based system won't correlate these facts to suggest personalized recommendations. Embedding-based retrieval treats "color," "birthday," and "preference" as semantically unrelated concepts.
The Vibe: It’s like a student who can look up answers in a textbook but lacks critical thinking. It retrieves facts in isolation—it can tell you what happened on page 42 and what happened on page 105, but it has no idea how they are related.

Source: Medium
Stage 2: Agentic RAG (The "Smart Researcher")
The Workflow: The agent doesn't just ask once. It actively manages the process. It refines its own queries, uses reasoning to decide where to look, and treats retrieval as a sophisticated tool rather than a script.
The Engine: It doesn't just ingest text and prompts; it enables continuous learning. Through feedback loops, the agent’s insights can actually update the knowledge base, creating a cycle of improvement rather than just a one-off answer. Agents aren’t just finding keywords, but they’re finding meaning.
The Vibe: It’s like upgrading from a dumb chatbot to a smart researcher. You don't just hand them a list of questions; you give them a topic. They traverse the data, follow themes, connect the dots, and come back with a solution that actually fits the context.
Stage 3: Agent Memory (The "Shared Notebook) This is the shift we are seeing now with tools like AGENTS.md and CLAUDE.md. It is the persistence layer that transforms distinct interactions into a continuous relationship. Unlike RAG (which retrieves static external data), Memory creates new internal data based on experience. It is the ability to retain state, user preferences, and learned behaviors across sessions.
The Workflow: The agent isn't just Reading and Writing; it is Compounding and Sharing.
Save Learnings: If the agent figures out a tricky bug in your repo, it doesn't just fix it—it records how it fixed it so it never struggles with that pattern again.
Share Findings: This memory isn't siloed. In advanced setups, a frontend agent can learn a design rule and "write" it to the memory, making it immediately available to the backend agent.
The Vibe: It’s like upgrading from a researcher to a community of researchers. They don't just research topics; they remember the client's history, they know the firm's unwritten rules, and they teach the juniors (other agents) how to do the job better next time.
Claude’s Memory According to Anthropic
Claude Code offers four memory locations in a hierarchical structure, each serving a different purpose:
Memory Type | Location | Purpose | Use Case Examples | Shared With |
|---|---|---|---|---|
Project memory |
| Team-shared instructions for the project | Project architecture, coding standards, common workflows | Team members via source control |
Project rules |
| Modular, topic-specific project instructions | Language-specific guidelines, testing conventions, API standards | Team members via source control |
User memory |
| Personal preferences for all projects | Code styling preferences, personal tooling shortcuts | Just you (all projects) |
Project memory (local) |
| Personal project-specific preferences | Your sandbox URLs, preferred test data | Just you (current project) |
How I Use memory and AGENTS.md
I’m an indie developer and don’t exclusively use Claude Code, so I do maintain an AGENTS.md file across all my agents (Codex and Gemini). Side note: Actually, the file was called replit.md because I started my project in Replit. My setup:
Symlink the
.mdfiles. I also created other general.mdfiles such ascalendly_integration_research.mdso any agent can easily access single source of truth.Every time I make a breakthrough and wanted agents to remember the moment, I manually prompt it to update the file.
Agents can either access these files by default at the start of every session, or I prompt it to read the files before doing a specific task.
How Claude Code Creator Uses Memory
For a while, using memory felt like a hack. Then Boris Cherny (the creator of Claude Code) dropped a tweet that basically confirmed this is the right way to do it.
He shared how his team uses a CLAUDE.md file as a persistent memory layer. It wasn’t just a "nice to have." It was core to how they built the tool itself.
When the creator of the tool tells you how to hack it, you listen. The CLAUDE.md isn't just a config file. It is the agent's brain. It bridges the gap between the "forgetful" API and the reality of building actual products.
Based on Boris's approach, here is the playbook:
1. Create the Brain (Facts) Make a CLAUDE.md file in your project folder. Dump your high-level context here. Architecture decisions, styling preferences, and the specific libraries you use. Anyone can update this file. Claude reads this before it starts working.
2. Treat Memory Like Code (Skills) Don't leave your prompt strategies in the chat interface. Save them in Git. Your "memory" should be tracked just like your code. Share it with others.
3. The Failure Loop (History) This is the most important part. When the agent messes up, don't just correct the output in the chat. Go to CLAUDE.md and add a rule so it doesn't happen again.
DRY (Don’t Repeat Yourself)
The goal isn't to get the answer right this one time. The goal is to never have to ask the same question twice.
If your agent could remember just one rule about your workflow that would save you 10 minutes a day, what would it be?
Go write it down in that markdown file.

