Semantic Memory Search: Find Anything Your AI Agent Ever Learned
On this page
I spent a Saturday morning trying to find a decision my agent had logged three weeks earlier. I knew I’d told it something about the caching approach for a side project. The memory was in there — I could see a dozen files in the directory. But which one? I grepped for ‘cache.’ Nothing. I grepped for ‘Redis.’ Nothing. Eventually I found it buried in a daily log, filed under a date I couldn’t have guessed, written in language that didn’t use the word ‘cache’ at all.
That’s the exact moment OpenClaw’s otherwise elegant memory design runs into a wall. The files are right there. Human-readable, version-controllable, portable. But there’s no search — just files. And grep only works when you remember the exact words you used.
The fix I found is called memsearch, a standalone Python tool that adds a semantic index on top of your existing OpenClaw markdown files. It doesn’t touch your files, doesn’t replace them, doesn’t require you to change anything about how your agent works. It just makes your memory searchable by meaning. I’ll show you how to set it up — and explain why the counterintuitive part (using BOTH semantic and keyword search at the same time) is what actually makes it work. That part comes after the setup.
Why OpenClaw Memory Breaks Down at Scale
OpenClaw stores memory in plain markdown files — no proprietary database, no hidden infrastructure. It’s one of the things that makes OpenClaw genuinely different from other agent platforms. By early February 2026, OpenClaw had crossed 180,000 GitHub stars, making it one of the fastest-growing open-source projects in history. Part of that appeal is this simple, transparent memory model.
The memory system has two layers. Long-term facts — decisions, preferences, recurring patterns — live in a file called MEMORY.md. Daily running context gets appended to date-stamped log files (memory/YYYY-MM-DD.md). Both are plain text. You can open them in any editor, put them in Git, read them on your phone.
What you can’t do is search them. Not really. Grep finds exact keyword matches. It misses the memory where you described the same decision using different words. Loading entire files into your agent’s active memory — so the AI can ‘read’ them all — wastes words on irrelevant content. After a few weeks of active use, the memory directory becomes a graveyard of valuable context that’s technically present but practically inaccessible.
What Semantic Memory Search Actually Does
Semantic search (search that understands meaning, not just keywords) works by converting your memory chunks into a mathematical representation of their meaning. When you search for ‘what caching solution did we pick?’, the system finds memories that are conceptually similar — even if they use words like ‘Redis’, ‘TTL’, or ‘cache invalidation strategy’ without ever using the word ‘caching.’
The memsearch tool does this by reading your OpenClaw markdown files, breaking them into chunks, and building a searchable index using Milvus as the database backend. You query it from the command line. It returns the specific chunks most relevant to your question.
Critically, the markdown files stay untouched. The index is a derived cache you can rebuild anytime by re-running one command. If something goes wrong, delete the index and start over. Your memories are exactly where OpenClaw left them.
If you’re exploring agentic AI more broadly, memory retrieval is one of the capabilities that separates a genuine agent from a stateless chatbot. An agent that can’t find its own past decisions is flying blind after the first conversation.
How to Add Semantic Search to Your OpenClaw Memory
You need Python 3.10 or newer. No OpenClaw configuration required — memsearch runs independently alongside your agent. The whole setup takes under 10 minutes.
- Install memsearch with pip:
pip install memsearch
Some knowledge doesn’t stay lost — it just waits for the right light to find it again.
- Run the interactive config wizard to set your embedding provider and other settings:
memsearch config init - Index your OpenClaw memory directory:
memsearch index ~/path/to/your/memory/ - Search by meaning, not keywords:
memsearch search "what caching solution did we pick?" - Start the file watcher to keep the index current automatically:
memsearch watch ~/path/to/your/memory/
Step 5 is underrated. The file watcher monitors your memory directory and re-indexes any file that changes. Your agent keeps writing memories the same way it always has — memsearch picks up the changes and updates the index behind the scenes. No manual re-runs.
If you’d rather not hand your memory contents to a cloud API, run the fully local setup instead. It works with no API keys:
pip install "memsearch[local]"
memsearch config set embedding.provider local
memsearch index ~/path/to/your/memory/
The local provider uses Ollama under the hood, so your memories never leave your machine. For most personal agent setups, this is exactly the right call.
Cloud options include OpenAI, Google, and Voyage. The config wizard walks you through whichever you choose. You can switch providers later — just re-run memsearch config init and then rebuild the index.
Why Pure Vector Search Isn’t Enough (This Is the Part People Get Wrong)
Here’s what I mentioned earlier that’s worth pausing on. The reflex assumption is: semantic search means vector search. You convert text to numbers, find the closest match, done. That’s the mental model most people carry.
The problem is that pure semantic search fails on exact matches. Ask for ‘GPT-4o response from the March 12 session’ and a system that understands meaning but not keywords might drift toward conceptually similar memories while missing the specific one you want. Conversely, pure keyword search (grep, BM25) finds exact matches but misses anything you phrased differently.
memsearch uses both simultaneously. Dense semantic search handles meaning-based queries. BM25 full-text search handles exact terms. The results from both get combined and re-ranked using a technique called Reciprocal Rank Fusion — which, stripped of jargon, means ‘a result that shows up high in both lists ranks higher than one that only shows up in one.’ The best of both approaches, blended.
This matters practically. Memory retrieval involves two types of queries: ‘what did we decide about performance?’ (meaning-based) and ‘find the entry from last Thursday about the API key’ (exact-match). A pure semantic system handles the first well. A pure keyword system handles the second. Hybrid handles both — which is why real agent memory retrieval systems, including OpenClaw’s own internal implementation using SQLite with FTS5 and sqlite-vec, are built this way.
Where This Approach Falls Apart
It’s not bulletproof. Here’s what breaks or surprises people:
- The index can lag if the file watcher crashes. The watcher process isn’t a system service — if you close the terminal or it errors out, new memory files won’t be indexed until you restart it or manually re-run
memsearch index. Set it up as a background process or cron job if you want reliable sync. - Very short memory chunks hurt retrieval quality. If your agent logs one-line memories (‘Decided: Redis’), the semantic search has almost nothing to work with. Richer, more descriptive memory entries get better retrieval results. This is a prompt engineering problem, not a memsearch problem — but it shows up here.
- Cloud embedding costs scale with memory size. SHA-256 hashing means you only pay to index new or changed content — unchanged files are never re-embedded. But if you’re storing memories aggressively and using a paid embedding API (like OpenAI), watch your usage. Local embeddings eliminate this entirely.
- Milvus requires a running instance. memsearch uses Milvus as its database backend. You’ll need it running locally or accessible on your network. The setup is straightforward, but it’s not zero-overhead — account for it if you’re running this on a resource-constrained machine.
- Access control is your responsibility. memsearch doesn’t know which memories should be visible to which users or agents. If you’re running a multi-agent setup where different agents have different memory permissions, you need to handle that at the directory and indexing level — not through the search layer.
That last point deserves emphasis if you’re running multiple agents. Prompt-level access control for memory is unreliable — it depends on the model choosing to comply, which is the wrong trust boundary for anything sensitive. Keep memory directories separate per agent and index them separately.
How to Know Your Memory Search Is Working
- Run a test query using phrasing you know is NOT in your memory files. If results come back that are conceptually related, semantic search is working.
- Check the index covers your full memory directory: re-run
memsearch index ~/path/to/memory/and confirm the output shows only new/changed files being embedded, not the full set. This confirms SHA-256 deduplication is active. - Test an exact-match query (a specific name, date, or phrase you know appears verbatim). Confirm it surfaces at or near the top of results.
- Start the file watcher, then have your agent write a new memory. Within seconds, query for something in that memory. If it appears, the live sync is working.
- If you’re using local embeddings, confirm no network requests are going out during indexing. Any network activity suggests you’re still pointed at a cloud provider.
Your Monday Morning Memory Setup
If you’ve got an OpenClaw agent with more than a week of memory files and you’ve never been able to search them properly, here’s how to fix it this week:
- Install memsearch:
pip install memsearch(orpip install "memsearch[local]"if you want zero API costs). This takes under 2 minutes. - Run
memsearch config initand choose your embedding provider. If you’re unsure, pick ‘local’ — it’s free, private, and plenty fast for personal agent memory. - Run
memsearch index ~/path/to/your/openclaw/memory/and let it build the initial index. Expect the first run to take a few minutes depending on memory size. Every run after this will be fast — only new or changed files get re-embedded. - Test it immediately. Run
memsearch search "[a decision you remember making but can't find]"and check whether the right memory chunk comes back. If it does, you’re done. If results look wrong, check your embedding provider config. - If you’re happy with results, start the file watcher as a background process:
memsearch watch ~/path/to/your/openclaw/memory/ &. Budget roughly $0–5/month for cloud embedding costs at personal-use scale, or $0 if you’re running local. - Optional: set the watcher to start on login. On macOS, add it to your shell profile. On Linux, create a systemd user service. On Windows, add it to Task Scheduler. This takes 5 minutes and means you never have to think about it again.
For the best AI agents to be genuinely useful over time, memory has to be findable — not just stored. This is the setup that closes that gap without touching your agent config or changing how OpenClaw works.
What This Means for Your Agent Memory Strategy
- OpenClaw’s markdown memory is elegant and portable — but keyword-only search (grep) stops working as soon as memories grow beyond a few weeks. Semantic memory search is the fix.
- memsearch adds a searchable index on top of your existing files without modifying them. Your markdown files stay the source of truth; the index is just a derived cache you can rebuild anytime.
- Hybrid search — combining meaning-based retrieval with keyword matching — outperforms either approach alone. This is why systems like OpenClaw’s own memory layer use both, and why memsearch does too.
- SHA-256 content hashing means re-indexing only processes new or changed files. You can run it as often as you like at effectively zero cost after the initial index is built.
- For personal agent setups where privacy matters, the fully local embedding option requires no API keys and no cloud access. It’s the right default for most individual users.
- My thinking on memory access control is still evolving — especially as multi-agent setups get more common. The short answer: handle permissions at the directory level, not the search layer. More on that in a future piece.
Frequently Asked Questions
Do I need to change how my OpenClaw agent writes memories?
No. memsearch reads your existing markdown files and builds an index alongside them. Your agent keeps writing memories exactly as it does today. The only change is that you now have a way to search them by meaning.
What happens if I delete or edit a memory file?
If you’re running the file watcher, it will detect the change and update the index automatically. If you’re not using the watcher, re-run memsearch index after making changes. The SHA-256 hashing system ensures only affected chunks are re-processed — the rest of the index stays intact.
Is this safe to use with sensitive personal memories?
It depends on your embedding provider. If you use a cloud provider like OpenAI, your memory content is sent to their API during indexing. If privacy matters, use the local embedding option (pip install "memsearch[local]") — your content never leaves your machine. The local option works well for personal agent setups.
Can I use this with other agent frameworks besides OpenClaw?
Yes. memsearch is a standalone tool — it indexes any directory of markdown files. If your agent stores memory as markdown, this works. The tool was built with OpenClaw’s memory structure in mind, but it’s not locked to it.
How much does it cost to run?
If you use local embeddings, the ongoing cost is zero — just compute on your own machine. If you use a cloud embedding API like OpenAI’s, costs scale with how much new content gets indexed. For personal use with one agent, expect well under $5/month at typical memory growth rates. SHA-256 deduplication means you’re never paying to re-index content that hasn’t changed.