When we started building Flumes, we didn’t set out to reinvent memory infrastructure. We just wanted to give our AI agents a way to remember things.
That turned out to be a much harder problem than we expected.
This post is a look behind the scenes, what we tried, what broke, and what we ultimately learned in designing a purpose-built memory stack for AI agents.
At first, we did what many teams do. We combined a:
It kind of worked. But it was a mess.
More than anything, we felt like we were building a memory system on top of tools that weren’t built for memory at all.
Each of the existing infra tools did one thing well:
But none of them gave us what memory needs:
We weren’t looking for a database. We needed a memory engine.
That led us to build Flumes as a unified memory layer, not a replacement for your DB, but a layer that handles memory as its own system.
We designed it with four core principles:
Flumes is not a wrapper. It’s a runtime for memory.
Early designs were overly index-centric. We relied too heavily on vector search. It made memory feel fuzzy and opaque.
We also underestimated:
Once we rethought memory as an intentional, structured, and tiered system, the design started to click.
We’re continuing to refine the Flumes memory engine: better summarization pipelines, tighter latency bounds, smarter cold storage policies. But the core principle is clear:
Memory deserves its own system.
If you’re tired of stitching together memory from tools that were never designed for it, come see what we’re building.
[Join the early access] and start building agents with real memory—structured, queryable, and built to scale.