Memweave: Revolutionizing Agent Memory with Markdown and SQLite
16 mins read

Memweave: Revolutionizing Agent Memory with Markdown and SQLite

Imagine dedicating an entire afternoon to crafting an advanced AI coding assistant. This assistant meticulously learns your project’s unique conventions, remembers your team’s preference for Valkey over Redis, and internalizes your established testing methodologies. The session concludes, only for you to return the next morning to a conversation that has completely forgotten everything. This resets your progress, forcing a return to square one. This is the inherent challenge with current Large Language Model (LLM) agents; their stateless design means each interaction begins anew, leaving the burden of memory management squarely on the developer.

The most common workaround involves embedding the entire conversation history within the model’s context window. While functional for short exchanges, this approach quickly becomes unsustainable. Context windows have finite limits and incur significant costs. As an agent’s operational history lengthens, it accumulates thousands of tokens, much of which becomes irrelevant to current queries. This results in paying to re-feed your agent last week’s debugging notes when a crucial architectural decision from months prior is what’s truly needed.

The next logical step for developers is often a vector database. Solutions like Chroma or provisioning a Pinecone index allow for embedding extensive data and querying it via semantic similarity. However, this method introduces its own set of complications. These tools, while powerful, were fundamentally designed for large-scale document retrieval, not the nuanced, personal, or project-scoped memory requirements of an AI agent. Using them for this purpose is akin to deploying a full-scale PostgreSQL cluster to store a single configuration file.

A simpler, more elegant solution has emerged with the development of memweave, an open-source project designed to address these limitations. Memweave’s core philosophy is remarkably straightforward: memories are stored as plain Markdown (.md) files on disk. The project then indexes these files into a local SQLite database, enabling efficient searching through a hybrid approach that combines BM25 keyword matching with semantic vector search. Crucially, the SQLite database functions as a derived cache; if it’s deleted, memweave can rebuild it entirely from the Markdown files, which remain the definitive source of truth.

The memweave Approach: Markdown + SQLite

The practical implementation of memweave is designed for ease of use and accessibility. Installation is as simple as running pip install memweave. To grant an agent persistent memory, developers can integrate memweave with just a few lines of Python code.

Consider this example:

memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required
import asyncio
from pathlib import Path
from memweave import MemWeave, MemoryConfig

async def main():
    async with MemWeave(MemoryConfig(workspace_dir=".")) as mem:
        # Write a memory - just a plain Markdown file
        memory_file = Path("memory/stack.md")
        memory_file.parent.mkdir(exist_ok=True)
        memory_file.write_text("We use Valkey instead of Redis. Target latency SLA: 5ms p99.")
        await mem.add(memory_file)

        # Search across all memories.
        # min_score=0.0 ensures results surface in a small corpus;
        # in production the default 0.35 threshold filters low-confidence matches.
        results = await mem.search("caching layer decision", min_score=0.0)
        for r in results:
            print(f"[r.score:.2f] r.snippet  -> r.path:r.start_line")

asyncio.run(main())

The output of this script would be:

[0.34] We use Valkey instead of Redis. Target latency SLA: 5ms p99.  -> memory/stack.md:1

Each search result in memweave provides not only a relevance score but also the exact file path and line number from which the information originated, offering complete source provenance without the need for further post-processing. This direct traceability allows users to easily inspect memories using familiar tools:

cat memory/stack.md
grep -r "Valkey" memory/
git diff memory/

The git diff memory/ command is particularly transformative, highlighting how memweave treats agent memory as an auditable part of the project’s history. Each piece of information stored by the agent becomes a line in a file, and each interaction or update can be viewed as a commit. This brings the operational model of agent memory in line with established software development practices, making it a first-class artifact rather than an opaque side effect.

Why Files and SQLite Over Vector Databases

The decision to leverage Markdown files and SQLite rather than a dedicated vector database stems from a fundamental difference in design objectives. Vector databases are optimized for massive-scale document retrieval, serving millions of documents across multi-tenant services and production search infrastructure. They excel in these scenarios. However, agent memory operates under different constraints: typically hundreds of thousands of files within a personal or project scope, where the knowledge itself is as critical as the code. These specific requirements led to a different set of trade-offs.

Feature Memweave (Markdown + SQLite) Vector Databases
Storage Plain Markdown files (source of truth) Opaque document store
Indexing Local SQLite with FTS5 (BM25) + sqlite-vec (semantic) Dedicated vector index (e.g., Annoy, FAISS, HNSW)
Scalability Hundreds of thousands of files, project/personal scope Millions/billions of documents, enterprise-grade
Verifiability Directly inspectable and diffable files Black box, API-driven management
Cost Minimal (local resources, embedding API calls) Significant (server costs, managed services)
Data Loss Rebuildable from files Potential loss if backups/snapshots fail
Complexity Low (single file, no server) High (deployment, management, scaling)

The most significant differentiator is version control. If an agent stores an incorrect assumption – for instance, mistaking PostgreSQL for CockroachDB when the latter was adopted last quarter – correcting it in a vector database involves a complex process of finding the specific embedding, deleting it, and re-inserting the updated information via an API. With memweave, this correction is as simple as opening the relevant Markdown file, fixing the line, and committing the change.

# git diff memory/stack.md

- Database: PostgreSQL (primary), Redis (cache)
+ Database: CockroachDB (primary, migrated Q1 2026), Valkey (cache)
+ Reason: geo-distribution requirement from the platform team

This diff becomes a permanent record in the project’s history, visible to teammates and future agents. It clearly shows what changed, when, and why, establishing a transparent operational model where agent memory is a first-class, inspectable artifact.

Architecture: Separating Storage from Search

Memweave’s architecture is built upon the foundational principle of separating storage from search. The Markdown files serve as the immutable source of truth. The SQLite database is a dynamically generated index, always capable of being rebuilt from these files, ensuring that data loss is effectively mitigated.

memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required

The indexing process for a Markdown file involves a deterministic pipeline:

  1. Chunking: The Markdown file is split into overlapping text chunks.
  2. Hashing: Each chunk is fingerprinted using a SHA-256 hash based on its content.
  3. Embedding Cache Lookup: A bulk SQL query checks the embedding_cache table for existing hashes.
    • Cache Hit: If a hash is found, the stored vector is reused, bypassing API calls.
    • Cache Miss: If the hash is not found, the chunk is sent to the embedding API (batched for efficiency).
  4. Embedding Storage: The generated vector is stored in the embedding_cache table.
  5. Database Insertion: The chunk, its hash, and its vector are inserted into the FTS5 (full-text search) and sqlite-vec (vector similarity search) tables within the SQLite database.

The SHA-256 hashing is a key efficiency mechanism. If a file is re-indexed and 90% of its chunks remain unchanged, only the modified chunks will trigger an embedding API call; the rest are retrieved instantly from the cache.

The search path is equally efficient. When mem.search(query) is called, both the FTS5 (BM25) and sqlite-vec (ANN) search backends process the query in parallel. Their results are then merged before undergoing further post-processing.

query
    -> FTS5 BM25 (keyword) -> exact term matching
    -> sqlite-vec ANN (semantic) -> cosine similarity

-> weighted merge (vector_score * 0.7 + bm25_score * 0.3)
    -> post-processing pipeline (threshold, decay, MMR)
    -> list[SearchResult]

Running both BM25 and vector search concurrently ensures comprehensive retrieval. BM25 excels at finding exact matches for error codes, configuration values, or specific names, while vector search identifies semantically related content even without overlapping keywords. This dual approach covers the spectrum of how agent memory is likely to be queried.

The Infrastructure Layer: Why SQLite?

SQLite’s selection as the underlying infrastructure is a deliberate choice, not a compromise. It is readily available with Python, requires no external server, and boasts robust support for full-text search via FTS5. Furthermore, with the sqlite-vec extension, it gains SIMD-accelerated vector similarity search capabilities. The entire memory store—including chunks, embeddings, cache, and file metadata—resides within a single file on disk. This file can be easily copied, backed up, or inspected using any standard SQLite browser. For the typical scale of agent memory, which involves thousands of files, SQLite offers an optimal blend of performance, simplicity, and portability.

Organizing Memory: Evergreen, Dated, and Namespaced

Not all knowledge has the same shelf life. A fundamental decision about a technology stack, such as migrating to CockroachDB, remains relevant indefinitely. Conversely, a debugging note from six months ago is likely to be obsolete. Memweave enforces this distinction through a simple file-naming convention, eliminating the need for explicit metadata tagging or complex configuration.

There are two primary types of memory files:

memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required
  • Evergreen Files: These are files whose names do not follow the YYYY-MM-DD.md pattern. They represent permanent knowledge that should consistently surface at the top of search results, regardless of how much new information is added.
  • Dated Files: Any file named YYYY-MM-DD.md is considered dated. Memweave extracts the date directly from the filename, ensuring that recent information is prioritized.

This convention naturally organizes a typical workspace:

memory/
├── MEMORY.md                  # evergreen - permanent facts, always surfaces
├── architecture.md            # evergreen - stack decisions, constraints
├── 2026-01-15.md              # dated - session notes from January
├── 2026-03-10.md              # dated - session notes from March
├── 2026-04-11.md              # dated - today's session, full score for now
└── researcher_agent/
    ├── findings.md            # evergreen - agent's standing knowledge
    └── 2026-04-11.md          # dated - agent's session log, will decay

Over time, dated files accumulate and their influence naturally fades, while evergreen files remain anchored at their full relevance score. This ensures that critical information, like architectural decisions, is always readily accessible even when buried under hundreds of session logs.

Agent Namespaces for Multi-Agent Memory

When multiple agents operate within a shared workspace, isolating their knowledge without resorting to separate databases is essential. Memweave handles this through subdirectories, where the immediate subdirectory under memory/ serves as the source label for all files within it.

Each agent can write to its own dedicated subdirectory, indexing against the same SQLite database. Searches are global by default, allowing any agent to access the memories of others. However, the source_filter parameter enables scoping searches exclusively to a specific namespace:

# Researcher writes to its own namespace
researcher = MemWeave(MemoryConfig(workspace_dir="./project"))
writer     = MemWeave(MemoryConfig(workspace_dir="./project"))

async with researcher, writer:
    # Researcher indexes its findings under memory/researcher_agent/
    await researcher.index()

    # Writer queries only the researcher's namespace
    results = await writer.search(
        "water ice on the Moon",
        source_filter="researcher_agent",
    )

This hierarchical structure scales efficiently for any number of agents, ensuring knowledge isolation through path conventions and allowing for versionable, inspectable agent knowledge via git log memory/agent_name/.

The memweave Search Pipeline

Every mem.search(query) call executes through a five-stage pipeline, each stage being independent and tunable:

Stage 1: Hybrid Score Merge

This stage combines results from both the BM25 and vector search backends. Their scores are normalized and linearly combined using a weighted average:

memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required

merged_score = vector_weight * vector_score + text_weight * bm25_score

The default vector_weight is 0.7, reflecting the common scenario where conceptual and paraphrased queries are more frequent than exact string lookups. This weight can be adjusted via HybridConfig for corpora that lean more towards precise technical retrieval.

Stage 2: Score Threshold

A min_score threshold (defaulting to 0.35) is applied to filter out low-confidence results before more computationally intensive post-processing. This acts as a noise gate, preventing irrelevant items from impacting subsequent calculations. The threshold can be overridden per call for specific needs.

Stage 3: Temporal Decay (Opt-in)

This feature addresses the aging of knowledge. Scores are multiplied by an exponential factor based on the age of the source file, ensuring recent information ranks higher than older, potentially stale entries. The formula multiplier = exp(-ln(2) / half_life_days * age_days) ensures that at half_life_days, a result’s score is halved. Evergreen files bypass this decay entirely. The half_life_days parameter can be tuned to match project velocity, from 7 days for fast-moving projects to 90 days for research repositories.

Stage 4: MMR Re-ranking (Opt-in)

Maximal Marginal Relevance (MMR) reorders results to balance relevance with diversity, preventing near-duplicate entries from dominating the top results. It maximizes MMR(c) = lambda * relevance(c) - (1 - lambda) * max(sim(c, c_selected)) at each selection step, using Jaccard token overlap for similarity. The lambda_param (defaulting to 0.7) controls the balance between relevance and diversity.

Stage 5: Custom Post-processors

Developers can register custom post-processing functions that run after the built-in stages. These can be used for domain-specific boosting, hard-pinning results, or integrating external signals, extending the pipeline without replacing it.

Real-World Example: Book Club Decision Log

To illustrate memweave’s impact, consider a book club scenario where two agents answer the same question: "What genre did the club vote on most recently?"

memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required

The workspace contains 9 memory files over 18 months. One evergreen file holds permanent club information, seven dated files log past discussions, and one file created today holds the current state.

Agent A (No Temporal Decay): Without temporal decay, Agent A’s search results are dominated by older files that contain more explicit "voting" language, even if they are not the most recent. Its top results include non-fiction (5 months ago) and fantasy (18 months ago). Consequently, Agent A confidently reports the club most recently voted on non-fiction, an outdated answer.

Agent B (With Temporal Decay, half-life = 90 days): When temporal decay is enabled, the file from today (April 11, 2026) immediately surfaces to rank one. Its score is unaffected by age. Older files’ scores are penalized, causing the November 2025 non-fiction vote to drop out of the top results. Agent B, grounded in the most recent data, correctly identifies science fiction as the most recently voted genre.

This demonstration highlights a critical aspect of agent memory: the "stale memory problem" is often silent. Without mechanisms like temporal decay, agents can confidently provide outdated information without any indication of error. As an agent’s memory grows, older, semantically similar entries can increasingly outcompete recent ones, leading to subtle but significant inaccuracies. Temporal decay is the essential mechanism for maintaining retrieval accuracy as history accumulates.

Conclusion

Memweave fundamentally redefines agent memory by treating it as an accessible, versionable artifact. By leveraging plain Markdown files as the source of truth and a local SQLite database for efficient hybrid search, it eliminates the need for complex infrastructure. Features like temporal decay and MMR re-ranking ensure that retrieved information remains relevant and diverse, addressing the silent but pervasive problem of stale memories. The book club example clearly illustrates the practical difference: the agent equipped with temporal decay provides an accurate, up-to-date answer, while its counterpart, lacking this feature, confidently offers stale information.

Memweave offers a path towards more reliable and transparent AI agents, where their knowledge is as auditable and manageable as the code they interact with.

To get started with memweave, install it via pip:

memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required
pip install memweave

The project encourages contributions and feedback, inviting developers to open issues or engage in discussions on its GitHub repository.

Leave a Reply

Your email address will not be published. Required fields are marked *