Building AI Agent Memory with Convex: Full Architecture

How we built Cortex, a cognitive memory system for AI agents on Convex — with sensory, episodic, and semantic stores, memory decay, spreading activation, and zero infrastructure cost.

Building AI Agent Memory with Convex: Full Architecture

AI Agent Memory with Convex: How We Built a Brain That Actually Remembers

AI Agent Memory is the missing piece in most AI agent architectures. OpenClaw gives AI agents file-based memory out of the box — markdown files, daily logs, even a memory_search tool for semantic lookup. It works. Our agent Timmy could remember yesterday's decisions by reading MEMORY.md, check what happened last week in daily notes, and search for relevant context across files.

AI Agent Memory is the foundation of any production AI system that needs to learn and improve over time. This guide shows how we built a cognitive AI Agent Memory system using Convex as the backend. But as our AI agent memory requirements grew and Timmy took on more responsibilities — writing blog posts, managing social media, automating browser tasks, running cron jobs — we started bumping into the limits of flat-file memory. Search was linear, with no relevance ranking. Old memories never faded, so the files grew bloated with outdated decisions. There was no way to automatically surface related context or distinguish a critical architectural decision from a throwaway observation.

We wanted something more cognitive. Not just storage — a system that models how memory actually works.

This is the story of how we built Cortex, a cognitive memory system for AI agents, powered by Convex.

AI Agent Memory: A 60-Second Crash Course in Cognitive Memory

Before we dive into architecture, let's talk about how human memory actually works. Building AI Agent Memory with Convex Cognitive scientists have identified several distinct memory systems, and understanding them makes the engineering decisions click:

🧠 Sensory Memory — Your short-term buffer. Like remembering what someone just said 30 seconds ago. High volume, very temporary. Most of it evaporates in minutes.

📔 Episodic Memory — Your diary. Specific events tied to times and places: "On Tuesday, we decided to use Typefully for social media." Autobiographical, contextual, timestamped.

📚 Semantic Memory — Your knowledge base. Facts you just know without remembering when you learned them: "Convex supports vector search" or "the deploy command is npx convex deploy." Distilled from experience over time.

⚙️ Procedural Memory — Muscle memory. How to do things: "To publish a blog post: create draft → add image → publish → verify URL → social media." You don't think about it, you just do it.

📋 Prospective Memory — Your to-do list. Things you need to do in the future: "Check SEO performance next Monday" or "Follow up with that lead after the demo."

The insight: these aren't just categories for organizing notes. Each type has different persistence, decay rates, and retrieval patterns. Sensory memories should fade in hours. Semantic knowledge should persist indefinitely. Emotional memories should resist forgetting. And related memories should activate each other — remembering "Convex" should naturally pull up "vector search" and "document model."

That's what Cortex does. Let's build it.

Why Convex? (It Wasn't Our First Choice)

We were already using Convex for our blog CMS at contextstudios.ai. Building AI Agent Memory with Convex But choosing it for cognitive memory came from three realizations:

1. Document Model Fits Memory Objects Naturally

A memory isn't a row in a table. It's a rich object with a title, content, emotional valence, strength, tags, provenance, and relationships. Convex's document model maps perfectly:

// convex/schema.ts — The Cortex memory schema
cortexMemories: defineTable({
  // What kind of memory is this?
  store: v.union(
    v.literal("sensory"),    // 24h buffer
    v.literal("episodic"),   // specific events
    v.literal("semantic"),   // factual knowledge
    v.literal("procedural"), // how-to workflows
    v.literal("prospective") // future intentions
  ),
  category: v.union(
    v.literal("decision"), v.literal("lesson"),
    v.literal("person"),   v.literal("rule"),
    v.literal("event"),    v.literal("fact"),
    v.literal("goal"),     v.literal("workflow")
  ),
  title: v.string(),
  content: v.string(),
  embedding: v.array(v.float64()), // 1536-dim vector

  // Cognitive metadata
  strength: v.float64(),      // 0–1, decays over time
  confidence: v.float64(),    // 0–1, how certain
  valence: v.float64(),       // -1 to 1 (emotional charge)
  arousal: v.float64(),       // 0–1 (emotional intensity)
  accessCount: v.number(),    // retrieval frequency
  lastAccessedAt: v.number(), // for recency scoring

  // Provenance
  source: v.union(
    v.literal("conversation"),
    v.literal("cron"),
    v.literal("observation"),
    v.literal("inference"),
    v.literal("external")
  ),
  tags: v.array(v.string()),
})

2. Built-in Vector Search (No Extra Infrastructure)

This was the killer feature. Native vector indexing means we store OpenAI embeddings alongside memory documents and search them without spinning up Pinecone or Qdrant:

// One index. Vector + metadata filtering. Zero extra infra.
.vectorIndex("by_embedding", {
  vectorField: "embedding",
  dimensions: 1536,
  filterFields: ["store", "category", "tags"],
})

3. Serverless Functions for Cognitive Processes

Memory isn't just storage — it's a process. Convex gives us mutations, queries, actions, and cron jobs — exactly the right primitives for decay, consolidation, and association:

// convex/crons.ts — Background cognitive processes
crons.interval("cortex consolidation", { hours: 12 }, internal.cortex.consolidate);
crons.interval("cortex decay", { hours: 24 }, internal.cortex.decay);
crons.interval("cortex cleanup", { hours: 24 }, internal.cortex.cleanupExpired);

No Lambda functions. No worker queues. Just TypeScript functions that run on schedule.

The AI Agent Memory Architecture: Cortex

Five Memory Stores

┌───────────────────────────────────────────────────────┐
│                    CORTEX MEMORY                       │
├────────────┬──────────────┬──────────────┬────────────┤
│  SENSORY   │  EPISODIC    │  SEMANTIC    │ PROCEDURAL │
│  (24h buf) │  (events)    │  (knowledge) │ (how-to)   │
│            │              │              │            │
│ Raw input  │ "We decided  │ "Convex uses │ "Deploy    │
│ from conv- │  to use      │  document    │  with npx  │
│ ersations  │  Typefully"  │  model with  │  convex    │
│            │              │  schemas"    │  deploy"   │
├────────────┴──────────────┴──────────────┴────────────┤
│                    PROSPECTIVE                         │
│            Goals, plans, future intentions             │
│        "Ship Cortex blog post by end of week"          │
└───────────────────────────────────────────────────────┘

Auto-Promotion: Sensory → Episodic → Semantic

Every 12 hours, a consolidation cron clusters related sensory memories (vector similarity > 0.75), synthesizes them into episodic memories, and marks the originals as consolidated. Over time, frequently accessed episodic memories with high confidence get promoted to semantic memory. The system literally learns what's important by tracking retrieval patterns.

Memory Decay: Forgetting Is a Feature

Every memory has a strength field starting at 1.0 that decays daily:

export const decay = internalMutation({
  handler: async (ctx) => {
    const now = Date.now();
    const memories = await ctx.db
      .query("cortexMemories")
      .filter(q => q.eq(q.field("archivedAt"), undefined))
      .collect();

    for (const mem of memories) {
      if (mem.store === "prospective") continue; // Goals don't decay

      const daysSinceAccess = (now - mem.lastAccessedAt) / (1000 * 60 * 60 * 24);
      // Emotional memories decay slower — matches cognitive research
      const isHighEmotion = Math.abs(mem.valence) > 0.7 && mem.arousal > 0.7;
      const decayRate = isHighEmotion ? 0.01 : 0.02;
      const newStrength = Math.max(0, mem.strength - decayRate * daysSinceAccess);

      if (newStrength < 0.1) {
        // Archive, don't delete — still searchable if needed
        await ctx.db.patch(mem._id, { strength: newStrength, archivedAt: now });
      } else if (newStrength !== mem.strength) {
        await ctx.db.patch(mem._id, { strength: newStrength });
      }
    }
  },
});

Two principles at work: emotional memories persist longer (you remember your wedding day better than last Tuesday's lunch), and access reinforces memory (frequently retrieved memories stay strong — spaced repetition for AI agents).

Spreading Activation: Memories That Connect

New memories automatically find related ones via vector search and create association links:

export const createAutoAssociations = internalAction({
  handler: async (ctx, args) => {
    const similar = await ctx.vectorSearch("cortexMemories", "by_embedding", {
      vector: args.embedding,
      limit: 6,
    });
    for (const result of similar) {
      if (result._id === args.memoryId) continue;
      await ctx.runMutation(internal.cortex.insertAssociations, {
        associations: [{
          from: args.memoryId,
          to: result._id,
          type: "related",    // also: caused, contradicts, supersedes, part_of
          weight: result._score,
          createdAt: Date.now(),
        }],
      });
    }
  },
});

Recall uses a composite scoring function combining four signals:

const compositeScore =
  mem.strength * 0.3 +      // How strong is this memory?
  recencyScore * 0.2 +      // How recently was it accessed?
  accessScore * 0.1 +       // How often retrieved?
  mem.vectorScore * 0.4;    // How relevant to the query?

A query like "Why did we choose Convex?" finds the strongest, most recently relevant, most frequently useful memory that's also semantically close. It feels like actual recall, not just search.

Integration with OpenClaw

Cortex is exposed through 8 MCP (Model Context Protocol) tools:

ToolPurpose
cortex_rememberStore a new memory
cortex_recallSearch and retrieve memories
cortex_what_do_i_knowBroad topic awareness check
cortex_why_did_weDecision archaeology
cortex_forgetExplicit memory removal
cortex_statsMemory system statistics
cortex_checkpointSave working context
cortex_wakeMorning briefing

A typical session now looks like:

Session start → cortex_wake()
  Returns: "Last session you were working on the video pipeline v3.
            Open decisions: voice selection for German narration."

During conversation → cortex_remember(
  store: "semantic",
  category: "decision",
  title: "Laura voice selected for German narration",
  content: "Voice ID FGY2WhTYpPnrIDTdsKH5, model eleven_multilingual_v2...",
  tags: ["video-pipeline", "tts", "voice"]
)

Ad-hoc recall → cortex_recall("video pipeline voice settings")
  Returns: The memory above + related pipeline memories

The Dual-Write Pattern

Every memory goes to Cortex (Convex) and to local markdown files. Cortex provides structured recall, vector search, and decay. Markdown files provide a human-readable audit trail. The redundancy has saved us more than once.

Build Your Own AI Agent Memory: Minimum Viable Cortex

Want to ship something like this in a weekend? Here's the minimum viable version — just sensory + semantic memory with vector search.

Step 1: Set Up the Convex Schema

// convex/schema.ts — Minimum viable Cortex
import { defineSchema, defineTable } from "convex/server";
import { v } from "convex/values";

export default defineSchema({
  memories: defineTable({
    store: v.union(v.literal("sensory"), v.literal("semantic")),
    title: v.string(),
    content: v.string(),
    embedding: v.array(v.float64()),
    strength: v.float64(),       // starts at 1.0
    createdAt: v.number(),
    lastAccessedAt: v.number(),
    tags: v.array(v.string()),
  })
    .vectorIndex("by_embedding", {
      vectorField: "embedding",
      dimensions: 1536,
      filterFields: ["store", "tags"],
    })
    .index("by_store", ["store"]),
});

Step 2: Write the Remember Function

// convex/memory.ts — Store a new memory
import { action } from "./_generated/server";
import { v } from "convex/values";
import { internal } from "./_generated/api";

// Action: generates embedding, then stores the memory
export const remember = action({
  args: {
    store: v.union(v.literal("sensory"), v.literal("semantic")),
    title: v.string(),
    content: v.string(),
    tags: v.array(v.string()),
  },
  handler: async (ctx, args) => {
    // Generate embedding via OpenAI
    const response = await fetch("https://api.openai.com/v1/embeddings", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        model: "text-embedding-3-small",
        input: `${args.title}: ${args.content}`,
      }),
    });
    const data = await response.json();
    const embedding = data.data[0].embedding;

    // Store the memory
    const now = Date.now();
    await ctx.runMutation(internal.memory.insert, {
      ...args,
      embedding,
      strength: 1.0,
      createdAt: now,
      lastAccessedAt: now,
    });
  },
});

Step 3: Write the Recall Function

// convex/memory.ts — Search memories by semantic similarity
export const recall = action({
  args: {
    query: v.string(),
    store: v.optional(v.union(v.literal("sensory"), v.literal("semantic"))),
    limit: v.optional(v.number()),
  },
  handler: async (ctx, args) => {
    // Embed the query
    const response = await fetch("https://api.openai.com/v1/embeddings", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        model: "text-embedding-3-small",
        input: args.query,
      }),
    });
    const data = await response.json();
    const queryEmbedding = data.data[0].embedding;

    // Vector search with optional store filter
    const filter = args.store ? { store: args.store } : undefined;
    const results = await ctx.vectorSearch("memories", "by_embedding", {
      vector: queryEmbedding,
      limit: args.limit ?? 5,
      filter,
    });

    // Update access timestamps
    for (const r of results) {
      await ctx.runMutation(internal.memory.touch, { id: r._id });
    }

    return results;
  },
});
// convex/crons.ts — Forget what's no longer relevant
import { cronJobs } from "convex/server";
import { internal } from "./_generated/api";

const crons = cronJobs();
crons.daily("memory decay", { hourUTC: 4, minuteUTC: 0 }, internal.memory.decay);
export default crons;

Step 5: Wire It to Your Agent

If you're using MCP, expose remember and recall as tools. If not, call them directly from your agent's code. That's it — you have a working memory system with semantic search, automatic decay, and two memory tiers.

From here, you can incrementally add:

  • Episodic memory (add timestamps, event context)
  • Associations (a second table linking related memories)
  • Consolidation (a cron that promotes strong sensory memories to semantic)
  • Emotional metadata (valence/arousal for decay modulation)

AI Agent Memory: Practical Lessons Learned

  1. Building AI Agent Memory with Convex Start with sensory, not semantic. Let the consolidation pipeline decide what's important. Don't overthink categorization upfront.

  2. Decay is essential. Without it, your memory store becomes a junk drawer of obsolete decisions and irrelevant observations.

  3. Emotional metadata matters. Tagging memories with valence and arousal genuinely improves recall quality. High-impact decisions should persist longer.

  4. Vector search + metadata filtering > pure vector search. Combined filtering on vector indexes is incredibly powerful — Convex makes this trivial.

  5. Associations create serendipity. The most valuable recall is sometimes a related memory the agent wouldn't have found through direct search.

AI Agent Memory Results in Production

After shipping Cortex, sessions start with the agent already knowing context. Decisions are instantly retrievable. Lessons actually stick. Running on Convex's free tier, we store hundreds of memories with zero infrastructure cost. Vector search returns results in under 100ms.

The real impact is qualitative: working with an AI that remembers feels like collaborating with a colleague who's been on the team for months, versus explaining everything to a new contractor every morning.

What's Next for AI Agent Memory

Cortex is live in production. If you're building AI agents and struggling with context persistence, consider this approach. The combination of multi-store architecture, time-based decay, vector similarity, and spreading activation creates a memory system that feels surprisingly natural.

Check out Convex's AI agent documentation and the MCP protocol for tool integration. Your AI doesn't have to start fresh every session.


AI Agent Memory: Frequently Asked Questions

What's the difference between Cortex and a vector database like Pinecone? Cortex adds memory decay, automatic consolidation, emotional metadata, and spreading activation on top of vector search. Plus, running on Convex means everything — storage, search, crons, real-time queries — is in one platform.

How much does it cost? Currently $0 on Convex's free tier. The only external cost is OpenAI embeddings (fractions of a cent per memory).

Can it work with models other than Claude? Yes. Cortex is model-agnostic, exposed via MCP tools. The embeddings use OpenAI's text-embedding-3-small, but you could swap in any 1536-dimension model.

How do you prevent storing incorrect information? Each memory has a confidence score affecting recall ranking, cortex_forget allows explicit removal, and natural decay means even incorrect memories fade if not accessed.

What if Convex goes down? The dual-write pattern means every memory exists in both Convex and local markdown files. If Convex is temporarily unavailable, the agent falls back to file-based search.

Share article

Share: