❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️
❄️

Memory in Cosmos DB β€” Flows and Code

Memory in Cosmos DB β€” Flows and Code

This post shows how I implement memory using Azure Cosmos DB for NoSQL with integrated vector search (DiskANN). We’ll cover write/read flows, access control, and practical C# snippets.

Data model (example)

  • PersonalMemory
    • id, userId (PK), kind, value, tags[], vector?, createdAt, updatedAt, ttl
  • TeamMemory
    • id, teamId (PK), title, value, tags[], acl[groupIds], vector?, createdAt
  • CompanyMemory
    • id, tenantId (PK), title, value, tags[], vector?, createdAt

I prefer partitioning by the primary scope (userId/teamId/tenantId). Add an acl string[] when you need group enforcement at retrieval.

Write flow (extract β†’ embed β†’ upsert)

content

facts

text

vector + doc

Create/Replace

Chat

Extract

PII Filter

Embedding Model

Upsert

Cosmos DB

Container with DiskANN vector index

using Microsoft.Azure.Cosmos;
using System.Collections.ObjectModel;

var client = new CosmosClient("https://<account>.documents.azure.com:443/", "<key>");
Database db = await client.CreateDatabaseIfNotExistsAsync("memdb");

var embeddings = new List<Embedding>
{
  new Embedding
  {
    Path = "/vector",
    DataType = VectorDataType.Float32,
    DistanceFunction = DistanceFunction.Cosine,
    Dimensions = 1536
  }
};

var props = new ContainerProperties(id: "personalMemories", partitionKeyPath: "/userId")
{
  VectorEmbeddingPolicy = new(new Collection<Embedding>(embeddings)),
  IndexingPolicy = new IndexingPolicy
  {
    VectorIndexes = { new VectorIndexPath { Path = "/vector", Type = VectorIndexType.DiskANN } }
  },
  DefaultTimeToLive = -1 // enable TTL; set per item
};
props.IndexingPolicy.IncludedPaths.Add(new IncludedPath { Path = "/*" });
props.IndexingPolicy.ExcludedPaths.Add(new ExcludedPath { Path = "/vector/*" });

Container personal = await db.CreateContainerIfNotExistsAsync(props, throughput: 1000);

Upsert a personal memory (with TTL)

var memory = new
{
  id = Guid.NewGuid().ToString("N"),
  userId = userId,
  kind = "preference",
  value = "Prefers concise answers",
  tags = new[] { "style", "preference" },
  vector = embedding, // float[] length must match vector dimensions
  createdAt = DateTimeOffset.UtcNow,
  updatedAt = DateTimeOffset.UtcNow,
  ttl = 60 * 60 * 24 * 180 // expire after ~6 months (seconds)
};

await personal.UpsertItemAsync(memory, new PartitionKey(userId));

Read flow (scope β†’ filter β†’ vector)

App

Resolve Scope: user/team/company

Build Query\nfilters + TOP N

VectorDistance order

Cosmos DB

Results

Optional rerank/recency

Query user memories by similarity

float[] q = queryEmbedding;
var sql = @"SELECT TOP 5 c.id, c.kind, c.value,
                 VectorDistance(c.vector, @q) AS SimilarityScore
          FROM c
          WHERE c.userId = @userId
          ORDER BY VectorDistance(c.vector, @q)";

var qd = new QueryDefinition(sql)
  .WithParameter("@q", q)
  .WithParameter("@userId", userId);

using var it = personal.GetItemQueryIterator<dynamic>(qd, requestOptions: new QueryRequestOptions
{
  PartitionKey = new PartitionKey(userId)
});
while (it.HasMoreResults)
{
  foreach (var item in await it.ReadNextAsync())
  {
    Console.WriteLine($"{item.id} | {item.kind} | score {item.SimilarityScore}");
  }
}

Team memories with ACL filtering

Store acl as a comma‑separated list or an array. With arrays, use ARRAY_CONTAINS for exact matches:

var groups = string.Join(",", allowedGroupIds);
var sql = @"SELECT TOP 5 c.id, c.title, c.value,
                 VectorDistance(c.vector, @q) AS SimilarityScore
          FROM c
          WHERE c.teamId = @teamId
            AND ARRAY_CONTAINS(c.acl, @groupId)
          ORDER BY VectorDistance(c.vector, @q)";

var qd = new QueryDefinition(sql)
  .WithParameter("@q", q)
  .WithParameter("@teamId", teamId)
  .WithParameter("@groupId", currentGroupId);

Tips

  • Always use TOP N in vector queries to control RU and latency.
  • Filter by partition key (userId/teamId/tenantId) first, then order by VectorDistance.
  • For small collections or filter‑heavy queries, QuantizedFlat can be more economical.
  • Keep sensitive values minimal; store references instead of raw secrets.