×

Filter by tag

Using Cloudflare Durable Objects

2025-04-03

In the previous post I explored using Cloudflare KV for managing blog content. While I didn’t think the eventually consistent model would be a problem, it turned out that list and get were not consistent, which led to errors for a full minute after new content was published. This did have a familiar feeling - I mention in that post that I had seen framework adapters for Cloudflare Pages use KV for storing content, and I remember similar buggy behaviour following each deploy.

Let’s explore using Durable Objects instead…


Durable Objects vs D1

I tried implementing this in both D1 and Durable Objects. D1 has the following advantages:

  • Point in time rollback
  • Web UI for viewing content
  • Remote access to the DB via HTTP (for things like Drizzle Kit/Studio)

Each of these are nice to have, but not impossible to work around. We lose all that using the (beta) SQLite in Durable Objects. But we gain:

  • No manual steps required to deploy, a new DO - just add a couple of lines to your wrangler.yaml
  • Sharding for multi-tenant architecture is built-in
  • JS is guaranteed to execute alongside the database

Each of these is a huge win in my book. I’ll touch on each of them below.

Leakproof Abstraction

A Durable Object looks something like this:

/// <reference types="@cloudflare/workers-types" />
import { drizzle, DrizzleSqliteDODatabase } from "drizzle-orm/durable-sqlite";
import { like, desc, eq, isNotNull } from "drizzle-orm";
import { posts } from "./schema";
import { migrate } from "./migrations";

export default class DurableDatabase extends DurableObject {
  storage: DurableObjectStorage;
  db: DrizzleSqliteDODatabase;

  static getDefault(env: Env) {
    const id = env.DurableDatabase.idFromName("default");
    const stub = env.DurableDatabase.get(id);
    return stub;
  }

  constructor(ctx: DurableObjectState, env: Env) {
    super(ctx, env);
    this.storage = ctx.storage;
    this.db = drizzle(this.storage, { logger: false });
    ctx.blockConcurrencyWhile(async () => {
      await migrate(ctx.storage.sql);
    });
  }

  async insert(user: typeof posts.$inferInsert) {
    await this.db.insert(posts).values(user);
  }

  async update(post: typeof posts.$inferInsert) {
    await this.db
      .update(posts)
      .set(post)
      .where(eq(posts.slug, post.slug))
      .execute();
  }

  async list(showAll: boolean = false) {
    if (showAll) {
      return this.db.select().from(posts).orderBy(desc(posts.date)).all();
    } else {
      return this.db
        .select()
        .from(posts)
        .where(eq(posts.status, "published"))
        .orderBy(desc(posts.date))
        .all();
    }
  }

  async get(slug: string) {
    const post = this.db
      .select()
      .from(posts)
      .where(eq(posts.slug, slug))
      .then(takeUniqueOrThrow);
    return post;
  }
}

All instance methods of a class are guaranteed to run alongside the SQLite database:

When a Durable Object uses SQLite, SQLite is invoked as a library. This means the database code runs not just on the same machine as the DO, not just in the same process, but in the very same thread. Latency is effectively zero, because there is no communication barrier between the application and SQLite. A query can complete in microseconds. - link.

There is just no way to write code that makes a high-latency query to the database. This gives a leakproof abstraction. All database access code lives inside the class. Whether you use Drizzle ORM like I did, or write raw SQL, nothing outside of the DO needs to know.

In progress - more to come soon

Managing content in Cloudflare KV

2025-03-27

The easiest way of managing content for a developer blog is probably just Markdown files living in the repo. Frameworks like Astro come with support for this built-in, but it’s trivial to do in any framework using Vite for the build process, with the built-in glob import:

const posts = import.meta.glob('./posts/*.md');

Which generates something that you can iterate over:

{
  './posts/post1.md': () => import('./posts/post1.md'),
  './posts/post2.md': () => import('./posts/post2.md'),
  // ...
}

I’m exploring an approach on Cloudflare, and trying to avoid any build step (except for wranglers inbuilt esbuild). It’s easy to include markdown files using rules and then import them directly in your worker, but it’s not easy to list all the files for the index.

So instead of storing Markdown in the repo, I’m experimenting with using Cloudflare KV. I’m pretty sure many of the framework adapters for Workers used KV to store content before Cloudflare Pages and then Workers Assets came along, so it seems like a pretty standard option for that kind of thing.


Of course, since I can no longer just edit md files locally, I’ll need to build a simple admin site to edit them. This is where something like django admin shines, but LLM’s make it easy to generate that sort of thing, and they will only get better. I’ll work on building this manually, so that I can use it as a reference for AI tools in the future.

KV API

Here are our types:

type Post = {
  slug: string;
  content: string;
  metadata: PostMetadata;
};

type PostMetadata = {
  title: string;
  status: "draft" | "unlisted" | "published";
  date: string;
};

For creating and updating, we create two different functions so that we don’t accidentally overwrite a previous post if we use a blog slug we’ve used before:

  async addPost(post: Post): Promise<Post> {
    if (await this.kv.get(post.slug)) {
      throw new Error("Post title already exists");
    }
    this.kv.put(post.slug, post.content, { metadata: post.metadata });
    return post;
  }

  async updatePost(post: Post): Promise<Post> {
    if (await this.kv.get(post.slug)) {
      this.kv.put(post.slug, post.content, { metadata: post.metadata });
      return post;
    } else {
      throw new Error("Post not found");
    }
  }

Then we just need a way of getting and listing posts

  async getPost(slug: string): Promise<Post> {
    const kvResult = await this.kv.getWithMetadata(slug);
    if (kvResult === null) {
      throw new Error("Post name not found");
    }
    const metadata = kvResult.metadata as PostMetadata | null;
    if (!metadata) {
      throw new Error("Post metadata not found");
    }

    if (!kvResult.value) {
      throw new Error("Post content not found");
    }

    const post = {
      slug: slug,
      metadata: metadata,
      content: kvResult.value,
    };
    return post;
  }

  async listPosts(): Promise<{ slug: string; metadata: PostMetadata }[]> {
    const { keys } = await this.kv.list();
    var posts = keys.flatMap((key) => {
      let metadata = key.metadata as PostMetadata | null;
      if (!metadata) {
        console.error(`No metadata found for ${key.name}`);
        return [];
      }
      return [
        {
          slug: key.name,
          metadata: metadata,
        },
      ];
    });
    // Sort by date
    posts = posts.sort((a, b) => {
      return a.metadata.date.localeCompare(b.metadata.date);
    });
    posts = posts.reverse();
    return posts;
  }
}

That should be all we need until we want to introduce tagging and searching.

Eventual Consistency

One gotcha with KV is the eventually consistent model. Will that cause problems? Let’s look at some concrete cases:

put("key1", "new content")

Followed by calling get several times might result in:

get("key1")
    -> "old content"
get("key1")
    -> "old content"
get("key1")
    -> "new content"

So it might take a few seconds (or up to 60) for everyone to see the new content. Not a big deal. But what about this:

list()
    -> ["key1"]

put("key2", "more new content")

list()
    -> ["key1", "key2"]

get("key2")
    -> Error("Post content not found")
get("key2")
    -> Error("Post content not found")
get("key2")
    -> "more new content"

Perhaps unexpectedly, list and get return inconsistent results. It’s easy to imagine this causing a bug where you click on a link to view the post, but it errors out. But is that theoretical and rare, or pretty common in practice?

Well I tried it, and it happens every time: calls to the list api return updated data long before calls to the get API. This means that when a post is published to the blog, it appears on the home page list almost immediately, and for a full minute the link returns a 404 page. A possible solution will be to enforce an order and a delay between the different statuses:

  status: "draft" | "unlisted" | "published";

If we change from “draft” to “unlisted” it will ensure that the page is available via direct link, but not listed on the index. We then bump it to “published” a minute later. This could be automated via Workflows.

There’s also some problems with the Admin UI. Eventual consistency might be fine for a blog, but it’s not suitable for the editing experience. If the save button triggers a page reload, then you will be shown an older version of the content, and have to keep refreshing for a full minute before you see your changes. I have a feeling that Durable Objects will be the answer for this - since it doesn’t need fast global access, the admin site can write to a DO and which can then store it to KV for the main site to read.

More to come on these solutions in Part 2.

D1 does not have read replication (yet)

2025-02-24

D1 is Cloudflare’s main relational database offering. But a year after GA and it still does not have replication. They are promising it…


D1’s read replication will automatically deploy read replicas as needed to get data closer to your users: and without you having to spin up, manage scaling, or run into consistency (replication lag) issues. 2024-04-01

Automatic read replication: our new storage subsystem is built with replication in mind, and we’re working on ensuring our replication layer is both fast & reliable before we roll it out to developers…when we enable global read replication, you won’t have to pay extra for it, nor will replication multiply your storage consumption…We think built-in, automatic replication is important… 2023-05-19

Unfortunately these promises have been misinterpreted in some cases. From the Prisma docs:

Cloudflare’s principles of geographic distribution and bringing compute and data closer to application users, D1 supports automatic read-replication. It dynamically manages the number of database instances and locations of read-only replicas based on how many queries a database is getting, and from where. For write-operations, queries travel to a single primary instance in order to propagate the changes to all read-replicas and ensure data consistency. - Prisma docs

I love the Syntax podcast, but in today’s episode it sounds like they have misread this too.

With Cloudflare’s developer week next week (hopefully, the 2024 schedule is still up), here’s hoping it is finally released.

Update 10th April 2025 - it is finally here and it is really, really cool.