Tailing the CosmosDB Change Feed

This will hopefully be a short post about how to listen to changes from your CosmosDB SQL API using the Node SDK. For some reasons, it reminds me of the oplog tailing mechanism in MongoDB that Meteor utilized in the past to achieve real-time, hence this article's title.

I do not have much time right now, but let us talk about a bit of my use case for this. I am currently working on an IoT project that uses several Azure technologies as its infrastructure. As you might have guessed, it pipes data from the IoT sensors into Azure Event Hubs. The data then gets ingested in Azure Databricks; ultimately ending up in an intermediate data warehouse (for dashboards and further analysis) and a NoSQL application data store (coming up next).

This NoSQL application data store is the CosmosDB instance that we are going to tail. The objective was to push changes from this data store to a mobile application in real-time or at least real-time (vs long polling or querying in set intervals). To make the long story short, I ended up tailing the CosmosDB's Change Feed in a GraphQL service application to make it easier for the client application to implement a PubSub system. More about this in an older post that I published early this year.

Before we dig into the code, let me just say that we are not going to go through how to initialize your CosmosDB collection or how to write a facade / repository class in NodeJS. We will go straight to the Change Feed part but first, make sure you have the correct Node SDK in your project:

npm install --save @azure/cosmos

Once you have set up your CosmosDB boilerplate code in Node, you can access the change feed iterator using the code below:

const changeFeed = container.items.readChangeFeed(partitionKey, options);

If you need to brush up on iterators in JavaScript, you can check this post that I have written last year).

And that's it! You now have a change feed iterable where you can listen to changes from CosmosDB, but let us see what options we have in doing so. The Node SDK currently gives us four ways of listening to the change feed and they are:

1. Start from now

const options = {};

2. Start from a continuation

const { headers } = await container.items.create({ ... });
const lsn = headers["lsn"];
const options = { continuation: lsn };
// I have not used this specific method yet, this example is from here.

3. Start from a specific point in time

const options = { startTime: new Date() }; // any Date object will do

4. Start from the beginning

const options = { startFromBeginning: true };

As all of these are self-explanatory, we will go ahead and use our changeFeed iterator. To retrieve the next value, we can use await changeFeed.executeNext() or we can loop through the next values like this:

while (changeFeed.hasMoreResults) {
  const { result } = await changeFeed.executeNext();
  // do what you want with the result / changes
}

Reading the source code of the Node SDK revealead that it is also exposing a real generator (the function signature being public async *getAsyncIterator(): AsyncIterable<ChangeFeedResponse<Array<T & Resource>>>). This would have allowed a more elegant for of construct, but unfortunately I bumped into a few issues regarding Symbols when I tried it. If you have used it in the past, please feel free to share in the comments!

That will be all for now, and I hope you learned something in this post.