Sponsor: Do you build complex software systems? See how NServiceBus makes it easier to design, build, and manage software systems that use message queues to achieve loose coupling. Get started for free.
I’ve started to use Azure Cosmos DB a bit more over the last couple weeks and I’m really enjoying it. The first real world scenario that I hit was needing to implement optimistic concurrency. This led me straight into I discovered two caching optimizations you can make for better performance accessing individual documents.Caching SelfLinks
If you are using the .NET SDK, each document contains a uniqueSelfLink
property. This is represented by the _self
property in the JSON.
They are guaranteed to be unique and most importantly immutable.
Because the SelfLink
is immutable we can cache it and and then use it to access the associated document.
It is more efficient to access the document directly via the SelfLink
rather than querying the collection and filtering by Id
.
ETags
Each document within Azure Cosmos DB also has has an ETag Property. This is the_etag
in the json document or when you are using the .NET SDK as the ETag
on your Document.
The ETag or entity tag is part of HTTP, the protocol for the World Wide Web. It is one of several mechanisms that HTTP provides for web cache validation, which allows a client to make conditional requests.You may be familiar with ETag’s related caching. A typical scenario is a user makes an HTTP request to the server for a specific resource. The server will return the response along with an ETag in the response header. The client then caches the response along with the associated ETag.
ETag: "686897696a7c876b7e"If they client then makes another request to the same resource, it will pass a If-Non-Match header with the ETag it received.
If-None-Match: "686897696a7c876b7e"If the resource has not changed and the ETag represents the current version, then the server will return a 304 Not modified status. If the resource has been modified, it will return the appropriate 200 status code along with the content new ETag header.
AccessCondition
Azure Cosmos DB uses ETags for handling caching exactly as you would expect for caching. We can storeETag
when we retrieve our document and then subsequently use that ETag
when we need to fetch the same document again. We can do this by creating an AccessCondition
and specifying an IfNonMatch
as the AccessConditionType
when we call ReadDocumentAsync
.
Cache Client
Putting it all together can look something like this. I’m using theMemoryCache
to store our fetched documents. Since these documents contain the SelfLink
we can make any other request to that document directly. Also with the ETag
on the document, when we query the document directly, we can specify an If-None-Match
for the server to return us a 304 Not Modified.
Since this is just a simple extension method on the DocumentClient, here are a couple of tests that verify that the document is from the cache when the server returns a 304.
what about caching on .NET instead .NET core ?
Check out System.Runtime.Caching
Keep in mind that if you have a Document cached, then that document is deleted, this line:
var response = await client.ReadDocumentAsync(cacheEntry.SelfLink, new RequestOptions { AccessCondition = ac });
will throw an exception:
catch (DocumentClientException ex) when (ex.Error.Code == “NotFound”)
Thanks! Great observation.
Thanks for posting this article — the company i work for has recently moved to Azure so we are all learning the ropes of Cosmos DBs. I referenced this article in my write-ups for a demo on certain caching techniques.