The Complexity of Caching

Caching ain’t easy! There are many factors that add to the complexity of caching. My general recommendation is to avoid caching if you can.

However, caching can bring performance and scaling which you might need. If you’re starting to use a cache in your system here are some things to think about. Adding a cache isn’t that trivial and requires some thought about caching strategies, how to invalidate, and fallbacks to your database. Caching can improve performance and scalability, but can also bring your entire system down if it’s failing.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

Strategy

The first thing to think about is the caching strategy. The two most common methods that I’ve noticed in code-bases are the write-through and cache aside methods.

The write-through method is when your application writes to its primary database, and then immediately updates the cached value. Meaning if you add a new record to your database, you immediately add the equivalent value to the cache. If you were to update a record, you would immediately update the cached value.

The Complexity of Caching

The Complexity of Caching

The second method most often used is the Cache Aside (Lazy Loading) method. This can be used in-conjunction with Write-Through or can be used by itself.

When the application needs something from cache, it first tries to retrieve it. If it does not exist (cache miss), it will then hit the primary database. Then it will write the value to the cache. Essentially you’re lazy loading the cache when data is requested for values that are not in the cache.

Invaliding the Cache

There are only two hard things in Computer Science: cache invalidation and naming things. -Phil Karlton

https://www.karlton.org/2017/12/naming-things-hard/

If you’re not using the write-through method, then that means your cache is stale when data gets updated in your primary database.

There are a couple methods I’ve used to invalidate the cache (remove the value from cache) and let the cache aside (lazy loading) method do it’s job.

Cache Expiry (TTL)

Most caches have the ability to expire a cached value after a period of time (time to live). When this occurs by the cache, the next call for an expired item will have to go through the 3 steps of the lazy loading method to re-populate the cache.

Async Messaging

The second method is using asynchronous messaging to notify another process that data has changed and to invalidate the cache.

This requires you to already be using messaging (events) and have a well defined API on where data is mutated in your system. If you have any external system modifying data within your database, you will not be able to emit an event everytime data is changed.

If you’re using something like Entity Framework, you could override the SaveChangesAsync to look at the ChangeTracker to determine which entities have changed and publish events.

Failures

One benefit to the Cache Aside (Lazy Loading) method is that if for whatever reason, you cannot reach the cache, you can fallback to using your database. This would work exactly like a cache miss. You would need to handle the appropriate Exceptions and Timeouts from the cache client to determine the Cache is unavailable, and then go directly to the database and return the value.

The Complexity of Caching

The one thing to very aware of, if you cache is unavailable, that all requests are now going to be fulfilled by the database. This could have a significant performance impact on your primary database. Depending on how many requests are normally handled by your cache are now adding all that extra load to your database.

Complexity of Caching

The complexity of caching isn’t trivial.

Avoid caching if you can.

First, look at the queries to your primary database before going down the path of adding a cache. There are many more complexities that you introduce when adding a cache. Avoid it if you can.

Follow @CodeOpinion on Twitter

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.

Links

Talking C# Performance with Steve Gordon

We’re talking C# Performance! I sat down and chatted with Steve Gordon to talk about writing performant C# code. When should you be concerned about performance? Does your application code need to be performant? At what cost to readability? How do you measure and test that you’re changes are useful? Steve and I cover all of this in this video.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

When to think about optimizing?

As Steve mentions, it’s easy to get really deep into writing performant code. It’s fun! However, it’s probably a good idea to have a discussion with stakeholders to figure out how much performance matters.

This answer can vary wildly.

If you’re writing framework or library code, I think performance matters. As an application developer, I want framework or library code to get out of my way. Meaning it has as little overhead as possible.

When writing application code, this goes back to figuring where performance matters.

Once you do want to get deep into performance, here are some resources and blog posts that I think would be useful.

BenchmarkDotNet

BenchmarkDotNet helps you to transform methods into benchmarks, track their performance, and share reproducible measurement experiments.

dotMemory

dotMemory allows you to analyze memory usage in a variety of .NET and .NET Core applications: desktop applications, Windows services, ASP.NET web applications, IIS, IIS Express, arbitrary .NET processes, and more.

Follow @CodeOpinion on Twitter

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.

Links

CQRS Myths: 3 Most Common Misconceptions

Although Command Query Responsibility Segregation (CQRS) seems to be a term a lot of developers are aware of, I do think the majority have the wrong definition. Like many terms in the software development industry, things over time get confused, and then those confusing ideas propagate. These are the 3 CQRS Myths I see or hear the most often.

YouTube

Check out my YouTube channel where I post all kinds of content that accompanies my posts including this video showing everything that is in this post.

CQRS Myths

I do understand why people can be confused by CQRS as if you do a search, you’re bound to find blog posts that explain CQRS very differently. Some posts get it right, however a lot of others conflate CQRS with other concepts.

So which blog posts are right? Who has the correct definition? How do you even know what I’m writing is correct?

CQRS Definition

Greg Young, who coined the term, used to blog on CodeBetter.com and his own personal site, goodenoughsoftware.net.

Unfortunately, neither of those sites exist anymore. However thanks to archive.org, we can go back to a blog post that Greg wrote in 2010.

Example

Instead of having a single object that has both write & read operations, you would split that into two separate objects. For example:

CustomerService has methods that return state (read) and mutate state (write). In order apply CQRS, we simply need to split these up into separate objects that will do one or the other, but not both.

That’s it. CQRS applied.

So why the confusion?

I assume most of the confusion is because most examples showing CQRS are also showing a bunch of other things, which is where these myths come in.

CQRS Myths

There are so many blog posts that will show a diagram like the one above, that are applying CQRS, but also doing other things.

Myth #1: Multiple Databases

CQRS does not require you to have multiple databases. You can use the same database for writes (commands) as you can for reads (queries).

CQRS Myths

It’s not a physical separation but more of a logical one. If you’re using a relational database, you could create SQL Views that only your query side accesses.

Myth #2: Event Sourcing

Event Sourcing is the concept of storing immutable events derived from actions within your system that represent state change. Greg Young has been at the forefront of Event Sourcing, and since he coined the term CQRS, I can see how Event Sourcing often gets lumped into it.

Most often times if people are referring to both CQRS and Event Sourcing, they will use the label “CQRS/ES”. I think it’s a good idea as I hope it’s preventing more confusion.

You do not need to be Event Sourcing in order to be doing CQRS.

Myth #3: Asynchronous Messaging

You do not need to be using a message bus to apply CQRS. Commands do not need to be asynchronous. Everything can be done in a synchronous request/response manner.

I think this myth stems from people using a message bus along with Event Sourcing so they can create projects for a separate read model.

More

Greg wrote on his blog in 2012 common misconceptions that I don’t think if left us yet. Hopefully, this post shining a light on his original posts help clear things up.

Follow @CodeOpinion on Twitter

Enjoy this post? Subscribe!

Subscribe to our weekly Newsletter and stay tuned.

Links