I’ve always been a fan of a good metaphor. It’s one of the best ways you can get people to understand a complex problem. If you can find something your audience can relate to, you have a significantly higher likelihood of them comprehending what you’re saying.
This is especially true when you’re talking about advanced technical mumbo jumbo like distributed computing. If you’re new to the cloud, you or some of your coworkers might have heard the phrase distributed computing but might not know exactly what it means.
Have no fear, today we will learn what distributed computing is with the use of a metaphor. A duck-ing good one.
At its simplest, distributed computing is a fancy way of saying “multiple computers working together to act as a single system.” When the workload gets heavy, the computers distribute the work amongst themselves to make things fast and efficient.
With cloud-native applications, you commonly see this in messaging systems. If you add some work to a queue, multiple computers/servers look at the same queue and process it together.
Ok, now that we understand what we’re trying to explain, let’s bring the ducks in.
Imagine you are at the park to feed the ducks some bread (tip: do not actually feed ducks bread, it is not the best for their diets). You have your loaf of bread and walk over to a group of ducks.
You throw bread to them one slice at a time and they grab and eat it. You throw the bread out faster and different ducks are able to grab slices of bread while the others are eating.
After a couple of minutes, your entire loaf of bread is gone. Those ducks really tear through bread quickly. If there had only been one duck, it would have taken much longer to get through the loaf. You would have had to wait on the lone duck to finish the slice of bread before tossing her the next one.
This is distributed computing. When a message or unit of work comes in, the servers see it, pick it up, and process the work. The more servers you have, the quicker you get through the work under load.
On the surface, this sounds great. Throw more servers out there, link them up, and you have a super fast and fault-tolerant system, right?
Weeeeeellll, yes and no.
A large number of servers under heavy load will start to compete with one another for work. Mathias Verraes puts it best:
There are only two hard problems in distributed systems: 2. Exactly-once delivery 1. Guaranteed order of messages 2. Exactly-once delivery— Mathias Verraes (@mathiasverraes) August 14, 2015
Why are these difficult problems to solve? Let’s turn back to the ducks.
Exactly once delivery guarantees that a message will be processed only one time. Sounds like something that should be easy.
The circle of ducks are all waiting for you to throw out a slice of bread. You throw a slice and two ducks grab it at the same time. They fight for a minute but eventually the bread breaks and both ducks share the same piece of bread. It really was meant for one duck, but you can’t control them. The timing was just right where they both grabbed it at the same time and ended up splitting it.
This is why exactly once delivery is hard with distributed computing. You can’t make the servers pay attention to each other to see who got what. Sometimes when the timing is just right, two machines will pick up the same message and process it twice.
Guaranteed order is pretty much exactly as it sounds like. If I send in messages with the letters A, B, C, and D respectively, I would want A to be worked first, then B, then C, and finally D.
In our duck scenario, we’re back to throwing bread slices one by one. The ducks are loving it, so we throw bread out faster. But there are a couple of ducks that are bigger than the others. They grab the bread and swallow it in one bite while the other ducks might take a few seconds to eat.
A small duck grabs a piece of bread and starts eating it. Then the big duck grabs the next slice and swallows it whole. The small duck finishes her piece of bread a few seconds later. The bread you are throwing was grabbed in the correct order but it was finished in a different order.
To a computer, this matters. If the series of messages rely on each other to be completed, then you might run into some problems.
You can’t guarantee how fast a machine is going to process a message. Even if all of the servers in your system had the exact same specifications, something environmental or network related could prevent it from finishing before others. This is what makes guaranteed order difficult in distributed systems.
Distributed computing is an incredible way to achieve performance at scale. If you use a cloud-native technology like serverless, the complexities of managing hundreds or thousands of machines is extracted away from you. You get the benefits of scale and performance without the hassle of worrying about keeping your machines up to date and in sync.
If you aren’t already, encourage your team to consider implementing a distributed system for your software. Your customers will thank you. Hopefully being able to describe what distributed computing is in a fun and easy way will help you build a shared understanding and make your team more willing to tackle it.
Good luck and remember to always have fun with it!