The serverless community talks about latency a lot.
We talk about how service-to-service connections improve or decline over time. We talk about optimization strategies for Lambda. We debate over the best DynamoDB data modeling practices for performance.
We talk about it for good reason - it directly relates to cost! The longer our Lambda functions are running, the more we get charged on our monthly bill. It’s a core cloud concept that we’re all familiar with at this point.
But what about location? The geographic distance between where your users are and where your code is running makes an impact on performance. We don’t talk about that as much as we should.
Sure, we talk about adding CloudFront distributions in front of your site or running some compute with Lambda@Edge, but there’s so much more to it than that - especially when it comes to 3rd party packages! If your Lambda function is running in eu-west-1 but the 3rd party API you’re using runs out of California, there’s a significant amount of distance that traffic has to cover to respond!
The laws of physics apply with data transfer. As much as I’d like to believe the internet is magic and things happen in an instant, that is unfortunately not the case.
Did you know you can (kind of) calculate latency based on physical distance?
Optimistically, data can travel at the speed of light. Realistically, it’s not going to do that. It has a travel medium, like a fiber-optic cable or a copper wire, which slows the travel time down. Also, with very few exceptions, data is not going to travel point to point. The API request you make from your machine will not jump directly to the server running your code. It will take multiple hops on its way to the destination.
In fact, to get from my home to an API that lives in AWS region us-east-1, there were 18 hops!
Because of this, we can’t calculate latency strictly at the speed of light. We have to slow it down to account for loss via the mediums (copper/fiber). In general, we can estimate that data will travel at about 200,000,000 meters per second to account for the loss. Which is still fast, but not negligible!
To put a relatable number on this time, New York City is approximately 16,000 km away from Syndey, Australia. Making a request between these two cities would take about 160ms. Once you add in latency for network hopping and queueing, you could reasonably expect that request to take about 300ms round-trip! And that’s just travel time - you still have to process the request and return a response.
There are an infinite number of variables that affect network latency. Calculating the fastest possible time due to the proximity of the compute resource is a good way to show you the best-case scenario. You know that due to the laws of physics, it can never be faster than that.
I experimented to see what would happen if you have a Lambda function that calls out to a third-party API in different geographic locations. My API lives in us-east-1, which is based primarily out of North Virginia.
I live in Dallas, Texas, which is about 2000km away from the data centers in us-east-1. Given our formula from above, I’d expect a round-trip latency of about 38ms. But what happens when the function in us-east-1 connects somewhere geographically distant?
To run the experiment, I used the Node.js Momento SDK to get and set color-hex combinations in a cache. The Momento SDK allows you to provide an auth token targeted at a specific region, so I ran a set of 1,000 API calls in the following regions.
Given the distance of each region, I’d expect the latencies to be considerably different when hitting the same endpoint from my machine in Dallas.
This was using the same API endpoint with the only difference being the region the Momento SDK was connected to for data operations. It shows strong evidence that distance has a tangible impact on latency.
In-region is obviously going to be the fastest since the data doesn’t have to travel nearly as far. A curious observation is that the latency is higher going to asia-northeast1 than it is to api-south-1 despite being a shorter distance. It could be due to the number of network hops or the quality of the cables the data is being transferred across. Whatever it is, it’s a reminder that there are dozens of factors that go into network latency.
Bringing your app closer to your customers is only a piece of the puzzle. When you use something like a CDN you bring the content of your user interface geographically closer to your end users. Since I live in Dallas, TX, content behind Amazon CloudFront would be delivered to me from the Dallas / Fort Worth edge location. This proximity means webpages could be served to me with < 1ms of latency due to geographic distance.
But what about the data? APIs and back-end processing don’t live in these edge locations. You can add Lambda@Edge, which offers geographically distributed “middle layer” functions" but it’s not intended to run your core business logic. So what can you do?
For starters, you could deploy your application in multiple regions. Whether you’re in AWS, GCP, Azure, or any other cloud vendor, you have a set of geographically diverse regions to run your code. Use IP-based routing in your DNS layer to route requests to the closest region to your caller.
But again, that’s only part of the problem. In our example above, the 3rd party package ran in completely separate regions around the world.
It’s your responsibility as a builder to know where your code is running.
When assessing a 3rd party package you must figure out how you can tune the clients to run closer to the rest of your code, if possible. This might not be obvious during your implementation but as you do performance tuning it’s something to look out for.
Use trace data to dissect your execution times. Traces show you where all the time is being spent while your code is running so you can quickly identify the slower components and act on it.
Cold start times aren’t the only contributors to latency in serverless applications! The geographic distance between your end users and code makes a big impact on performance.
Consider where all the components of your application are running. If the back-end code is running out of AWS region us-east-1 but you’re using a 3rd party API that only runs out of Mumbai, India, you might reconsider using that package. Not only will it increase the network time of your calls but it will raise the cost of your application as well. The longer you wait on 3rd party processes in a Lambda function, the more you pay because of increased execution times.
Use trace data to identify if your 3rd party packages are slow. When doing your research, read the docs or do a Google search to find where the data centers are located.
If you discover you can tune the client to use a centrally located data center, do it! If you can’t…. weigh the pros and cons of the latency hit vs the value it provides.
Deploy your application to multiple regions around the world. Where are your customers? Do you support an instance of your application in close proximity to them? How much would it take to deploy another instance close by? Do you have the operational power to support it?
Use an appropriate routing policy to direct traffic to the appropriate region of your application. You can choose options like geolocation, latency, or IP-based routing to optimize performance for your end users.
Note - based on the method you choose, you will have different data replication requirements. If you dynamically route user traffic to different instances of your application, you must replicate data to all instances (usually with a strongly-consistent guarantee or tight replication SLA). But if you assign user profiles to specific regions, your replication requirements go away.
Remember that it’s not always about the code! The laws of physics play an understated role in computer science. When building serverless applications, it needs to be considered when designing your application. It could save you thousands of dollars every month!