The Risks of Moving Too Quickly with Serverless Development

Back in December, I got an opportunity to visit a race track and drive a Lamborghini Gallardo.

The driver’s education class beforehand taught me what the markers were on the track, what to do when we needed to make a pass, and warned of the dangers of putting the pedal to the metal. They reminded me that this was, in fact, a race car with over 750 horsepower. It put the 265 horsepower of my Jeep Wrangler to shame.

The instructor told me if I floored it, I’d likely spin out of control, hit a wall, and burst into flames (ok, that might be a slight embellishment). The gist of his message was not to accelerate too quickly. Just because it can reach a top speed of 202 doesn’t mean I should try to get it there in 5 seconds.

After I burned off all the nervous energy, I got to thinking. That sounds a lot like serverless development.

You can build apps faster than any other means I’ve seen by building on serverless technologies. But should you? Much like driving fast cars, it’s thrilling to go fast - but it’s dangerous too.

Serverless is about enablement, not reckless development. That said, we often find ourselves overtaken by whimsy with a desire to go as fast as possible.

What Are The Risks?

The faster you go the more things you miss, like observability, performance optimizations, and production-grade fault tolerance. But there are other critical risks that can’t be overstated.

Unclear Architecture - Have you ever deployed a stack straight from a GitHub repo and started using it? While the functionality and the “time to feature” might be nice, remember you have to maintain it! If you don’t know the components of your architecture, you can’t reasonably expect to know how to fix an issue when it pops up. Cloud vendors like AWS make it so easy to connect services that it takes intentional effort to slow down, think, and document the components and how data flows between them.
Poor Change Management - Do you like writing unit tests? Me neither. I’m guilty of publishing many reference architectures that don’t include tests but include a CI pipeline. There are so many CLIs that make deploying from a developer machine to the cloud as easy as running sam deploy or terraform apply. Production-ready deployment processes include pull requests, unit tests, integration tests, documentation generation, and canaries. When you pull down a stack from GitHub and run a single build and deploy script, you’re bypassing many of the checks that make sure you aren’t breaking your customers.
Wide Open Permissions - When I first started building serverless applications, I would run into IAM issues fairly regularly. I wasn’t familiar with how permissions were broken down and certainly didn’t know when I needed one versus another. This resulted in many AmazonDynamoDBFullAccess and AdministratorAccess policies added to the execution roles of my Lambda functions. Neither policy is recommended if you’re practicing the principle of least privilege. Moving fast to get past permissions issues is sure to lead to disaster.
Stale Or Missing Documentation - You know what developers like less than unit tests? Documentation. It feels like such a slowdown in a process that otherwise operates at lightspeed. Writing documentation that describes a serverless application often gets stale the moment you hit that Save button. Bringing on new developers becomes increasingly difficult as they wade through a pool of stale and up-to-date documentation. It leads to a sense of doubt in all your docs. How do you know what’s right?
No Indicators of Success - As much as we’d all like it, observability tools don’t automatically track your business metrics. You can add APM vendors like BaseLime, Lumigo, and DataDog to your account, but unless you intentionally add meaningful metrics to track your KPIs, you’re left in the dark. Metrics tend to fall by the wayside in many scenarios where speed is the primary objective. No business metrics mean you have no visibility if your application is doing what it’s supposed to do. Without these indicators, your serverless app might be up and running but you have no idea if it’s actually crashing and burning.

Avoid Risks With Careful Planning

We all want to go fast. I’m not sitting here telling you not to go fast. But I am cautioning you to not be reckless. Software, no matter how easy it is to get started, requires careful planning. Understand how your data flows, know where your bottlenecks are, and have a full understanding of how your application works. Here are a few tips I’ve stumbled upon in my journey from idea to production.

Start With KPIs - Before you write a single line of code, you need to know the business value it is going to add. If your code doesn’t directly add meaningful value to the business problem, then why are you writing it? When you understand which business objective is being satisfied with your code, you can instrument metrics directly into your code that help the product team know how the app is doing. Want to know how many people abandon their cart before purchasing? How does that number change if you send those customers a 10% off coupon? You can’t understand customer behavior with infrastructure metrics alone. You need to start with the KPIs that identify your success. Build those directly into your business logic.
Keep References As References - A reference architecture is intended to be just that…. a reference. They often are proof of concepts that were not designed for production use. When designing the architecture for your product you must keep the big picture in mind. Each component is a piece of a puzzle and you need to know what every side looks like for them to fit together properly. As you build your application, automatically generating your infrastructure diagrams is a great way to compare the plan and the actual implementation. Use this to monitor how you’ve steered away from the plan and course correct if necessary.
Bootstrap Best Practices - With few exceptions, it’s always best to incorporate your established best practices in some sort of init process. Setting up a way to minimize barriers to entry and maximize compliance with standards is a surefire way to incorporate critical tasks like linting, unit and integration testing, and preventing merges to main without an approved PR. It also provides a level of normalization on rapidly changing services which make it easier to maintain as you scale. Use processes like sam pipeline init –bootstrap to automatically scaffold these best practices. This command is extendable with your own templates, allowing you to tailor your experiences exactly how you like.
Incorporate Zero Tolerance Security Practices - I do my best to avoid talking in absolutes. But one area I whole-heartedly break that rule is security. The worst possible thing that could happen with your software is a breach that compromises sensitive data. One slip-up could give someone the ability to query your DynamoDB table for records they shouldn’t have access to. Add checks to your CI pipeline to mitigate these risks. Don’t allow wildcards in policies unless absolutely necessary. Build a strong set of governance rules in AWS Config to make sure backdoors don’t accidentally open straight to your data.
Generate Your Documentation - When possible, documentation generated in your CI pipeline is the way to go. Now, there is a fine balance here with generic, contextless documents and meaningful, intuitive documentation. Developers tend to shy away from documentation that feels like nothing more than prettified code. That is why I tend to lean toward using an Open API Spec when building APIs and an AsyncAPI Spec when building event-driven architectures. These specifications are not only industry standard, but also allow you to add contextual information in Markdown at every level. This enables you to provide examples and relatable material along with your technical specifications. Best of all, there is a plethora of tools that build beautiful documentation that slide right into your CI pipeline.
Handoff Processing to Middleware - Chances are if you are into serverless, you’re also into having as much as possible managed for you. I know I am. Why reinvent the wheel every time you need to do something like structured logging or caching data? There are many middleware solutions available that plug right into the AWS SDK designed to help you accelerate development. You don’t need to rush to get things done because they are already done for you. The toughest part about it is awareness. Before you go out and start building another event validator, do your research on available middleware. Check out Middy or Momento to see some of the available tools at your disposal. I often say “take as much responsibility off of humans as you can” because we all make mistakes. A tried and true middleware solution will make sure you aren’t making silly mistakes.

Summary

You might be thinking to yourself “everything in this list seems like common sense,” and you’re not wrong. That said, it’s easy to get caught up with serverless and push out the bare minimum to get a feature out the door. What traditionally might have taken 2 weeks to implement might only take 2 hours with serverless. That’s exciting! It’s easy to want to show your team what you built as soon as it’s functional.

Keep reminding yourself the initial build of software is only a tiny portion of its life. After you send it off to production, you have to support it! If it’s an API, you might support it forever. With this in mind, don’t make it 100x harder on yourself to back into best practices. Start with them.

Sheen Brisals recently wrote a blog post where he talks about FIRST principles for a successful serverless adoption. In it, he describes the “hello, serverless” syndrome, which many of us have fallen prey to. We write our first Lambda function and deploy it to the cloud in minutes All of a sudden we think we’re ready to build enterprise-scale serverless software. Unfortunately there’s much more to it than that.

Slow down. It’s not just about the code. It’s about the business problem.

Remember that you aren’t going to be the only one reading and maintaining the code you’re writing. You have a team of developers with you that rely on best practices and standards to write familiar, maintainable code. You have a team of product owners that rely on business metrics to know if they are building the right thing or if they need to pivot. You have integrators that need up-to-date documentation.

This article might feel like a reminder of things you already know. But do a self-check. Are you doing everything here? How sure are you that you have visibility into your app? How quickly can you spin up a new microservice? Did you write your own structured logging mechanisms?

We’re all in this together and we all want to see each other succeed. So let’s do it.

Happy coding!