Comparing AWS Step Functions and Temporal: A Developer’s Perspective

Comparing AWS Step Functions and Temporal: A Developer’s Perspective

I build a lot of automations. My imagination often runs wild with things I’d like to create and try out, and usually I go build them. But then I have a choice to make: do I maintain them forever or do I abandon them and hope I took some lessons away.

While the majority of my little experiments early on in my career suffered the latter fate, I stuck with the maintenance route for many projects over the past few years. But my kids starting getting older. And I bought a farm. And I started to help run the Believe in Serverless community. Time, or lack thereof, got the better of me, and I no longer could maintain the projects I know and love. So I built automations to do it for me.

Tasks I used to do by hand every few days could now be done with the help of generative AI and an orchestrated workflow without me having to so much as lift a finger. Results are emailed to me weekly with a summary of everything that was done and an indicator if I have any action items to perform. Pretty slick.

All this is thanks to multi-step orchestrated workflows. “Do this, then this, and if that, then do these two things,” in other words. Historically, I’ve stuck with AWS Step Functions to build these workflows given my affinity towards AWS. But there’s been a lot of chatter in the Believe in Serverless community about Temporal as an orchestration engine. The chatter is mostly “has anyone tried this out” with most of the answers being “no”, so Andres Moreno and I decided to try it out on a livestream a few weeks ago.

It took us a minute to figure out how to get it going, but once we did - we liked what we saw. It has many similarities to Step Functions, but also has some significant differences. Let’s compare the two and see which one might be better for your use case.

Comparing Step Functions and Temporal

Determining which service is “best” is entirely subjective, as it varies from person to person what indicators of success look like and what objectives they have in a specific service. Because of this, I’ll objectively cover several core areas of each of the services and let you come to your own conclusions on which one is “best”.

Table comparing AWS Step Functions and Temporal

Building Blocks: Tasks vs Activities

A workflow is composed of multiple building blocks. You chain the blocks together in a meaningful way to solve your business problems.

Step Functions: These blocks are known as tasks. Tasks are limited to AWS SDK operations, logical branches, and 3rd party HTTP API calls. Not all AWS operations are supported, but it feels like most of them are. If you want to use a 3rd party service that does not have an HTTP API, you need to create a Lambda function and use it there.

Temporal: Building blocks are called activities. Activities are reusable function calls written in code. These are built and maintained by you, but they can do anything! The tradeoff here is that while these are all highly composable and free to do anything - you’re responsible for them 100%.

Designing Workflows: Visual vs Code-Based

Step Functions: There are two options for building workflows. You can manually write each step and transition in JSON or YAML using Amazon States Language (ASL), or you can build workflows visually using the Workflow Studio. Workflow Studio can be used in the browser via the Step Functions console or directly inside of VSCode through the Application Composer designer in the AWS toolkit plugin. All possible tasks and logic trees are available in the studio graphically, allowing you to see everything you can do quickly and easily.

Design experience of Step Functions vs Temporal

Temporal: This is completely code-based. There is no visual designer or flashy builder. It’s just code. Your available programming languages are Go, Java, PHP, Python, .NET, and TypeScript. All you need to do is import your activities and chain them together. Simple as that!

Both Step Functions and Temporal offer configuration on each of their states/activities like built-in retries, timeouts, and compensation logic on failures.

Starting a workflow: Triggers and Timers

Step Functions: Workflows can be started in one of three ways. Via a trigger, through the SDK, or manually through the console. A trigger could be an event from another AWS service like S3, a published event in EventBridge, or a scheduled timer (with frequency down to every minute unless you know some tricks).

Temporal: Since it is not part of a larger ecosystem, you don’t have the native triggers from other services like you do with Step Functions. However, you can start workflows via the SDK, through an HTTP trigger, and on a timer. Timers with Temporal can be configured to run every second, offering more granularity over Step Functions. Not only that, but the timers can be dynamic, allowing you to configure them any way you like, not just on a regular interval. You can do point-in-time, sleep for N seconds, and intervals (just like Step Functions).

Both Step Functions and Temporal allow you to run workflows both synchronously and asynchronously.

Execution Environment: Managed vs Self-Managed

Step Functions: Like all of AWS’s serverless services, Step Function workflows run on managed infrastructure provided by AWS. This means you don’t know which VMs or servers are executing your code. You don’t have to manage capacity or load balance, AWS simply does it for you. Your primary concern is building workflows that solve your business problem.

Temporal: Workflows run on workers that live on compute that you manage. This could mean a server you own and operate, a Kubernetes cluster hosted in EKS or your data center, or even ECS Fargate. This means that you are responsible for capacity management, scaling, and load balancing. Workers are stateless, which means they are easy to scale horizontally, but keep an eye on the resources of the compute running them to make sure you don’t over-provision an instance.

Debugging Workflows: Visual and Table Views

As much as we’d like to believe our code has zero bugs, that is never the case. Troubleshooting workflows in progress or completed workflows is almost a daily occurrence in many production applications. This means the DX of navigating your workflows from state to state and from an overall perspective is critically important to the maintainability of your application.

Step Functions: Offers both a visual representation of your executed workflow and a table view. You can clearly see which states the execution traversed and what the data looked like as inputs and outputs of every state. It also gives you a timeline view highlighting how much time was spent in each state.

Step Functions debugging experience

Temporal: You get all of the same features with Temporal except the visual representation showing which states were executed. The workflow dashboard shows execution details, input and outputs of each activity, and a timeline view of when each activity was run. As previously discussed, since Temporal is run on workers, you also see which of your workers picked up and processed the workflow. This is extremely useful for finding environmental errors if one of your workers gets into a bad state.

Temporal debugging experience

Cost: Apples to Crab Apples

Pricing between these two services is close and are charged on roughly the same metrics, but not quite. So it’s like comparing apples to… crab apples 😅

Step Functions: There are two types of workflows in Step Functions: standard and express. When it comes to cost, standard workflows are priced at $.025 per 1,000 state transitions. Express workflows are priced similarly to Lambda, billing $1.00 per 1M requests and $0.00001667 per GB-Second.

As far as serverless services from AWS go, this tends to be one of the pricier services, but you get what you pay for. You’re not only getting the orchestration, but you’re getting that best-in-class observability of your workflows as well.

Temporal: For the managed version of Temporal, Temporal Cloud, consumers are charged for three metrics - number of actions, storage, and support. Actions are $25 per 1M, storage costs $0.042 per GB-Hour for active data and $0.00042 per GB-Hour of retained data, while support costs $200/month for basic and $2000/month for premium.

If you choose the self-managed version of Temporal, then it’s free! Kind of. Instead of being charged for actions, storage, and support, you take on the cost of the infrastructure hosting the workers and data.

My subjective opinion

I use Step Functions in every one of my projects. My projects are hobby-scale and rarely incur costs every month. Since cost is a primary driver for me, I default to Step Functions because it’s free. That said, is the experience perfect in every way? No. Does Temporal offer some things I wish Step Functions has? Absolutely.

I really like the idea of writing all your actions as code. Being a tenured programmer, I find writing code intuitive and fast. On top of that, building reusable activities that I can import into any new project would be a huge time saver. Plus, unit testing a code-based activity is dirt simple. These are all capabilities you don’t get with Step Functions. There is no code-based activity library or workflow builder - it’s simply building blocks and JSON.

I’ve said it many times before, don’t confuse unfamiliarity with complexity. Both of these services have a learning curve and they both have their pros and cons. I can’t fairly say one is better than the other. My personal opinion is that I like Step Functions more because I prefer the visualizations. Not only is it easier (for me) to maintain, but it is also an easy way to share the business logic with non-technical stakeholders.

To be honest, I like both services. But the reason I will not be using Temporal for my side projects is the $200/month support charge for the managed service. If that ever goes away, I might consider investing more time in it. Overall though, I recommend both of them as orchestration engines. You can be the judge for which one suits your needs best.

Happy coding!

Allen Helton

About Allen

Allen is an AWS Serverless Hero passionate about educating others about the cloud, serverless, and APIs. He is the host of the Ready, Set, Cloud podcast and creator of this website. More about Allen.

Share on:

Join the Serverless Picks of the Week Newsletter

Stay up to date with the best content serverless has to offer, learn about the latest updates to AWS serverless services, and get to know community superheroes, catered by AWS Serverless Hero Allen Helton. New issue every Monday.
Click here to see past issues.

Join the Serverless Picks of the Week Newsletter

Thank you for subscribing!
View past issues.