
So much of agent traffic is repetitive. By adding in caching layers, you can both improve your user experience and decrease your bills in one fell swoop.
read moreIn 2025, I did a series of videos that went through five design principles for production-ready AI agents. In my episode about guardrails, I made the statement, “If you can get away with not calling a model, do that.” In the episode about agent tools, I made an argument saying to build idempotent tools and store responses temporarily in case of retries or duplicate requests. I didn’t know it at the time, but I essentially was making the same argument just with different words.

So much of agent traffic is repetitive. By adding in caching layers, you can both improve your user experience and decrease your bills in one fell swoop.
read moreThank you for subscribing! Check your inbox to confirm.
View past issues. | Read the latest posts.