Three years of cost retrospectives across mixed AWS fleets keep landing on the same finding. Teams that pick one compute commitment model and apply it across the whole fleet (all-Savings-Plan, all-On-Demand, all-Spot for the brave ones) overspend the optimum by 18 to 35 percent. The single-model decision is comfortable because it is one decision; the portfolio decision is uncomfortable because ...
The average mid-size production EC2 fleet runs at 12 to 23 percent utilization. The remaining 77 to 88 percent is idle compute that ran continuously, billed continuously, and produced nothing. On a 200-instance fleet of m5.xlarge equivalents, that idle slice is worth $150,000 to $250,000 a year. The numbers come from AWS Trusted Advisor and the Flexera 2025 State of the Cloud report and they ha...
The 3am page is rarely about something that needs a human. The on-call gets paged at 03:14 because a pod has crashlooped four times in five minutes. They open Slack, look at the logs, see "OOMKilled" in the exit reason, run kubectl set resources to raise the memory limit, wa...
The most profitable project of my life hasn’t been built yet.
My most viral post hasn’t been written yet.
I haven’t met the most wonderful person yet.
I haven’t come up with my strongest idea yet.
I believe that the best is yet to come, and that the best of the past will eventually stop being the best. That’s why I keep working on new projects, even if my current ones are at...
If you’re planning a system migration, you already know it carries risk.
The question is not whether something could go wrong — but whether those risks are being actively managed.
In practice, most migration issues don’t come from unexpected failures. They come from predictable gaps: things that were assumed to be simple, overlooked during planning, or only discovered after the syst...