Infrastructure

It started with a postmortem I never wanted to write. A merchant flash sale launched 20 minutes ahead of schedule. Traffic to the payment authorization services doubled in under a minute. Kubernetes HPA did exactly what it was configured to do — it detected the CPU spike and requested scale-out across over 150 checkout-path services simultaneously. Most new pods fit on existing nodes, but dozens of services exhausted their node pool headroom and triggered provisioner requests in a burst. The node provisioner stalled under the queue pressure. New capacity came up four minutes later. ...

Infrastructure

The Cluster That Learned to Plan Ahead

Why 1,500 HPAs Is Not an Autoscaling Strategy