Vladyslav Ratslav

Cloud Architect · DevOps · MLOps · SRE Consultant
article preview

Proactive Kubernetes Scaling With Cron and KEDA: A Practical Way to Save Cloud Costs

Published by Vladyslav Ratslav · Cloud Architect · January 2026

Also published on LinkedIn: Read on LinkedIn

There are many ways to save money on cloud infrastructure, and Kubernetes gives you a huge toolbox to work with. But before diving into specific tricks, there's one principle that consistently proves itself in real-world systems:

The most reliable way to keep a system both stable and cost-efficient is to let it auto-scale as much as it needs - and to let it scale fast.

Autoscaling prevents overprovisioning, avoids idle capacity, and keeps performance predictable. But sometimes, autoscaling alone isn't enough.

When You Know the Storm Is Coming

Every engineer has seen this pattern: a daily batch of API calls, a heavy scheduler job, or a predictable traffic spike. You don't need metrics to tell you what's about to happen - you already know.

In these cases, waiting for HPA metrics to trigger scaling is unnecessary. You can simply scale up proactively, right before the storm hits.

This avoids cold starts, prevents latency spikes, and ensures the system is warm and ready when the load arrives.

Preparing the Environment: Test Deployment

Before exploring scaling techniques, it's useful to have a simple deployment for validation.

# nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

This deployment will be used later to verify both Cron-based and KEDA-based scaling.

The Simple Approach: CronJobs That Scale Up and Down

The most straightforward solution is to create two CronJobs:

  • One to scale up before the heavy workload
  • One to scale down after it finishes
kubectl scale --replicas=6 rs/foo
kubectl scale --replicas=1 rs/foo

The important part is that the CronJob's pod must have permissions to control the cluster. Otherwise, you'll get a Permission denied error.

Example CronJob (scale up)

apiVersion: batch/v1
kind: CronJob
metadata:
  name: example-cronjob
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: "<JOB_SERVICE_ACCOUNT>"
          serviceAccount: "<JOB_SERVICE_ACCOUNT>"
          containers:
          - name: example-task
            image: busybox
            command:
              - "/bin/bash"
              - "-c"
              - oc scale --replicas=$REPLICA_COUNT dc $DEPLOYMENT_CONFIG
            env:
            - name: REPLICA_COUNT
              value: "6"
            - name: DEPLOYMENT_CONFIG
              value: "rs/foo"
          restartPolicy: OnFailure

The scale-down CronJob is identical - just set REPLICA_COUNT=1.

But there's a catch.

The Catch: Autoscaling May Override Manual Scaling

If HPA or VPA is enabled, your manual replica count may clash with autoscaling logic. Kubernetes will happily override your manual settings if metrics dictate otherwise.

To avoid this conflict, you need a solution that integrates cleanly with autoscaling.

The Better Solution: KEDA Cron-Based Scaling

A solution I've tested and implemented across multiple SaaS platforms is KEDA Cron Scalers. It works flawlessly and integrates naturally with Kubernetes autoscaling.

First, install KEDA (always follow the official documentation - it's clear and reliable).

Once installed, you can define a ScaledObject that handles time-based scaling:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    name: nginx-deployment
  minReplicaCount: 1
  cooldownPeriod: 300
  triggers:
  - type: cron
    metadata:
      timezone: CET
      start: 00 13 * * *        # At 1:00 PM
      end: 00 15 * * *          # At 3:00 PM
      desiredReplicas: "20"

Validate the result:

kubectl get scaledobject
kubectl get hpa
kubectl get pods

Why This Works So Well

KEDA handles the scaling logic in a way that doesn't fight with HPA. It respects autoscaling boundaries, integrates cleanly with Kubernetes primitives, and gives you predictable, time-based scaling without hacks or race conditions.

This approach:

  • Prevents pod overload during predictable spikes
  • Ensures new pods start on time
  • Avoids node-level cold starts
  • Reduces cloud costs by scaling down automatically
  • Keeps autoscaling behavior consistent and conflict-free

Final Thoughts

Optimizing cloud infrastructure cost and performance is a delicate process. There's no single magic trick - different workloads require different strategies.

But if you know when your load is coming, proactive scaling with KEDA Cron is one of the cleanest, most reliable, and most cost-effective techniques you can implement.