CronJobs in Kubernetes - In-Depth Guide¶

What is a CronJob?¶

A CronJob in Kubernetes is a higher-level API object used to run Jobs on a scheduled basis, similar to how the cron utility works in Unix/Linux. It allows you to automate recurring tasks such as database backups, sending reports, clearing caches, or syncing data at defined intervals.

📌 A CronJob creates a new Job resource according to the schedule you define. That Job, in turn, creates one or more Pods to run the actual workload.

Syntax Overview¶

Here is a minimal example of a CronJob YAML:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/5 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            command:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

Anatomy of a CronJob¶

Let’s break down each component in detail:

1. `schedule`¶

Format: Standard cron format ("minute hour day month day-of-week")
Example: "0 0 * * *" means once per day at midnight.
You can use special characters:
* — every possible value
*/5 — every 5 units (e.g., every 5 minutes)
1-5 — range (e.g., Mon–Fri)

2. `jobTemplate`¶

A CronJob doesn’t run containers directly. It creates Jobs, and Jobs create Pods.
This section defines the template for those Jobs, just like a regular Job YAML spec.

3. `restartPolicy`¶

Must be OnFailure or Never. Always is not allowed.
It tells Kubernetes what to do if the Pod exits unexpectedly.

Key CronJob Fields¶

1. `startingDeadlineSeconds`¶

Maximum time (in seconds) the system has to start a Job if it misses its schedule.
Useful when your cluster is under heavy load and a Job start is delayed.
Example: If this is set to 200, and the schedule is every 5 minutes, but the controller checks late by 210 seconds, it will skip that run.

2. `concurrencyPolicy`¶

Controls what happens if the previous Job hasn’t finished when the next one is scheduled.

Allow (default): Runs Jobs concurrently.
Forbid: Skips the new Job if the previous one hasn’t finished.
Replace: Deletes the currently running Job and replaces it with the new one.

3. `suspend`¶

Boolean field that disables a CronJob without deleting it.
Use case: Temporarily stop scheduling (e.g., for maintenance).

CronJob Execution Flow¶

At scheduled time, CronJob controller checks if a Job needs to be created.
If conditions are met (not suspended, not too late), a Job is created from the jobTemplate.
That Job runs its Pods as usual.
Success/failure is recorded. If TTL is set, the Job can be auto-deleted.

Real-World Example: Database Backup¶

apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-job
spec:
  schedule: "0 */6 * * *"  # every 6 hours
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: my-backup-image
            args:
            - "/backup.sh"
          restartPolicy: OnFailure

CronJob Resource Management¶

1. `successfulJobsHistoryLimit`¶

Number of successful Jobs to retain in history.
Helps avoid clutter while allowing for some audit trail.

2. `failedJobsHistoryLimit`¶

Number of failed Jobs to retain.
Helps with debugging recurring failures.

spec:
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1

CronJobs and Timezones¶

Kubernetes CronJobs use the kube-controller-manager node’s time zone, which is usually UTC.
You cannot directly configure time zones per CronJob.
If you need timezone control:
Adjust schedule times manually.
Or run logic inside the container to sleep until the desired local time.

Cleanup and Lifecycle¶

CronJobs don’t clean up completed Jobs automatically unless TTL or history limits are set.
Best practices:
Use .spec.ttlSecondsAfterFinished inside jobTemplate to auto-delete Jobs.
Set successfulJobsHistoryLimit and failedJobsHistoryLimit to control retained Jobs.

Common Pitfalls¶

1. Time Drift / Missed Schedules¶

If a CronJob is skipped due to node or controller downtime, it won't retroactively catch up unless startingDeadlineSeconds is set.

2. Overlapping Jobs¶

If concurrencyPolicy is not set, multiple overlapping Jobs can cause resource contention.

3. Pod Failure and Retry¶

CronJobs depend on the Job retry logic (backoffLimit) and Pod restartPolicy.
Make sure failure handling is correctly configured.

When to Use CronJob vs Other Controllers¶

Use Case	Recommended Resource
One-time task	Job
Periodic scheduled task	CronJob
Long-running services	Deployment / StatefulSet
Real-time triggered tasks	Event-based controller (e.g., Argo, Knative)

Best Practices Summary¶

Use restartPolicy: OnFailure.
Set backoffLimit to control retries.
Limit job history for better performance.
Use suspend: true to temporarily disable scheduling.
Monitor execution via Job/Pod logs.
Validate your cron expression with online tools.