CronJobs in Kubernetes - In-Depth Guide¶
What is a CronJob?¶
A CronJob in Kubernetes is a higher-level API object used to run Jobs on a scheduled basis, similar to how the cron utility works in Unix/Linux. It allows you to automate recurring tasks such as database backups, sending reports, clearing caches, or syncing data at defined intervals.
📌 A CronJob creates a new Job resource according to the schedule you define. That Job, in turn, creates one or more Pods to run the actual workload.
Syntax Overview¶
Here is a minimal example of a CronJob YAML:
apiVersion: batch/v1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/5 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Anatomy of a CronJob¶
Let’s break down each component in detail:
1. schedule¶
- Format: Standard cron format (
"minute hour day month day-of-week") - Example:
"0 0 * * *"means once per day at midnight. - You can use special characters:
*— every possible value*/5— every 5 units (e.g., every 5 minutes)1-5— range (e.g., Mon–Fri)
2. jobTemplate¶
- A CronJob doesn’t run containers directly. It creates Jobs, and Jobs create Pods.
- This section defines the template for those Jobs, just like a regular Job YAML spec.
3. restartPolicy¶
- Must be
OnFailureorNever.Alwaysis not allowed. - It tells Kubernetes what to do if the Pod exits unexpectedly.
Key CronJob Fields¶
1. startingDeadlineSeconds¶
- Maximum time (in seconds) the system has to start a Job if it misses its schedule.
- Useful when your cluster is under heavy load and a Job start is delayed.
- Example: If this is set to
200, and the schedule is every 5 minutes, but the controller checks late by 210 seconds, it will skip that run.
2. concurrencyPolicy¶
Controls what happens if the previous Job hasn’t finished when the next one is scheduled.
- Allow (default): Runs Jobs concurrently.
- Forbid: Skips the new Job if the previous one hasn’t finished.
- Replace: Deletes the currently running Job and replaces it with the new one.
3. suspend¶
- Boolean field that disables a CronJob without deleting it.
- Use case: Temporarily stop scheduling (e.g., for maintenance).
CronJob Execution Flow¶
- At scheduled time, CronJob controller checks if a Job needs to be created.
- If conditions are met (not suspended, not too late), a Job is created from the
jobTemplate. - That Job runs its Pods as usual.
- Success/failure is recorded. If TTL is set, the Job can be auto-deleted.
Real-World Example: Database Backup¶
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-job
spec:
schedule: "0 */6 * * *" # every 6 hours
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: my-backup-image
args:
- "/backup.sh"
restartPolicy: OnFailure
CronJob Resource Management¶
1. successfulJobsHistoryLimit¶
- Number of successful Jobs to retain in history.
- Helps avoid clutter while allowing for some audit trail.
2. failedJobsHistoryLimit¶
- Number of failed Jobs to retain.
- Helps with debugging recurring failures.
spec:
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
CronJobs and Timezones¶
- Kubernetes CronJobs use the kube-controller-manager node’s time zone, which is usually UTC.
- You cannot directly configure time zones per CronJob.
- If you need timezone control:
- Adjust schedule times manually.
- Or run logic inside the container to sleep until the desired local time.
Cleanup and Lifecycle¶
- CronJobs don’t clean up completed Jobs automatically unless TTL or history limits are set.
- Best practices:
- Use
.spec.ttlSecondsAfterFinishedinsidejobTemplateto auto-delete Jobs. - Set
successfulJobsHistoryLimitandfailedJobsHistoryLimitto control retained Jobs.
Common Pitfalls¶
1. Time Drift / Missed Schedules¶
- If a CronJob is skipped due to node or controller downtime, it won't retroactively catch up unless
startingDeadlineSecondsis set.
2. Overlapping Jobs¶
- If
concurrencyPolicyis not set, multiple overlapping Jobs can cause resource contention.
3. Pod Failure and Retry¶
- CronJobs depend on the Job retry logic (
backoffLimit) and PodrestartPolicy. - Make sure failure handling is correctly configured.
When to Use CronJob vs Other Controllers¶
| Use Case | Recommended Resource |
|---|---|
| One-time task | Job |
| Periodic scheduled task | CronJob |
| Long-running services | Deployment / StatefulSet |
| Real-time triggered tasks | Event-based controller (e.g., Argo, Knative) |
Best Practices Summary¶
- Use
restartPolicy: OnFailure. - Set
backoffLimitto control retries. - Limit job history for better performance.
- Use
suspend: trueto temporarily disable scheduling. - Monitor execution via Job/Pod logs.
- Validate your cron expression with online tools.