Cronjob monitoring with Prometheus Node exporter
For a long time I have been looking for a solution to monitor my cron jobs. Last week, I accidentally stumbled across the Prometheus Node exporter Textfile Collector and found out that it is intended for statistics from jobs.
The Textfile Collector parses all files from a specified directory with the *.prom extension. So you only need to put your metrics in Prometheus text format in a file, and Node exporter will export them on the next Prometheus scrape.
Node exporter configuration
The command line flag, --collector.textfile.directory must be configured to the directory where you want to put your files. The Community Ansible role uses /var/lib/node_exporter so I used that directory too. On OpenBSD I use /var/node_exporter.
Because there are so many ways to install or run node exporter, I cannot describe exactly where you need to add the command line flag. I installed it manually, so I needed to edit /etc/systemd/system/node_exporter.service.
ExecStart=/usr/local/bin/node_exporter --web.listen-address="0.0.0.0:9100" --collector.processes --collector.cpu.info --collector.textfile.directory=/var/lib/node_exporter
For more information, check the GitHub README.
Shell Script Example
Below is a short shell script example of what such a file can look like. The mv is recommended by the documentation that Node exporter will not read half a file.
#!/bin/bash
# exit if a command returns a nonzero exit status or an undefined variable is used!
set -eu
# work like rsync, ...
cat << EOF > /var/lib/node_exporter/backup.prom.$$
# HELP cronjob_last_run_24h Cronjob last run, in unixtime.
# TYPE cronjob_last_run_24h gauge
cronjob_last_run_24h{name="backup"} $(date +%s)
EOF
mv /var/lib/node_exporter/backup.prom.$$ /var/lib/node_exporter/backup.prom
As you can see above, I use it to check that my cron jobs finish without an error. I have a Prometheus rule that checks if all cronjob_last_run_24h Unix times are not older than 24 hours. If I have a cronjob that runs more often than 24 hours, I will add another check for that; for example, cronjob_last_run_1h for 1-hour or cronjob_last_run_1m for 1-minute intervals.
Below you can see the Prometheus rule:
groups:
- name: node
rules:
- alert: Cronjob Last Run 24 Hours
expr: (time() - cronjob_last_run_24h) >= 90000
for: 1m
annotations:
instance: '{{ $labels.instance }}'
description: 'Cronjob should have run!'
I'm super happy with this setup because I finally know that my cron jobs are working as they are intended without checking if I received an error mail.
30 December 2025 - Philipp Keschl