Cronjob monitoring with Prometheus Node exporter

For a long time I have been looking for a solution to monitor my cron jobs. Last week, I accidentally stumbled across the Prometheus Node exporter Textfile Collector and found out that it is intended for statistics from jobs.

The Textfile Collector parses all files from a specified directory with the *.prom extension. So you only need to put your metrics in Prometheus text format in a file, and Node exporter will export them on the next Prometheus scrape.

Node exporter configuration

The command line flag, --collector.textfile.directory must be configured to the directory where you want to put your files. The Community Ansible role uses /var/lib/node_exporter so I used that directory too. On OpenBSD I use /var/node_exporter. Because there are so many ways to install or run node exporter, I cannot describe exactly where you need to add the command line flag. I installed it manually, so I needed to edit /etc/systemd/system/node_exporter.service.

ExecStart=/usr/local/bin/node_exporter --web.listen-address="0.0.0.0:9100" --collector.processes --collector.cpu.info --collector.textfile.directory=/var/lib/node_exporter

For more information, check the GitHub README.

Shell Script Example

Below is a short shell script example of what such a file can look like. The mv is recommended by the documentation that Node exporter will not read half a file.

#!/bin/bash

# exit if a command returns a nonzero exit status or an undefined variable is used!
set -eu

# work like rsync, ...

cat << EOF > /var/lib/node_exporter/backup.prom.$$
# HELP cronjob_last_run_24h Cronjob last run, in unixtime.
# TYPE cronjob_last_run_24h gauge
cronjob_last_run_24h{name="backup"} $(date +%s)
EOF

mv /var/lib/node_exporter/backup.prom.$$ /var/lib/node_exporter/backup.prom

As you can see above, I use it to check that my cron jobs finish without an error. I have a Prometheus rule that checks if all cronjob_last_run_24h Unix times are not older than 24 hours. If I have a cronjob that runs more often than 24 hours, I will add another check for that; for example, cronjob_last_run_1h for 1-hour or cronjob_last_run_1m for 1-minute intervals.

Below you can see the Prometheus rule:

 groups:
  - name: node
    rules:
      - alert: Cronjob Last Run 24 Hours
        expr: (time() - cronjob_last_run_24h) >= 90000
        for: 1m
        annotations:
          instance: '{{ $labels.instance }}'
          description: 'Cronjob should have run!'

I'm super happy with this setup because I finally know that my cron jobs are working as they are intended without checking if I received an error mail.

30 December 2025 - Philipp Keschl