BackupPC Monitoring with Heartbeats

BackupPC is a fantastic open-source backup solution, widely adopted for its flexibility, deduplication capabilities, and robust client-server architecture. It handles a multitude of clients, from Linux to Windows, and often runs quietly in the background, diligently backing up your critical data. And that's precisely where the problem lies.

The "set it and forget it" mentality, while tempting, is a dangerous trap when it comes to backups. A silent failure in BackupPC can go unnoticed for days, weeks, or even months, leaving you with a gaping hole in your disaster recovery plan. You only discover the problem when you desperately need to restore data, and by then, it's often too late.

Manual checks are tedious and prone to human error. Relying on complex log file parsing can be reactive and challenging to maintain consistently. What you need is a proactive, simple, and reliable way to confirm that your BackupPC jobs are actually running successfully. This is where heartbeat monitoring shines.

The Core Problem: Silent Failures

BackupPC operates predominantly in the background, orchestrated by cron jobs and its internal scheduling. When things go wrong, the signs can be subtle:

  • Disk Full: The BackupPC server runs out of storage, causing dumps to fail without necessarily crashing the service itself.
  • Network Issues: A client becomes unreachable, or the BackupPC server loses network connectivity to its storage or clients.
  • Client Offline/Misconfiguration: A client machine is powered off, renamed, or its rsyncd configuration changes.
  • Permissions Problems: BackupPC no longer has the necessary permissions to read client data or write to its own storage.
  • BackupPC Service Stalled: The BackupPC_nightly process hangs or fails to complete its run.

In all these scenarios, your scheduled backup might attempt to run, but silently fail to complete successfully. The cron job might report success because the BackupPC_dump command was invoked, not because the backup itself finished. You might receive no explicit alert, leaving you vulnerable.

Traditional monitoring often focuses on the presence of a process or the size of a log file. But for backups, you need confirmation of completion and success. This is a critical distinction.

What is a Heartbeat and Why is it Perfect for BackupPC?

A heartbeat, in the context of monitoring, is a simple signal sent by a scheduled job to a monitoring service. This signal is typically an HTTP GET request to a unique URL provided by the monitoring service. The service expects to receive this "heartbeat" within a predefined interval. If the heartbeat doesn't arrive on time, it triggers an alert (e.g., Slack, Discord, email).

Why is this ideal for BackupPC?

  • Confirmation of Success: Unlike simply checking if BackupPC processes are running, a heartbeat confirms that a specific backup job or the entire nightly routine has completed successfully. You integrate the heartbeat at the very end of the success path.
  • Simplicity: Sending a heartbeat is a single curl command. It's lightweight and has minimal overhead.
  • Proactive Alerts: Instead of reacting to a problem found in logs, you get an alert when the expected success signal doesn't arrive. This shifts your monitoring paradigm from "something failed" to "something didn't succeed as expected."
  • Works with Existing Infrastructure: You don't need complex agents or custom daemons. It leverages standard shell commands and HTTP.
  • Granular or Global: You can monitor individual client backups or the overall health of your BackupPC server's nightly routine.

Integrating Heartbeats into BackupPC's Workflow

BackupPC offers powerful configuration options to inject custom commands at various stages of a backup. The most relevant for heartbeat monitoring is dumpPostUserCmd.

Example 1: Monitoring Individual Host Backups with dumpPostUserCmd

The dumpPostUserCmd configuration variable allows you to specify a command to be executed after a successful dump for a specific host. This is the perfect place to send a heartbeat for that host.

First, within Heartfly, you'd create a new monitor for each critical BackupPC client. Let's say you have a client named webserver.example.com. You'd create a monitor in Heartfly, set its expected interval (e.g., 25 hours for a daily backup), and it would provide you with a unique heartbeat URL, something like https://cron2.91-99-176-101.nip.io/ping/your-unique-id-webserver.

Now, on your BackupPC server, you'll edit the configuration for `