How to perform a full system backup


The backup utility is designed to serve two purposes: to provide a basic means for taking a snapshot of an entire GroundWork Monitor system for regular-backup purposes during maintenance windows, and to generate and use such a backup around a system upgrade, as a fallback measure. The discussion here centers around the second use case.

Starting with the GWMEE 7.0.2 release, the backup tool is now named gw-backup-br387.2-linux-64 . This version only supports backing up and restoring PostgreSQL-based GroundWork Monitor releases (6.6.0 and later). The tool can be downloaded from here:

There are currently no attachments on this page.

For purposes of using this tool during a system upgrade, you will run it on the GroundWork Monitor server, even if that system references a remote database.

Special-case use for sites with large RRD-file collections
The instructions shown immediately below will suffice for running the tool on most systems. However, if you have a large site with a lot of storage space allocated in these directories:
  • /usr/local/groundwork/rrd/
  • /usr/local/groundwork/cacti/htdocs/rra/
  • /usr/local/groundwork/nagios/var/archives/

then a special procedure may be more efficient during a system upgrade. See the Upgrading a large site section below for details.

Special-case handling for a separate archive database
If you have the unusual case where your archive_gwcollagedb database does not reside in the same PostgreSQL instance (local or remote) as the rest of the GroundWork databases accessed by a single GroundWork Monitor server, special steps will need to be taken to capture and/or restore all the database data. Contact GroundWork Support in this situation, and have them ask for instructions from GroundWork Engineering.




1.0 Backing up to a backupdb-type tarball

For purposes of using this tool during a system upgrade, we use the --action backupdb form of backup, not the --action backup form. This will capture the full set of GroundWork files, including both GroundWork RRD files and Cacti RRD files. It will also include, embedded inside the backupdb tarball, a full dump of all of the PostgreSQL databases.

2.0 Restoring from a backupdb-type tarball

Unrolling the backupdb tarball is not enough
The backup tool takes a variety of actions beyond just unrolling the backupdb tarball, to ensure the completeness and integrity of the restored system. You should not think that you can bypass use of the tool by just unrolling the tarball manually yourself.

To restore the system from a backup taken as described above, you will need to first re-install a fresh copy of the GroundWork version that it represents. (This is why you renamed the backup file above to include the GroundWork version number, so there can be no mistake now about what version to install.) During this re-install, you must use exactly the same postgres database-user password as was in force when the backup was taken. This system refresh will both provide the necessary files for the restore operation to function, and (in a failed-upgrade scenario, where you're rolling back to the previously-installed release) eliminate any mixture of files from the old and new releases.

3.0 Upgrading a large site

The procedure outlined above will work on any system, large or small, whether you have local or remote databases, provided you have enough space for the backup file. However, that procedure may be somewhat inefficient during a system upgrade if you have a large amount of space allocated for RRD files and certain log files. In such cases, the time taken to back up and restore these particular file collections may be excessive. An alternative procedure is available for such cases. This procedure manually moves aside the large file collections, then puts them back afterward. Keeping these file collections out of the backup file makes the processing faster, at the cost of you having to manage these parts of the system on your own.

Be very careful when following this procedure
This procedure speeds up the processing by keeping large collections of critical data out of the backup tarball. This means you won't have any extra copy of this data lying around to recover from if you make a mistake. So tread carefully, checking your work at every stage to make sure you have both saved the data aside properly, and that you do not inadvertently destroy the data by making mistakes in these commands. If any of that concerns you, and you prefer the safety and convenience of more automated procedures, then make sure you have enough space for a backupdb tarball, live with the extra time it will take to save and possibly restore these file and database collections, and follow the simpler main procedure above.

The procedure in this case is as follows.

Back up a large system

Measure the space required by the backup, as described in the main procedure above. Also take the following measurements:

du -sm /usr/local/groundwork/rrd
du -sm /usr/local/groundwork/cacti/htdocs/rra
du -sm /usr/local/groundwork/nagios/var/archives

If these locations represent a large amount of space (say, greater than 1 GB; this is a matter of judgment), it may be worthwhile to follow the instructions here.  This will result in a tarball that does not include the file trees at these locations. This tarball will also not include a dumpfile of all your PostgreSQL application databases, so saving and restoring them will need to be handled separately if you are using a remote database. (For a local-database-only setup, the raw database files will be part of the tarball, and that will suffice.)

Examine and correct the filesystem structure before continuing
Because the file-tree movement described below attempts to move data within the same filesystem but outside the /usr/local/groundwork/ file tree, this process will only work if /usr/local/groundwork is not the root of a mounted filesystem. If it is, fix that up first, by shutting down the system, then moving around mount points and/or symlinks, before proceeding with the procedure below.

Instead of running the backup tool in --action backupdb mode, do the following:

Perform the system upgrade

Follow the standard release instructions to upgrade your system to the later release.

Upon a failed upgrade, restore the entire system

If the upgrade failed, take these steps to roll back completely to the previous release. (The archived Nagios log files can probably just be ignored, as they will have little value going forward.)

Upon a successful upgrade, restore the RRD files

If the upgrade worked, take these steps to put back your saved RRD files. (Again, the archived Nagios log files can probably just be ignored, as they will have little value going forward.)

At this point, your old data should be back in place, and your system should be fully operational.

Beware of incomplete backups
Once you have restored the RRD files this way, your --action backup tarball will be of limited value for any future restore operations. That is, it doesn't contain the RRD files, and you have moved your saved RRD files back into operational locations where they would not be available if some future restore operation must be performed, without extra work to save them again (if they are still available when you need them, i.e., presuming your disk didn't crash). So at this point, you should take whatever steps you deem appropriate to back up your entire system, as you would for executing a disaster-recovery operation.