Contents
This page covers the new configuration tools Downtimes and Recurring Downtimes for managing scheduled downtime. Scheduling downtimes is very useful during system maintenance as they enable GroundWork Monitor to stop sending messages during a down time period which also helps produce more accurate data in SLA reports. We start first with how to enable these tools and continue with the setting up regular and recurring downtimes.
Enabling the Downtimes Tool
Use the attached archive to put the additional script in to place and then enable the cronjob.
- Download the attached file to your Groundwork server:
ZIP Archive TB6.7.0-6.downtime_job.tar.gz - Extract the archive:
run as nagios user
tar xf TB6.7.0-6.downtime_job.tar.gz -C / - The crontab entry to run this script is commented in the release. Uncomment it to start the job:
run as nagios usercrontab -l | sed -e '/downtime/s/^\#//' | crontab -
- If you want it to run this more frequently than once an hour, you will need to change the coding. This example runs at 1 minute past the hour every hour. The authors suggest once an hour or once a day. If you are running less frequently than once a day you are apt to miss some requests for recurring downtime.
How It Works
Regular Downtime
When you set up regular downtime scheduling in the next section, your selections are turned into external commands for the running Nagios. Suppose you choose the following:
- Downtime of 60 minutes
- For a hostgroup "Linux Servers"
- Starting at 03:00 on December 22, 2012
As soon as you make the selection and press the final Add button, you can immediately look at the status.log file and view the addition. At this point Nagios has been advised and the downtime will be respected.
Recurring Downtime
The crontab entry and the newly added script work on recurring downtime selections. Making choices in the Recurring Downtimes panel will produce entries in the following file. The entries persist in that file until they are deleted using the application.
/usr/local/groundwork/nagios/var/downtime_schedule.cfg
That file is read by the regularly scheduled script:
/usr/local/groundwork/nagios/bin/downtime_job.pl
The entries are examined for jobs that might be significant in the coming 24 hours. Any entries found are converted into external commands and sent to Nagios. As with downtime initiated in any other way, Nagios will undertake the removal of the status.log entry upon the passage of the designated time. If you remove Downtime requests through the application the corresponding entries in the configuration file and the status.log will be removed accordingly.
Debug comments are produced on the command line. If you need to see them you must become user nagios and run the command at a prompt. Here is an example of the configuration file downtime_schedule.cfg:
define schedule { sid 794192837294efd1cef4ad6b13969b67 user rstools comment test linux time 23:20 duration 60 days_of_week mon,tue,thu,fri,sat,sun days_of_month 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,21,22,23,24,25,26,27,28,29,30,31 schedule_type hostgroup hostgroup_name Linux Servers } define schedule { sid b0ea8267f9557b760c85bcdb745ef81b user rstools comment web maint time 01:00 duration 600 days_of_week sun days_of_month 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 schedule_type hostgroup hostgroup_name Apps-MemberWeb }
Here we run the downtime script and see the debug comments on line:
run as nagios user -bash-4.1$ /usr/local/groundwork/nagios/bin/downtime_job.pl Reading in configuration Reading in status log to get list of services Reading /usr/local/groundwork/nagios/var/objects.cache...Done. Reading in list of already scheduled downtime Reading /usr/local/groundwork/nagios/var/nagiosstatus.sav ...Adding 172.28.113.155:1355985900 Adding 172.28.113.155:trap_unknown:1355985900 Done. Checking for downtime due in next 10080 minutes test linux: Current candidate: 23:20 on 19/12/2012 Current candidate: 23:20 on 20/12/2012 Checking days of week: days (0,1,2,4,5,6) are valid Scheduling for day 4 (today is 3, looking at scheds for 4 and later) Current candidate: 23:20 on 20/12/2012 Scheduling hostgroup Linux Servers Checking hostgroup representative localhost:1356074400 Sending command 'SCHEDULE_HOSTGROUP_HOST_DOWNTIME;Linux Servers;1356074400;1356078000;1;0;0;rstools;AUTO: test linux ' SCHEDULE_HOSTGROUP_HOST_DOWNTIME;Linux Servers;1356074400;1356078000;1;0;0;rstools;AUTO: test linux Sending command 'SCHEDULE_HOSTGROUP_SERVICE_DOWNTIME;Linux Servers;1356074400;1356078000;1;0;0;rstools;AUTO: test linux ' SCHEDULE_HOSTGROUP_SERVICE_DOWNTIME;Linux Servers;1356074400;1356078000;1;0;0;rstools;AUTO: test linux web maint: Current candidate: 01:00 on 19/12/2012 Current candidate: 01:00 on 20/12/2012 Checking days of week: days (0) are valid Advancing a week to day 0 Current candidate: 01:00 on 23/12/2012 Scheduling hostgroup Apps-MemberWeb Checking hostgroup representative WB_Apps-MemberWeb:1356253200 Sending command 'SCHEDULE_HOSTGROUP_HOST_DOWNTIME;Apps-MemberWeb;1356253200;1356289200;1;0;0;rstools;AUTO: web maint ' SCHEDULE_HOSTGROUP_HOST_DOWNTIME;Apps-MemberWeb;1356253200;1356289200;1;0;0;rstools;AUTO: web maint Sending command 'SCHEDULE_HOSTGROUP_SERVICE_DOWNTIME;Apps-MemberWeb;1356253200;1356289200;1;0;0;rstools;AUTO: web maint ' SCHEDULE_HOSTGROUP_SERVICE_DOWNTIME;Apps-MemberWeb;1356253200;1356289200;1;0;0;rstools;AUTO: web maint
Setting Up Regular Downtimes
This section reviews the Configuration Downtimes feature in GroundWork Monitor where you can schedule downtimes for a single specified date, time, and duration. We'll cover how to list, add, and delete regular downtimes by system hosts, host groups, and service groups.
Listing Downtimes
This command displays all scheduled downtimes and gives you the option to delete selected downtimes.
Using Listing Downtimes
- Go to Configuration>Downtimes>List Downtimes, a list of currently scheduled downtimes will be displayed.
- To remove all scheduled downtimes click Delete all downtime(s).
- To remove specific scheduled downtimes click the corresponding box at the end of each row, and click Delete selected.
- You may refresh the current list of downtimes by selecting Refresh, as deleting downtimes can take some time to register.
Figure: List Downtimes
Adding Downtimes
This command enables you to add downtimes for Hosts, Host Groups, and Service Groups
Using Add Host Downtime
Here you can indicate hosts and hosts service for scheduled downtime.
- Go to Configuration>Downtimes>Add host downtime, a list of current system hosts will be displayed.
- Using the check boxes and drop-down arrow for each host and service, select at least one host and or service to place in downtime. Checking the box in the upper right corner select all services, clicking the drop-down arrow in the upper right corner exposes all of the hosts services.
- Next, select Add downtime. A dialog box will be displayed, enter the downtime start time, end time, duration, and comment.
- Click Add to add the scheduled host downtime. You can view this downtime using the List downtimes option.
Figure: Add Downtime by Host
Using Add Host Groups Downtime
Here you can indicate hostgroups, hosts, and services for scheduled downtime.
- Go to Configuration>Downtimes>Add hostgroup downtime, a list of current system hostgroups will be displayed, along with their corresponding hosts and services.
- Use the check boxes and drop-down arrow to select what to put into downtime. If you check a box for a hostgroup, the hosts and hosts services for that hostgroup will all be selected for downtime. You may also choose to select host(s) or service(s) separately.
- After you have made your selection(s), select Add downtime and define the time for the downtime in the next screen.
Figure: Add Downtime by Host Group
Using Add Service Groups Downtime
Here you can indicate servicegroups, hosts, and services for scheduled downtime.
- Go to Configuration>Downtimes>Add servicegroup downtime, a list of current system servicegroups will be displayed, along with their corresponding hosts and services.
- Enter at least one servicegroup, host, and service to place in downtime.
- After you have made your entries, select Add downtime and define the time for the downtime in the next screen.
Figure: Add Downtime by Service Group
Setting Up Recurring Downtimes
This section reviews the Configuration Recurring Downtimes feature in GroundWork Monitor. The previous option Downtimes lets you schedule downtimes for a single specified date, time, and duration. With this feature you can schedule downtimes for Hosts, Hostgroups, and Servicegroups with a recurring time, duration, days of the week, and days of the month. For example you can set up to have a downtime at 8PM, for 1 hour, on every second Friday of every month.
Adding a Schedule
- Go to Configuration>Recurring Downtimes.
- Select to add a host, hostgroup, or servicegroup downtime by selecting the corresponding tab and then selecting Add schedule.
- Next, you will need to define the recurring downtime as shown in the image below. Enter the name of the host, hostgroup, or servicegroup and any specific service(s). Wildcards can be used to specify multiple matches. Then, enter a start time, duration, and a comment describing the purpose of the downtime, also enter the valid days for the downtime, keep in mind it's the check boxes left that become valid:
Days of Week and Days of Month this schedule is valid: - If you specify Days of Week and Days of Month, then both must match! In our example we specify Days of Week = Friday and Days of Month = 8, 9, 10, 11, 12, 13, and 14 which are the Fridays of our months.
- If any Days of Week are selected, the remaining days of week become valid. In our example below the gray boxes were selected and the checked boxes become the valid days.
- If any Days of Month are selected, the remaining days are valid. Again, the checked boxes indicate what days are valid.
- Select Create. You will see you recurring downtime listed. You can use the edit, delete, and copy icons for each scheduled downtime.
Figure: Defining Recurring Downtimes
Figure: Recurring Downtimes Example