VMware

Contents

This page references the GroundWork Cloud Hub and the VMware virtualization environment.

1.0 Managing a VMware Connection

This section reviews how to add and configure the Cloud Hub connector VMware. Each connector requires a unique set of parameters (e.g. URI, credentials). You will need your GroundWork server and virtual environment connector parameters handy.

1.1 Adding a new connection
  1. Log in to GroundWork Monitor as an Administrator.
  2. Select GroundWork Administration > GroundWork Cloud Hub. The Cloud Hub Configuration Wizard screen will be displayed where you can add and configure the Cloud Hub for various virtual environments. For each of the established configurations you can start or stop the connection, modify the parameters, or choose to remove a connection.
  3. To start a new connection click the +Add icon next to the environment to add. You will create a new connector in this way for each region in VMware that is to be monitored.

    Figure: Cloud Hub Configuration Wizard
1.2 Configuring GroundWork server values
  1. Next, enter the GroundWork server values to access the region. You will need to point the Cloud Hub VMware connector to a GroundWork server, indicate if it supports SSL, and give it an API key to transmit data.

    Figure: GroundWork server values for VMware (Example)


1.3 Configuring virtualization server values
  1. Next, we continue with the second half of the configuration wizard by entering the values for the virtualization server. The data that the GroundWork server receives comes from the VMware server, the information is pulled from the API on a periodic basis based on the check interval that is set. You can also select which views to include. 

    Figure: Values for a VMware connection (Example)


  2. Select SAVE which saves the current connection values and writes the entries to an XML file in the GroundWork server /usr/local/groundwork/conifg/cloudhub directory. When you choose to save the Cloud Hub connector is assigned an agent ID and that in turn becomes a record locator in Foundation when you begin monitoring.
  3. Then to validate the configuration select TEST CONNECTION which will check if the virtual instance is accessible with the given credentials. If successful you should see Connection successful! at the top of the screen.
  4. After the credentials have been validated select NEXT to display an associated connection metrics screen where you can determine the metrics to be monitored for VMware, (the HOME option would take you back to the first page of the configuration wizard).
1.4 Determining metrics to be monitored

Each management system provides metrics for specific checks that can be defined for the instance or the container. The property name and the thresholds are defined in a monitoring profile in an XML format. 

The VMware API (application programming interface) defines a set of metrics (measurements regarding performance, resource utilization, bandwidth) that apply to hypervisors (physical machines), hosts (virtual machines), networks and datastores (disk partitions). The metrics gathered by Cloud Hub are of two kinds: native and synthetic. The strings that define the native metrics are exactly those supported by the VMware API, with certain restrictions, namely that the list must be from those metrics that result in values, and not lists of objects. The majority of the metrics are numeric in nature - amounts of "MHz" (megahertz, in VMware parlance), amounts of memory (bytes, megabytes), amounts of disk space (bytes, megabytes, gigabytes). Again, they are taken in their native form, neither normalized nor adjusted.

The native metrics lack a sense of normalization, as an example a host (VM/virtual machine) may have a metric for CPU utilization of "273". The VMware documentation indicates that this value is in MHz (megahertz). However, in ferreting out system issues, it is often more useful to know what proportion of the total resource in question is in use. In other words, "273 of what?"

The synthetic metrics are pairs of native metrics, cast into percentage-of-total form. The numerator (number "on top") is a performance metric, and the denominator (divisor "on the bottom") is the "sum of, or size of a resource". Synthetic metrics can be extremely helpful in deciphering performance and accessibility issues in real-time. The percentages are bounded in the [0..100] range, and they include the "%" character at the end.

  1. The metrics screen allows you to define if a metric should be monitored and graphed, and lets you set the values for Warning and Critical thresholds at which to trigger alerts. It is recommended to use the synthetic metrics (computed percentages) since it helps to define the threshold values in a 0-100% range:
  2. When you are satisfied with the profile selections choose SAVE to write out the profile. Select HOME to return to the main Cloud Hub panel.
    The view selections made on the previous screen will determine metric options. The image below shows all views.


    Figure: Cloud Hub Configuration wizard for VMware - Hypervisor thresholds, Virtual Machine thresholds, and Storage thresholds


  3. Select START for the specific connector to begin the discovery and data collection process.

    Figure: Cloud Hub Configuration

2.0 Unified Monitoring

So how does all this get represented in the unified monitoring context? The data for the monitored services selected are passed to the GroundWork REST API and are directly inserted into the Status and Event Console tables in the GroundWork Foundation database which makes them show up in the UI almost immediately.

2.1 Status view

After starting the connection, in a couple minutes the Status viewer application will display the automatically created host groups corresponding to the views chosen in setup. The monitoring can be adjusted by returning to the Cloud Hub configuration screen and modifying metrics collected (check/un-check) or modifying threshold values.

In our example, we show the syn.vm.mem.sharedToConfigMemSize service Status Information as WARNINGas this is reflective of the current threshold set in the profile. In this view you can also see the graphs coming in under Service Availability and Performance Measurement, and the events being logged at the bottom of the screen. 

Figure: Status view

2.2 Event Console

Here in Event Console, we have selected the system applications filter OS, which lists events for the VMware application type. From here you can select specific events and apply various actions.

Figure: Event Console, by Application Type (VEMA)

2.3 Dashboards

This view displays the Enterprise View dashboard and indicates the host bdc.demo.com status as Host Recently Recovered.

Figure: VMware Connections - Dashboards, Enterprise View

2.4 NoMa

Below we show the NoMa log for notifications in which you can see alerts for the service syn.vm.mem.sharedToConfigMemSize.

Figure: NoMa notification log

3.0 Monitoring Profile for the VMware Virtual Environment

The master monitoring profiles for virtual environments are stored on the GroundWork server. Each time the user goes into the configuration screens for Cloud Hub the monitoring profile from the GroundWork server would be loaded into the Cloud Hub. This allows to you to manage and maintain the monitoring profiles for Cloud Hub in a central location.

3.1 Location of profiles

The location for Cloud Hub monitoring profiles is:

/usr/local/groundwork/core/vema/profiles/

Viewing the profiles directory:

[root@gwdemo~]# cd /usr/local/groundwork/core/vema/profiles
[root@gwdemo profiles\]# ls
amazon_monitoring_profile.xml      openstack_monitoring_profile.xml
docker_monitoring_profile.xml      rhev_monitoring_profile.xml
netapp_monitoring_profile.xml      vmware_monitoring_profile.xml
opendaylight_monitoring_profile.xml
[root@gwdemo profiles]#

The name of the VMware monitoring profile is:

vmware_monitoring_profile.xml

If you wish, you may carefully edit vmware_monitoring_profile.xml to include additional numeric metrics.

If you edit PLEASE test immediately. Any metric test that is slightly misspelled or otherwise rejected short-circuits ALL the metrics from reporting silently and without raising flags. In general, we can't recommend adding additional numeric metrics, at the time of this writing all useful metrics have been included as part of the released XML file contents.
3.2 VMware monitoring profile: vmware_monitoring_profile.xml
<?xml version="1.0" encoding="UTF-8"?>
<vema-monitoring>
    <profileType>vmware</profileType>
    <hypervisor>
	<metric name="summary.quickStats.overallCpuUsage"        description="Overall Hypervisor CPU Usage" monitored="false"  graphed="false" warningThreshold="1500" criticalThreshold="2500" />
	<metric name="summary.quickStats.overallMemoryUsage"     description="Overall Hypervisor Memory Usage in MB" monitored="false"  graphed="false" warningThreshold="3500" criticalThreshold="4500" />
	<metric name="summary.quickStats.uptime"                 description="Hypervisor Running up time" monitored="false" graphed="false" warningThreshold="3500" criticalThreshold="4500" />
	<metric name="syn.host.cpu.used"                         description="Hypervisor CPU Usage Percentage" monitored="true"  graphed="true"  warningThreshold="75" criticalThreshold="95" />
	<metric name="syn.host.mem.used"                         description="Hypervisor Memory Usage Percentage" monitored="true"  graphed="true"  warningThreshold="90" criticalThreshold="95" />

        <metric name="summary.capacity"  sourceType="storage"    description="Total Capacity of Storage Device (bytes)" monitored="true"  graphed="false" warningThreshold="-1" criticalThreshold="-1" />
        <metric name="summary.freeSpace" sourceType="storage"    description="Free Space on Storage Device (bytes)" monitored="true"  graphed="true" warningThreshold="-1" criticalThreshold="-1" />
        <metric name="summary.uncommitted" sourceType="storage"  description="Uncommitted Bytes on Storage Device" monitored="true"  graphed="false" warningThreshold="-1" criticalThreshold="-1" />
        <metric name="syn.storage.percent.used" sourceType="storage" description="Percent Usage of a Storage Device" monitored="true"  graphed="true" warningThreshold="-1" criticalThreshold="-1" />

    </hypervisor>
	<vm>
        <metric name="summary.quickStats.balloonedMemory"        description="VM Ballooned memory in MB" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.compressedMemory"       description="VM Compressed memory in MB" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.consumedOverheadMemory" description="VM Consumed Memory consumption in MB" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.guestMemoryUsage"       description="VM Guest Memory consumption in MB" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.hostMemoryUsage"        description="VM Host Memory Usage in MB" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.overallCpuDemand"       description="VM Overall CPU Demand" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.overallCpuUsage"        description="VM Overall CPU Usage" monitored="false"  graphed="false" warningThreshold="50" criticalThreshold="80" />
        <metric name="summary.quickStats.privateMemory"          description="VM Private Memory Used in MB" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.sharedMemory"           description="VM Shared Memory Used in MB" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.ssdSwappedMemory"       description="VM SSD Swapped Memory in MB" monitored="false"  graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.quickStats.swappedMemory"          description="VM Swapped Memory in MB" monitored="false"  graphed="false" warningThreshold="500" criticalThreshold="1000" />
        <metric name="summary.quickStats.uptimeSeconds"          description="VM Up time in seconds" monitored="false" graphed="false" warningThreshold="2000" criticalThreshold="3000" />

        <metric name="summary.runtime.bootTime"                  description="VM Boot time in seconds" monitored="false" graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.runtime.connectionState"           description="VM Connection State" monitored="false" graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.runtime.memoryOverhead"            description="VM Memory Overhead" monitored="false" graphed="false" warningThreshold="2000" criticalThreshold="3000" />
        <metric name="summary.runtime.powerState"                description="VM Power State" monitored="false" graphed="false" warningThreshold="2000" criticalThreshold="3000" />

        <metric name="summary.storage.committed"                 description="VM Storage Percent Committed" monitored="true"  graphed="true" warningThreshold="60" criticalThreshold="80" />
        <metric name="summary.storage.uncommitted"               description="VM Storage Percent UnCommitted" monitored="true"  graphed="true" warningThreshold="2000" criticalThreshold="3000" />

        <metric name="syn.vm.mem.balloonToConfigMemSize.used"    description="VM Ballooned Memory Used Percentage" monitored="true"  graphed="true" warningThreshold="50" criticalThreshold="75" />
        <metric name="syn.vm.mem.compressedToConfigMemSize.used" description="VM Compressed Memory Used Percentage" monitored="true"  graphed="true" warningThreshold="50" criticalThreshold="75" />
        <metric name="syn.vm.mem.sharedToConfigMemSize.used"     description="VM Shared Memory Used Percentage" monitored="true"  graphed="true" warningThreshold="50" criticalThreshold="75" />
        <metric name="syn.vm.mem.swappedToConfigMemSize.used"    description="VM Swapped Memory Used Percentage" monitored="true"  graphed="true" warningThreshold="75" criticalThreshold="90" />
        <metric name="syn.vm.mem.guestToConfigMemSize.used"      description="VM Guest Memory Used Percentage" monitored="true"  graphed="true" warningThreshold="75" criticalThreshold="85" />
        <metric name="syn.vm.cpu.cpuToMax.used"                  description="VM Cpu Usage Percentage" monitored="true"  graphed="true" warningThreshold="75" criticalThreshold="95" />

    </vm>
    <excludes>
        <exclude>perfcounter.101</exclude>
        <exclude>perfcounter.1</exclude>
        <exclude>perfcounter.15</exclude>
        <exclude>perfcounter.77</exclude>
    </excludes>
</vema-monitoring>

4.0 Removing Connectors from Monitoring

If you decide you do not want to monitor a particular region, simply navigate to GroundWork Administration > GroundWork Cloud Hub select STOP for the connector, then DELETE. All of the created host groups and the discovered and monitored instances for that region will be deleted from the Foundation database within a few minutes and monitoring access to the region endpoint will cease.

Additionally, see How to remove Cloud Hub hosts in the document How to delete or remove hosts.