Doesn’t seem like everybody really understands how good and how big a change vRops symptoms and alerts have gone through since version 5.x.
vRops 6.x has an whole new way of informing about the status of an object which is called a symptom and on the alert side, symptoms are used to give the alert the correct properties to ensure that the alert carries the weight it ought to have. This way we can eliminate false positives, which is never wanted when doing monitoring. When that is said, vRops is only a monitoring tool. It doesn’t know how you would like to monitor your environment unless you tell it how to.
In this blog post I’m going to show you how to setup disk usage symptoms and use these symptoms to create an alert which does differentiated disk space monitoring. Why would you want to create your own ? Isn’t the default that comes with vRops good enough ? Well that depends! Per default vRops does disk usage monitoring only in percentage. That might be good enough for some, but not all. Let me give you an example.
Disk 10 MB @90% Full = 1 MB free space
Disk 10 GB @90% Full = 1 GB free space
Disk 10 TB @90% Full = 1 TB free space
As seen above if you get alerted when the disk is 90% full, there can be en extreme difference in how much space is left and therefore also how much time you have to fix the issue. Most servers with 1 TB free space can keep running for a long time whereas most servers with only 1 MB would properly need to be fixed right away. So how can we make an alert that make more sense, you ask. Keep reading!
First we need to create some symptoms. In order to do a more precise alerting on free space, I’m going to create a symptom based on the disk size and one on how much free space needs to be there before the alert is triggered. There will also need to be a percentage based symptom, but I will get back to that later.
The following symptoms will be created, from these :
Medium Disk size = Guest File System stats|Guest File System Capacity (GB) | Equal or greater then 100GB
Small Disk size = Guest File System stats|Guest File System Capacity (GB) | Less then 100GB
Less than 1 GB free space = Guest File System stats|Guest File System Free (GB) | Less then 1 GB
Less than 5 GB free space = Guest File System stats|Guest File System Free (GB) | Less then 5 GB
To create the above symptoms. Go to “Content”->”Symptoms Definitions” and click on the green plus sign. The above windows appears. In “Base Object Type”, type or select “virtual machine” and then navigate to “Guest File System Stats” and drag and drop the needed metric over to the empty canvas. Lastly fill out the fields as showed above.
Note that I have added an other metric as well, which is “Total Guest File System Usage (%)”, this is going to be the third parameter used to establish a better disk usage monitoring. Don’t forget to click “save” when done.
Create an alert
Click on “Alert Definitions” and then the green plus sign, to create a new alert. First give it a name and a description.
I call my alert: “vDisk low free space”
Then click on “Base Object Type” and type or select “Virtual Machine”, now click on “Alert Impact”. Below you can see how it set it up.
Let me explain the fields.
“Impact” – This is what major badge should be affected by this alert.
“Criticality” – How critical is this alert. This is the criticality the alert will have when triggered.
“Alert Type and Subtype” – This is where we categorize the alert.
The “Cycle” is how long we wait before triggering or canceling the alert. One cycle is five minutes. This can be used to avoid a lot of short lived alerts being triggered by setting the threshold up.
Now it’s time to add the symptoms. Search for the symptoms and drag and drop them into these two symptoms on the canvas. See below how I have done this. Note the field next to match is changed to “Any”, which means that just one of the two groups of symptoms have to be true for this alert to be triggered.
Now just click “Save” or if you want to add a recommendation do that before you save, by clicking on the menu item “Add Recommendations”
Now that the alert and symptoms have been created we need to start using them, for that to happen we need to verify that they are active in the policy which is in effect. In this demo environment there is only one policy named “vSphere Solution’s Default Policy”. Edit the policy and go to “6. Alert / Symptom Definitions” search for the alert and the symptoms and verify that “State” is set to “Inherited” or change the state to active (Local or Inherited). Then save the policy.
vRops Symptoms & Alerts – Disk usage – Wrap up
Last thing is to verify that it works. Go to “Alerts” and find the alert we created (if triggered, this can take up to five minutes) and click on it. Now verify that it’s all as expected.
Lastly to verify the symptoms click on the VMs name and then click on “Troubleshooting” and verify that the symptoms can be seen under the “symptoms” tab.
That was it, quite easy right. Now you have learned how to create symptoms and based on those symptoms, create an alert with multiple symptoms groups which can individually be triggered. That was all, take care. If there is any questions or comments, the comment section is just below.