Background
Two events coincided today: Realizing that I’ve been running without redundancy on a Linux software RAID coincided with me having some time to spare for a change. So I decided to write a simple vR Ops agent to monitor a software RAID, and, since I had some time on my hands, I wanted in Python instead of my go-to language Java. This is my very first Python program, so bear with me.
Learning Python proved to be very easy. It’s a very intuitive language. I’ve got some issues with loosely typed languages, but for a quick hack I’m willing to put that aside. What turned out to be a bit more difficult was to find any good examples of agents for vR Ops written in Python. The blog vRandom had a getting started guide that allowed me to understand how to install and get the thing running, but how to code the actual agent was harder to find. But after some trial and error I got the hang of it and I decided to share my experience here.
A quick look at the code
Initializing
The client library is called “nagini”, so before we can do anything, we need to import it:
import nagini
Next, we need to obtain a connection to vR Ops. This is done like this:
vrops = nagini.Nagini(host=vropshost, user_pass=(vropsuser, vropspassword))
It’s worth noticing that this doesn’t actually make the connection. It just sets up the connection parameters. So if you do something wrong, like mistyping the password, this code will gladly execute but you’ll get an error later on when you’re trying to use the connection.
Sending the data
Here is where things get quite interesting and the Python client libraries show their strength. While all the REST methods are exposed to the programmer, the libraries also add some nice convenience methods.
Normally, when you post metrics for a resource, you need to do it in three steps:
- Look up the resource to obtain its ID.
- If the resource doesn’t exist, create it.
- Send the metrics using the resource ID you just obtained by looking it up or creating it.
The vR Ops Python client can do this in a single call! All we have to do is to put together somewhat elaborate data structure and send it off. Let’s have a look!
Wow, that’s a lot of code! But don’t despair! Let’s break it down. In this example, I’m sending but stats and properties. If you recall, stats are numeric values, such as memory utilization and remaining disk space, whereas properties are strings and typically more static in nature. Let’s examine the stats structure.
The REST API states that stats should be sent as a single member called “stat-content”. That member contains an array of sub-structures, each representing a sample. Each sample has a “statKey” holding the name of the metric, a “timestamp” member and a “data” member holding the actual data value. As you can see above, I’m sending three values, the “totalMembers”, “healthyMenbers” and “percentHealthy”.
Now that we have our fancy structure, we can send it off using the find_create_resource_push_data method. It does exactly what the name implies: It tries to find the resource, if it doesn’t it creates it and then it pushes the data. To make it woek, we need to send the name of the resource, the name of the resource kind as well as adapter kind and the stats and/or properties. And that’s it! We’ve now built an adapter with Python using a single call to the vR Ops API!
What about the timestamp?
But wait, there’s one more thing we need to know. The vR Ops API expects us to send a timestamp as a Java-style epoc milliseconds long integer. How do we obtain one of those in Python.
Easy! We just time.time() method (you have to import “time” first). This gives us the epoc seconds (as opposed to milliseconds) as a float, so we have to multiply it by 1000 and convert it to a long. It will look something like this:
timestamp = long(time.time() * 1000)
Installing and running
Prerequisites
First of all, you need to install python (obviously). Most likely it’s already installed, but if it isn’t, just go to the package manager on your system and install it. Next, you need the client libraries for vR Ops. They are located on your vR Ops server and can be downloaded and installed like this:
mkdir tmp cd tmp wget --no-check-certificate https://<your vrops server>/suite-api/docs/bindings/python/vcops-python.zip unzip vcops-python.zip python setup.py install
Installing the agent
Download the agent from here. Unzip it in the directory where you want to run it, e.g. /opt/vraidmon
mkdir /opt/vraidmon unzip <path to my zip>
Next, we need to edit the configuration file vraidmon.cfg. You need to provide hostname and login details for your vR Ops. The name of the raid status file should remain /proc/mdstat unless you have some very exotic version of Linux. Here isĀ an example:
vropshost=myvrops.mydomain.com vropsuser=admin vropspassword=topsecret! mdstat=/proc/mdstat
Now you can run the agent using this command:
python vraidmon.py vraidmon.cfg
You’ll probably get some warning messages about certificates not being trusted, but we can safely ignore those for now.
Making it automatic
Running the agent manually just sends a single sample. We need some way of making this run periodically and the simplest way of doing that is through a cron job. I simply added this to my crontab (ignore any line breaks and paste it in as a single line):
*/5 * * * * /usr/bin/python /opt/vraidmon/vraidmon.py /opt/vraidmon/vraidmon.cfg 2> /dev/null
Adding alerts
In order for this adapter to be useful, I added a simple alert that fires when the percentage of healthy members drops below 100%. This is what a simulated failure looks like:
Downloadable material
Python source code and sample config file.