Did you know you can tag a host in Splunk ?? I didn’t !! Do you know how much time tags would have saved me from having to craft a most excellent Splunk search to capture just the right hosts ?? Me neither — but I’m guessing it’s a lot ..
So instead of my searches looking like this:
# get all staging RMI nodes -- hard index=* ( host=rmi1.s.* OR host=rmi2.s.* OR host=rmi3.s.* ) source=*tomcat* earliest=-1h
They can now look like this:
# get all staging RMI nodes -- easy index=* tag=rmi tag=stage source=*tomcat* earliest=-1h
I know, I know — I could achieve the same level of excellence using targeted indexes (index=rmi_stage) and/or various regex filters .. Some of that, unfortunately, is out of my control ..
OK .. So how can you manage this without having to use the GUI ?? Easy !! You just need to drop a config file in the proper location (for me it’s: /opt/splunk/etc/system/local/tags.conf) on the search head, and away you go .. The syntax is pretty basic:
# tagging host login.example.net with PROD, TOMCAT, and LOGIN [host=login.example.net] prod = enabled tomcat = enabled login = enabled
Below’s a nice little example of how I automated this using Ansible (big surprise there 🙂 and the EC2 dynamic inventory script ..
Hal is my Hubot chatbot .. He’s awesome !! He gets me beer !!
He also does things like restart app servers, deploy code, and show me pictures of grumpy cats .. He’s so cool, I’ve started making non-humans to talk to him .. “Greg, what do you mean ??” .. Well, let me show you ..
- I have a Nagios server
- It monitors (allthethings)
- When the “logged in users” alert is triggered, Nagios sends a message to my chat service using hipsaint
- “logged in users” is a monitor I have that alerts me when more than 3 users are logged into a server
- I see the alert and the server in question
- I SSH into the server
- I type who
- I then determine if I need to care
- If not, move on with my life
- If so, dig deeper
The thing is, I have more than
1,000 1,500 2,200 5,000 6,500 active monitors .. That means Nagios can and will send many, many messages to my chat service — depending on the day .. So how can I make my life easier ??
Here’s an easy one: ask Hal who’s on a server ..
My stack is HipChat -> Hubot -> Jenkins -> Ansible .. That means I can damn near do anything I want, all from my chat client ..
Remember what I said earlier — about making non-humans talk to Hal ?? What I did was create a Nagios event handler that sends a message to my chat service using HipChat CLI .. Therefore, I AM NOT asking Hal who’s on a server, it’s NAGIOS WHO IS doing it ..
It doesn’t stop there !! You can create scripted Splunk alerts as well .. Before you know it, you will be making (allthethings) talk to Hal ..