Category: Nagios

Qik-n-EZ: Nagios AWS EC2 Hosts via Ansible

Sooo .. You are monitoring a fleet of AWS EC2 hosts via Nagios, and have yet to find an easy way to manage their host definitions .. Good news (if you happen to be using Ansible dynamic inventories) !! I created an Ansible template that loops thru all your EC2s and creates them for you ..

In addition, you can easily define Nagios service dependencies, helping you zero in on the root problem more quickly ..

Advertisements

Qik-n-EZ: Nagios Status and Acknowledgement Links via HipSaint

This post is kinda dopey, but it might help one person out there in intertubes land .. That said, I have been using HipSaint for ……………….. 3 years ?? It’s great !! It posts Nagios alert information into HipChat:

I then:

  1. Log into Nagios
  2. Locate the host or service that is alerting
  3. Click the link
  4. View the details
  5. Acknowledge the alert, if needed

Y’all know I’m lazy though –right ?? I wish the HipChat message would just give me the links I want .. Well, now it can:

As you will see, I am simply appending the status and acknowledgement links to the Nagios “service output” .. I also use Ansible to populate variables such as:

  • HipChat token
  • HipChat room
  • Nagios hostname

Now I can be as lazy as I want to be:

 

Make Everyone Talk to Hal !!

Hal is my Hubot chatbot .. He’s awesome !! He gets me beer !!

hal beer me

hal beer me

He also does things like restart app servers, deploy code, and show me pictures of grumpy cats .. He’s so cool, I’ve started making non-humans to talk to him .. “Greg, what do you mean ??” .. Well, let me show you ..

DAY-TO-DAY PROCESS:

  1. I have a Nagios server
  2. It monitors (allthethings)
  3. When the “logged in users” alert is triggered, Nagios sends a message to my chat service using hipsaint
    1. “logged in users” is a monitor I have that alerts me when more than 3 users are logged into a server
  4. I see the alert and the server in question
  5. I SSH into the server
  6. I type who
  7. I then determine if I need to care
    1. If not, move on with my life
    2. If so, dig deeper

The thing is, I have more than 2,200 active monitors .. That means Nagios can and will send many, many messages to my chat service — depending on the day .. So how can I make my life easier ??

Here’s an easy one: ask Hal who’s on a server ..

hal whos on

hal whos on

 

hal whos on

hal whos on

My stack is HipChat -> Hubot -> Jenkins -> Ansible .. That means I can damn near do anything I want, all from my chat client ..

Remember what I said earlier — about making non-humans talk to Hal ?? What I did was create a Nagios event handler that sends a message to my chat service using HipChat CLI .. Therefore, I AM NOT asking Hal who’s on a server, it’s NAGIOS WHO IS doing it ..

nagios hal whos on

nagios hal whos on

It doesn’t stop there !! You can create scripted Splunk alerts as well .. Before you know it, you will be making (allthethings) talk to Hal ..

Qik-n-EZ: Multiline Notes for Nagios Service Definitions

OMG !! It’s still hip to say that — right ??

Anyway .. While I consider myself to be reasonably intelligent, I still find myself doing dopey things from time to time .. For example: saying OMG .. Another dopey thing I have been doing for a very, very long time — are run-on sentence notes for my Nagios service definitions .. I am ashamed to admit how many times I have tried to “fix” this — but always errored out during the pre-flight check .. Yes — I am aware of Google, but I’ve never found anything definitive ..

And then it hit me like a bolt of lightening — Bash style line breaks ..

So obvious, yet so elusive ..