October 2, 2019

Qik-n-EZ: Using a Shell’s STDOUT in a Chef ruby_block

Chef gets a bad rap for being “hard” — especially when compared to Ansible .. This is especially true when developers of Chef cookbooks don’t understand the two-pass model .. A common question amongst new Chef cookbook developers goes something like this: “I want to run a shell command and capture its output (i.e. STDOUT) .. I then want to loop through that shell command’s output to run other Chef resources, and I want those other Chef resources to be aware of the shell command’s output .. How can I do that ??” .. Sometimes it’s easier to use bullets:

run a shell command
- the output of the shell command == A, B, C
loop through the shell command’s output one at a time
- A
- B
- C
run other Chef resources per loop item
- other Chef resources are aware of the loop item value

Often, the new Chef cookbook developer will try and run a Ruby loop in the raw, which will execute during the compile phase — this is a big no-no .. Instead, you should run the shell command, using shell_out, in a Chef ruby_block resource .. This will give you access to the shell command’s output where you can manipulate the output using good ol’ fashioned Ruby, and then “notify” other Chef resources as desired ..

One thing to be aware of is to make sure you are lazily evaluating properties in a resource that can’t be known until the execution phase of the chef-client run. Alan Thatcher has a nice writeup here about the hows and the whys ..

Enough talk — show me the code !!

	ruby_block 'shell out fun' do
	block do
	# this is me running my command and assigning the output to a var
	ls = shell_out('ls /var/')

	# this is me assigning the STDOUT to a var
	raw_output = ls.stdout

	# i used the Ruby p method to print out the raw chars
	# then i knew how to manipulate the string
	# in this case i am going to split the output by newlines and store it in an array
	clean_output = raw_output.split(/\n/)

	# here i am looping thru the array
	clean_output.each do \|line\|
	# setting the temp var you want to store this in and use later
	node.run_state[:output] = line

	# here i go notifying the resource i want to invoke knowing i have set
	# the temp variable i plan to use later
	resources(:execute => 'say_hello').run_action(:run)
	end
	end
	end

	execute 'say_hello' do
	# here i am going to lazy fetch the command value
	# so it will grab the latest value of the temp var we are using
	command lazy { "echo 'hello #{node.run_state[:output]}'" }

	# don't forget to make this do NOTHING until notified
	action :nothing
	end

view raw

recipe.rb

hosted with ❤ by GitHub

So, is Chef “hard” ?? I would argue no — but it sure is nuanced, and for good reason ..

July 31, 2019

Qik-n-EZ: Secret Ohai Directory in Chef Cookbooks

Do you code Chef Cookbooks ?? Do you think Ohai plugins are so awesome that you’ve created some of your own ?? Do you really, really dislike having to depend on a community cookbook to deploy your custom Ohai plugins ?? Well, then I have some good news !!

There’s a secret top-level cookbook directory where you can put your custom Ohai plugins that will be synced and reloaded during a chef-client run .. Want to guess the name of this directory ?? My guess is you will only need one ..

./<cookbook name>/ohai/<ohai plugin>

Why is it a secret ?? Well, my guess is they simply forgot to document it, since it’s been an available feature since the 13.x days: https://github.com/chef-boneyard/chef-rfc/blob/master/rfc059-ohai-cookbook-segment.md

UPDATE

The ohai directory is not so secret anymore — see here ..

November 29, 2018

Free Idea: Ansible + Jenkins x AWS CodeBuild = Infinite Scale (sorta)

As you know, I like me some Ansible, AWS, and Jenkins .. Did you know it’s not uncommon to use Ansible + Jenkins as an “automation platform” to manage your cloud infrastructure ?? I do this a lot — it’s easy, reusable, and works !! Think about the workflow:

Ansible code is committed to the Git repo
Jenkins job is triggered
- either “manually”, via a trigger, or periodically
Jenkins pulls code to a local workspace
- don’t forget to make it shallow
Jenkins executes “shell” build step based on defined parameters
Ansible playbook executes against defined inventory
(allthethings) are automated

Given the above workflow, your Jenkins job might look something like this:

Jenkins job

That said, our Jenkins node has been running hot for the past 3 months (CPU >=90%, increased RAM usage, etc.) — and our knee-jerk reaction has been to scale vertically .. Yes, we know how to setup a Jenkins slave and have done this in the past — but there has to be an easier way to consume “transient infrastructure” (provision, configure, execute, destroy) .. We looked into integrating Jenkins with Lambda — but it no longer looked like normal Ansible “code” — so we punted on that .. There has to be a way to run a Jenkins job on some transient infrastructure without having to: 1) redo your workflow, and 2) having to care too much ..

Let’s Google — “jenkins aws plugin” .. Hello AWS CodeBuild !!

Here’s how AWS CodeBuild describes itself — “AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers.” My big take away — “you don’t need to provision, manage, and scale your own build servers” .. Do a simple regex on that last statement (s/build/automation/) and you can see where I am going with this ..

Checkout the plugin, it states — “Instead of sending your build jobs to Jenkins build nodes, you use the plugin to send your build jobs to AWS CodeBuild.” What’s a build job ?? Commands .. What’s a command ?? CLI .. Just look at the examples they give us .. So what’s stopping me from baking a Docker image that has Ansible installed and using that as my custom build environment ?? Nothing .. All that’s left (<<famous last words) is to create a legitimate buildspec.yml and away you go:

	version: 0.2

	phases:
	build:
	commands:
	– ansible-playbook -i $INVENTORY $PLAYBOOK \
	–limit $LIMIT –extra-vars $EXTRA –tags $TAGS

view raw

buildspec.yml

hosted with ❤ by GitHub

Here’s a nice article that will show you how to setup and use the Jenkins AWS CodeBuild plugin ..

March 21, 2018

Purging Dead AWS Route53 Records via Ansible

We aren’t a big shop, but we use AWS Autoscale Groups .. That means nodes spin up and down all day long .. Got some traffic ?? Add nodes .. Traffic is low ?? Drop nodes .. Someone accidentally terminates a node ?? Add nodes .. Someone sneezes funny ?? Drop nodes ..

This goes on all day long to get as close to “right-sizing” our infrastructure as we can ..

We also name our nodes using a simple formula for various monitoring and orchestration purposes .. For example:

asg-<application>-<environment>-<availability-zone>-<instance-id>.example.com

asg-login-prod-us-east-1a-i-abc123xyz.example.com

Yes, i understand an argument could be made this is an anti-pattern .. “Greg, you don’t need to name the stinking node .. It is cattle — nobody cares !!” .. I get it, I really do — but sometimes it’s nice just to have a name ..

So then, with all this spinning up and down of nodes and the related creation of Route53 records — you can end up with a lot of dead entries .. For whatever reason, you may wish purge these dead entries — and OBVIOUSLY you do NOT want to do this manually .. So what to do ??

Well, I might have a solution for you .. Go ahead and check out this GitHub repo: https://github.com/gkspranger/aws-route53-purge-dead-records .. It’s a simple playbook and role that will purge AWS Route53 records for a given hosted zone and a known naming pattern, while making sure to NOT delete records of nodes that are currently running ..

WARNING It can cause damage, so please be sure to review and understand what is going on with the playbook and role ..

December 25, 2017

Dear Nagios

Dear Nagios,

I’m sorry, but it’s over ..

No, no, no .. It’s not you, it’s me .. I’ve changed — let me explain ..

You’ve been great these past 7 years .. Better than great — you basically defined what monitoring means to me .. Yes, we’ve both said things we regret — but we always ended back together ..

This time, it’s different ..

I found someone ..

Their name is TIG (Telegraf, InfluxDB, Grafana) ..

STOP THAT !! THAT’S NOT NICE .. Yes, they’re younger than you — but it’s not about that .. NO !! NO !! NO !! I’m not going through a midlife crisis .. Like I said earlier — I’ve changed .. Simply put, I now care more about metrics now than I do monitoring ..

Yeesss, I know you offer performance data .. You know I know that .. We’ve dated for almost a decade — let’s at least be honest with one another ..

Uh huh .. Uh huh .. Uh huh .. No, not this time, not ever .. Uh huh .. Uh huh .. Uh huh .. Believe me, I wish there was a way, I just don’t see it happening ..

OK !! THAT’S ENOUGH !! This is getting abusive now — I’m going to leave ..

Thank you for being with me ..

Thank you for teaching me ..

But it’s time to go ..

October 5, 2017

Splunk Host Tags

Did you know you can tag a host in Splunk ?? I didn’t !! Do you know how much time tags would have saved me from having to craft a most excellent Splunk search to capture just the right hosts ?? Me neither — but I’m guessing it’s a lot ..

So instead of my searches looking like this:

# get all staging RMI nodes -- hard
index=* ( host=rmi1.s.* OR host=rmi2.s.* OR host=rmi3.s.* ) source=*tomcat* earliest=-1h

They can now look like this:

# get all staging RMI nodes -- easy
index=* tag=rmi tag=stage source=*tomcat* earliest=-1h

I know, I know — I could achieve the same level of excellence using targeted indexes (index=rmi_stage) and/or various regex filters .. Some of that, unfortunately, is out of my control ..

OK .. So how can you manage this without having to use the GUI ?? Easy !! You just need to drop a config file in the proper location (for me it’s: /opt/splunk/etc/system/local/tags.conf) on the search head, and away you go .. The syntax is pretty basic:

# tagging host login.example.net with PROD, TOMCAT, and LOGIN
[host=login.example.net]
prod = enabled
tomcat = enabled
login = enabled

Below’s a nice little example of how I automated this using Ansible (big surprise there 🙂 and the EC2 dynamic inventory script ..

	# AWS EC2 hosts
	# using ANSIBLE to assign tags

	{% macro eze(tag) -%}
	{# this is an easy, consistent way to enable a tag #}
	{{ tag }} = enabled
	{%- endmacro %}

	{# loop thru all your EC2 hosts in alpha order #}
	{% for i in ansible_play_batch \| sort %}

	{# set the node name AND node vars #}
	{% set node_name=i %}
	{% set node_vars=hostvars[ i ] %}

	# tags for {{ node_name }}
	[host={{ node_name }}]

	{# set custom enviro tag — dev, stage, prod, etc #}
	{{ eze(node_vars.enviro) }}

	{# set std AWS metadata tag — region, AZ, etc #}
	{{ eze(node_vars.ansible_ec2_placement_region) }}
	{{ eze(node_vars.ansible_ec2_instance_id) }}
	{{ eze(node_vars.ansible_ec2_placement_availability_zone) }}

	{# set custom tag if condition is met — app type, etc #}
	{% if node_vars.app_type is defined %}
	{{ eze(node_vars.app_type) }}
	{% endif %}

	…

	{% endfor %}

view raw

splunk_tags.conf

hosted with ❤ by GitHub

August 22, 2017

He Shoots .. He Fails !!

For the past 6 months I have been working on a lean startup .. It was (<<infer all you want here) a chatbot interface for a financial services CRM — and did not end well 😦 That said, I did some of my best Ansible/AWS work and my server-side JavaScript (Node.js) and understanding of the Hubot internals improved exponentially ..

So I want to share !!

I can’t get into too many details, but the overall concept was that every customer would be running a micro instance with our custom Hubot code installed .. This instance would pull code updates, if any, every 5 minutes and infrastructure updates, if any, every 15 minutes .. In addition, a customer could participate in pilot programs — AKA branch work ..

I really liked how I was able to mitigate the use of a “command node” and just run Ansible locally and on a schedule .. Also, I was able to automate pretty much everything — from VPC creation all the way to autoscaling groups ..

Anyway, here’s the link: https://github.com/gkspranger/failed-chatbot .. Maybe it will help one of you out there in Internets land ..

July 10, 2017

Qik-n-EZ: Nagios AWS EC2 Hosts via Ansible

Sooo .. You are monitoring a fleet of AWS EC2 hosts via Nagios, and have yet to find an easy way to manage their host definitions .. Good news (if you happen to be using Ansible dynamic inventories) !! I created an Ansible template that loops thru all your EC2s and creates them for you ..

In addition, you can easily define Nagios service dependencies, helping you zero in on the root problem more quickly ..

	{# loop thru all relevant nodes #}
	{% for i in ansible_play_batch \| sort %}

	{# set the node name AND node vars #}
	{% set node_name=i %}
	{% set node_vars=hostvars[ i ] %}

	{# define the nagios hostgroup vars associated with this node .. default to linux #}
	{% if node_vars.nagios_hostgroups is defined %}
	{% set node_hostgroups="linux," ~ node_vars.nagios_hostgroups %}
	{% else %}
	{% set node_hostgroups="linux" %}
	{% endif %}

	{# define the nagios hostgroup arr so we can do some easy checking #}
	{% set node_hostgroups_arr=node_hostgroups.split(",") %}

	#############################
	## START {{ node_name }}
	#############################

	define host {

	use linux-server
	host_name {{ node_name }}
	address {{ node_vars.ansible_ec2_local_ipv4 \| default('127.0.0.1') }}

	hostgroups {{ node_hostgroups }}

	# it's nice to have some ec2 data as nagios host vars
	_ec2_instance_id {{ node_vars.ansible_ec2_instance_id }}
	_ec2_instance_type {{ node_vars.ansible_ec2_instance_type }}
	_ec2_placement_availability_zone {{ node_vars.ansible_ec2_placement_availability_zone }}
	_ec2_placement_region {{ node_vars.ansible_ec2_placement_region }}
	_ec2_security_groups {{ node_vars.ansible_ec2_security_groups }}

	}

	#############################
	## NRPE dependencies
	## if NRPE aint available, these will fail
	#############################

	# SWAP
	define servicedependency {
	host_name {{ node_name }}
	service_description NRPE Port
	dependent_host_name {{ node_name }}
	dependent_service_description Swap
	execution_failure_criteria n
	notification_failure_criteria w,u,c
	}

	# CPU
	define servicedependency {
	host_name {{ node_name }}
	service_description NRPE Port
	dependent_host_name {{ node_name }}
	dependent_service_description CPU
	execution_failure_criteria n
	notification_failure_criteria w,u,c
	}

	# you can go on and on here ..
	# for a very long time ..

	#############################
	## END {{ node_name }}
	#############################

	{% endfor %}

view raw

ec2_hosts.cfg

hosted with ❤ by GitHub

June 15, 2017

Qik-n-EZ: Nagios Plugins to Check AWS EC2 Images and Snapshots

Afraid of having too many AWS EC2 images and/or snapshots, thus running up your bill ?? Fear not !! I have you covered:

Nagios Plugins to Check AWS EC2 Images

	#!/bin/bash
	#
	# checks for the number of AWS AMIs and allows alerts when threshold is met
	# example usage:
	# ./check_aws_amis.sh -w <integer> -c <integer>
	###
	### USES ANSIBLE to put AWS KEYs as VARS
	### USES ANSIBLE to define NAGIOS REGION
	###
	### REQUIRES AWS CLI : https://aws.amazon.com/cli/
	### REQUIRES JQ : https://stedolan.github.io/jq/

	export AWS_ACCESS_KEY_ID={{ aws_access_key_id }}
	export AWS_SECRET_ACCESS_KEY={{ aws_secret_access_key }}

	warn=NULL
	critical=NULL

	help () {
	cat << EOF
	Check number of AMIs in AWS.

	Usage:
	check_aws_amis.sh -w <warning number> -c <critical number>

	Options:
	-h,
	Print detailed help screen
	-w INTEGER
	Exit with WARNING status if greater than INTEGER
	-c INTEGER
	Exit with CRITICAL status if greater than INTEGER
	EOF
	exit 3
	}

	while getopts "w:c:h" opt; do
	case $opt in
	w)
	warn="$OPTARG"
	;;
	c)
	critical="$OPTARG"
	;;
	h)
	help
	;;
	esac
	done

	if [[ "$warn" == "NULL" ]] \|\| [[ "$critical" == "NULL" ]]; then
	help
	fi

	amis=`aws ec2 describe-images –owners self –region {{ ansible_ec2_placement_region }} \| jq -r '.Images \| length'`

	if [ $amis -ge $critical ]; then
	dostatus="CRITICAL"
	doexit=2
	elif [ $amis -ge $warn ]; then
	dostatus="WARNING"
	doexit=1
	elif [ $amis -lt $warn ]; then
	dostatus="OK"
	doexit=0
	else
	dostatus="UNKOWN"
	doexit=3
	fi

	echo "AWS AMIs ${dostatus} – ${amis} AWS AMIs \| amis=${amis};0;0;0;0"
	exit $doexit

view raw

check_aws_amis.sh

hosted with ❤ by GitHub

Nagios Plugin to Check AWS EC2 Snapshots

	#!/bin/bash
	#
	# checks for the number of AWS snapshots and allows alerts when threshold is met
	# example usage:
	# ./check_aws_snapshots.sh -w <integer> -c <integer>
	###
	### USES ANSIBLE to put AWS KEYs as VARS
	### USES ANSIBLE to define NAGIOS REGION
	###
	### REQUIRES AWS CLI : https://aws.amazon.com/cli/
	### REQUIRES JQ : https://stedolan.github.io/jq/

	export AWS_ACCESS_KEY_ID={{ aws_access_key_id }}
	export AWS_SECRET_ACCESS_KEY={{ aws_secret_access_key }}

	warn=NULL
	critical=NULL

	help () {
	cat << EOF
	Check number of snapshots in AWS.

	Usage:
	check_aws_snapshots.sh -w <warning number> -c <critical number>

	Options:
	-h,
	Print detailed help screen
	-w INTEGER
	Exit with WARNING status if greater than INTEGER
	-c INTEGER
	Exit with CRITICAL status if greater than INTEGER
	EOF
	exit 3
	}

	while getopts "w:c:h" opt; do
	case $opt in
	w)
	warn="$OPTARG"
	;;
	c)
	critical="$OPTARG"
	;;
	h)
	help
	;;
	esac
	done

	if [[ "$warn" == "NULL" ]] \|\| [[ "$critical" == "NULL" ]]; then
	help
	fi

	snapshots=`aws ec2 describe-snapshots –owner-ids self –region {{ ansible_ec2_placement_region }} \| jq -r '.Snapshots \| length'`

	if [ $snapshots -ge $critical ]; then
	dostatus="CRITICAL"
	doexit=2
	elif [ $snapshots -ge $warn ]; then
	dostatus="WARNING"
	doexit=1
	elif [ $snapshots -lt $warn ]; then
	dostatus="OK"
	doexit=0
	else
	dostatus="UNKOWN"
	doexit=3
	fi

	echo "AWS Snapshots ${dostatus} – ${snapshots} AWS Snapshots \| snapshots=${snapshots};0;0;0;0"
	exit $doexit

view raw

check_aws_snapshots.sh

hosted with ❤ by GitHub

May 17, 2017

Hubot with Handlebars :-3

So you’ve met Hal — he’s my bud .. That said, I’ve never been a fan of how I “reply” to Hubot commands .. For example:

	// Commands:
	// hubot hello – replies with "world"!

	const logger = require("winston");

	module.exports = function(robot) {
	robot.respond(/hello$/i, { id: "hello" }, function(message) {
	message.send("world!");
	logger.log("info", "i like to log stuff!", { yourObject: "some random data" });
	});
	};

view raw

hello.js

hosted with ❤ by GitHub

“Greg, what’s your problem ?? Hal brings you beer !!” .. True, true .. But I want more !! I want to be more like my developer friends and apply an MVC design pattern to my Hubot development .. Specifically, I want my “views” to be beautiful and maintainable, while having the ability to use complex data “models” ..

Hello Handlebars !! Long story short, I can easily build semantic templates, compile output, and send as a Hubot reply .. For example:

	// Commands:
	// hubot handlebar me – example of how to use handlebars

	const logger = require("winston");
	const handlebars = require("handlebars");
	const fs = require("fs");

	module.exports = function(robot) {
	// this is the controller
	robot.respond(/handlebar me/i, function(message) {

	// this is me getting the view
	fs.readFile('./views/mytemplate.txt', 'utf-8', function(error, source) {
	var template = handlebars.compile(source);

	// this is me passing the model into the view
	var output = template({
	line1: "this is line 1",
	arr1: [
	"this",
	"is",
	"an",
	"array"
	],
	obj1: {
	obj: "object within an object"
	}
	});

	// this is me replying to the hubot command
	message.send(output);
	});
	});
	};

view raw

hubot-handlebars.js

hosted with ❤ by GitHub

Here’s the template:

	{{!–

	# these are comments .. they will not be displayed
	# i often put what the object looks like in here so i don't forget

	{
	line1: …,
	arr1: […],
	obj1: {
	obj: …
	}
	}

	–}}
	Line 1: {{line1}}

	{{#each arr1}}
	Item {{@index}}: {{this}}
	{{/each}}

	My object inside an object: {{obj1.obj}}

view raw

handlebar-template.txt

hosted with ❤ by GitHub

Here’s the Slack output: