Ansible: Handling Different Operating Systems with Variables

A fundamental part of working with configuration management tools is the ability to handle different operating systems within the same logic stream, such as a module.

I’ve found a few challenges addressing this topic within Ansible, and hence I think it’s worth noting certain things that you can’t do, and I think is the most logical way (so far).

Things That do not Work

As with most things I learn in life, I figured these out via trial and error, via an approach something along the lines of “I should be able to do this, shouldn’t I?”

Variables and facts cannot be directly referenced in a handler

I tried this:

# handlers/main.yml
---
- name: (Ubuntu) restart ssh
  service: name=ssh state=restarted

- name: (CentOS) restart ssh
  service: name=sshd state=restarted

# tasks/main.yml
---
- name: test SSH restart
  lineinfile: dest=/etc/ssh/sshd_config line=#testing
  notify: ({{ ansible_distribution }}) restart ssh

This gives you this:

ERROR: change handler (({{ ansible_distribution }}) restart ssh) is not defined

Hmm.

Logic does not work in handler

You also can’t do something like this:

# handlers/main.yml
---
- name: restart ssh
  service: name=ssh state=restarted
  when: ansible_distribution == 'Ubuntu'

- name: restart ssh
  service: name=sshd state=restarted
  when: ansible_distribution == 'CentOS'

In this instance, only the first handler is looked at, and if the logic in when: does not evaluate to true, the handler will be skipped. Completely.

Defining variables within a task or handler

You cannot use a vars directive within a task or handler: you simply get this:

ERROR: vars is not a legal parameter in an Ansible task or handler

Note that facts can be set using the set_fact module but this is better used when there is a need to define host-specific variables for use later, in more of a host scope versus a role scope.

The Right Way

I mentioned that variables within a role should not be used, but that’s only half the story. Specifically, variables should not be used for parameters – if you are going to define those variables later within a playbook or via group_vars/ or host_vars/, it is better to make them defaults.

But on the other hand, they are better suited as variables if they are not intended for assignment outside of the scope of the role. This is perfect for what we are planning on using them for here.

Variables can be defined in a role pretty easy – see roles. Note that in here there is a vars/ directory, from which vars/main.yml is loaded with high priority.

This is not the file I am looking for though, as variable files are not very flexible: no additional files or logic can go into it, just values.

So how can variables be included logically then?

include_vars and with_first_found

include_vars is a module that allows you to include variables from within a task. Perfect for the kind of logic flow we are looking for, and allows you to specify a file based off either a jinja2 template handler or the special {{ item }} token.

That’s only part of it though. Logic is needed to handle different operating systems, and can get pretty messy if all that is being used is a bunch of when: clauses. Enter with_first_found:

# tasks/main.yml
---
- name: set distro-specific variables
  include_vars: '{{ item }}'
  with_first_found:
    - '{{ ansible_os_family }}.yml'
    - default.yml

Using this, files can now be placed like so:

# vars/Debian.yml
---
ssh_svc: ssh

# vars/RedHat.yml
---
ssh_svc: sshd

# vars/default.yml
---
ssh_svc: ssh

Now, if you have a handler such as:

# handlers/main.yml
- name: restart ssh
  service: name={{ ssh_svc }} state=restarted

Either ssh or sshd would be used, depending on distro.

Note that with_first_found can be used to set up logic for other things, not just include_vars, it can be especially useful within your task workflow, to avoid including logic that may artificially take up log space or time in a playbook. These includes could be used in the place of when: clauses, for example.

More detail on logic and templating below. Most with_ clauses can be found in the Loops page, but there are a couple of more within the other pages. I wish Ansible was better at documenting this!

http://docs.ansible.com/playbooks_loops.html
http://docs.ansible.com/playbooks_lookups.html
https://docs.ansible.com/playbooks_conditionals.html#selecting-files-and-templates-based-on-variables

Advertisement

Configuration Management Series Part 1.5 of 3 – Ansible Follow-Up

Before we get started on working with our next configuration management system, Chef, I thought I’d post a follow-up to last week’s article covering Ansible. In this article, I will be touching up on some of the conclusions I drew last week, and my opinion on them now, and also some of the things I have learned about working with playbooks.

Programmability

First off, some things on programmability:

Advantages of sequential execution

I really do enjoy the top-down nature of Ansible. You don’t really have to set up any sort of dependency chain for classes or resources, which can really work to your advantage.

Consider the scenario where you have a file that needs to be written to or a command that needs to be run only when a specific package is installed.

In Puppet, this needs to be written:

file { '/etc/example/example.conf':
  content => template('example_mod/example.conf.erb'),
  require => Package['example']
}

package { 'example':
  ensure => installed
}

This is not so bad, and in fact, the declarative nature of Puppet makes it pretty easy to manage this stuff.

But what if you try to require a different resource, like a class? Or multiple packages?

Your child resource could end up with dependencies like this, which would then need to be repeated:

file { '/etc/example/example.conf':
  content => template('example_mod/example.conf.erb'),
  require => [ Package['example1'], Package['example2'], Package['example3'] ]
}

You may also try to chain off a class or service, however when I have attempted these kinds of things, I have just ran into dependency loops.

Then you get so far down the rabbit hole you need a grapher. Consider the following article regarding dependency issues. When I need a whole article to try to figure out how to properly structure dependencies, I honestly tune out.

In Ansible, you don’t really need all of this. You just make sure your package tasks are set up before the tasks that depend on them:

- name: Install packages
  yum: name='{{ item }}' state=present
  with_items:
    - example1
    - example2
    - example3

-name: Put the file where it needs to go now
  template: src=example.conf.j2 dest=/etc/example/example.conf mode=0644

In this instance, if the first task fails, the other ones will not execute.

Another easy check that you can stick at the start of a playbook is an OS check or what not, such as:

- name: check for supported OS
  fail: msg=Unsupported distribution (Ubuntu 14.04 supported only)
  when: ansible_distribution_release != 'trusty'

This will fail if the playbook is not running against anything but Ubuntu 14.04, allowing you to have the rest of the playbook run without worrying about it being run against an unsupported OS.

Frustrations with bad error messages

A few things I have noticed so far with Ansible is the Python-ish error messages you get out of it, sometimes, unfortunately, out of context. I have gotten error messages with stack traces in them (ie: when processing Jinja2 stuff and also when a module fails thru some sort of untrapped Python method). I actually don’t mind this, because at least I know where to start looking.

It’s error messages like this that drive me up the wall.

msg: Error in creating instance: 'str' object has no attribute 'get' 
FATAL: all hosts have already failed -- aborting

This happened to me while trying to mess with the nova_compute module. Where exactly am I supposed to start looking for something wrong here? I have even attempted to use pdb to try and chase it down, but I had to give up because I had other stuff to do. I may try again, but I really am on my own here, barring outside help. This error message does nothing to tell me where my problem is or how to correct it.

It almost would have been better if the error was left un-handled, so I could see the Python stack trace.

Playbooks

And also, some thoughts and bits of knowledge on playbooks.

The playbook as a site configuration

If you looked at the previous article, I had a bit of trouble trying to figure out what a playbook really was in relation to how Puppet is laid out. At first glance, it seemed to me that a playbook was close to what a module was, but that’s not really the case at all.

Playbooks are modular, sure, but they are more like individual site configurations more than anything else. Think about them as individual Puppet or Chef site installs.

According to the Ansible playbook best practices guide there is a very specific layout. This is reproduced below:

production                # inventory file for production servers
stage                     # inventory file for stage environment

group_vars/
   group1                 # here we assign variables to particular groups
   group2                 # ""
host_vars/
   hostname1              # if systems need specific variables, put them here
   hostname2              # ""

library/                  # if any custom modules, put them here (optional)
filter_plugins/           # if any custom filter plugins, put them here (optional)

site.yml                  # master playbook
webservers.yml            # playbook for webserver tier
dbservers.yml             # playbook for dbserver tier

roles/
    common/               # this hierarchy represents a "role"
        tasks/            #
            main.yml      #  <-- tasks file can include smaller files if warranted
        handlers/         #
            main.yml      #  <-- handlers file
        templates/        #  <-- files for use with the template resource
            ntp.conf.j2   #  <------- templates end in .j2
        files/            #
            bar.txt       #  <-- files for use with the copy resource
            foo.sh        #  <-- script files for use with the script resource
        vars/             #
            main.yml      #  <-- variables associated with this role
        defaults/         #
            main.yml      #  <-- default lower priority variables for this role
        meta/             #
            main.yml      #  <-- role dependencies

    webtier/              # same kind of structure as "common" was above, done for the webtier role
    monitoring/           # ""
    fooapp/               # ""

Note how everything is accounted for here. For someone like me who comes from Puppet, there are some easy analogs here that have helped me:

  • site.yml and the other playbooks are pretty similar to Puppet site manifest files. From them entire configurations can be laid out without a need for anything else, or different roles can be referenced.
  • group_vars and host_vars behave similar to Hiera for storing variable overrides.
  • Finally, the roles structure is the closest thing that Ansible has to Puppet modules.

Honourable mentions to the library and filter_plugins directories. Ansible modules are actually closer to the core than Puppet’s, and normally do not need to be written, but you may find yourself doing so if your roles require functionality that cannot be reproduced in any existing module and you want that recycled across modules. Also, jinja2 filter plugins can be written (we give a small example below).

Roles

As mentioned, roles are probably the closest thing that Ansible has to a puppet module. These are recyclable “playbooks” that are self-contained and are then included within a proper playbook via the following means:

- hosts: webservers
  sudo: yes
  roles:
    - common
    - webtier
    - monitoring

This would ensure that webservers got any tasks related to the common, webtier, and monitoring roles, running under sudo. Imagine that this means that some basic common tasks are done (ie: maybe push out a set of SSH keys and make sure that DNS servers are set correctly), set up a webserver, and install a monitoring agent.

Similar to Puppet or Chef, there is a file structure to roles that allows you to split out various parts of the roles to different directories:

  • tasks for role tasks, the main workflow of a role and Ansible itself.
  • handlers for server actions and what not (ie: restart webserver)
  • templates to store your jinja2 templates for template module tasks
  • files for your static content
  • defaults to define default variables
  • vars to define local variables (NOTE: I have decided to stay away from these and I recommend you do too. Use defaults!)

main.yml is the universal entry point for all of these. from here, you can include all you wish. Example:

- name: check for supported OS
  fail: msg=Unsupported distribution (Ubuntu 14.04 supported only)
  when: ansible_distribution_release != 'trusty'

- include: install_pkgs.yml

This would be a tasks/main.yml file and demonstrates how you can mix includes and actual tasks in the role.

Group and host variables

This is where I would recommend you store variables if you need them. The obvious place would be group first, then host if you need specific host variables.

For example, this would apply foo=bar to all webservers for all plays:

# group_vars/webservers.yml
---
foo: bar

You can also include them in the playbook directly, ie: like so:

- hosts: webservers
  sudo: yes
  roles:
    - common
    - webtier
    - monitoring
  vars:
    foo: bar

Secure data

ansible-vault is a very basic encryption system that encrypts an entire file (does not need to be YAML) off the command line. It hooks into ansible and ansible-playbook via the --ask-vault-pass switch that will take any encrypted files it encounters and decrypt them.

It is recommended that anything with sensitive data be encrypted. The one main difference here is that as opposed to Hiera-eyaml, for example, the entire file is encrypted, so you may wish to separate encrypted and unencrypted data through separate files, if it can be helped. This ensures non-sensitive data can still be published to source control and be reviewed.

Unfortunately, it does not look like there is a certificate or private/public key system in use here, so remembering the password is very important, as is picking a secure one. Passwords can also be stored in a file (make sure it’s set to mode 0600), and passed using the --vault-password-file option under all Ansible utilities, assumedly.

Filter Plugins for jinja2

And finally, I wrote that Jinja2 plugin I was hitting the wall with. 😉 It’s very simple and can be found below. This would go into the filter_plugins directory.

# Add string split filter functionality
def split(value, sep, max):
    return value.split(sep, max)

class FilterModule(object):
    def filters(self):
        return {
            'split': split
        }

This now gives you the split filter that can be invoked like so:

{{ (example|split('.'))[0] }}

Assuming example.com, this would give example.

“Final” Thoughts

To be honest, I think that Ansible is more of an orchestration tool rather than a configuration management system. The agentless push nature of it, combined with its top-down workflow, roll-your-own inventories, and loosely enforced site conventions give it a chaotic feel that, even after giving it a week, still does not feel like configuration management to me.

This is not to say it’s a bad tool in any right, in fact, quite the opposite. Ansible is a great addition to my toolbox, and I am looking forward to applying it in everyday engineering. The above points that may be “weaknesses” when considering configuration management can equally be strengths when you need to get something done quickly and with as little of a footprint as possible.

Further to that, Ansible can be combined with something like Chef or Puppet, with each tool playing on its strengths. Ansible can be used to bootstrap either tool, with Chef or Puppet doing things like managing SSH keys (such as one for Ansible), enforcing configuration, etc. Ansible can be used in place of something like knife or puppet kick to force updates as well.

Some tools need work: Vault, for example, could do better to be more than just a wrapper for encryption, and set up some sort of keypair structure to reduce the amount of typing you have to do. And error messaging could do with a bit of work so that engineers are not scratching their heads trying to figure out what variable caused Python to barf when errors are trapped.

All in all though, I really like Ansible. There are some other projects that I want to try it with, namely working with the nova_compute and vsphere_guest modules to help automate orchestration even further.

I know that I mentioned that Chef was going to be this Sunday. Unfortunately (or maybe fortunately) I spent a bit more time with Ansible than I thought I was going to, and the research has not been done to properly get a Chef article up yet – ie: I have not yet started. 😛 in any case, that will be happening next – so stay tuned for the it next weekend!

Configuration Management Series Part 1 of 3: Ansible

This is the first of a three-part series that I am doing regarding reviewing 3 major configuration management tools: Ansible, Chef, and Puppet.

You may have seen that I have written here about Puppet before – indeed, it is the configuration management tool that I have the most experience with and probably the one that I will be sticking with personally (famous last words :p). However, I think that it’s always in an engineer’s best interest to be able to understand, support, and thrive with technologies other than the ones that he or she may personally favour as part of their own toolkit – that is what teamwork is built on, after all. So with that, take the opinions in these next few articles with a grain of Salt (which, incidentally, is the name of another tool that I will not be reviewing at this point in time, heh). I hope that you can value my opinion while still understanding that is it subjective, and your experience will vary. Honestly, I think that all 3 of these tools have value, and in the hands of competent engineers, you would be hard-pressed to find shortcomings in any of them.

One other note: I have removed the ratings from this article and the future articles will not have them either. I thought it was a good idea at first, but as time went on, and my experience with the tools increase, they seem more and more pretentious and quite misinformed.

Now that I have that disclaimer out of the way, let’s get started shall we?

As mentioned in the first part of this series, we are covering Ansible.

About Ansible

A long time ago in a technology scene not so far away (but definitely quite different than the landscape today, I used to work at NetNation (you may now know them as Hostway Canada). On our management station we had a very simple shell script that performed a very valuable task: batch administration of our shared hosting server farm over SSH.

It has now been nearly 10 years since I left Hostway, yet some concepts never die, they just get better. Ridiculously better. Magnitudes of evolution better.

Maybe about 5 or so years ago now I actually wrote a better version of this script in Python – doing some research I fond a very nifty SSH library called Paramiko (http://www.paramiko.org/) and added a few things to make the experience better: namely, the ability to upload files and run shell scripts on the remote machine.

Admittedly, I haven’t really looked into Ansible recently, but imagine my surprise to find out that someone has basically taken that concept and made it leaps and bounds more awesome.

Ansible is basically this concept on – even though I loathe to use this analogy – steroids. Its main command transport is SSH, and does not require an agent on managed nodes to run. It allows for complex configuration management thru a YAML-based pseudo-DSL, one-off command execution, file upload, and also remote execution of complete shell scripts. A very powerful swiss-army knife for both one-off and ongoing configuration management.

Installation

Installation of Ansible is probably the most minimal out of the three that we will be covering. The three major methods are either: 1) git repo (https://github.com/ansible/ansible), 2) package, or 3) pip.

With either method you may notice something that you may not be used to if you have used a more traditional configuration management tool such as Chef or Puppet: no management server.

This is almost true. Ansible nodes normally do not use a check-in method to get their configuration data, rather it is pushed out to various nodes, so there is no need for a server with this capability. Nonetheless, there is a central configuration structure that can be found in a directory such as /etc/ansible and needs to be edited accordingly. More on that in brief later.

You can find the installation guide at http://docs.ansible.com/intro_installation.html.

Windows Support

This was going to detail the support of both major operating systems, but I felt the need to mention Linux support was redundant, as all 3 were obviously built to manage Linux first.

Ansible was written with Linux and other open source server systems in mind. With that said, they have had made a very recent push to supporting Windows, and it is looking pretty promising. Going with their agentless model again, they have decided to use WinRM instead. Some of the Windows specific features look amazing as well and definitely on par with something like Puppet.

Check out the Windows section at http://docs.ansible.com/intro_windows.html. The list of Windows-specific modules can be found at http://docs.ansible.com/list_of_windows_modules.html.

Manageability

Since Ansible does not rely on agents, there is no real host discovery in the sense that hosts would be checking in with your management sever. From what I’ve seen, there is not even any real language to refer to the “master” server.

Inventory is managed in /etc/ansible/hosts in a pretty simple configuration file that is hard to mess up. Here is my sample hosts file:

# Our lab webserver
[webservers]
172.16.0.12

# Our lab database server
[dbservers]
172.16.0.13

Hosts can be referred to by IP address, hostname, and can be further abstracted with aliases. They can also be grouped together using primitive expressions if your hosts all follow a certain convention. Various plugins exist to gather inventory dynamically too – see http://docs.ansible.com/intro_dynamic_inventory.html.

The only thing that I really see missing here is some native sort of host registration and grouping. This may not be something that, by design, would be Ansible’s domain though, considering it’s push-based architecture. Also, depending on the environment you are using it in, this may be something that you can handle with dynamic inventory.

Best practice for storing host parameters (variables) actually follows very closely with how Puppet handles things with Hiera. See here: http://docs.ansible.com/intro_inventory.html#splitting-out-host-and-group-specific-data and you will see that data is generally stored in YAML files in a very similar fashion. There is also Ansible Vault: http://docs.ansible.com/playbooks_vault.html which allows for the storage of encrypted data.

There is also an enterprise console, of course. Check out Ansible tower here: http://www.ansible.com/tower

Execution Model

Several options exist for you to use with Ansible’s powerful command-line structure. The ansible command line tool can be used to execute one-off commands, module runs, and even push shell scripts and files over to remote nodes. And since it’s all intended to be run over SSH via standard user accounts, existing access control schemes can be extended to configuration management as well.

Even though I don’t believe it’s a strength of the tool, there is a pull option available as well via ansible-pullhttp://docs.ansible.com/playbooks_intro.html#ansible-pull

Programmability

My opinion here might be a bit polarizing or just completely out there, but I have not had the best experience writing for Ansible so far, although mind you I have only been doing it for an evening, and obviously still have a lot to learn.

Modules are actually bit more of an advanced topic, with Playbooks taking the place of general modules like you would be used to in Puppet, or cookbooks in Chef. These are collections of YAML files that dictate orchestration and management of configuration files. Playbook documentation can be found at http://docs.ansible.com/playbooks.html.

Templating for Playbooks is done using jinja2 – a powerful template language written for Python. You can find the template reference at http://jinja.pocoo.org/docs/dev/templates/. Ansible also extends this a bit and its worth reading http://docs.ansible.com/playbooks_variables.html#using-variables-about-jinja2 to find out how.

Unfortunately, this is where I got a little hung up on a learning curve. My impressions were:

  • The YAML structure for Playbooks was open to a bit of interpretation, and the ambiguity of it was throwing me off at points. I found that I was trying to figure out the best way to nest, if I should be at all, how much code to recycle, etc. Ironically, there seems to be several ways to do it, which kind of flies in the face of one of the core philosophies of what Ansible is written in, Python (see https://wiki.python.org/moin/TOOWTDI).
  • I hit a straight up wall when I found out that jinja2 does not have a split filter – you need to write one of your own. An easy task in Python, but annoying nonetheless.

Lastly, when it comes to config management, I prefer to not have to write any of my own modules if I can help it. Unfortunately, briefly looking at Ansible Galaxy left me unfulfilled. Standard modules that I was used to having in Puppet, such as mainline designed modules like Apache and MySQL, were not up to par or were just flat out missing. Other highly-rated modules that purported to do one thing did others (ie: one Apache module was trying to manage SSH keys as well). Several user accounts (even one named “ihateansible”) exist with zero contributions. I hate to say it but my at least first impression is that there are some quality control issues here.

Conclusion

  • Strengths:
    • Agentless push-style management
    • Powerful inventory management
  • Weaknesses
    • Core programmability features will take some getting used to if you are coming from Chef/Puppet.
    • Ansible Galaxy possibly has some quality control issues.

Ansible is great, no doubt about that. It has taken a tried and reliable concept and turned it into something so much more than I could have imagined.

Installation is dead simple, and hence it excels in environments that need to stand up some sort of configuration management fast and with as little impact to existing infrastructure as possible. Its simple yet powerful approach to host management compliments this process even more.

Windows support is now available and looks very promising, and personally I am impressed on how fast they have been able to stand this side of it up.

Programmability could be worked on a bit, but again this could be my inexperience talking. At the very least, I would prefer to see more mainline created and managed Playbooks, which would then raise the overall quality of the modules seen on Galaxy.

Stay tuned for part 2: Chef, either later this week or on next Sunday. There will probably be another article or two regarding Ansible as well before we wrap up!