As luck would have it, I have been working a lot more with Chef in the last couple of weeks, so a supplemental article is in order.
There have been a number of things that I have found out in this period, and there is a good amount that I probably will not cover, but I am going to try and cover some of the key challenges that I have encountered upon my journey. This should give you, the reader, some insight into why I have chosen to continue with Chef, and maybe help you make some good decisions on data management and code design that will save you time on your own ramp-up.
Provisioning with Knife Plugins
First off, the reason why I have decided to keep going with Chef in the first place. Incidentally, it would seem that Chef has the best support for provisioning out of the three tools that I have looked at, especially as far as vSphere is concerned.
As it currently stands, I am in a situation where I need a tool that is going to handle the lifecycle of an instance, end-to-end. Ansible’s vsphere-guest module, unfortunately, was not satisfactory for this job at all, and in fact, working with this in particular piece of code was a lesson in frustration. The code is highly inconsistent across operations – specifically, very few features that are available in instance creation are available in template clone operations, and the available functionality in the former was not suitably portable. Puppet seemingly lacks such a feature altogether, outside of Puppet Enterprise. Both of these are show stoppers for me.
Enter knife-vsphere. An amazing tool, and 100% free. Incidentially, cloning from template is the only creation operation supported, which makes sense, considering that a brand new instance would more than likely be useless for provisioning on. Linux guest customization is fully supported, and programmable support for Windows customization is as well (see examples in the README). Finally, it can even bootstrap Chef onto the newly created node. The tool will even destroy instances and remove their entries from the Chef server if the instance label has a matching node.
Having these features in Chef, it was pretty much a no-brainer to see what wins out here.
Data Bags and Vault, Oh My!
I think my assessment that data bags were close to Hiera was premature. In the time that I wrote that, I have learned a few key takeaways.
First off, data bags are global. And even though I used global lookups directly in my first article about the subject, Hiera objects are designed to be used in a scoped auto-lookup mechanism, as per their documentation. Incidentally, data bags are the ones that are intended to be accessed in a fashion that is more in line with what I described in that article. Live and learn eh?
Conventions can be created that can help with this. For example, one can set up a bag/item pair like
foo/bar if the cookbook’s name is
foo. Also, cookbooks can still be parameterized elsewhere, and in fact Chef features like roles and environments are better suited for this, by design. (Note: Please read the last section of this article before investing too much into roles and environments, as roles in particular are frowned upon in parts of the Chef community).
Considering the latter part of the last paragraph, one would wonder: what exactly are data bags ultimately good for then? What indeed. A little bit of research for me revealed this good commentary on the subject by awesome community Chef 2015 winner Noah Kantrowitz. Incidentally, Noah recommends creating resources for this, versus using roles or environments.
As it currently stands, my general rule of thumb for using data bags is encryption. Does it need to be encrypted? Then it goes into a bag. Not a regular encrypted bag though – a regular encrypted bag requires a pre-shared key that needs to be distributed to the nodes somehow. There is no structure around controlling access to the data. How does this problem get solved then? Enter chef-vault.
Vault allows encryption via the public keys of the nodes that need the data, effectively ensuring only these nodes have access. This can explicitly be controlled by node or through a search string that allows searching on a
key:data pair (ie:
os:linux). This addresses both concerns mentioned in the previous paragraph. The only major issue with this setup is that the data needs to be encrypted across the key that will ultimately be supplied to the node, creating a bit of a chicken-before-the-egg problem. Luckily, it looks like Chef is catching up to this, and recently vault options for knife bootstrap were added to get past this. Now, when nodes are created, vault items can be updated by the host doing the provisioning, allowing a node access to vault items even during the initial chef-client run. This is not supported on validator setup, as the ability to do this anonymously (well, with the validator key) could possibly mean a compromise of the data.
Down the Rabbit Hole
Lastly, I thought I would share some next steps for me. The really fascinating (and frustrating) part of Chef for me is how the community has adopted its own style for using it. There are several years of social coding practices, none of which have been in particularly well documented by mainline.
First off, I am currently working on re-structuring my work to be supported by Berkshelf. This is now a mainline part of the Chef DK, and is used to manage cookbooks and their dependencies.
There are also a number of best practices for writing cookbooks, usually referred to as cookbook patterns in the community, which definitely reflects the developer-centric nature of Chef. Based on some very light and non-scientific observation, one of the more popular documents for this seems to be The Environment Cookbook Pattern by Jamie Winsor, the author of Berkshelf.
2 other articles that have helped me so far are found below as well:
You can probably expect to hear more out of me when it comes to Chef, stay tuned!