all posts tagged planetfedora

by on August 18, 2015

Thanking Oh-My-Vagrant contributors for version 1.0.0

The Oh-My-Vagrant project became public about one year ago and at the time it was more of a fancy template than a robust project, but 188 commits (and counting) later, it has gotten surprisingly useful and mature.

james@computer:~/code/oh-my-vagrant$ git rev-list HEAD --count
james@computer:~/code/oh-my-vagrant$ git log $(git log --pretty=format:%H|tail -1)
commit 4faa6c89cce01c62130ef5a6d5fa0fff833da371
Author: James Shubin <>
Date:   Thu Aug 28 01:08:03 2014 -0400

    Initial commit of vagrant-puppet-docker-template...
    This is an attempt to prototype a default environment for
    vagrant+puppet+docker hacking. More improvements are needed for it to be
    useful, but it's probably already useful as a reference for now.

It would be easy to take most of the credit for taking the project this far, as I’ve been responsible for about 87% of the commits, but as is common, the numbers don’t tell the whole story. It is also a bug (but hopefully just an artifact) that I’ve had such a large percentage of commits. It’s quite common for a new project to start this way, but for Free Software to succeed long-term, it’s essential that the users become the contributors. Let’s try to change that going forward.

james@computer:~/code/oh-my-vagrant$ git shortlog -s | sort -rn
   165    James Shubin
     5    Vasyl Kaigorodov
     4    Randy Barlow
     2    Scott Collier
     2    Milan Zink
     2    Christoph Görn
     2    aweiteka
     1    scollier
     1    Russell Tweed
     1    ncoghlan
     1    John Browning
     1    Flavio Fernandes
     1    Carsten Clasohm
james@computer:~/code/oh-my-vagrant$ echo '165/188*100' | bc -l

The true story behind these 188 commits is the living history of the past year. Countless hours testing the code, using the project, suggesting features, getting hit by bugs, debugging issues, patching those bugs, and so on… If you could see an accurate graph of the number of hours put into the project, you’d see a much longer list of individuals, and I would have nowhere close to 87% of that breakdown.

Contributions are important!

Contributions are important, and patches especially help. Patches from your users are what make something a community project as opposed to two separate camps of consumers and producers. It’s about time we singled out some of those contributors!

Vasyl Kaigorodov

Vasyl is a great hacker who first fixed the broken networking in OMV. Before his work merged, it was not possible to run two different OMV environments at the same time. Now networking makes a lot more sense. Unfortunately the GitHub contributors graph doesn’t acknowledge his work because he doesn’t have a GitHub account. Shame on them!

Randy Barlow (bowlofeggs)

Randy came up with the idea for “mainstream mode“, and while his initial proof of concept didn’t quite work, the idea was good. His time budget didn’t afford the project this new feature, but he has sent in some other patches including some, tweaks used by the Pulp Vagrantfile. He’s got a patch or two pending on his TODO list which we’re looking forward to, as he finishes the work to port Pulp to OMV.

Scott Collier

Scott is a great model user. He gets very enthusiastic, he’s great at testing things out and complaining if they don’t behave as he’d like, and if you’re lucky, you can brow beat him to write a quick patch or two. He actually has three commits in the project so far, which would show up correctly above if he had set his git user variables correctly 😉 Thanks for spending the time to deal with OMV when there was a lot more cruft, and fewer features. I look forward to your next patch!

Milan Zink

Milan is a ruby expert who fixed the ruby xdg bugs we had in an earlier version of the project. Because of his work new users don’t even realize that there was ever an issue!

Christoph Görn

Christoph has been an invaluable promoter and dedicated user of the project. His work pushing OMV to the limit has generated real world requirements and feature requests, which have made the project useful for real users! It’s been hard to say no when he opens an issue ticket, but I’ve been able to force him to write a patch or two as well.

Russell Tweed

Russell is a new OMV user who jumped right into the code and sent in a patch for adding an arbitrary number of disks to OMV machines. As a first time contributor, I thank him for his patch and for withstanding the number of reviews it had to go through. It’s finally merged, even though we might have let one bug (now fixed) slip in too. I particularly like his patch, because I actually wrote the initial patch to add extra disks support to vagrant-libvirt, and I’m pleased to see it get used one level up!

John Browning

John actually found an edge case in the subscription manager code and after an interesting discussion, patched the issue. More users means more edge cases will fall out! Thanks John!

Flavio Fernandes

Even though Flavio is an OSX user, we’re thankful that he wrote and tested the virtualbox patch for OMV. OMV still needs an installer for OSX + mainstream mode, but once that’s done, we know the rest will work great!

Carsten Clasohm

Carsten actually wrote a lovely patch for a subtle OMV issue that is very hard to reproduce. I was able to merge his patch on the first review, and in fact it looked nicer than how I would have written it!

Nick Coghlan

Nick is actually a python hacker, so getting a ruby contribution proved a bit tricky! Fortunately, he is also a master of words, and helped clean up the documentation a bit. We’d love to get a few more doc patches if you have the time and some love!

Aaron Weitekamp

Even though aweiteka (as we call him) has only added five lines of source (2 of which were comments), he was an early user and tester, and we thank him for his contributions! Hopefully we’ll see him in our commit logs in the future!

Máirín Duffy

Máirín is a talented artist who does great work using free tools. I asked her if she’d be kind enough to make us a logo, and I’ll hopefully be able to show it to you soon!

Everyone else

To everyone else who isn’t in the commit log yet, thank you for using and testing OMV, finding bugs, opening issues and even for your social media love in getting the word out! I hope to get a patch from you soon!

The power of the unknown user

They’re sometimes hard to measure, but a recently introduced bug was reported to me independently by two different (and previously unknown) users very soon after the bug was introduced! I’m sorry for letting the bug in, but I am glad that people picked up on it so quickly! I’d love to have your help in improving our automated test infrastructure!

The AUTHORS file

Every good project needs a “hall of fame” for its contributors. That’s why, starting today there is an AUTHORS file, and if you’re a contributor, we urge you to send a one-line patch with your name, so it can be immortalized in the project forever. We could try to generate this file with git log, but that would remove the prestige behind getting your first and second patches in. If you’re not in the AUTHORS file, and you should be, send me your patch already!

Version 1.0.0

I think it’s time. The project deserves a 1.0.0 release, and I’ve now made it so. Please share and enjoy!

I hope you enjoy this project, and I look forward to receiving your patch.

Happy Hacking!


PS: Thanks to Brian Bouterse for encouraging me to focus on community, and for inspiring me to write this post!

by on July 23, 2015

Git archive with submodules and tar magic

Git submodules are actually a very beautiful thing. You might prefer the word powerful or elegant, but that’s not the point. The downside is that they are sometimes misused, so as always, use with care. I’ve used them in projects like puppet-gluster, oh-my-vagrant, and others. If you’re not familiar with them, do a bit of reading and come back later, I’ll wait.

I recently did some work packaging Oh-My-Vagrant as RPM’s. My primary goal was to make sure the entire process was automatic, as I have no patience for manually building RPM’s. Any good packager knows that the pre-requisite for building a SRPM is a source tarball, and I wanted to build those automatically too.

Simply running a tar -cf on my source directory wouldn’t work, because I only want to include files that are stored in git. Thankfully, git comes with a tool called git archive, which does exactly that! No scary tar commands required:

Nobody likes tar

Here’s how you might run it:

$ git archive --prefix=some-project/ -o output.tar.bz2 HEAD

Let’s decompose:

The --prefix argument prepends a string prefix onto every file in the archive. Therefore, if you’d like the root directory to be named some-project, then you prepend that string with a trailing slash, and you’ll have everything nested inside a directory!

The -o flag predictably picks the output file and format. Using .tar.bz2 is quite common.

Lastly, the HEAD portion at the end specifies which git tree to pull the files from. I usually specify a git tag here, but you can specify a commit id if you prefer.

Obligatory, "make this article more interesting" meme image.

Obligatory, “make this article more interesting” meme image.

This is all well and good, but unfortunately, when I open my newly created archive, it is notably missing my git submodules! It would probably make sense for there to be an upstream option so that a --recursive flag would do this magic for you, but unfortunately it doesn’t exist yet.

There are a few scripts floating around that can do this, but I wanted something small, and without any real dependencies, that I can embed in my project Makefile, so that it’s all self-contained.

Here’s what that looks like:

    @echo Running git archive...
    # use HEAD if tag doesn't exist yet, so that development is easier...
    git archive --prefix=oh-my-vagrant-$(VERSION)/ -o $(SOURCE) $(VERSION) 2> /dev/null || (echo 'Warning: $(VERSION) does not exist.' && git archive --prefix=oh-my-vagrant-$(VERSION)/ -o $(SOURCE) HEAD)
    # TODO: if git archive had a --submodules flag this would easier!
    @echo Running git archive submodules...
    # i thought i would need --ignore-zeros, but it doesn't seem necessary!
    p=`pwd` && (echo .; git submodule foreach) | while read entering path; do 
        [ "$$path" = "" ] && continue; 
        (cd $$path && git archive --prefix=oh-my-vagrant-$(VERSION)/$$path/ HEAD > $$p/rpmbuild/tmp.tar && tar --concatenate --file=$$p/$(SOURCE) $$p/rpmbuild/tmp.tar && rm $$p/rpmbuild/tmp.tar); 

This is a bit tricky to read, so I’ll try to break it down. Remember, double dollar signs are used in Make syntax for embedded bash code since a single dollar sign is a special Make identifier. The $(VERSION) variable corresponds to the version of the project I’m building, which matches a git tag that I’ve previously created. $(SOURCE) corresponds to an output file name, ending in the .tar.bz2 suffix.

    p=`pwd` && (echo .; git submodule foreach) | while read entering path; do 

In this first line, we store the current working directory for use later, and then loop through the output of the git submodule foreach command. That output normally looks something like this:

james@computer:~/code/oh-my-vagrant$ git submodule foreach 
Entering 'vagrant/gems/xdg'
Entering 'vagrant/kubernetes/templates/default'
Entering 'vagrant/p4h'
Entering 'vagrant/puppet/modules/module-data'
Entering 'vagrant/puppet/modules/puppet'
Entering 'vagrant/puppet/modules/stdlib'
Entering 'vagrant/puppet/modules/yum'

As you can see, this shows that the above read command, eats up the Entering string, and pulls the quoted path into the second path variable. The next part of the code:

        [ "$$path" = "" ] && continue; 

uses bash idioms to remove the two single quotes that wrap our string, and then skip over any empty versions of the path variable in our loop. Lastly, for each submodule found, we first switch into that directory:

        (cd $$path &&

Run a normal git archive command and create a plain uncompressed tar archive in a temporary directory:

git archive --prefix=oh-my-vagrant-$(VERSION)/$$path/ HEAD > $$p/rpmbuild/tmp.tar &&

Then use the magic of tar to overlay this new tar file, on top of the source file that we’re now building up with each iteration of this loop, and then remove the temporary file.

tar --concatenate --file=$$p/$(SOURCE) $$p/rpmbuild/tmp.tar && rm $$p/rpmbuild/tmp.tar); 

Finally, we end the loop:


Boom, magic! Short, concise, and without any dependencies but bash and git.

Nobody should have to figure that out by themselves, and I wish it was built in to git, but until then, here’s how it’s done! Many thanks to #git on IRC for pointing me in the right direction.

This is the commit where I landed this patch for oh-my-vagrant, if you’re curious to see this in the wild. Now that this is done, I can definitely say that it was worth the time:

Is it worth the time? In this case, it was.

With this feature merged, along with my automatic COPR builds, a simple ‘make rpm‘, causes all of this automation to happen, and delivers a fresh build from git in a few minutes.

I hope you enjoyed this technique, and I hope you have some coding skills to get this feature upstream in git.

Happy Hacking,


by on October 18, 2014

Hacking out an Openshift app

I had an itch to scratch, and I wanted to get a bit more familiar with Openshift. I had used it in the past, but it was time to have another go. The app and the code are now available. Feel free to check out:

This is a simple app that takes the URL of a markdown file on GitHub, and outputs a pandoc converted PDF. I wanted to use pandoc specifically, because it produces PDF’s that were beautifully created with LaTeX. To embed a link in your upstream documentation that points to a PDF, just append the file’s URL to this app’s url, under a /pdf/ path. For example:

will send you to a PDF of the puppet-gluster documentation. This will make it easier to accept questions as FAQ patches, without needing to have the git embedded binary PDF be constantly updated.

If you want to hear more about what I did, read on…

The setup:

Start by getting a free Openshift account. You’ll also want to install the client tools. Nothing is worse than having to interact with your app via a web interface. Hackers use terminals. Lucky, the Openshift team knows this, and they’ve created a great command line tool called rhc to make it all possible.

I started by following their instructions:

$ sudo yum install rubygem-rhc
$ sudo gem update rhc

Unfortunately, this left with a problem:

$ rhc
/usr/share/rubygems/rubygems/dependency.rb:298:in `to_specs': Could not find 'rhc' (>= 0) among 37 total gem(s) (Gem::LoadError)
    from /usr/share/rubygems/rubygems/dependency.rb:309:in `to_spec'
    from /usr/share/rubygems/rubygems/core_ext/kernel_gem.rb:47:in `gem'
    from /usr/local/bin/rhc:22:in `'

I solved this by running:

$ gem install rhc

Which makes my user rhc to take precedence over the system one. Then run:

$ rhc setup

and the rhc client will take you through some setup steps such as uploading your public ssh key to the Openshift infrastructure. The beauty of this tool is that it will work with the Red Hat hosted infrastructure, or you can use it with your own infrastructure if you want to host your own Openshift servers. This alone means you’ll never get locked in to a third-party providers terms or pricing.

Create a new app:

To get a fresh python 3.3 app going, you can run:

$ rhc create-app <appname> python-3.3

From this point on, it’s fairly straight forward, and you can now hack your way through the app in python. To push a new version of your app into production, it’s just a git commit away:

$ git add -p && git commit -m 'Awesome new commit...' && git push && rhc tail

Creating a new app from existing code:

If you want to push a new app from an existing code base, it’s as easy as:

$ rhc create-app awesomesauce python-3.3 --from-code
Application Options
Domain:      purpleidea
Cartridges:  python-3.3
Source Code:
Gear Size:   default
Scaling:     no

Creating application 'awesomesauce' ... done

Waiting for your DNS name to be available ... done

Cloning into 'awesomesauce'...
The authenticity of host ' (' can't be established.
RSA key fingerprint is 00:11:22:33:44:55:66:77:88:99:aa:bb:cc:dd:ee:ff.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ',' (RSA) to the list of known hosts.

Your application 'awesomesauce' is now available.

  SSH to:
  Git remote: ssh://
  Cloned to:  /home/james/code/awesomesauce

Run 'rhc show-app awesomesauce' for more details about your app.

In my case, my app also needs some binaries installed. I haven’t yet automated this process, but I think it can be done be creating a custom cartridge. Help to do this would be appreciated!

Updating your app:

In the case of an app that I already deployed with this method, updating it from the upstream source is quite easy. You just pull down and relevant commits, and then push them up to your app’s git repo:

$ git pull upstream master 
 * branch            master     -> FETCH_HEAD
Updating 5ac5577..bdf9601
Fast-forward | 2 --
 1 file changed, 2 deletions(-)
$ git push origin master 
Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 312 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Stopping Python 3.3 cartridge
remote: Waiting for stop to finish
remote: Waiting for stop to finish
remote: Building git ref 'master', commit bdf9601
remote: Activating virtenv
remote: Checking for pip dependency listed in requirements.txt file..
remote: You must give at least one requirement to install (see "pip help install")
remote: Running script..
remote: running develop
remote: running egg_info
remote: creating pdfdoc.egg-info
remote: writing pdfdoc.egg-info/PKG-INFO
remote: writing dependency_links to pdfdoc.egg-info/dependency_links.txt
remote: writing top-level names to pdfdoc.egg-info/top_level.txt
remote: writing manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: reading manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: writing manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: running build_ext
remote: Creating /var/lib/openshift/00112233445566778899aabb/app-root/runtime/dependencies/python/virtenv/venv/lib/python3.3/site-packages/pdfdoc.egg-link (link to .)
remote: pdfdoc 0.0.1 is already the active version in easy-install.pth
remote: Installed /var/lib/openshift/00112233445566778899aabb/app-root/runtime/repo
remote: Processing dependencies for pdfdoc==0.0.1
remote: Finished processing dependencies for pdfdoc==0.0.1
remote: Preparing build for deployment
remote: Deployment id is 9c2ee03c
remote: Activating deployment
remote: Starting Python 3.3 cartridge (Apache+mod_wsgi)
remote: Application directory "/" selected as DocumentRoot
remote: Application "" selected as default WSGI entry point
remote: -------------------------
remote: Git Post-Receive Result: success
remote: Activation status: success
remote: Deployment completed with status: success
To ssh://
   5ac5577..bdf9601  master -> master

Final thoughts:

I hope this helped you getting going with Openshift. Feel free to send me patches!

Happy hacking!


by on October 10, 2014

Continuous integration for Puppet modules

I just patched puppet-gluster and puppet-ipa to bring their infrastructure up to date with the current state of affairs…

What’s new?

  • Better README’s
  • Rake syntax checking (fewer oopsies)
  • CI (testing) with travis on git push (automatic testing for everyone)
  • Use of .pmtignore to ignore files from puppet module packages (finally)
  • Pushing modules to the forge with blacksmith (sweet!)

This last point deserves another mention. Puppetlabs created the “forge” to try to provide some sort of added value to their stewardship. Personally, I like to look for code on github instead, but nevertheless, some do use the forge. The problem is that to upload new releases, you need to click your mouse like a windows user! Someone has finally solved that problem! If you use blacksmith, a new build is just a rake push away!

Have a look at this example commit if you’re interested in seeing the plumbing.

Better documentation and FAQ answering:

I’ve answered a lot of questions by email, but this only helps out individuals. From now on, I’d appreciate if you asked your question in the form of a patch to my FAQ. (puppet-gluster, puppet-ipa)

I’ll review and merge your patch, including a follow-up patch with the answer! This way you’ll get more familiar with git and sending small patches, everyone will benefit from the response, and I’ll be able to point you to the docs (and even a specific commit) to avoid responding to already answered questions. You’ll also have the commit information of something else who already had this problem. Cool, right?

Happy hacking,


by on September 3, 2014

Introducing: Oh My Vagrant!

If you’re a reader of my code or of this blog, it’s no secret that I hack on a lot of puppet and vagrant. Recently I’ve fooled around with a bit of docker, too. I realized that the vagrant, environments I built for puppet-gluster and puppet-ipa needed to be generalized, and they needed new features too. Therefore…

Introducing: Oh My Vagrant!

Oh My Vagrant is an attempt to provide an easy to use development environment so that you can be up and hacking quickly, and focusing on the real devops problems. The README explains my choice of project name.


I use a Fedora 20 laptop with vagrant-libvirt. Efforts are underway to create an RPM of vagrant-libvirt, but in the meantime you’ll have to read: Vagrant on Fedora with libvirt (reprise). This should work with other distributions too, but I don’t test them very often. Please step up and help test :)

The bits:

First clone the oh-my-vagrant repository and look inside:

git clone --recursive
cd oh-my-vagrant/vagrant/

The included Vagrantfile is the current heart of this project. You’re welcome to use it as a template and edit it directly, or you can use the facilities it provides. I’d recommend starting with the latter, which I’ll walk you through now.

Getting started:

Start by running vagrant status (vs) and taking a look at the vagrant.yaml file that appears.

james@computer:/oh-my-vagrant/vagrant$ ls
Dockerfile  puppet/  Vagrantfile
james@computer:/oh-my-vagrant/vagrant$ vs
Current machine states:

template1                 not created (libvirt)

The Libvirt domain is not created. Run `vagrant up` to create it.
james@computer:/oh-my-vagrant/vagrant$ cat vagrant.yaml 
:image: centos-7.0
:sync: rsync
:puppet: false
:docker: false
:cachier: false
:vms: []
:namespace: template
:count: 1
:username: ''
:password: ''
:poolid: []
:repos: []

Here you’ll see the list of resultant machines that vagrant thinks is defined (currently just template1), and a bunch of different settings in YAML format. The values of these settings help define the vagrant environment that you’ll be hacking in.

Changing settings:

The settings exist so that your vagrant environment is dynamic and can be changed quickly. You can change the settings by editing the vagrant.yaml file. They will be used by vagrant when it runs. You can also change them at runtime with --vagrant-foo flags. Running a vagrant status will show you how vagrant currently sees the environment. Let’s change the number of machines that are defined. Note the location of the --vagrant-count flag and how it doesn’t work when positioned incorrectly.

james@computer:/oh-my-vagrant/vagrant$ vagrant status --vagrant-count=4
An invalid option was specified. The help for this command
is available below.

Usage: vagrant status [name]
    -h, --help                       Print this help
james@computer:/oh-my-vagrant/vagrant$ vagrant --vagrant-count=4 status
Current machine states:

template1                 not created (libvirt)
template2                 not created (libvirt)
template3                 not created (libvirt)
template4                 not created (libvirt)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.
james@computer:/oh-my-vagrant/vagrant$ cat vagrant.yaml 
:image: centos-7.0
:sync: rsync
:puppet: false
:docker: false
:cachier: false
:vms: []
:namespace: template
:count: 4
:username: ''
:password: ''
:poolid: []
:repos: []

As you can see in the above example, changing the count variable to 4, causes vagrant to see a possible four machines in the vagrant environment. You can change as many of these parameters at a time by using the --vagrant- flags, or you can edit the vagrant.yaml file. The latter is much easier and more expressive, in particular for expressing complex data types. The former is much more powerful when building one-liners, such as:

vagrant --vagrant-count=8 --vagrant-namespace=gluster up gluster{1..8}

which should bring up eight hosts in parallel, named gluster1 to gluster8.

Other VM’s:

Since one often wants to be more expressive in machine naming and heterogeneity of machine type, you can specify a list of machines to define in the vagrant.yaml file vms array. If you’d rather define these machines in the Vagrantfile itself, you can also set them up in the vms array defined there. It is empty by default, but it is easy to uncomment out one of the many examples. These will be used as the defaults if nothing else overrides the selection in the vagrant.yaml file. I’ve uncommented a few to show you this functionality:

james@computer:/oh-my-vagrant/vagrant$ grep example[124] Vagrantfile 
    {:name => 'example1', :docker => true, :puppet => true, },    # example1
    {:name => 'example2', :docker => ['centos', 'fedora'], },    # example2
    {:name => 'example4', :image => 'centos-6', :puppet => true, },    # example4
james@computer:/oh-my-vagrant/vagrant$ rm vagrant.yaml # note that I remove the old settings
james@computer:/oh-my-vagrant/vagrant$ vs
Current machine states:

template1                 not created (libvirt)
example1                  not created (libvirt)
example2                  not created (libvirt)
example4                  not created (libvirt)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.
james@computer:/oh-my-vagrant/vagrant$ cat vagrant.yaml 
:image: centos-7.0
:sync: rsync
:puppet: false
:docker: false
:cachier: false
- :name: example1
  :docker: true
  :puppet: true
- :name: example2
  - centos
  - fedora
- :name: example4
  :image: centos-6
  :puppet: true
:namespace: template
:count: 1
:username: ''
:password: ''
:poolid: []
:repos: []
james@computer:/oh-my-vagrant/vagrant$ vim vagrant.yaml # edit vagrant.yaml file...
james@computer:/oh-my-vagrant/vagrant$ cat vagrant.yaml 
:image: centos-7.0
:sync: rsync
:puppet: false
:docker: false
:cachier: false
- :name: example1
  :docker: true
  :puppet: true
- :name: example4
  :image: centos-7.0
  :puppet: true
:namespace: template
:count: 1
:username: ''
:password: ''
:poolid: []
:repos: []
james@computer:/oh-my-vagrant/vagrant$ vs
Current machine states:

template1                 not created (libvirt)
example1                  not created (libvirt)
example4                  not created (libvirt)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

The above output might seem a little long, but if you try these steps out in your terminal, you should get a hang of it fairly quickly. If you poke around in the Vagrantfile, you should see the format of the vms array. Each element in the array should be a dictionary, where the keys correspond to the flags you wish to set. Look at the examples if you need help with the formatting.

Other settings:

As you saw, other settings are available. There are a few notable ones that are worth mentioning. This will also help explain some of the other features that this Vagrantfile provides.

  • domain: This sets the domain part of each vm’s FQDN. The default is, which should work for most environments, but you’re welcome to change this as you see fit.
  • network: This sets the network that is used for the vm’s. You should pick a network/cidr that doesn’t conflict with any other networks on your machine. This is particularly useful when you have multiple vagrant environments hosted off of the same laptop.
  • image: This is the default base image to use for each machine. It can be overridden per-machine in the vm’s list of dictionaries.
  • sync: This is the sync type used for vagrant. rsync is the default and works in all environments. If you’d prefer to fight with the nfs mounts, or try out 9p, both those options are available too.
  • puppet: This option enables or disables integration with puppet. It is possible to override this per machine. This functionality will be expanded in a future version of Oh My Vagrant.
  • docker: This option enables and lists the docker images to set up per vm. It is possible to override this per machine. This functionality will be expanded in a future version of Oh My Vagrant.
  • namespace: This sets the namespace that your Vagrantfile operates in. This value is used as a prefix for the numbered vm’s, as the libvirt network name, and as the primary puppet module to execute.

More on the docker option:

For now, if you specify a list of docker images, they will be automatically pulled into your vm environment. It is recommended that you pre-cache them in an existing base image to save bandwidth. Custom base vagrant images can be easily be built with vagrant-builder, but this process is currently undocumented.

I’ll try to write-up a post on this process if there are enough requests. To keep you busy in the meantime, I’ve published a CentOS 7 vagrant base image that includes docker images for CentOS and Fedora. It is being graciously hosted by the GlusterFS community.

What other magic does this all do?

There is a certain amount of magic glue that happens behind the scenes. Here’s a list of some of it:

  • Idempotent /etc/hosts based DNS
  • Easy docker base image installation
  • IP address calculations and assignment with ipaddr
  • Clever cleanup on ‘vagrant destroy
  • Vagrant docker base image detection
  • Integration with Puppet

If you don’t understand what all of those mean, and you don’t want to go source diving, don’t worry about it! I will explain them in greater detail when it’s important, and hopefully for now everything “just works” and stays out of your way.

Future work:

There’s still a lot more that I have planned, and some parts of the Vagrantfile need clean up, but I figured I’d try and release this early so that you can get hacking right away. If it’s useful to you, please leave a comment and let me know.

Happy hacking,



by on August 27, 2014

Rough data density calculations

Seagate has just publicly announced 8TB HDD’s in a 3.5″ form factor. I decided to do some rough calculations to understand the density a bit better…

Note: I have decided to ignore the distinction between Terabytes (TB) and Tebibytes (TiB), since I always work in base 2, but I hate the -bi naming conventions. Seagate is most likely announcing an 8TB HDD, which is actually smaller than a true 8TiB drive. If you don’t know the difference it’s worth learning.

Rack Unit Density:

Supermicro sells a high density, double-sided 4U server, which can hold 90 x 3.5″ drives. This means you can easily store:

90 * 8TB = 720TB in 4U,


720TB/4U = 180TB per U.

To store a petabyte of data, since:

1PB = 1024TB,

we need:

1024TB/180TB/U = 5.68 U.

Rounding up we realize that we can easily store one petabyte of raw data in 6U.

Since an average rack is usually 42U (tall racks can be 48U) that means we can store between seven and eight PB per rack:

42U/rack / 6U/PB = 7PB/rack

48U/rack / 6U/PB = 8PB/rack

If you can provide the power and cooling, you can quickly see that small data centers can easily get into exabyte scale if needed. One raw exabyte would only require:

1EB = 1024PB

1024PB/7PB/rack = 146 racks =~ 150 racks.

Raid and Redundancy:

Since you’ll most likely have lots of failures, I would recommend having some number of RAID sets per server, and perhaps a distributed file system like GlusterFS to replicate the data across different servers. Suppose you broke each 90 drive server into five separate RAID 6 bricks for GlusterFS:

90/5 = 18 drives per brick.

In RAID 6, you loose two drives to parity, so that means:

18 drives – 2 drives = 16 drives per brick of usable storage.

16 drives * 5 bricks * 8 TB = 640 TB after RAID 6 in 4U.

640TB/4U = 160TB/U

1024TB/160TB/U = 6.4TB/U =~ 7PB/rack.

Since I rounded a lot, the result is similar. With a replica count of 2 in a standard GlusterFS configuration, you average a total of about 3-4PB of usable storage per rack. Need a petabyte scale filesystem? One rack should do it!

Other considerations:

  • Remember that you need to take into account space for power, cooling and networking.
  • Keep in mind that SMR might be used to increase density even further (unless it’s not already being used on these drives).
  • Remember that these calculations were done to understand the order of magnitude, and not to get a precise measurement on the size of a planned cluster.
  • Petabyte scale is starting to feel small…


Storage is getting very inexpensive. After the above analysis, I feel safe in concluding that:

  1. Puppet-Gluster could easily automate a petabyte scale filesystem.
  2. I have an embarrassingly small amount of personal storage.

Hope this was fun,

Happy hacking,



Disclaimer: I have not tried the 8TB Seagate HDD’s, or the Supermicro 90 x 3.5″ servers, but if you are building a petabyte scale cluster with GlusterFS/Puppet-Gluster, I’d like to hear about it!


by on June 4, 2014

Hiera data in modules and OS independent puppet

Earlier this year, R.I.Pienaar released his brilliant data in modules hack, a few months ago, I got the chance to start implementing it in Puppet-Gluster, and today I have found the time to blog about it.

What is it?

R.I.’s hack lets you store hiera data inside a puppet module. This can have many uses including letting you throw out the nested mess that is commonly params.pp, and replace it with something file based that is elegant and hierarchical. For my use case, I’m using it to build OS independent puppet modules, without storing this data as code. The secondary win is that porting your module to a new GNU/Linux distribution or version could be as simple as adding a YAML file.

How does it work?

(For the specifics on the hack in general, please read R.I. Pienaar’s blog post. After you’re comfortable with that, please continue…)

In the hiera.yaml data/ hierarchy, I define an OS / version structure that should probably cover all use cases. It looks like this:

- tree/%{::osfamily}/%{::operatingsystem}/%{::operatingsystemrelease}
- tree/%{::osfamily}/%{::operatingsystem}
- tree/%{::osfamily}
- common

At the bottom, you can specify common data, which can be overridden by OS family specific data (think RedHat “like” vs. Debian “like”), which can be overridden with operating system specific data (think CentOS vs. Fedora), which can finally be overridden with operating system version specific data (think RHEL6 vs. RHEL7).

Grouping the commonalities near the bottom of the tree, avoids duplication, and makes it possible to support new OS versions with fewer changes. It would be especially cool if someone could write a script to refactor commonalities downwards, and to refactor new uniqueness upwards.

This is an except of the Fedora specific YAML file:

gluster::params::package_glusterfs_server: 'glusterfs-server'
gluster::params::program_mkfs_xfs: '/usr/sbin/mkfs.xfs'
gluster::params::program_mkfs_ext4: '/usr/sbin/mkfs.ext4'
gluster::params::program_findmnt: '/usr/bin/findmnt'
gluster::params::service_glusterd: 'glusterd'
gluster::params::misc_gluster_reload: '/usr/bin/systemctl reload glusterd'

Since we use full paths in Puppet-Gluster, and since they are uniquely different in Fedora (no more: /bin) it’s nice to specify them all here. The added advantage is that you can easily drop in different versions of these utilities if you want to test a patched release without having to edit your system utilities. In addition, you’ll see that the OS specific RPM package name and service names are in here too. On a Debian system, they are usually different.


This depends on Puppet >= 3.x and having the puppet-module-data module included. I do so for integration with vagrant like so.

Should I still use params.pp?

I think that this answer is yes. I use a params.pp file with a single class specifying all the defaults:

class gluster::params(
    # packages...
    $package_glusterfs_server = 'glusterfs-server',

    $program_mkfs_xfs = '/sbin/mkfs.xfs',
    $program_mkfs_ext4 = '/sbin/mkfs.ext4',

    # services...
    $service_glusterd = 'glusterd',

    # misc...
    $misc_gluster_reload = '/sbin/service glusterd reload',

    # comment...
    $comment = ''
) {
    if "${comment}" == '' {
        warning('Unable to load yaml data/ directory!')

    # ...


In my data/common.yaml I include a bogus comment canary so that I can trigger a warning if the data in modules module isn’t working. This shouldn’t be a fail as long as you want to allow backwards compatibility, otherwise it should be! The defaults I use correspond to the primary OS I hack and use this module with, which in this case is CentOS 6.x.

To use this data in your module, include the params.pp file, and start using it. Example:

include gluster::params
package { "${::gluster::params::package_glusterfs_server}":
    ensure => present,

Unfortunately the readability isn’t nearly as nice as it is without this, however it’s an essential evil, due to the puppet language limitations.

Common patterns:

There are a few common code patterns, which you might need for this technique. The first few, I’ve already mentioned above. These are the tree layout in hiera.yaml, the comment canary, and the params.pp defaults. There’s one more that you might find helpful…

The split package pattern:

Certain packages are split into multiple pieces on some operating systems, and grouped together on others. This means there isn’t always a one-to-one mapping between the data and the package type. For simple cases you can use a hiera array:

# this hiera value could be an array of strings...
package { $::some_module::params::package::some_package_list:
    ensure => present,
    alias => 'some_package',
service { 'foo':
    require => Package['some_package'],

For this to work you must always define at least one element in the array. For more complex cases you might need to test for the secondary package in the split:

if "${::some_module::params::package::some_package}" != '' {
    package { "${::some_module::params::package::some_package}":
        ensure => present,
        alias => 'some_package', # or use the $name and skip this

service { 'foo':
    require => "${::some_module::params::package::some_package}" ? {
        '' => undef,
        default => Package['some_package'],

This pattern is used in Puppet-Gluster in more than one place. It turns out that it’s also useful when optional python packages get pulled into the system python. (example)

Hopefully you found this useful. Please help increase the multi-os aspect of Puppet-Gluster by submitting patches to the YAML files, and by testing it on your favourite GNU/Linux distro!

Happy hacking!



by on May 13, 2014

Vagrant on Fedora with libvirt (reprise)

Vagrant has become the de facto tool for devops. Faster iterations, clean environments, and less overhead. This isn’t an article about why you should use Vagrant. This is an article about how to get up and running with Vagrant on Fedora with libvirt easily!


This article is an update of my original Vagrant on Fedora with libvirt article. There is still lots of good information in that article, but this one should be easier to follow and uses updated versions of Vagrant and vagrant-libvirt.

Why vagrant-libvirt?

Vagrant ships by default with support for virtualbox. This makes sense as a default since it is available on Windows, Mac, and GNU/Linux. Real hackers use GNU/Linux, and in my opinion the best tool for GNU/Linux is vagrant-libvirt. Proprietary, closed source platforms aren’t hackable and therefore aren’t cool!

Another advantage to using the vagrant-libvirt plugin is that it plays nicely with the existing ecosystem of libvirt tools. You can use virsh, virt-manager, and guestfish alongside Vagrant, and if your development work needs to go into production, you can be confident in knowing that it was already tested on the same awesome KVM virtualization platform that your servers run.


Let’s get going. What do you need?

  • A Fedora 20 machine

I recommend hardware that supports VT extensions. Most does these days. This should also work with other GNU/Linux distro’s, but I haven’t tested them.


I’m going to go through this in a logical hacking order. This means you could group all the yum install commands into a single execution at the beginning, but you would learn much less by doing so.

First install some of your favourite hacking dependencies. I did this on a minimal, headless F20 installation. You might want to add some of these too:

# yum install -y wget tree vim screen mtr nmap telnet tar git

Update the system to make sure it’s fresh:

# yum update -y

Download Vagrant version 1.5.4. No, don’t use the latest version, it probably won’t work! Vagrant has new releases practically as often as there are sunsets, and they typically cause lots of breakages.

$ wget

and install it:

# yum install -y vagrant_1.5.4_x86_64.rpm

RVM installation:

In order to get vagrant-libvirt working, you’ll need some ruby dependencies. It turns out that RVM seems to be the best way to get exactly what you need. Use the sketchy RVM installer:

# \curl -sSL | bash -s stable

If you don’t know why that’s sketchy, then you probably shouldn’t be hacking! I did that as root, but it probably works when you run it as a normal user. At this point rvm should be installed. The last important thing you’ll need to do is to add yourself to the rvm group. This is only needed if you installed rvm as root:

# usermod -aG rvm <username>

You’ll probably need to logout and log back in for this to take effect. Run:

$ groups

to make sure you can see rvm in the list. If you ran rvm as root, you’ll want to source the file:

$ source /etc/profile.d/

or simply use a new terminal. If you ran it as a normal user, I think RVM adds something to your ~/.bashrc. You might want to reload it:

$ source ~/.bashrc

At this point RVM should be working. Let’s see which ruby’s it can install:

$ rvm list known

Ruby version ruby-2.0.0-p353 seems closest to what is available on my Fedora 20 machine, so I’ll use that:

$ rvm install ruby-2.0.0-p353

If the exact patch number isn’t available, choose what’s closest. Installing ruby requires a bunch of dependencies. The rvm install command will ask yum for a bunch of dependencies, but if you’d rather install them yourself, you can run:

# yum install -y patch libyaml-devel libffi-devel glibc-headers autoconf gcc-c++ glibc-devel patch readline-devel zlib-devel openssl-devel bzip2 automake libtool bison

GEM installation:

Now we need the GEM dependencies for the vagrant-libvirt plugin. These GEM’s happen to have their own build dependencies, but thankfully I’ve already figured those out for you:

# yum install -y libvirt-devel libxslt-devel libxml2-devel

Now, install the nokogiri gem that vagrant-libvirt needs:

$ gem install nokogiri -v '1.5.11'

and finally we can install the actual vagrant-libvirt plugin:

$ vagrant plugin install --plugin-version 0.0.16 vagrant-libvirt

You don’t have to specify the –plugin-version 0.0.16 part, but doing so will make sure that you get a version that I have tested to be compatible with Vagrant 1.5.4 should a newer vagrant-libvirt release not be compatible with the Vagrant version you’re using. If you’re feeling brave, please test newer versions, report bugs, and write patches!

Making Vagrant more useful:

Vagrant should basically work at this point, but it’s missing some awesome. I’m proud to say that I wrote this awesome. I recommend my bash function and alias additions. If you’d like to include them, you can run:

$ wget
$ echo '. ~/' >> ~/.bashrc
$ . ~/.bashrc    # reload

to pull in my most used Vagrant aliases and functions. I’ve written about them before. If you’re interested, please read:

KVM/QEMU installation:

As I mentioned earlier, I’m assuming you have a minimal Fedora 20 installation, so you might not have all the libvirt pieces installed! Here’s how to install any potentially missing pieces:

# yum install -y libvirt{,-daemon-kvm}

This should pull in a whole bunch of dependencies too. You will need to start and (optionally) enable the libvirtd service:

# systemctl start libvirtd.service
# systemctl enable libvirtd.service

You’ll notice that I’m using the systemd commands instead of the deprecated service command. My biggest (only?) gripe with systemd is that the command line tools aren’t as friendly as they could be! The systemctl equivalent requires more typing, and make it harder to start or stop the same service in quick succession, because it buries the action in the middle of the command instead of leaving it at the end!

The libvirtd service should finally be running. On my machine, it comes with a default network which got in the way of my vagrant-libvirt networking. If you want to get rid of it, you can run:

# virsh net-destroy default
# virsh net-undefine default

and it shouldn’t bother you anymore. One last hiccup. If it’s your first time installing KVM, you might run into bz#950436. To workaround this issue, I had to run:

# rmmod kvm_intel
# rmmod kvm
# modprobe kvm
# modprobe kvm_intel

Without this “module re-loading” you might see this error:

Call to virDomainCreateWithFlags failed: internal error: Process exited while reading console log output: char device redirected to /dev/pts/2 (label charserial0)
Could not access KVM kernel module: Permission denied
failed to initialize KVM: Permission denied

Additional installations:

To make your machine somewhat more palatable, you might want to consider installing bash-completion:

# yum install -y bash-completion

You’ll also probably want to add the PolicyKit (polkit) .pkla file that I recommend in my earlier article. Typically that means adding something like:

[Allow james libvirt management permissions]

as root to somewhere like:


Your machine should now be setup perfectly! The last thing you’ll need to do is to make sure that you get a Vagrantfile that does things properly! Here are some recommendations.

Shared folders:

Shared folders are a mechanism that Vagrant uses to pass data into (and sometimes out of) the virtual machines that it is managing. Typically you can use NFS, rsync, and some provider specific folder sharing like 9p. Using rsync is the simplest to set up, and works exceptionally well. Make sure you include the following line in your Vagrantfile:

config.vm.synced_folder './', '/vagrant', type: 'rsync'

If you want to see an example of this in action, you can have a look at my puppet-gluster Vagrantfile. If you are using the puppet apply provisioner, you will have to set it to use rsync as well:

puppet.synced_folder_type = 'rsync'

KVM performance:

Due to a regression in vagrant-libvirt, the default driver used for virtual machines is qemu. If you want to use the accelerated KVM domain type, you’ll have to set it:

libvirt.driver = 'kvm'

This typically gives me a 5x performance increase over plain qemu. This fix is available in the latest vagrant-libvirt version. The default has been set to KVM in the latest git master.

Dear internets!

I think this was fairly straightforward. You could probably even put all of these commands in a shell script and just run it to get it all going. What we really need is proper RPM packaging. If you can help out, that would be excellent!

If we had a version of vagrant-libvirt alongside a matching Vagrant version in Fedora, then developers and hackers could target that, and we could easily exchange dev environments, hackers could distribute product demos as full vagrant-libvirt clusters, and I could stop having to write these types of articles 😉

I hope this was helpful to you. Please let me know in the comments.

Happy hacking,



by on May 6, 2014

Keeping git submodules in sync with your branches

This is a quick trick for making working with git submodules more magic.

One day you might find that using git submodules is needed for your project. It’s probably not necessary for everyday hacking, but if you’re glue-ing things together, it can be quite useful. Puppet-Gluster uses this technique to easily include all the dependencies needed for a Puppet-Gluster+Vagrant automatic deployment.

If you’re a good hacker, you develop things in separate feature branches. Example:

cd code/projectdir/
git checkout -b feat/my-cool-feature
# hack hack hack
git add -p
# add stuff
git commit -m 'my cool new feature'
git push
# yay!

The problem arises if you git pull inside of a git submodule to update it to a particular commit. When you switch branches, the git submodule‘s branch doesn’t move along with you! Personally, I think this is a bug, but perhaps it’s not. In any case, here’s the fix:


exec git submodule update

to your:


and then run:

chmod u+x <projectdir>/.git/hooks/post-checkout

and you’re good to go! Here’s an example:

james@computer:~/code/puppet/puppet-gluster$ git checkout feat/yamldata
M vagrant/gluster/puppet/modules/puppet
Switched to branch 'feat/yamldata'
Submodule path 'vagrant/gluster/puppet/modules/puppet': checked out 'f139d0b7cfe6d55c0848d0d338e19fe640a961f2'
james@computer:~/code/puppet/puppet-gluster (feat/yamldata)$ git checkout master
M vagrant/gluster/puppet/modules/puppet
Switched to branch 'master'
Your branch is up-to-date with 'glusterforge/master'.
Submodule path 'vagrant/gluster/puppet/modules/puppet': checked out '07ec49d1f67a498b31b4f164678a76c464e129c4'
james@computer:~/code/puppet/puppet-gluster$ cat .git/hooks/post-checkout
exec git submodule update

Hope that helps you out too! If someone knows of a use-case when you don’t want this functionality, please let me know! Many thanks to #git for helping me solve this issue!

Happy hacking,



by on March 27, 2014

Puppet-Gluster now available as RPM

I’ve been afraid of RPM and package maintaining [1] for years, but thanks to Kaleb Keithley, I have finally made some RPM’s that weren’t generated from a high level tool. Now that I have the boilerplate done, it’s a relatively painless process!

In case you don’t know kkeithley, he is a wizard [2] who happens to also be especially cool and hardworking. If you meet him, be sure to buy him a $BEVERAGE. </plug>

A photo of kkeithley after he (temporarily) transformed himself into a wizard penguin.

A photo of kkeithley after he (temporarily) transformed himself into a wizard penguin.

The full source of my changes is available in git.

If you want to make the RPM’s yourself, simply clone the puppet-gluster source, and run: make rpm. If you'd rather download pre-built RPM's, SRPM'S, or source tarballs, they are all being graciously hosted on, thanks to John Mark Walker and the community.

These RPM's will install their contents into /usr/share/puppet/modules/. They should work on Fedora or CentOS, but they do require a puppet package to be installed. I hope to offer them in the future as part of a repository for easier consumption.

There are also RPM's available for puppet-common, puppet-keepalived, puppet-puppet, puppet-shorewall, puppet-yum, and even puppetlabs-stdlib. These are the dependencies required to install the puppet-gluster module.

Please let me know if you find any issues with any of the packages, or if you have any recommendations for improvement! I'm new to packaging, so I probably made some mistakes.

Happy Hacking,


[1] package maintainer, aka: "paintainer" - according to semiosis, who is right!

[2] wizard as in an awesome, talented, hacker.