all posts tagged git


by on July 23, 2015

Git archive with submodules and tar magic

Git submodules are actually a very beautiful thing. You might prefer the word powerful or elegant, but that’s not the point. The downside is that they are sometimes misused, so as always, use with care. I’ve used them in projects like puppet-gluster, oh-my-vagrant, and others. If you’re not familiar with them, do a bit of reading and come back later, I’ll wait.

I recently did some work packaging Oh-My-Vagrant as RPM’s. My primary goal was to make sure the entire process was automatic, as I have no patience for manually building RPM’s. Any good packager knows that the pre-requisite for building a SRPM is a source tarball, and I wanted to build those automatically too.

Simply running a tar -cf on my source directory wouldn’t work, because I only want to include files that are stored in git. Thankfully, git comes with a tool called git archive, which does exactly that! No scary tar commands required:

Nobody likes tar

Here’s how you might run it:

$ git archive --prefix=some-project/ -o output.tar.bz2 HEAD

Let’s decompose:

The --prefix argument prepends a string prefix onto every file in the archive. Therefore, if you’d like the root directory to be named some-project, then you prepend that string with a trailing slash, and you’ll have everything nested inside a directory!

The -o flag predictably picks the output file and format. Using .tar.bz2 is quite common.

Lastly, the HEAD portion at the end specifies which git tree to pull the files from. I usually specify a git tag here, but you can specify a commit id if you prefer.

Obligatory, "make this article more interesting" meme image.

Obligatory, “make this article more interesting” meme image.

This is all well and good, but unfortunately, when I open my newly created archive, it is notably missing my git submodules! It would probably make sense for there to be an upstream option so that a --recursive flag would do this magic for you, but unfortunately it doesn’t exist yet.

There are a few scripts floating around that can do this, but I wanted something small, and without any real dependencies, that I can embed in my project Makefile, so that it’s all self-contained.

Here’s what that looks like:

sometarget:
    @echo Running git archive...
    # use HEAD if tag doesn't exist yet, so that development is easier...
    git archive --prefix=oh-my-vagrant-$(VERSION)/ -o $(SOURCE) $(VERSION) 2> /dev/null || (echo 'Warning: $(VERSION) does not exist.' && git archive --prefix=oh-my-vagrant-$(VERSION)/ -o $(SOURCE) HEAD)
    # TODO: if git archive had a --submodules flag this would easier!
    @echo Running git archive submodules...
    # i thought i would need --ignore-zeros, but it doesn't seem necessary!
    p=`pwd` && (echo .; git submodule foreach) | while read entering path; do 
        temp="$${path%'}"; 
        temp="$${temp#'}"; 
        path=$$temp; 
        [ "$$path" = "" ] && continue; 
        (cd $$path && git archive --prefix=oh-my-vagrant-$(VERSION)/$$path/ HEAD > $$p/rpmbuild/tmp.tar && tar --concatenate --file=$$p/$(SOURCE) $$p/rpmbuild/tmp.tar && rm $$p/rpmbuild/tmp.tar); 
    done

This is a bit tricky to read, so I’ll try to break it down. Remember, double dollar signs are used in Make syntax for embedded bash code since a single dollar sign is a special Make identifier. The $(VERSION) variable corresponds to the version of the project I’m building, which matches a git tag that I’ve previously created. $(SOURCE) corresponds to an output file name, ending in the .tar.bz2 suffix.

    p=`pwd` && (echo .; git submodule foreach) | while read entering path; do 

In this first line, we store the current working directory for use later, and then loop through the output of the git submodule foreach command. That output normally looks something like this:

james@computer:~/code/oh-my-vagrant$ git submodule foreach 
Entering 'vagrant/gems/xdg'
Entering 'vagrant/kubernetes/templates/default'
Entering 'vagrant/p4h'
Entering 'vagrant/puppet/modules/module-data'
Entering 'vagrant/puppet/modules/puppet'
Entering 'vagrant/puppet/modules/stdlib'
Entering 'vagrant/puppet/modules/yum'

As you can see, this shows that the above read command, eats up the Entering string, and pulls the quoted path into the second path variable. The next part of the code:

        temp="$${path%'}"; 
        temp="$${temp#'}"; 
        path=$$temp; 
        [ "$$path" = "" ] && continue; 

uses bash idioms to remove the two single quotes that wrap our string, and then skip over any empty versions of the path variable in our loop. Lastly, for each submodule found, we first switch into that directory:

        (cd $$path &&

Run a normal git archive command and create a plain uncompressed tar archive in a temporary directory:

git archive --prefix=oh-my-vagrant-$(VERSION)/$$path/ HEAD > $$p/rpmbuild/tmp.tar &&

Then use the magic of tar to overlay this new tar file, on top of the source file that we’re now building up with each iteration of this loop, and then remove the temporary file.

tar --concatenate --file=$$p/$(SOURCE) $$p/rpmbuild/tmp.tar && rm $$p/rpmbuild/tmp.tar); 

Finally, we end the loop:

    done

Boom, magic! Short, concise, and without any dependencies but bash and git.

Nobody should have to figure that out by themselves, and I wish it was built in to git, but until then, here’s how it’s done! Many thanks to #git on IRC for pointing me in the right direction.

This is the commit where I landed this patch for oh-my-vagrant, if you’re curious to see this in the wild. Now that this is done, I can definitely say that it was worth the time:

Is it worth the time? In this case, it was.

With this feature merged, along with my automatic COPR builds, a simple ‘make rpm‘, causes all of this automation to happen, and delivers a fresh build from git in a few minutes.

I hope you enjoyed this technique, and I hope you have some coding skills to get this feature upstream in git.

Happy Hacking,

James


by on October 18, 2014

Hacking out an Openshift app

I had an itch to scratch, and I wanted to get a bit more familiar with Openshift. I had used it in the past, but it was time to have another go. The app and the code are now available. Feel free to check out:

https://pdfdoc-purpleidea.rhcloud.com/

This is a simple app that takes the URL of a markdown file on GitHub, and outputs a pandoc converted PDF. I wanted to use pandoc specifically, because it produces PDF’s that were beautifully created with LaTeX. To embed a link in your upstream documentation that points to a PDF, just append the file’s URL to this app’s url, under a /pdf/ path. For example:

https://pdfdoc-purpleidea.rhcloud.com/pdf/https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md

will send you to a PDF of the puppet-gluster documentation. This will make it easier to accept questions as FAQ patches, without needing to have the git embedded binary PDF be constantly updated.

If you want to hear more about what I did, read on…

The setup:

Start by getting a free Openshift account. You’ll also want to install the client tools. Nothing is worse than having to interact with your app via a web interface. Hackers use terminals. Lucky, the Openshift team knows this, and they’ve created a great command line tool called rhc to make it all possible.

I started by following their instructions:

$ sudo yum install rubygem-rhc
$ sudo gem update rhc

Unfortunately, this left with a problem:

$ rhc
/usr/share/rubygems/rubygems/dependency.rb:298:in `to_specs': Could not find 'rhc' (>= 0) among 37 total gem(s) (Gem::LoadError)
    from /usr/share/rubygems/rubygems/dependency.rb:309:in `to_spec'
    from /usr/share/rubygems/rubygems/core_ext/kernel_gem.rb:47:in `gem'
    from /usr/local/bin/rhc:22:in `'

I solved this by running:

$ gem install rhc

Which makes my user rhc to take precedence over the system one. Then run:

$ rhc setup

and the rhc client will take you through some setup steps such as uploading your public ssh key to the Openshift infrastructure. The beauty of this tool is that it will work with the Red Hat hosted infrastructure, or you can use it with your own infrastructure if you want to host your own Openshift servers. This alone means you’ll never get locked in to a third-party providers terms or pricing.

Create a new app:

To get a fresh python 3.3 app going, you can run:

$ rhc create-app <appname> python-3.3

From this point on, it’s fairly straight forward, and you can now hack your way through the app in python. To push a new version of your app into production, it’s just a git commit away:

$ git add -p && git commit -m 'Awesome new commit...' && git push && rhc tail

Creating a new app from existing code:

If you want to push a new app from an existing code base, it’s as easy as:

$ rhc create-app awesomesauce python-3.3 --from-code https://github.com/purpleidea/pdfdoc
Application Options
-------------------
Domain:      purpleidea
Cartridges:  python-3.3
Source Code: https://github.com/purpleidea/pdfdoc
Gear Size:   default
Scaling:     no

Creating application 'awesomesauce' ... done


Waiting for your DNS name to be available ... done

Cloning into 'awesomesauce'...
The authenticity of host 'awesomesauce-purpleidea.rhcloud.com (203.0.113.13)' can't be established.
RSA key fingerprint is 00:11:22:33:44:55:66:77:88:99:aa:bb:cc:dd:ee:ff.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'awesomesauce-purpleidea.rhcloud.com,203.0.113.13' (RSA) to the list of known hosts.

Your application 'awesomesauce' is now available.

  URL:        http://awesomesauce-purpleidea.rhcloud.com/
  SSH to:     00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com
  Git remote: ssh://00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com/~/git/awesomesauce.git/
  Cloned to:  /home/james/code/awesomesauce

Run 'rhc show-app awesomesauce' for more details about your app.

In my case, my app also needs some binaries installed. I haven’t yet automated this process, but I think it can be done be creating a custom cartridge. Help to do this would be appreciated!

Updating your app:

In the case of an app that I already deployed with this method, updating it from the upstream source is quite easy. You just pull down and relevant commits, and then push them up to your app’s git repo:

$ git pull upstream master 
From https://github.com/purpleidea/pdfdoc
 * branch            master     -> FETCH_HEAD
Updating 5ac5577..bdf9601
Fast-forward
 wsgi.py | 2 --
 1 file changed, 2 deletions(-)
$ git push origin master 
Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 312 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Stopping Python 3.3 cartridge
remote: Waiting for stop to finish
remote: Waiting for stop to finish
remote: Building git ref 'master', commit bdf9601
remote: Activating virtenv
remote: Checking for pip dependency listed in requirements.txt file..
remote: You must give at least one requirement to install (see "pip help install")
remote: Running setup.py script..
remote: running develop
remote: running egg_info
remote: creating pdfdoc.egg-info
remote: writing pdfdoc.egg-info/PKG-INFO
remote: writing dependency_links to pdfdoc.egg-info/dependency_links.txt
remote: writing top-level names to pdfdoc.egg-info/top_level.txt
remote: writing manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: reading manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: writing manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: running build_ext
remote: Creating /var/lib/openshift/00112233445566778899aabb/app-root/runtime/dependencies/python/virtenv/venv/lib/python3.3/site-packages/pdfdoc.egg-link (link to .)
remote: pdfdoc 0.0.1 is already the active version in easy-install.pth
remote: 
remote: Installed /var/lib/openshift/00112233445566778899aabb/app-root/runtime/repo
remote: Processing dependencies for pdfdoc==0.0.1
remote: Finished processing dependencies for pdfdoc==0.0.1
remote: Preparing build for deployment
remote: Deployment id is 9c2ee03c
remote: Activating deployment
remote: Starting Python 3.3 cartridge (Apache+mod_wsgi)
remote: Application directory "/" selected as DocumentRoot
remote: Application "wsgi.py" selected as default WSGI entry point
remote: -------------------------
remote: Git Post-Receive Result: success
remote: Activation status: success
remote: Deployment completed with status: success
To ssh://00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com/~/git/awesomesauce.git/
   5ac5577..bdf9601  master -> master
$

Final thoughts:

I hope this helped you getting going with Openshift. Feel free to send me patches!

Happy hacking!

James


by on October 10, 2014

Continuous integration for Puppet modules

I just patched puppet-gluster and puppet-ipa to bring their infrastructure up to date with the current state of affairs…

What’s new?

  • Better README’s
  • Rake syntax checking (fewer oopsies)
  • CI (testing) with travis on git push (automatic testing for everyone)
  • Use of .pmtignore to ignore files from puppet module packages (finally)
  • Pushing modules to the forge with blacksmith (sweet!)

This last point deserves another mention. Puppetlabs created the “forge” to try to provide some sort of added value to their stewardship. Personally, I like to look for code on github instead, but nevertheless, some do use the forge. The problem is that to upload new releases, you need to click your mouse like a windows user! Someone has finally solved that problem! If you use blacksmith, a new build is just a rake push away!

Have a look at this example commit if you’re interested in seeing the plumbing.

Better documentation and FAQ answering:

I’ve answered a lot of questions by email, but this only helps out individuals. From now on, I’d appreciate if you asked your question in the form of a patch to my FAQ. (puppet-gluster, puppet-ipa)

I’ll review and merge your patch, including a follow-up patch with the answer! This way you’ll get more familiar with git and sending small patches, everyone will benefit from the response, and I’ll be able to point you to the docs (and even a specific commit) to avoid responding to already answered questions. You’ll also have the commit information of something else who already had this problem. Cool, right?

Happy hacking,

James


by on May 6, 2014

Keeping git submodules in sync with your branches

This is a quick trick for making working with git submodules more magic.

One day you might find that using git submodules is needed for your project. It’s probably not necessary for everyday hacking, but if you’re glue-ing things together, it can be quite useful. Puppet-Gluster uses this technique to easily include all the dependencies needed for a Puppet-Gluster+Vagrant automatic deployment.

If you’re a good hacker, you develop things in separate feature branches. Example:

cd code/projectdir/
git checkout -b feat/my-cool-feature
# hack hack hack
git add -p
# add stuff
git commit -m 'my cool new feature'
git push
# yay!

The problem arises if you git pull inside of a git submodule to update it to a particular commit. When you switch branches, the git submodule‘s branch doesn’t move along with you! Personally, I think this is a bug, but perhaps it’s not. In any case, here’s the fix:

add:

#!/bin/bash
exec git submodule update

to your:

<projectdir>/.git/hooks/post-checkout

and then run:

chmod u+x <projectdir>/.git/hooks/post-checkout

and you’re good to go! Here’s an example:

james@computer:~/code/puppet/puppet-gluster$ git checkout feat/yamldata
M vagrant/gluster/puppet/modules/puppet
Switched to branch 'feat/yamldata'
Submodule path 'vagrant/gluster/puppet/modules/puppet': checked out 'f139d0b7cfe6d55c0848d0d338e19fe640a961f2'
james@computer:~/code/puppet/puppet-gluster (feat/yamldata)$ git checkout master
M vagrant/gluster/puppet/modules/puppet
Switched to branch 'master'
Your branch is up-to-date with 'glusterforge/master'.
Submodule path 'vagrant/gluster/puppet/modules/puppet': checked out '07ec49d1f67a498b31b4f164678a76c464e129c4'
james@computer:~/code/puppet/puppet-gluster$ cat .git/hooks/post-checkout
#!/bin/bash
exec git submodule update
james@computer:~/code/puppet/puppet-gluster$

Hope that helps you out too! If someone knows of a use-case when you don’t want this functionality, please let me know! Many thanks to #git for helping me solve this issue!

Happy hacking,

James

 

by on October 10, 2013

Show current git branch in PS1 when branch is not master

Short post, long command…

I’ve decided to start showing the current git branch in my PS1. However, since I don’t want to know when I’m on master, I had to write a new PS1 that I haven’t yet seen anywhere. Add the following to your .bashrc:

PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w\$ '
if [ -e /usr/share/git-core/contrib/completion/git-prompt.sh ]; then
    . /usr/share/git-core/contrib/completion/git-prompt.sh
    PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w$([ "$(__git_ps1 %s)" != "" -a "$(__git_ps1 %s)" != "master" ] && (echo -e " (\[33[32m\]"$(__git_ps1 "%s")"\[33[0m\])") || echo "")\$ '
fi

This keeps my PS1 short for when I’m hacking on personal repositories that only have a single branch. Keep in mind that you might have to change the path to git-prompt.sh depending on what OS you’re using.

Example:

james@computer:~/code/puppet$ cd puppet-gluster
james@computer:~/code/puppet/puppet-gluster$ git checkout -b cool-new-feature
Switched to a new branch 'cool-new-feature'
james@computer:~/code/puppet/puppet-gluster (cool-new-feature)$ # tada !

The branch name is coloured to match the default colours that git uses to colour branches.

Happy hacking,

James

 

by on October 8, 2013

When in Rome, Don’t Use GitHub

Please excuse the garbled proverb, I don’t mean to say that people in Rome shouldn’t use GitHub. Instead, I’m talking about using the right tool for the job and the community. Sometimes the “right” tool is completely wrong, because the audience is wrong for the tool.

Collaboration works well when you use the right tools. For developers, that’s often git and GitHub. Wanna work on some code? Throw it up on GitHub. Check out the code, make your changes, and commit. Most developers working in open source today are likely to know git well enough to work with you. (Note that I’m using “GitHub” as shorthand for any git repository/service, really.)

At the edges of developer communities, though, there’s a temptation to use developer tools for non-development work. For example, creating marketing materials and documentation.

For some documentation projects, git is probably the right tool. Especially if you’re doing books and trying to document along with a release cycle. But for creating marketing materials or “living” documents that change a lot? Probably not such a great idea.

Use the tools that are lightweight and easy for anyone to work with, preferably something browser-based and that allows revision control. (e.g., a wiki.) While there are flaws with those tools, they’re “good enough” and – most importantly – they pass the “native” test for people who are contributing to non-code projects. Do they have a browser? Check. Can they figure out how to use the wiki in 5 minutes? Check. Do they need to create an account with a third party service or download a tool they wouldn’t otherwise need? Nope. All good.

Resist the urge to impose your favorite tools on an unsuitable audience. It’s always tempting to go with what you know best or optimize for your own use case, but resist the urge when it’s going to be detrimental to getting maximum participation and building community.

by on June 17, 2013

puppet lsi hardware raid module

In response to some discussion in the gluster community, I am releasing my puppet-lsi module. It’s quite simple, but it is very useful for rebuilding machines. It could do a lot more, but I wanted to depend on the proprietary LSI tools as little as possible. Running megacli with puppet would be a very doable hack, but I’m not sure enough devops out there who would use that feature.

Usage is straightforward if you like the sensible defaults:

include lsi::msm

The general idea is that you’ve probably already setup all your “virtual drive” RAID configurations initially, and now you’re deploying your setup using cobbler and puppet-gluster. This puppet-lsi module should install all the client side LSI tools, and make sure monitoring for the hardware RAID is working. Megacli and all the (evil?) vivaldi framework stuff will be up and running after puppet has run.

I haven’t tested this on a wide array of hardware, and there might even be newer LSI gear on the market. Please don’t test it on production servers. If you want help with this, you might have to sponsor some hardware, or send me somewhere where I can hack on some, because I don’t have a gluster test rig at the moment.

I am curious to hear what kind of RAID you’re using with gluster. Hardware? Software? Details rock. SGPIO with mdadm, and you’re my hero. I want to hear about that!

https://github.com/purpleidea/puppet-lsi/

I hope this was useful to you, and in the meantime,

Happy hacking,

James

PS: The most useful feature of this module, is that it sets up monitoring of your RAID, and lets you access the management daemon through the now installed LSI services.

by on January 28, 2013

Blog Post 1 – Source, Build and Install

Welcome back!

Hope you installed and tried GlusterFS using the guide from gluster website. In this post we will cover the following topics:

  • Get source from git.
  • Make and install gluster from source.

 

Get source from git:

  • Install git on your work machine using your favorite package manager. You may have to configure your git installation.
  • Visit review.gluster.org and register yourself there. During the process you will have to provide ssh keys to authenticate your push and pull commands to the server. Just follow the instructions and you will be fine.
  • Then we get the source!
  • cd into your preferred working directory for your projects and type
git clone ssh://<UserName>@git.gluster.org/glusterfs.git glusterfs
  • Congratulations! You now have the complete source code for your project.

 

Make and install gluster from source:

NOTE: Gluster has different default installation paths for installation from repo packages and source. If you had installed gluster previously from package manager , please uninstall before you proceed.

  • Cd into the glusterfs directory where git has cloned the complete source from git server.
  • Run autogen.sh present in the directory.
  • Run “./configure –enable-debug” , you may get some output like this

GlusterFS configure summary

===========================

FUSE client : yes
Infiniband verbs : no
epoll IO multiplex : yes
argp-standalone : no
fusermount : no
readline : yes
georeplication : yes
Linux-AIO : yes
Enable Debug : no
systemtap : no
Block Device backend : no

We are good to go as long as you have yes against FUSE Client and readline for our current exploration. We will come back and look later at what other components do.

  • Lets build! You can call make with CFLAG option O0 to get a debug build.
make CFLAG='-g -O0'
  • If previous command completed successfully, that means you now have binaries under the same directory. Now lets install gluster in the right location.

make CFLAG=’-g -O0′ install

Now you have gluster installed on your machine. In the next post we will look at creating volumes in gluster and try to make guesses about architecture of gluster while we observe its behavior.

Assignment: Before we do that, I want you to make a list of all the files that were installed to different paths of the system when we installed gluster.

by on July 26, 2012

puppet gluster module now in git

The thoughtful bodepd has been kind enough to help me get my puppet-gluster module off the ground and publicized a bit too. My first few commits have been all clean up to get my initial hacking up to snuff with the puppet style guidelines. Sadly, I love indenting my code with tabs, and this is against the puppet rules :(

I’ll be accepting patches by email, but I’d prefer discussion first, especially since I’ve got a few obvious things brewing in my mental queue that should hit master shortly.

Are you a gluster expert who’s weak at puppet? I’m keen to implement many of the common raid, file system and gluster performance optimization’s directly into the module, so that the out of box experience for new users is a fast, streamlined, experience.

Are you a puppet expert who knows a bit of gluster? I’m not sure what the best way to handle large config changes, such as expanding volumes, or replacing bricks is. I can imagine a large state diagram that would be very hard to wholly implement in puppet. So for now, I’m missing a few edge cases, but hopefully this module will be able to solve more of them over time.

I’ve included an examples/ directory in the repository, to give you an idea of how this works for now. Stay tuned for more commits!

git clone https://github.com/purpleidea/puppet-gluster.git

Happy hacking,
James