all posts tagged geo-replication


by on September 1, 2015

Introducing georepsetup – Gluster Geo-replication Setup Tool

How many of you succeeded to set up Gluster Geo-replication for the first time? SSH keys need to be deployed to all Slave nodes from all Master nodes as part of the Geo-replication setup. So number of steps involved in setting up Geo-rep is not very easy to manage. We get more queries in gluster-devel and gluster-users lists related to Geo-rep Setup than actually using Geo-replication, many users stopped trying Geo-replication after they faced issues during setup.

With the release of Gluster 3.7, the Geo-replication got lots of improvements. Will write blog about new features and improvements in my next blog. Yesterday I wrote a CLI tool using Python to simplify the steps involved in Geo-replication setup. Now setting up Geo-replication is as easy as running one command. Yay!

For example,

sudo georepsetup <MASTERVOL> <SLAVEHOST> <SLAVEVOL>

It prompts for the Root's Password of Slave node specified in the command. That's it!

This command also produces a good summary as shown below. Now it is very easy to trace the errors and handle them.

Summary

Install

Install this tool on any one master node where you wish to initiate the Geo-replication setup,

git clone https://github.com/aravindavk/georepsetup.git
cd georepsetup
sudo python setup.py install

This tool is not packaged as RPM/Deb Yet. Pull requests are Welcome :)

Setting up non-root Geo-replication still involves some manual steps, will try to improve in future.

Documentation is available here

Comments & Suggestions Welcome.

by on April 1, 2015

GlusterFS Geo-replication Tutorials – Understanding Session Creation

Geo-replication is one of the awesome feature of GlusterFS. With this feature we can replicate data from one Gluster Volume to another geographically located Gluster Volume.

This blog is first in a series of Understanding GlusterFS Geo-replication, Comments and Suggestions welcome.

Link: https://speakerdeck.com/aravindavk/understanding-glusterfs-geo-replication-session-creation

I created these visualizations using my Wacom tablet(Wacom Bamboo Pen & Touch CTH-460) and MyPaint software in Linux.

by on November 4, 2013

Gluster Community Day, LISA 2013, Monday

I’m here at LISA 2013 at the Gluster Community Day. I’ve been asked by Joe Brockmeier to give a little recap about what’s been going on. So here it is!

Wesley Duffee-Braun started off with a nice overview talk about GlusterFS. The great thing about his talk was that he gave a live demo, running on virtual machines, on his laptop. If you’re a new GlusterFS user, this is good exposure to help you get started. He also has the nicest slides I’ve ever seen. Someone needs to send me the .odp template!

Eco Willson gave the next talk about geo-replication, and discussed a few other GlusterFS topics too. I particularly enjoyed when he talked to us about the upcoming changes. I understood that this would include the ability to have multiple geo-replication daemons across each distributed part of each volume, and, HA between the N brick replicas. This way, your workload is split across the volume, and if a replica is down, one of the other copies can take over and continue the sync.

During lunch, I got to meet Jeff Darcy and talk a bit about storage and feature ideas on my wish-list. He knew my twitter/IRC handle, so he instantly gets 200 bonus points. You should probably checkout his blog if you haven’t already.

After lunch I gave my talk about puppet-gluster, and successfully gave two live demos. I’m really due for a blog post about some of the new features that I’ve added. I’ll try to put together a screen cast in the future. If you’re really keen on trying out some of the new features, I’m happy to share a screen session on your hosts and walk you through it. When run fully automatically, it takes between five and ten minutes to deploy a featureful GlusterFS!

I won’t be able to hack on this project for free, forever. If you’re able to donate test hardware, VM time, or sponsor a feature, it would be appreciated. I’m particularly interested in building a GlusterFS microserver cluster. If you’re interested in this too, let me know.

Wesley came back for round two to demo the GlusterForge (hi John Mark!) and Glusterinfo, which I hadn’t ever used. Nobody knew why certain projects are still “incubating”, when they’re mostly likely cooked through by now.

At this point, we had some left over time for questions and discussion. Jeff, Eco, and I formed a loosely organized “panel” to answer questions. Joe took a photo, and now owes me a copy, since I’ve never really been on a “panel” before. Jeff knows his GlusterFS internals very well. I should have come prepared with some hard questions!

To whoever was looking for Ben England’s RHS performance talk, it’s available here.

Overall, it was a very nice little event, and I hope the attendees enjoyed it. If I forgot anything, please feel free to add it into the comments. I’d also love to hear about what you enjoyed or didn’t enjoy about my talk. Let me know!

Happy Hacking,

James

by on September 22, 2013

Bittorent sync for repository mirroring

Theron Conrey writes about using:

BitTorrent Sync as Geo-Replication for Storage

We got a chance to talk about this idea at Linuxcon. I’m not entirely convinced there aren’t some problem edge cases with this solution, but I think it will be hard to tell as long as the BitTorrent sync library is proprietary. I did come up with a special case of Theron’s idea that I believe could work well.

The special case uses the optimization that the synchronization (or file transferring) is unidirectional. This avoids any coherency complications involved if both sides were to write to the same file. Combined with the BitTorrent protocol, this does what normal torrent usage does, except with BitTorrent sync, we’re looking at a folder full of files.

What kind of synchronization would benefit from this model? Repository mirroring! This is exactly a folder full of files, but going in only one direction. Instead of yum or deb mirrors each running rsync, they could use BitTorrent sync, and because of the large amount of available upload bandwidth usually available on these mirrors, “seeding”, wouldn’t be a problem, and the worldwide pool would synchronize faster.

Can we apply this to user mirroring, net installers, and machine updating? Absolutely. I believe someone has already looked into the updates scenario, but it didn’t progress for some reason. The more convincing case is still the server geo-replication of course.

Obviously, using glusterfs with puppet-gluster to host the mirrors could be a good fit. You might not even need to use any gluster replication when you have built-in geo-replication via other mirrors.

If someone works up the open source BitTorrent parts, I’m happy to hack together the puppet parts to turn this into a turn-key solution for mirror hosts.

Hope you liked this idea.

Happy hacking,

James