all posts tagged user stories
Chitika Inc., an online advertising network based in Westborough, MA, sought to provide its data scientists with faster and simpler access to its massive store of ad impression data. The company managed to boost availability and broaden access to its data by swapping out HDFS for GlusterFS as the filesystem backend for its Hadoop deployment.
“There are a number of benefits to the utilization of Gluster, arguably the biggest of which is the system’s POSIX compliance. This allows us to very easily mount Gluster anywhere as if it was a disk local to the system, meaning that we can expose 60TB of data to anything in our data center, across any amount of servers, users, and applications.”
Logging Data Part 2: Taming the Storage Beast — Chitika blog
I talked to Chitika Senior Systems Administrator Nick Wood about his company’s GlusterFS deployment, and we discussed the challenges, opportunities, and next steps for GlusterFS in their environment.
Chitika’s GlusterFS storage deployment consists of four GlusterFS 3.5 host running Debian Wheezy. Each host is packed with disks, sporting 43TB of storage across a set of 6 RAID arrays. Each of these twenty-four arrays hosts a single GlusterFS brick, which together form a triple-replicated GlusterFS volume providing roughly 59TB of total storage, 40TB of which is currently consumed.
Chitika’s client machines, which also run Debian Wheezy, primarily access this volume via the GlusterFS FUSE client, although one of their clients makes use of GlusterFS’s NFS support.
Bridging Chikika’s GlusterFS cluster and the company’s cluster of 36 Hadoop nodes is a customized version of the glusterfs-hadoop plugin. On the hardware side, the company taps Infiniband gear to link up its GlusterFS and Hadoop clusters, using an IP over Infiniband connection.
Wood explained that he’s keen to see the RDMA (Remote Direct Memory Access) support in GlusterFS stablize enough for Chitika to shift from TCP to RDMA as their GlusterFS transport type, thereby allowing the company to take full advantage of its Infiniband hardware.
Since their June 2013 deployment, Chitika’s GlusterFS storage solution has undergone a two software upgrades while in production, both of which ran smoothly.
However, the team’s experience with its deployment hasn’t been trouble-free. At one point, the deployment suffered an issue in which problems with the operating system hard drive in one of the GlusterFS hosts that led to inconsistency between some of the replicated data in their volume — a state also known as split-brain.
GlusterFS includes a self-heal daemon for repairing inconsistencies between replicated files, but there are scenarios that require manual intervention to determine which copies to retain and which to discard, which Wood and the Chitika team experienced first-hand.
“The self-heal didn’t really work as we expected. It correctly identified some corrupt files, correctly healed others, and completely ignored most,” Wood explained. “Luckily, we store compressed files and could easily tell what was corrupted. However, it took a 50TB filesystem crawl and semi-automated identification/restoration of good copies from the raw bricks to recover.”
Also among the challenges the Chitika team have encountered has been slow performance with common system utilities that carry out file stat operations, leading the team to develop alternative utilities that avoid the stat system call or that operate in parallel.
Despite these bumps in the road, the team at Chitika is enthusiastic about its GlusterFS deployment, and are mulling plans to double their GlusterFS host count to eight, to accomodate the addition of more compute nodes.
Louis Zuckerman, CTO of Picture Marketing, is working on not one, but two interesting projects for Gluster. Zuckerman is working on a Java filesystem backed by GlusterFS and Java Native Interface (JNI) bindings for GlusterFS’s native library (libgfapi).
Zuckerman says he’s using GlusterFS with storing media for Picture Marketing. “Brand ambassadors use our mobile apps to take pictures and videos at events and upload them to our online platform. After processing the uploads our system stores the media in a GlusterFS cluster. From there it is served to event attendees through custom web sites made specifically for the events.”
According to Zuckerman GlusterFS “is ideal for our use case.”
“Over the last two years we’ve enjoyed excellent reliability and superb performance from our cluster in EC2,” says Zuckerman. “Thanks to GlusterFS’ scale-out architecture we can grow our processing and web app clusters to accommodate increased demand for our online services. This is critical for our business since our system has been used by over half the top 100 brands in the US, at major sports venues, retail stores, and all kinds of events where brand ambassadors interact with customers.”
Scratching The Itch
While GlusterFS provided the features and stability that Picture Marketing needs, Zuckerman had to roll up his sleeves a bit to ensure he could run it on his system of choice.
Zuckerman began working with GlusterFS in late 2010 on EC2, and worked on packaging Gluster for 32-bit systems because the Gluster only provided 64-bit packages. “At that time Gluster only provided 64-bit packages, and the downstream packages provided by Debian (and thus Ubuntu) were stuck at a version a year older due to bugs. I fixed the bugs in Debian and became co-maintainer of the Debian project’s GlusterFS packages (helping out lead maintainer Patrick Matthaei whenever I can). I’ve also been providing my own packages specially tailored for Ubuntu since that time.”
That work led to Zuckerman being tapped as the official Debian and Ubuntu packager for GlusterFS, and to a seat on Gluster’s community advisory board. Not that he wants to keep all the fun and glory to himself. “I’d like to see more people get involved with the packaging process. I’m grateful for those who take the time to report bugs in the packages, and try to help anyone interested in rolling their own based on my or Debian’s sources.”
After tackling the packaging problem, Zuckerman started working on a few projects of interest around Java and GlusterFS.
Building a Filesystem Service Provider for Java 7
Currently, Zuckerman says that the projects are for fun. “Java is one of the languages I know fairly well and I thought that implementing an NIO.2 filesystem provider would be a fun challenge. (It sure is!) The project is actually a pair of related software packages: a Java JNI wrapper around the libgfapi C library (libgfapi-jni), and an implementation of the NIO.2 filesystem service provider API (glusterfs-java-filesystem) that uses the JNI library.”
He notes that Hiram Chirino was “instrumental” in getting the libgfapi-jni off the ground, and “probably would not have been able to make a JNI wrapper for the libgfapi C library without his support and the JNI code generator, HawtJNI” which is written by Chirino.
He also says he’d like to find a few co-contributors for the projects. “The Java projects are still in infancy and I have lots of plans for new features. Unfortunately I don’t have as much free time to put into coding as I would like so things are progressing slowly.”
Overall, Zuckerman says that he’s had a good experience working with the Gluster community. “I have enjoyed a good rapport with the GlusterFS developers, and other community members, since I first began using GlusterFS back in late 2010,” says Zuckerman.
“I’ve asked lots of questions over the years and the developers are extremely knowledgeable, helpful, and kind in their support of users. That was a big motivation for me to get involved, and stay involved, with the project. I like the software and get along well with the people who make it.”
Have questions about Zuckerman’s projects? You can find him on Freenode as semiosis and on Twitter as @pragmaticism. Questions about Gluster development in general? Check out the #gluster channel on irc.gnu.org, or join the mailing lists to get help from the Gluster community.
Cutting Edge, a visual effects company that’s worked on films such as The Great Gatsby and I, Frankenstein, had outgrown its NAS storage system and was in search of a way to boost its storage capacity and performance in the face of several large upcoming projects. The Australia-based firm turned to GlusterFS as an alternative to making a massive investment in an enterprise SAN.
I spoke to Dan Mons, R&D SysAdmin at Cutting Edge and architect of the company’s GlusterFS deployment, about how he tapped Gluster to meet Cutting Edge’s growing storage needs.
“We’ve had three feature films roll through our Gluster storage since it went in, and to be 100% honest we couldn’t have done them without Gluster,” Mons said. “The flexibility it offers us for storage is amazing.”
The GlusterFS storage solution that Mons assembled consists of 24 total GlusterFS 3.4.1 nodes, each running CentOS 6.4 and outfitted with 34TB of RAID6 storage. These nodes are assembled into four six-node clusters, which provide the company’s Brisbane and Sydney offices each with its own production and backup cluster pair.
Each cluster hosts a distributed-replicated GlusterFS volume, which keeps data accessible in the event of node failure. Nightly rsync operations between the production and backup clusters at each location provide an additional layer of data protection.
Users in Cutting Edge’s Sydney and Brisbane offices have access to 107TB of production storage, and read-only access to another 107TB on each location’s the backup cluster.
Mons explained that given data volume, time and bandwidth issues, it isn’t feasible to synchronize completely the data generated at the two offices, but that the company’s artists have access to scripts to sync particular folders between the locations when it’s necessary to collaborate with co-workers in another office.
With a client pool that runs the gamut from Linux-powered render machines and individual workstations to machines running OS X, Windows, and a handful of specialty OSes, ensuring access to their data across multiple platforms and protocols has been one of the trickier parts of the Cutting Edge deployment.
The Linux machines that comprise that majority of the company’s client mix access the cluster via the GlusterFS FUSE client, which provides access to all six nodes in the production cluster directly, for maximum bandwidth distribution. Older Linux and machines running speciality OSes tap the cluster via Gluster’s NFS support, with DNS round robin for distributing the load.
Mons explained that while the OS X-based machines in his company’s environment are able to access the GlusterFS cluster normally via NFS or CIFS mounts using command line tools, he’s run into various issues with the OS X Finder application and with Carbon or Cocoa-based OS X applications.
To work around these issues, the team at Cutting Edge set up a separate Linux server that mounts the GlusterFS volume with the FUSE client, and then re-exports that as AFP via Netatalk3. This method works, but at the cost of performance and of compatibility with some of the firm’s pipeline processes. Ideally, Mons would like to see a FUSE client become available for OS X.
The company’s Windows-based machines access the cluster via Samba, installed on each node in the cluster, with DNS round robin for distributing the load and Active Directory for authentication. Mons said that his team encountered file locking issues with certain applications, most of which they were able to resolve, although they’ve continued to experience issues with Photoshop and Microsoft Office on Windows.
Since their March 2013 deployment, the Cutting Edge storage solution has undergone updates from GlusterFS 3.3.1 to 3.4.0, and most recently, to 3.4.1, all of which have gone smoothly. Mons noted that the latest GlusterFS updates have brought noticable speed and NFS stability improvements, benefiting legacy and turnkey systems for which the FUSE client is not an option.
Looking ahead, Cutting Edge plans to add new node pairs to their production and backup clusters in early 2014, as their production clusters are nearing 90% capacity, with more project data on the way.
Mons told me that he’s begun testing Samba with Gluster’s recent libgfapi enhancements, which appear to boost file browsing performance in his environment. Along similar lines, Mons is looking forward to seeing support for storing directory and file information in extended attributes make its way into GlusterFS, which promise to speed list directory and disk usage operations.
Rock the Vote needed a way to manage the fast growth of the data handled by its Web-based voter registration application. The organization turned to GlusterFS replicated volumes to allow for filesystem size upgrades on its virtualized hosting infrastructure without incurring downtime.
Over its twenty-one year history, Rock the Vote has registered more than five million young people to vote, and has become a trusted source of information about registering to vote and casting a ballot.
Since 2009, Rock the Vote has run a Web-based voter registration application, powered by an open source rails application stack called Rocky.
I talked to Lance Albertson, Associate Director of Operations at the Oregon State University Open Source Lab and primary technical systems operation lead for the service, about how they’re using Gluster to provide for the service’s growing storage requirements.
“During a non-election season,” Albertson explained, “the filesystem use and growth is minimal, however during a presidential election season, the growth of the filesystem can be exponential. So with Gluster we’re trying to solve the sudden growth problem we have.”
Rock the Vote’s voter registration application is served from a virtual machine instance running Gentoo Hardened, with a pair of physical servers running CentOS 6 with Gluster 3.3.0 to host voter registration form data. The storage nodes host a replicated GlusterFS volume, which the registration front end accesses via Gluster’s NFS mount support.
The Gluster-backed iteration of the voter registration application started out in September with a 100GB volume, which the team stepped up incrementally to 350GB as usage grew in the period leading up to the election.
Before implementing Gluster for their storage needs, Rock the Vote’s application hosting team was using local storage within their virtual machines to store the voter form data, which made it difficult to expand storage without bringing their VMs down to do so.
The hosting team shifted storage to an HA NFS cluster, but found the implementation fragile and prone to breakage when adding/removing NFS volumes and shares.
“Gluster allowed us more flexibility in how we manage that storage without downtime,” Albertson continued, “Gluster made it easy to add a volume and grow it as we needed.”
Looking ahead to future election seasons, and forthcoming GlusterFS releases, Albertson told me that the Gluster attribute he’s most interested in is limited-downtime upgrades between version 3.3.0 and future Gluster releases. Albertson is also looking forward to the addition of multi-master support in Gluster’s geo-replication capability, an enhancement planned for the upcoming 3.4 version.
Here’s a nice post about creating a linked list topology for a distributed-replicated setup. The idea is that it is easier to add a single server to a replicated volume by spending a bit of extra time prepping a linked list of bricks. The default topology would leave the author with the need of adding a pair of servers at a time:
The drawback to this setup is when servers are added, they must be added in pairs. You cannot have an odd number of servers in this topology.
Read the post to learn more about how (and why) he implemented a linked list topology.