The Gluster Blog

Gluster blog stories provide high-level spotlights on our users all over the world

Automated Hadoop Deployment on GlusterFS with Apache Ambari

Gluster
2013-10-16

The glusterfs-hadoop team is pleased to announce that the Apache Ambari project now supports the automated deployment and configuration of Hadoop on top of GlusterFS.

What is Apache Ambari?

Apache Ambari is a browser based Hadoop Management Web UI that is used to provision, manage and monitor Hadoop clusters. Once Apache Ambari is installed on a management server a user can use it to select a particular Hadoop stack to deploy on a group of servers. In addition, the user can specify which services within the stack they want deployed as well as the appropriate configurations for each of the services. Up until now, Apache Ambari has only supported the automated deployment and configuration of Hortonworks Data Plaform stacks on top of the Hadoop Distributed FileSystem (HDFS).

Deploying HDP 1.3.2 on GlusterFS within Apache Ambari

Over the last several months a number of engineers from Hortonworks and Red Hat have collaborated within the Apache Ambari Incubator project to modify the core HDP 1.3.2 stack to provide users the choice of either HDFS or GlusterFS. Should one select GlusterFS, the Hadoop distribution is then configured to use the Hadoop FileSystem plug-in for GlusterFS.

Prior to the GlusterFS support in Ambari, one had to separately download Apache Hadoop and configure it to use the glusterfs-hadoop Hadoop FileSystem plugin in order to get Hadoop to run on GlusterFS. All of these steps are now automated.

Figure 1 – Users can select a stack that includes the GlusterFS Hadoop FileSystem

stack

Figure 2 – Users can choose whether they want HDFS or GlusterFS as the Hadoop FileSystem

services

In order to take advantage of this new feature please follow the instructions on the glusterfs-hadoop project wiki.

So what’s next?

The Apache Ambari project is currently working on a re-architecture of the stack definition in order to support the ability to arbitrarily define and extend Ambari stacks. This should go a long way to enabling broader support for Hadoop Compatible FileSystems and improving Hadoop Interoperability.

Lastly, at the time of writing, Apache Ambari only works on RHEL, CentOS, OEL and SLES. Thus, we are we’ve also been putting some time in getting Apache Ambari working on Fedora so that the Fedora community has access to it. This should also make integration with the existing glusterfs-hadoop and related projects a lot simpler.

 

BLOG

  • 26 Apr 2019
    Gluster Monthly Newsletter, April 2...

    Upcoming Community Happy Hour at Red Hat Summit! Tue, May 7, 2019, 6:30 PM – 7:30 PM EDT https://cephandglusterhappyhour_rhsummit.eventbrite.com has all the details. Gluster 7 Roadmap Discussion kicked off for our 7 roadmap on the mailing lists, see [Gluster-users] GlusterFS v7.0 (and v8.0) roadmap discussion https://lists.gluster.org/pipermail/gluster-users/2019-March/036139.html for more details. Community...

    Read more
  • 24 Apr 2019
    Community Survey Feedback, 2019

    In this year’s survey, we asked quite a few questions about how people are using Gluster, how much storage they’re managing, their primary use for Gluster, and what they’d like to see added. Here’s some of the highlights from this year!

    Read more
  • 24 Apr 2019
    How to Deploy the OpenVPN Encryptio...

    This is part of a new series on using Gluster! OpenVPN is open source software that serves as the basis for a Virtual Private Network capable of supporting a point-to-point or site-to-site connection. Along with the fact that it’s free to use, it also has the benefit of being one...

    Read more