by on October 16, 2013

Automated Hadoop Deployment on GlusterFS with Apache Ambari

The glusterfs-hadoop team is pleased to announce that the Apache Ambari project now supports the automated deployment and configuration of Hadoop on top of GlusterFS.

What is Apache Ambari?

Apache Ambari is a browser based Hadoop Management Web UI that is used to provision, manage and monitor Hadoop clusters. Once Apache Ambari is installed on a management server a user can use it to select a particular Hadoop stack to deploy on a group of servers. In addition, the user can specify which services within the stack they want deployed as well as the appropriate configurations for each of the services. Up until now, Apache Ambari has only supported the automated deployment and configuration of Hortonworks Data Plaform stacks on top of the Hadoop Distributed FileSystem (HDFS).

Deploying HDP 1.3.2 on GlusterFS within Apache Ambari

Over the last several months a number of engineers from Hortonworks and Red Hat have collaborated within the Apache Ambari Incubator project to modify the core HDP 1.3.2 stack to provide users the choice of either HDFS or GlusterFS. Should one select GlusterFS, the Hadoop distribution is then configured to use the Hadoop FileSystem plug-in for GlusterFS.

Prior to the GlusterFS support in Ambari, one had to separately download Apache Hadoop and configure it to use the glusterfs-hadoop Hadoop FileSystem plugin in order to get Hadoop to run on GlusterFS. All of these steps are now automated.

Figure 1 – Users can select a stack that includes the GlusterFS Hadoop FileSystem

stack

Figure 2 – Users can choose whether they want HDFS or GlusterFS as the Hadoop FileSystem

services

In order to take advantage of this new feature please follow the instructions on the glusterfs-hadoop project wiki.

So what’s next?

The Apache Ambari project is currently working on a re-architecture of the stack definition in order to support the ability to arbitrarily define and extend Ambari stacks. This should go a long way to enabling broader support for Hadoop Compatible FileSystems and improving Hadoop Interoperability.

Lastly, at the time of writing, Apache Ambari only works on RHEL, CentOS, OEL and SLES. Thus, we are we’ve also been putting some time in getting Apache Ambari working on Fedora so that the Fedora community has access to it. This should also make integration with the existing glusterfs-hadoop and related projects a lot simpler.

 

Tags

2 Comments

  1. Robin Goldstone says:

    Will this be supported on HDP 2.0 any time soon? Thanks.

  2. Erin Boyd says:

    Yes. Within the next few months. Though Ambari isn’t supported for 2.0, the plugin is. Please see these instructions for testing out HDP 2.0 with GlusterFS https://forge.gluster.org/hadoop/pages/Configuration

Leave a Reply

Your email address will not be published.