Running a Single CDH 5 Deployment on One or More GlusterFS Volumes
Given that the CDH 5 distribution is comprised of other components besides YARN and MapReduce,
I used the Apache Bigtop System Testing Framework to explicitly validate that Apache Sqoop, Apache Flume, Apache Pig, Apache Hive, Apache Oozie, Apache Mahout, Apache ZooKeeper, Apache Solr and Apache HBase also ran successfully. Work is Still in Progress to Enable the Use of Impala.
If you would like to participate in accelerating the work on Impala, please reach out to us on the Gluster mailing list.
Implementation details for this solution and the specific setup required for all the components are available on the glusterfs-hadoop project wiki. If you have additional questions, feel free to reach out to me on FreeNode (IRC handle jayunit100), @jayunit100 on twitter, or via the Gluster mailing list.