Shortly before VMware’s VSAN was released, I had designed my new lab using GlusterFS across 2 to 4 nodes on my Dell C6100. Since this server did not have a proper RAID card and had 4 nodes total, I needed to design something semi-redundant incase a host were to fail.
You have a few options on how you want to scale this, the simplest being 2 nodes with GlusterFS replicating the data. This only requires 1 VM on each host with VMDK’s or RDM’s for storage, then shared back to the host via NFS which will be described later.
If you wish to scale beyond 2 nodes and only replicate the data twice instead of across all 4 nodes, you’ll just need to set up the volume as a distributed-replicate, this should keep 2 copies of a file between the 4 or more hosts. What I mistakenly found out previously was that if you use the same folder across all the nodes, it replicates the data to all 4 of them instead of just 2. You can see a sample working layout below:
Volume Name: DS-01 Type: Distributed-Replicate Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 172.16.0.21:/GFS-1/Disk1 Brick2: 172.16.0.22:/GFS-2/Disk1 Brick3: 172.16.0.23:/GFS-3/Disk1 Brick4: 172.16.0.24:/GFS-4/Disk1 Options Reconfigured: nfs.rpc-auth-allow: 192.168.1.1
After trying several different methods of making a so called FT NFS server, using things like UCARP and HeartBeat and failing, I thought about using a vSwitch with no uplink and using the same IP address across all of the nodes and their storage VM’s. Since the data is replicated and the servers are aware of where the data is, it theoretically should be available where needed. This also ends up tricking vSphere into thinking this IP address is actually available across the network and is really “shared” storage.
Networking on a host ended up looking similar to this:
vSwitch0 — vmnic0
vSwitch1 — No Uplink
GlusterFS Server: 192.168.1.2
GlusterFS Client(NFS Server): 192.168.1.3
VMKernel port: 192.168.1.4
vSwitch2 — vmnic1
GlusterFS Replication: 172.16.0.x
Then you can go ahead and setup a vSphere cluster and add the datastores with the same IP address across all hosts.
I will admit I did not have enough time to properly test things like performance and such before moving to VSAN, but what I did test worked. I was able to do vMotions across the hosts in this setup and validate HA failover on a Hypervisor failure. There are obviously some design problems with this, because if 1 of the VM’s were to have issues, it will break on the host. I had only designed this to account for a host failing which I thought would most likely be the issue I’d face most often.
Thoughts, concerns, ideas?