by on June 29, 2012

Healing Split Brain

As people who attended my recent Red Hat Summit talk are aware, one of the big issues with GlusterFS replication is “split brain” which occurs when conflicting updates are made to replicas of a file. Self-healing in the wrong direction risks data loss, so we won’t do self-heal if we detect split brain and we’ll be very conservative about identifying conditions as split brain which might actually be resolvable. Unfortunately, this can lead to situations in which we won’t do self-heal at all and the file remains inaccessible until the administrator manually resolves the split brain. Even more unfortunately, manual resolution means poring through logs and manually mapping from subvolume names to physical storage locations. Most unfortunately of all, the last step of manual resolution involves doing exactly what we generally tell people not to do and will soon forbid – modifying the back-end data directly on the servers.

Clearly, the best approach to split brain is to prevent it, for example by enabling the quorum enforcement feature that I implemented a while ago. However, the conditions that can cause split brain are not nearly as rare as we would like them to be, and a little help with manual resolution can go a long way. That’s where my new script comes in. At the very least, it will do some of the drudge work of parsing configurations, fetching extended attributes, etc. for the files you tell it to heal. If it still can’t heal a file, it will at least tell you why, in something approximating human language and without requiring you to search through every log file. That’s not all, though. It also uses algorithms that are a little different than the ones in the regular self-heal, so it can recognize and correct some more conditions:

  • In its “aggressive” mode, it will resolve some “wise fool” and “two fool” conditions that standard self-heal will give up on, if the pending-operation counters give us good reason to believe that some “accusations” should be withdrawn or reversed. (See my article on replication internals for explanations of these strange terms.) This can break some accusation loops that cause us to declare split brain.
  • Regardless of aggressive vs. normal mode, it will detect when file contents are identical and clear the pending-operation counts so that the file becomes accessible again. This is kind of a last-ditch attempt to get the data unblocked, after all of our other methods have failed.

Obviously, more aggressive self-heal means higher potential for data loss if we make the wrong decision. That’s why I wrote it to look only at files you specify, instead of doing a full scan. That’s why I went a little further than usual in writing tests for it. Think of it the same way as you would think of a wipe and restore from tape, when regular self-heal has definitely failed and regaining access to the file is critical even if the version you end up with is slightly out of date. It’s certainly not supported in any way by Red Hat, and my colleagues would be within their rights to disavow or even condemn it.

The script is designed to be run offline and on a server, though it can run online and on a client (so long as that client has the gluster CLI installed). You’ll need everything in the github directory I linked to above, and then you’d do something like this: myvol server1:/export/sdd path/to/broken/file another/broken/file

The second argument could be the path (e.g. from “gluster volume info” or the trusted.glusterfs.pathinfo synthetic extended attribute) for any brick where the affected files reside, and you can specify as many of those as you like. The script will then mount all of the bricks containing replicas, use those to fetch the pending-operation counts on all replicas, and try to figure out what kind of repair to do. If you’re having problems with split brain, it’s one more thing you can try before you go poking around on the back-end storage or give up entirely, but due to the inherent complexity of the problems it’s trying to solve I can’t guarantee that it will fix your particular split-brain problem. Good luck.

1 Comment

  1. ch says:

    What version(s) of glusterfs should these scripts work against? I am runing 3.2.6 and I get RuntimeError “text outside volume definition” when trying to use the script as described.

Leave a Reply

Your email address will not be published.