all posts tagged Rants & Raves


by on October 22, 2013

‘Software Defined Storage’ and ‘Open’: What the heck are they?

A few weeks ago, "Software Defined Storage" and "Open" were all the news in the "cloud" industry as EMC announced they had an "Open" "Software Defined Storage" solution. I heard the news and rolled my eyes. Yeah, right. ... but I was busy with real life things and didn't even have time to read the announcement, much less any of the buzz around it.

Until today...

I've read all the buzzwords and hype, and still I'm not entirely sure what they're offering. One keyword that's missing from everything is "POSIX" an acronym for "Portable Operating System Interface" which is an IEEE standard that is an open, operating system independant standard api for filesystem interaction. It does appear that they're offering yet another object based api, nfs, cifs, and iSCSI as well as S3 and Swift compatibility, but it's unclear from their documentation as to whether those are north facing or just south.

What is "Software Defined Storage"?

Enterprise Management Associates, Inc., whose paper reads as if they are a paid shill for EMC would like SDS to be defined in such a way that their definition aligns almost perfectly with EMC's product announcement (http://www.emc.com/collateral/analyst-reports/ema-emc-vipr.pdf), even Wikipedia seems to have followed the commercial shill lines and left true software defined storage at the bottom of a bullet list.

Software defined storage is any storage where the logic that defines that storage is abstracted into a software layer. Using that same software layer, therefore, it should be possible to define the physical storage in multiple ways. SDS is a tool or set of tools that allows you to use software to design a storage system that best fits your use case.

What is "Open"?

When I first got in to the computer industry, there were software producers and consumers and there was a huge paywall between the two. As a consumer of software, there was nothing open about it. Some features were documented, but most were not. File structures, interfaces, memory models, even the user interfaces were proprietary and would be changed between releases to ensure that any competitor's products that had managed to decypher them and integrated their product would break. Big software vendors really had no interest in a little beauty supply distributor in just three states. Their focus was on the Fortune 500 and if you wanted to report a bug, you were welcome to, but it wasn't likely to have any resources applied to it unless you were someone like Boeing.

Today, with open source, anyone can become involved in the production cycle, even without programming skills. I hang out on IRC and, in my spare time, help people understand GlusterFS. I look at the industry trends and communicate directly with the developers that produce the code. Together we brainstorm over new ideas or bugs. We, the consumers and the producers, are all part of a community. That's not just some buzzword but it's actual interaction. Some of the developers have become my friends. This is "open".

"Open" is a breakdown of the barrier between the producers and the consumers. "Open" is where the person (or company) with the problem to solve is involved in defining the solution to that problem. That's true for open source software and also open standards.

One company, be it EMC, Microsoft, VMWare, Amazon, etc., telling you what's an "open" solution, is not open. That's proprietary by it's very nature.

EMC's product has a pretty GUI, it may indeed be software defined storage. It is not, however, "open".

by on June 26, 2013

Fedora 19 with legacy GlusterFS 3.3

I went into this expecting problems. Fedora 19 ships with GlusterFS 3.4.0 beta and I'm using GlusterFS 3.3 in production. I expected that I would have to downgrade my Fedora packages so I could use my volumes. I expected problems.

What I didn't expect was that my wishes for the last three releases would be realized in 3.4. RPC compatibility!

I started futzing around with downgrading and encountering problems... Doesn't really matter what those problems were because that was a complete waste of time.

I reinstalled the latest beta packages and mounted my volumes.

I was able to be a member of the peer group (my Fedora box is a member for CLI convenience only).

Everything just worked.

Happy Dance

by on May 6, 2013

PHP playing fast and loose with your data integrity

Had a potential GlusterFS user state that the filesystem incorrectly reported that a write succeeded even though all the servers were powered off. Since this sounded rather impossible, I asked for details and duplicated the problem. This is the php code:

<?php
$fp = fopen("myfootest","c");
if (fwrite($fp,"Test1\n") === false) die("Write failed.");
echo "First write completed. Turn off servers and press enter.";
fgets(STDIN);
if (fwrite($fp,"Test2\n") === false) die("Write failed.");
if (fclose($fp) !== false ) {
    echo "File closed without error.\n";
} else {
    echo "Error closing file\n";
}
?>

After opening and writing to the file once, we wait for input and pull the plugs on the servers (I actually just kill -9 the glusterfsd and glusterd processes). Then we write to the file again and close the file. We expect that either the fwrite or at least the fclose are going to error. Unfortunately, "File closed without error."

How can this be? Can we really be accepting this write even though it can't be written?

An strace reveals what's actually happening though. When I strace this code, here's what really happens:

open("/mnt/myfootest", O_WRONLY|O_CREAT, 0666) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
write(3, "Test1\n", 6)                  = 6
write(1, "First write completed. Turn off "..., 56) = 56
read(0, "\n", 8192)                     = 1
write(3, "Test2\n", 6)                  = 6
close(3)                                = -1 ENOTCONN (Transport endpoint is not connected)
write(1, "File closed without error.\n", 27) = 27

The writes don't fail because they're being cached. The fclose, however, is getting an error but is not passing that error along to the script. I believe that is this bug: https://bugs.php.net/bug.php?id=60110

by on August 15, 2012

GlusterFS bit by ext4 structure change

On Sunday, March 18th, Fan Yong commited a patch against ext4 to "return 32/64-bit dir name hash according to usage type". Prior to that, ext2/3/4 would return a 32-bit hash value from telldir()/seekdir() as NFSv2 wasn't designed to accomidate anything larger. This broke the distribute translator as suddenly the dirent structure was returning 64bit d_off values. When DHT (Distributed Hash Translator) applied dht_itransform() on those values, it would overflow. Since the dictionary entry did not have a cached offset, it would try to create one again and would end up in an endless loop.

That patch was for kernel v3.3-rc2. To make things more fun, Jarod Wilson merged in that patch in 2.6.32-268.el6 (from "rpm  -q --changelog kernel | less). My personal feelings on this is that structure changes shouldn't have been backported into Enterprise kernels. This has caused a lot of frustrated users on the IRC channel. Most have just reformatted with xfs, which is a valid solution and falls in line with the officially recommended configuration. For some, however that's just not possible.

Distributions known to be affected by this change are:

  • Fedora >= 17
  • Red Hat Enterprise Linux (RHEL) 6.3
  • CentOS 6.3
  • Debian Sid
  • Debian Wheezy

The workaround is to either downgrade your kernel, or reformat your bricks xfs OR for RHEL/CentOS,  downgrade your kernel to 2.6.32-267 or for everybody else, downgrade to 3.2.9.

The patches that are related to this issue can be tracked at http://review.gluster.com/

UPDATE 2012-08-17 04:02 GMT

Spoke briefly with Vijay 'hagarth' Bellur, one of the lead developers, who said, "there are some problems getting NFS and ext3/4 to work with this patch .. hence it is sitting in the queue."

It is still being actively worked on, though, and is a high priority.

by on September 4, 2011

Not the kind of support I give…

I had a problem where /lib/dri/r600_dri.so was left behind on a x86_64 system during an upgrade from F14 to F15 using the Anaconda DVD upgrade. Not noticing this was a 32 bit app, when I couldn't figure out why the error message was showing up from a line number that didn't correlate with the source code, I played around with using rawhide packages and seeing if that changed anything. When it didn't, I asked a question on #fedora. Now I know that both #fedora and #centos are run by people with no manners and god complexes, but I did it anyway.

fenrus02, after telling me off for mixing distro versions despite the fact that my question had already explained that the version wasn't even being read, told me that his post-upgrade page would fix this problem and insisted, despite evidence to the contrary, that I was wrong.

So let's analyze his method and see where he expected this problem would have become evident (he won't say how he expected this to work).

0. Change runlevel to 3 ( http://fedorasolved.org/post-install-solutions/runlevel ) (Use the grub method) and login as root.  You can use "cnetworkmanager" or "nmcli" if you need to get networking started without the GUI.

Irrelevant. Has nothing to do with finding 32 bit libraries with no rpmdb Package data.

1. Update your system:

# rm /var/lib/rpm/__db.00?;

Delete lock files. Won't find files in /lib/dri that don't have Packages

# yum clean all;

Delete yum cache. Won't find files in /lib/dri that don't have Packages

# yum-complete-transaction;

If there were any pending yum installs, perhaps mesa-dri-drivers.i686, then this could work. Since there weren't, no.

# yum update --skip-broken;

Since mesa-dri-drivers.i686 wasn't installed, this update will not affect that file.

# rpm -a --setugids; rpm -a --setperms;

Resetting file permissions will, again, not version library files that aren't Packaged. In fact, even their gids and permissions won't get reset.

# yum install @core @base;

mesa-dri-drivers.i686 is in neither @core or @base

2. (Optional) Remove old packages from cache directories

# DIST=$(rpm --eval '%{dist}'); find /var/cache/yum/ -type f -name \*.rpm |grep -v $DIST |xargs rm -f;

Delete old yum cache files from other distributions. Won't fix the library issue.

If you have yum-plugin-local installed, you will want to free up spaced it used:

# DIST=$(rpm --eval '%{dist}'); find /var/lib/yum/plugins/local/ -type f -name \*.rpm |grep -v $DIST |xargs rm -f;

Deleting these files also not going to affect /lib/dri/*

3. (Optional) Install basic components you would have from a new install:  You likely want to include the desktop of your choice as well, such as @gnome-desktop or @kde-desktop or @xfce-desktop or @lxde-desktop

# yum install @base-x @base @core @fonts @input-methods @admin-tools @dial-up @hardware-support @printing fpaste memtest86+ @gnome-desktop;

mesa-dri-drivers.i686 is not in any of those groups.

4. Correct labels and reboot:  (This command takes about 11 minutes to run on my hardware.  Yours may be quicker or slower.  Give it time to complete.)

# fixfiles -R -a restore; reboot;

The above command only does the files yum/rpm installed. If you would rather relabel all files on the system, use this instead: (Note: It may take longer on reboot)

# fixfiles onboot; reboot;

Resetting the selinux context is also not going to update the i686 version of those libraries.

5. Newer versions of yum include this command.  If your version does not yet support it, skip this step:  (yum-3.2.28-1 is known to work)

# yum distribution-synchronization --disablepresto;

distro-sync does not perform operations on "local packages". That may be what these files are considered since they have no package. The yum python scripts don't do anything that would look at an existent directory and install the package to manage it. This will have no effect.

6. Login again with runlevel 3 and as root.  Install yum-utils and print out a list of all the packages that need review:  (This will print out packages that have dependency problems as well as packages that are no longer found in your configured repos and any duplicate packages you might have.)

# rpm -Va --nofiles --nodigest;

The package doesn't exist in the rpm database, so this will have no effect.

# yum install yum-utils;

Obviously installs yum-utils, no effect.

# package-cleanup --problems;

package-cleanup works through the rpm database package table, so it will not even look at the dri libraries.

# package-cleanup --orphans;

ditto

# package-cleanup --dupes;

ditto

(optionally locate packages you may not need any longer)
# package-cleanup --leaves;

ditto

7. In most cases, you should remove the versions listed above and install the current distribution version of the package instead if you still use it.

8. Create a list of all the files that have verify concerns:

# . /etc/sysconfig/prelink && /usr/sbin/prelink -av $PRELINK_OPTS >> /var/log/prelink/prelink.log 2>&1;

This is where I'm guessing he thought it would show up. If there was a library linkage to the file that wasn't there, I could see getting an error about a missing library. But since it was there, there are no errors associated with the mesa-dri-drivers. No effect.

# /sbin/ldconfig;
# rpm -Va > /tmp/rpm-Va.txt 2>&1;

I don't care how many times you walk the rpm database, if the package isn't there, it's not going to check the files that aren't associated with it. No effect.

9. Using the above, create a list of non-configuration files that need review:

# egrep -v '^.{9} c /' /tmp/rpm-Va.txt > /tmp/URGENT-REVIEW.txt;

Won't be here for the reason stated above.

10. Using the above, create a list of configuration files that need review:

# egrep '^.{9} c /' /tmp/rpm-Va.txt > /tmp/REVIEW-CONFIGS.txt;

Won't be here for the reason stated above.

11. Review the lists above. Consult "man rpm" under the VERIFY section for the meaning of the first column.  You can usually ignore lines if they have a "prelink: /usr/bin/somefile: at least one of file's dependencies has changed since prelinking" type message next to it.

12. Locate your changed config files and manually merge the changes (If you have yum-plugin-merge-conf installed, you can use it here to assist as well):

# yum install rpmconf; rpmconf -a; 

rpmconf walks the rpm database Packages table. It's still not in there so this will have no effect.

# find /etc /var -name '*.rpm?*' > /tmp/REVIEW-OBSOLETE-CONFIGS.txt;

It's not in /etc or /var so that's going to be irrelevant right off the bat.

Summary

I was told several times that this was the end-all-be-all answer to the problem I asked about. I argued as it's clear by reading through the list of commands that it would have no effect. When I was chastised and told that I could, "decide to live with misery for the next few months while you sort it out yourself of course" I was offended.

DiscordianUK chimed in later saying that F17 packages were supported in fedora-qa which, of course, had nothing to do with my question. And I was accused of acting "entitled" by EvilBob. I don't feel it's inappropriate to feel entitled to common courtesy.

If you feel that someone has done something foolish, but that foolish thing isn't what he's asking about you can be rude and hostile, or you can ignore the question, or you could actually read what was asked and respond to it. I don't feel rude and hostile is worth anybody's time.

by on January 19, 2010

Location Finder 0.9.2beta1

Over 300 downloads so far, and not one bit of feedback. If you download the beta and it fixes the problem where you search for a location (that you know is there) and nothing comes up, let me know! I need your feedback!

Thanks.