Sample Header Ad - 728x90

Recommendations for replacing a GFS cluster?

5 votes
2 answers
6376 views
I have a couple of CentOS GFS-clusters (GFS as in Global File System) using a shared disk in a Fibre Channel SAN. They are mature now, and the time has come to start planning for their replacement. They are an odd number of nodes (3 or 5) with fencing of faulty nodes set up with APC (PDU) power switches. The nodes are all active and read and write simultaneously on the same shared filesystem. The filesystem is small, currently less than a TB, and will never grow larger than would fit on a commodity hard drive. I have two exclusive IP-address resources which relocate when a node is down. (1 on the 3-node cluster). Everything works very well, but the performance is not very good when there is a lot of activity. So, what could I do differently in my next generation cluster? **What I need is service uptime and data availability.** Possibly scalability as well, but probably not. I don't expect the load to grow very much. I also need to be able to read and write the files like regular files on a regular filesystem. There is no need for quotas or ACLs. Just regular unix permissions, ownership, mtime, size in bytes, and the ability to use ln to make a lock file in a way that fails on all but 1 node, should they try it at the same time. I don't want to increase the number of physical servers (which means that I want to use the storage on the actual servers themselves). It's not mandatory, but I think it would be good if I weren't dependent on the Shared disk. I've been through two incidents with Enterprise class SAN storage being unavailable in the last 5 years, so however improbable that is, I'd like to be one step ahead. Since uptime is very important, 1 physical server with 1 running kernel is too little. Virtual machines are dependent on the SAN in our environment. My thoughts so far: * All nodes could be plain NFSv3 clients (Would ln work the way I expect? What would be the NFS server then?) * [Ceph](http://ceph.com/) with CephFS (When will the FS be production ready?) * [XtreemFS](http://www.xtreemfs.org/index.php) (Why are there so little written about it compared to Ceph?) As you see, I'm interested in distributed storage, but need advice from experienced gurus. Especially recommendations or advice about Ceph or XtreemFS would be welcome. This is not a HPC with insane bandwidth demands. Just need the availability and reliability, and hopefully flexibility of my old solution, ideally in a "better" configuration than the current. **EDIT** (see Nils comment) The main reason I think about replacing this solution is that I want to see if it is possible to eliminate the single point of failure that the SAN storage cabinet is. Or should I instead use LVM mirroring to keep the data on two different storage systems in the same SAN fabric? Two FC-HBAs and double switches should be enough I think.
Asked by MattBianco (3806 rep)
Jan 14, 2014, 03:59 PM
Last activity: Mar 20, 2019, 12:29 PM