To Cluster with Lustre … or not?
Recently I was tasked with investigating the feasibility of using Lustre (a clustered file system typically used in supercomputing environments) for a solution at my employer. Essentially this system requires a few high-end components to achieve considerable throughput. I’ll attempt to outline the pros and cons to using Lustre in a production environment.
Lustre started as a cluster-aware file system originally designed by Cluster File Systems, Inc. and was recently acquired by Sun Microsystems. Lustre was designed to be a highly-scalable, high performance file system/cluster solution. The system consists of a few key components at its core.
Picking a clustering file system such as Lustre obviously has to be out of need. These systems, inherently, are more complex and can be prone to failure for that reason. Using Lustre makes sense if you’re looking for a scalable storage solution which can expand over thousands of nodes for storage. High performance must be in mind as well. Most business problems do not need a solution of this magnitude. I guess now would be a good time to cover the terminology used with Lustre.
Lustre has some key terms we’ll need to know while reading this short paper.
- MGS – Management Server (there is one management server per site, this server contains all configuration detail for all Lustre clusters at a site)
- MDT – Meta Data Target (this server [or pair of servers] stores all meta data needed for where files are stored)
- OST – Object Storage Target (this is where the data is actually stored and striped across)
- Lustre Clients (these clients are typically *nix variants)
Now that we have the terminology out of the way I’ll describe how it works (just a high-level overview).
We’ve reviewed the components of the Lustre configuration above. A Lustre MGS stores all configuration data needed for a site. The MDT stores all the meta data needed for where the files are located (pointers to OST’s) and the OST’s have the physical storage needed for object (file) storage.
A key benefit is scalability and performance. Performance is achieved by striping data across all available OST containers. This is what makes Lustre shine. Consequently you’ll need equipment to support that level of speed.
Lustre uses its own network drivers to facilitate network communication between nodes. Currently Lustre supports TCP.IP, Elan, InfiniBand, myrinet and others.
Here is where most decisions are made. I suppose Lustre on a 1G network would perform (granted your switching backplane is great) but it also depends on how many clients you have accessing this array of machines. It’s recommended to use a higher-speed communication medium such as 10G or InfiniBand.
The bottom line
The bottom line is very simple. If you need a system which is highly scalable, high performance and very reliable pick Lustre. Remember to gain any considerable speed you will need considerable investments in the network arena. Lustre is not a widely deployed solution in most hosting enterprises but could serve as a good storage back end solution for a cluster of web servers (since Lustre supports reading and writing the same file at the same time from different machines).
* Image provided by Cluster File Systems, Inc.