What is BIG DATA? 1PB … 16TB … 32GB? Honestly it means something different for everyone, but no matter what it means you you the main goal is to make managing it simple and easy. Isilon could be just the solution you are looking for if your current data management platform/strategy is becoming challenging.
What is Isilon
So let me start by saying that Isilon is a scale out NAS solution, meaning that as you need to expand your storage you simply buy more nodes and stack them together (think LEGOs)…also very much like the HP P4000 Lefthand solution, but the P4000 is block level storage while Isilon is primarily file level storage but has some block storage features.
Here is a Visio type diagram of the Isilon Architecture:
Why Isilon is awesome
The beauty of Isilon is there OneFS… and it means just what it says … there is only one file system that spans all nodes in the cluster, and it manages everything: the RAID, the volumes, and the file system. This is how Isilon scales so easily with very little management. As you add nodes to the cluster there is no configuration needed for an increase in capacity to be realized. Your data is also automatically load balanced across a back end network (infiniband) so that all data is distributed evenly. It also looks for hot data and spreads it out among nodes so that no one node is working harder then the others. The only configuration decision you need to make at all with Isilon is how much protection you need. And to do that you simply pick how many nodes in your cluster can be lost, and how many drives per node you are willing to lose, after that the OneFS magic kicks in and does the rest.
So your probably thinking by now that this is all great, but why Isilon? Why not a typical EMC VNX array like I’m used to? Afterall the Celerra architecture (and now the VNX) are tried and true? My two main reasons thought, are BIG DATA and easy of management. An EMC VNX array is not capable of presenting a volume bigger then 16TB, while that is a lot of space, it may just not be big enough for your big data needs. The other disadvantage of your typical array is that when you do add another node (P4000) or another shelf of disks (EMC VNX, Netapp, etc) you will have to login to the SAN and tell it what to do with those new disks. They will not automatically grow capacity by themselves, whereas Isilon will literally do just that.
Oh did I mention that it only takes about 60 seconds to add a new node to a cluster ? Seriously that is it, take a look for yourself right here. You power on the box and either use the CLI to add it to the cluster (which is super simple…. Select “join existing cluster” and then pick the cluster to join), or login to the GUI and use it to add the node. Either way it will literally take you longer to rack and cable the box then it will to add it and have people using it. You’re not going to get that with your typical SAN array.
So how big does Isilon get? Right now Isilon supports volumes up to 15PetaBytes (reference here) and there are rumors that it may be going up too! I guess you could call that big data.
Where I see Isilon being used
Because Isilon focuses on NAS services including NFS and CIFS I see Isilon being used at any company that has lots of people updating data… like a design firm that has tons of graphics files, or maybe an mechanical manufacturer that has CAD drawings. On the other side of the spectrum the hospitals that do MRI’s and other imaging need massive amounts of storage too. I remember working on one such machine that mounted a Linux NFS server for pushing those images too. It worked OK, but if that Linux server went down (and it did, that’s why I was there) they were unable to pull up images older then a week. If that customer would have been using Isilon they would have had two major benefits: 1) they could expand their storage VERY easily as they needed 2) they would have had to call me if one box went down (one node) because Isilon has built in redundancy between nodes.
Remember too that Isilon can do replication too. So now if you have lots of drawings or CAD files that need to be shared between your Ohio office and your California office you could do that too.
Isilon can also be used as VMware vSphere storage too, by utilizing the NFS capabilities of Isilon. Check here for the best practices guide.
For more information check out Jason McCarthys Blog. Jason is a vSpecialist working for EMC who’s focus is on Big Data and Isilon. He has also published two articles so far on setting up Smart Connect, which I did not mention, but basically think of SmartConnect as the Isilon version of PowerPath. IT tells your hosts about all the possible connection points to the Isilon cluster and helps load balance and provide more fault tolerance for applications.