My Exagrid Road Map

I did a post a while ago about things I would like to see Veeam add to their road map, and I don’t think that any of them were added, but it’s always fun to speculate. So I obviously have no insight to the product road map at Exagrid, nor do I have any influence over what they implement. Therefore this post is just an “if I was important I would …” post.

Exagrid has pretty much mastered the deduplicated backup storage area; they provide backup application aware dedupe that is second to none and have also added in replication for easy offsite backups. In my road map they would expand upon their features and maybe continue to improve their dedupe and compression ratios if possible. However at some point you have to figure that you have reached the limit. So where do they go from there?

What I would suggest Exagrid add to its product offering is two things, neither of which I think would be too hard for them to do, but both of which would add great value to their product. These features are:

  1. Ability to create an NFS share for storing ISO images for VMware
  2. Ability to create an NFS share for use as a Datastore on which VM Templates could live

Right now Exagrid has the ability to create a “Utility” share and present it via NFS, which would actually allow you to do this now. However you would not get any compression or deduplication of the files stored on that share. A Utility share is a simple Linux based share for providing access to simple storage via NFS or CIFS. So the value add would be to add two share types… one for each of the items listed above. Then your ISO library could look at all the ISO images and dedupe across them, and if you are like most of the customers I have worked with, you probably store a bunch of Microsoft related ISO’s… and we all know there are probably some duplicate files in those things 🙂

The value of having a deduplicated VM Template repository is pretty obvious as well. If you have several Windows Server templates that are archived and only used when a new server is deployed, then you could save a decent amount of SAN space if you could put them on the Exagrid. I figure 10-15GB per Windows Template that would be the same for every 2008 R2 template, and if you had 10 templates that is 90GB of savings.

Now you might be thinking… why use a box purpose built for dedupe backup storage for something like this. My answer to that is that the Exagrid shines while the average IT admin is sleeping. If all goes well you could login the your Exagrid every morning at 8am and run ‘top’… my guess is that you would see something like this in the top line 0.05 or 0.02. Why? Well the Exagrid’s that I’ve worked with are Quad core CPU’s with Hyper Threading and 8GB of RAM. That is a lot of horse power to just sit and idle most of the day, so why not put it to use for something that you will most likely only use while you are there between 8am-5pm?

I guess time will only tell if this is something they have already thought of or if they like my idea and put it in place, if you have an Exagrid and like this idea I would encourage you to give your rep a call and tell them. As one of our sales guys always says … the squeaky wheel gets greased first …

As always thanks for reading!

For all the Exagrid Articles I have posted click here

Exagrid Notes / Support

My Experience with Exagrid Support

Let me start out by saying that the support and customer service from Exagrid is top notch. I get to work with a lot of different vendors at my job, and I must say that when I have to call one I usually plan for a crappy day. However I don’t have that feeling when I have to call Exagrid, because they know their shit and don’t try to transfer you 100 times before its fixed. Actually I should say when they call me, because normally they know something is wrong and are on top of it before I am.

Anyhow, now that I have a few weeks of backups out there with Veeam and an Exagrid box I have started to run into issues as described in Part 4 of my Exagrid post. After talking with Tom, from Exagrid, I have to say that I have not only learned a lot about how their product works, but also why my backups started to fail. Tom did a great job of explaining how files are presented back to a backup server and what back-end work the Exagrid is doing… because of this it wasn’t hard to put my finger on why the backup job failed. In my book when a vendor goes the extra mile to help me understand why there might be issues… and not just fix them and be 100% reactive… then they are pretty damn awesome in my book.

Synthetic full backups and Exagrid

One of the best features that Veeam offers is the ability to only pull the blocks of data that has changed from a VMware (and now Hyper-V) datastore. Then take those blocks and combine them with a full backup (VBK file) and create a new full backup. They call this a synthetic full backup.

So what I was having problems with at a customer was 2 things:

  1. Synthetic full backups were taking 6-10x longer then they did on a non-deduplicated backup repository.
  2. Synthetic full backup jobs were failing with errors such as “failed to write to …” with the destination being on the Exagrid file share.

Today, with the help of Tom at Exagrid I learned why these things are happening. So if you are a already a customer and if you are running into errors with synthetic fulls keep reading, or if your a potential customer keep this info in mind when planning out your Veeam jobs.

First lets draw out how the Exagrid is designed to work:

(Click for a larger version 2MB file)

Basically what the Exagrid tries to do is store the last backup in a fully hydrated form. This allows you to do file restores, instant recovery and all other stuff from the latest backup without waiting for the appliance to rehydrate the files. (This is the normal operation provided the unit is sized properly and space permits the files to remain hydrated)

However… if the unit is undersized, or you just have weeks and weeks of retention on it, and a full week of backups no longer fits in the landing pad then things get a little hairy. First synthetic full backups start taking a really long time… then if things grow a little bit larger… they start to fail. This all comes back to the size of the box. While the landing pad is an elastic area and will dynamically expand if needed, once the “cold storage” area gets full the landing pad cannot expand.

Why does it need to expand you ask?

Well if a file is requested by the Veeam server, the Exagrid finds the file and checks to see if its in the landing pad (ie. hydrated)… if its not it puts all the pieces back together and re-hydrates the file. This puts the file into the landing pad, which takes up space. Then the Veeam server can read the file (note that entire files do not always need to be re-hydrated… the Exagrid is smart enough to do just what is needed), but since deduplication is done after IO is stopped on the share, the re-hydrated file will stay in the landing pad until a predetermined amount of time passes once IO stops. So if you have a 1TB Exagrid and your Full Veeam backup is about 484GB on the disk, the landing pad is now 1TB – 484GB… this leaves approximately 516GB of space in the landing pad. So lets say that each daily incremental is 50GB, and you do an incremental Monday-Friday (Synthetic full on Saturday… which is the default)… now you are left with about 266GB of free space in the landing pad…this of course is assuming worst case and that all files for the previous week must by re-hydrated.

(Side note… the 1st problem I listed should have an obvious explanation now. It takes a lot of CPU cycles to produce a re-hydrated file… so if you have to re-hydrate 500+GB you can expect to wait a while)

So now its Saturday and Veeam starts a Synthetic Full backup… the Exagrid has re-hydrated all the files it has asked for and we have 266GB free in the landing pad for the 484GB synthetic full file. By now I think you see the problem.. we need more space then the 1TB that is advertised by the Exagrid.

At this point two things could happen:

1.) If free space is available: Free space on the Exagrid is allocated to the landing pad dynamically and brings the landing pad up to whatever size is needed by the backup files… your job completes fine. Then later the previous week full backup as well as the synthetics are deduplicated and put back into “cold storage”.

(Click for a larger version 2MB file)

2.) If free space is not available: The Exagrid cannot expand the landing pad to accommodate the new synthetic full backup and therefore tells Veeam that writes to the shared path have failed. Then Veeam gets pissed and your job fails to run.

(Click for a larger version 2MB file)

 

How can I fix this ?

Well you have two options, the first is simple… buy another Exagrid (let them know I sent you … maybe Ill get commission LOL). The second way is to stop using synthetic full backups inside of Veeam. Instead tell Veeam to “periodically” do full backups and not build them synthetically. This will put more of a load on your SAN without a doubt, because Veeam will now transfer ALL BLOCKS OF DATA each time a full backup runs. So if your SAN is already over worked, or if your backup windows are too long to transfer all the blocks, then you’re probably better off to just buy an additional unit.

 

Before posting this article I left the fellas over at Exagrid proof read it since I’m still fairly new to the technology and they had these points:

1. Consider using the term “repository space” in place of “cold storage”.  That is what you’ll see in our GUI/screen shots.

2. When Veeam deletes no-longer needed save points, the ExaGrid gets to work removing that un-needed data from the repository as quickly as possible – no need to wait for another backup, etc.  We know its been deleted, so we just purge it from the repository decently and in an optimal order.

3. The sizing calculator we use for Veeam customers is exactly the same as Veeam uses – 1TB of provisioned VM space requires 1TB of disk storage which would be an EX1000.  So “potential customers” should not run into this same situation.

Thanks for the tips guys!

Exagrid Software Upgrade

If you follow my blog you know that I’ve been working with the Exagrid backup appliances a lot lately and that I was lucky enough to actually get a couple units to do some further testing with.

Well when I received the two boxes I powered them on and ran through the initial setup, but soon found that Veeam was not an option in the “Share Type” drop down box. I scooted over to the “About” page and found that the software that they were running was the revision before the version that supported Veeam.

Disclaimer: You will need to contact Exagrid to get the updated software. While the process is very simple, Exagrid support will actually remote in and do the upgrade for you if you are uncomfortable with it. Therefore I take no responsibility if you nuke your box, or if you cause data loss for some weird reason. If you are in doubt call support… they are great!

Anyhow, after talking with support they sent me the download link for the 4.1 software. After a 540ish MB download I was ready to apply the patch.

How to upgrade the Exagrid software:

After logging into your Exagrid go to the Manage menu and select Software Upgrade.

The next screen will let you upload the software update file that you downloaded. You will need to get the download link from Exagrid support as they will have the latest version. Besides that it will also show you the version that is already installed on your system. In this screenshot I already have the latest version on the box. But if you have an older version you just click browse, select the file that support gave you, and then click upload.

After the upload has finished the new version will be listed in the Available Upgrade Packages section, then all you need to do is click Apply. The upgrade process can take as long as 45 minutes. After that amount of time you can relaunch a browser window and you should be able to reconnect to your system which is now on the latest version.

Exagrid with Veeam Backup Part 4

So far we have talked about what the Exagrid is, what it does, and how to setup Veeam Backups so they play nice together. But what does it really get you? … That’s what this post is going to cover. Also I will go through some of the things that makes Veeam run slower when using an Exagrid.

My Mileage

These articles have been written while trying out the Exagrid with Veeam for a customer. So all of the data below is real world, not lab data.

Veeam has been setup to have three jobs. One for Citrix servers, as we only backup once a week since they are pretty static. The second jobs is for their file and print server, which has 400+GB of files on it. And lastly a job for all the other misc VM’s on their cluster, this job has 7 Windows VM’s on it totaling 1.09TB of provisioned space. So overall I’m protecting 2TB-ish of provisioned space, actual data is around 1.3TB I would guess.

With Veeam on the recommended settings (compression off, and in Local Storage mode) a full  backup is around 360GB for my “Misc” VM job, 395GB for my Citrix server job, and 484GB for my file server job. Since the Exagrid only has a 1TB landing pad and since we only backup the Citrix servers once a week, we are able to run the file server and “Misc” servers jobs and get all of that data into the landing pad, even on a full backup night. Then we will run the Citrix server job during weird hours on a Sunday.

After about 8 days their Exagrid had about 15% of its main storage area free.

Now, after more than 2 weeks we still have about 6% free… so that’s 6 days of backups only taking up about 9% of 1300GB.

On the Exagrid we are consuming 1.305TB of space… but Veeam thinks we are using 3.469TB of space. This is where we can really start to see the savings that the Exagrid does for us. If we were keeping months of backup jobs the savings would be even higher.

Issues I’ve Seen

One thing that does concern me is the Saturday Synthetic Full backup builds. This process takes almost 8 hours! This same process on a non-compressed/non-deduped filesystem is MUCH shorter, a comparable cluster and data size only takes 2.5 hours on an SMSproSafe backup server. Also because the landing pad only accepts about 1TB of data on this unit, it also causes my jobs to fail every now and then. I assume this is because I have two backup jobs trying to create their synthetic fulls on Saturday (one early in the evening and one at 2am … technically Sunday) but because it takes time to clear out the landing pad out it fills up before it can be cleared… causing the job to fail. To fix that issue I simply moved the synthetic full on one of the jobs to friday night and the other one to Sunday afternoon… this gives both jobs plenty of time to complete without running into each other. The other solution would be to add another Exagrid node, and if the customer adds any more virtual machines that is probably what we will need to do anyhow.

Back to the synthetic full creation times. So Customer A, which is the one that takes 8 hours almost, has two DL360 G6 servers and an HP P4000 Lefthand SAN; they also run Veeam as a VM with 4 vCPU’s and push data to the Exagrid. Customer B has the exact same setup but instead of a virtual Veeam server and an Exagrid they have an SMSproSafe backup server (and for the purpose of this comparison is a single 5500 series quad core processor with 6 – 1 TB SATA 7200 rpm drives).

Customer A (with the Exagrid) has 1.24TB of VM space in the backup job in question. Here is the statistics for the job. As you can see it takes just about 8 hours for this job to run. Make a note of the processing rate… 51MB/s.

Customer B (with the SMSproSafe) has 1.27TB of VM space in the backup job, and it takes about 2.5 hours. But look at the processing rate… 153MB/s, almost 3x faster.

I can only contribute the extended amount of time to 1 thing (well maybe 2, but probably one is the main thing). My suspected reason is because the Exagrid has all of the files deduplicated and compressed… where as the SMSproSafe box is a local, uncompressed Windows file syste. Write/Read performance on the SMSproSafe is limited only by the number of spindles and the SATA interface speed. Where as the uncompressing and rehydration of all of the VIB files at the time of VBK (synthetic full) creation will undoubtedly put a high load on the CPU of the Exagrid.

I should note that even though it does take Veeam a lot longer to do synthetic fulls when pushing to an Exagrid, I have to ask myself “is the juice worth the squeeze”… my answer would be yes it is. Synthetic fulls are probably going to always be created on a weekend day anyhow, and the amount of data the Exagrid is able to fit into such a small space is amazing. The only other concern I have is with Instant Recovery… how will these slower read and write rates effect performance during an Instant Recovery?

I will investigate that as soon as I get some VM’s pushed to the two test Exagrid’s that they left me borrow, I will post my findings in a few days.

Exagrid with Veeam Backup Part 3

As promised here is the third part of the Exagrid/Veeam Backup series. This post will show you step by step the best practices for using Veeam with your Exagrid backup appliance.

If you have the time and like reading the best practice documents check out this one. Otherwise I have the short and sweet… get you up and running version below.

Normally if you are just backing up virtual machine data to a standard drive or network share the default Veeam settings are normally just fine. But because the Exagrid deduplicates data as well as compresses data some settings need modified to allow for optimal space savings. One thing that I’ve always heard is that you cannot compress compressed data, meaning that while you could burn CPU cycles to try to do it you will not see any additional savings. This is the case with Veeam and the Exagrid as well. Veeam’s dedupe settings however can be left at their defaults. I would imagine (and this is in no way official, but its my guess) that Veeam dedup’s data at a certain block size (probably whatever the VMFS file system block size is), then when the Exagrid gets the data it dedup’s at a much smaller block size… probably a 64 or 128k block size. IF this is the case that would explain why you get additional savings by using an Exagrid over using just Veeam and a standard disk… because once the Exagrid has all the data it can look at the entire job at a more granular level and compress and dedupe the hell out if it! Veeam can’t do this because it has data constantly streaming into it from the SAN or ESX host, so it’s much harder to get a “big picture” of all the data.

On to the step by step and screenshots:

I wont explain the entire Veeam Backup job process as you should already know most if it, but after selecting your Job Name, and the Virtual Machines you want to protect you are asked to pick a backup location. When using Veeam with an Exagrid you need to entire the UNC path to the Exagrid share you want to write data too. In this case I want to put my data on the “VEEAM3” share I created on the Exagrid.

Before clicking next we need to click the Advanced button to configure some additional settings.

On the first tab in Advanced we need to make sure that out backup mode is Incremental. If we tried to use a reverse incremental we would cause the Exagrid ALOT of extra work. Next click over to the Storage tab.

On this tab make sure to set the Compression Level to NONE, and make sure to leave the “Storage Optimize for” section set to Local (even though we are going to a LAN share). This will effect the amount of deduplication that we do on the Veeam server, we will actually do less dedupe on Veeam, but the Exagrid will more then make up for it.

You can now click OK out of the Advanced settings dialog boxes. And we have completed all of the Exagrid specific settings, continue to create your backup job just as you normally do.

One thing I do want to note is that you should not point more then 1 Veeam Backup job to a share…. or if you do make sure to run them at about the same time. This is because the Exagrid monitors activity to each of it’s shares… when there has been no IOps for some time to a particular share it will start the dedupe/compress process on that share. So if you have multiple jobs running at different times to the same share you may lock the Exagrid out from doing its job.

Check back soon for the update on what I was actually able to save by using the Exagrid, that will be the next post I do.

Exagrid with Veeam Backup Part 2

So if Veeam = Good, then Veeam + Exagrid = Better!, I explained why in the first part of this post series here. Now lets talk about why.

So lets say you bought an Exagrid, now we have to set it up and create a Veeam backup job to push data to it… that’s what this post will cover. I wont cover the initial unboxing, or overall config of the Exagrid because they have manuals for that, and their support has always been very good.

All of the Exagrid administration is done through a web interface, it can be accessed by browsing out to the IP address or DNS name of your Exagrid. Once there you will need to login to see the admin interface.

Here is the homepage of Exagrid interface

As you can see in the homepage it gives you a quick easy way to see what you have on the Exagrid. It shows you how much of the landing pad is free for the next backup and how much of the “cold storage” (or deduped/compressed area) is still available to hold those unique blocks of data. Finally at the bottom it also shows you which shares you have and how much data is in them as well as the dedupe/compression ratio and actual space consumed.

Creating a Share

This is probably a 2 minute task at the absolute most once you have done it once. To start we need to select which Exagrid server we want to put the share on, since we only have one just click the server on the left, then click “Manage” from the top menu, and then select “Shares”.

It will display a list of all of your current shares, and on the right there is a button to create a “New” share. Click that, and the options form for a new share is displayed.

Take notice of the “Share Type” this must be set properly for the type of data we are sending it in order to get good backups. Make sure to select Veeam(R) from the list of types.

After selecting the share type, put in the share name, this can be anything you want. then the last thing you need to do is put in the IP addresses or FQDN of the backup servers. These are your Veeam servers in this case. Then hit “Create” at the bottom right. That’s it! that is really all you need to do to use Exagrid with Veeam (At least on the Exagrid Side).

I will post the Veeam job settings tomorrow, so be sure to stop back.

 

Exagrid with Veeam Backup Part 1

This month Exagrid released an update for their backup storage appliances that makes them fully compatible with Veeam. For those who have no idea what Exagrid is, check out www.exagrid.com but basically Exagrids are a hardware appliance with several terabytes of raw storage that is presented in different ways via CIFS.

The idea is to have a “landing pad” that is a certain size (which is determined by the amount of data you need to back up, so if you need to backup 1Tb of data you can buy the 1Tb appliance), then after your backup job is completed the data that is in the landing pad is processed and is put into a “cold storage” like area. To Veeam the files still look like they are there and are the original full size, but what has happened is the Exagrid has deduplicated and compressed the data in the landing pad and pushed it back to the main storage area. This leaves the landing pad free for the next backup.

So why consider an Exagrid? Well Veeam alone will do a great job of compressing and deduplicating data inside of a job… but what it cannot do is deduplicate and compress across jobs. So your Monday backup is not deduplicated against your Sunday backup or your Tuesday backups. Also if you are doing weekly full backups and are retaining 30 days of backups you still need a pretty decent amount of storage:

Backup Storage Requirement = (Full backup size * 4) + (Average Incremental * 26)

So if a full backup is 500GB you will need at a minimum 2TB of disk space to retain 4 weekly full backups. and if your average daily incremental is 50GB then you also need 1300GB for incrementals. So total for 30 days of backup retention on a standard file system you would need 3.3TB of disk space.

Here is where the Exagrid magic comes in… the Exagrid will see the blocks from all 4 weekly full backups and deduplicate them… so you might only have to store 600GB of data total for all 4 weekly fulls. Plus it will deduplicate all incrementals and full backups together… so it’s looking at every block of data in a CIFS share… not just what is in a single session. So LOTs of space savings to be had!

So the bottom line is use what you have more effectively instead of continually buying disk space.

Stay tuned as I will be explaining how to setup the Exagrid for Veeam Backups as well as the best practices for configuring your Veeam Backup jobs to work with the Exagrid appliances.