So far we have talked about what the ExaGrid is, what it does, and how to setup Veeam Backups so they play nice together. But what does it really get you? … That’s what this post is going to cover. Also, I will go through some of the things that makes Veeam run slower when using an ExaGrid.
These articles have been written while trying out the ExaGrid with Veeam for a customer. So all of the data below is the real world, not lab data.
Veeam has been set up to have three jobs. One for Citrix servers, as we only backup once a week since they are pretty static. The second jobs are for their file and print server, which has 400+GB of files on it. And lastly a job for all the other misc VM’s on their cluster, this job has 7 Windows VM’s on it totaling 1.09TB of provisioned space. So overall I’m protecting 2TB-ish of provisioned space, actual data is around 1.3TB I would guess.
With Veeam on the recommended settings (compression off, and in Local Storage mode) a full backup is around 360GB for my “Misc” VM job, 395GB for my Citrix server job, and 484GB for my file server job. Since the ExaGrid only has a 1TB landing pad and since we only backup the Citrix servers once a week, we are able to run the file server and “Misc” servers jobs and get all of that data into the landing pad, even on a full backup night. Then we will run the Citrix server job during weird hours on a Sunday.
After about 8 days their ExaGrid had about 15% of its main storage area free.
Now, after more than 2 weeks we still have about 6% free… so that’s 6 days of backups only taking up about 9% of 1300GB.
On the ExaGrid we are consuming 1.305TB of space… but Veeam thinks we are using 3.469TB of space. This is where we can really start to see the savings that the ExaGrid does for us. If we were keeping months of backup jobs the savings would be even higher.
Issues I’ve Seen
One thing that does concern me is the Saturday Synthetic Full backup builds. This process takes almost 8 hours! This same process on a non-compressed/non-deduped filesystem is MUCH shorter, a comparable cluster and data size only take 2.5 hours on a SMSproSafe backup server. Also because the landing pad only accepts about 1TB of data on this unit, it also causes my jobs to fail every now and then. I assume this is because I have two backup jobs trying to create their synthetic fulls on Saturday (one early in the evening and one at 2 am… technically Sunday) but because it takes time to clear out the landing pad out it fills up before it can be cleared… causing the job to fail. To fix that issue I simply moved the synthetic full on one of the jobs to Friday night and the other one to Sunday afternoon… this gives both jobs plenty of time to complete without running into each other. The other solution would be to add another ExaGrid node, and if the customer adds any more virtual machines that is probably what we will need to do anyhow.
Back to the synthetic full creation times. So Customer A, which is the one that takes 8 hours almost, has two DL360 G6 servers and an HP P4000 Lefthand SAN; they also run Veeam as a VM with 4 vCPU’s and push data to the ExaGrid. Customer B has the exact same setup but instead of a virtual Veeam server and an Exagrid, they have a SMSproSafe backup server (and for the purpose of this comparison is a single 5500 series quad-core processor with 6 – 1 TB SATA 7200 rpm drives).
Customer A (with the ExaGrid) has 1.24TB of VM space in the backup job in question. Here is the statistics for the job. As you can see it takes just about 8 hours for this job to run. Make a note of the processing rate… 51MB/s.
Customer B (with the SMSproSafe) has 1.27TB of VM space in the backup job, and it takes about 2.5 hours. But look at the processing rate… 153MB/s, almost 3x faster.
I can only contribute the extended amount of time to 1 thing (well maybe 2, but probably one is the main thing). My suspected reason is that the ExaGrid has all of the files deduplicated and compressed… whereas the SMSproSafe box is a local, uncompressed Windows file system. Write/Read performance on the SMSproSafe is limited only by the number of spindles and the SATA interface speed. Whereas the uncompressing and rehydration of all of the VIB files at the time of VBK (synthetic full) creation will undoubtedly put a high load on the CPU of the ExaGrid.
I should note that even though it does take Veeam a lot longer to do synthetic fulls when pushing to an ExaGrid, I have to ask myself “is the juice worth the squeeze”… my answer would be yes it is. Synthetic fulls are probably going to always be created on a weekend day anyhow, and the amount of data the ExaGrid is able to fit into such a small space is amazing. The only other concern I have is with Instant Recovery… how will these slower read and write rates effect performance during an Instant Recovery?
I will investigate that as soon as I get some VM’s pushed to the two test Exagrid’s that they left me borrow, I will post my findings in a few days.