One of the guys I’ve known for a long time is an engineer / IT guy / jack of all trades for a fairly small SMB (in terms of IT needs). They only have a handful of VM’s and a dataset size of about 500GB of production data. So when I get asked to give advise one of the biggest problems I have is that I’m thinking WAY to big, for example could I really tell him to buy a Data Domain or a StoreOnce (physical appliance) that holds several TB of backups physically, and logically scales to hundred of TB’s? Probably not… even if it were in budget it would still be a huge waste!
So when asked how we could do some offsite backups while keeping a budget in mind, I remembered that HP was allowing production use on their 1TB free HP StoreOnce VSA! Having worked for an EMC/Cisco/VMware reseller for a long time I haven’t had a chance to install StoreOnce in any capacity, so this would be my first encounter with real data on StoreOnce. (I had deployed it in my lab a couple of times but I had never used replication or any of the features more than a week at a time, plus lab data gives no really measure of dedupe capability.)
Before StoreOnce all backup data was stored on a P4000 2 node SAN along with all of the production data. Production data was taking up about 500GB and Veeam backup data was taking up about 800GB for 3 weeks of retention. Aside from the P4000, he would also copy the latest backups to an external USB drive so that he had some sort of DR plan.
At one point Rick over at Veeam let me play around with a Veeam Cloud Provider license and we tried that out with this SMB’s data by replicating to my colo. It works pretty well, no issue with the technology, but there really just wasn’t enough bandwidth to push the nightly change data to my colo without also running into production hours. (Nightly change from Veeam to disk is about 12GB, and the company’s upload rate is about 2Mbps).
So let me start by saying this is a work in progress, but honestly I’m really excited about StoreOnce so I didn’t want to wait a month or more before writing this post… So it will get updated as I have more data.
The first thing I did was deploy a StoreOnce VSA on their VMware cluster, the licensed it… which was a bit of a pain in the ass… So one recommendation I would make is add a spot for licensing to the GUI. Don’t get me wrong I’m not afraid of the command line, but for the normal SMB customer who would be the target for a self install VSA… yeah GUI would be better. After that I added a 1TB VMDK to the VM and powered it on. Initial startup and install takes about 10 minutes, but its all hands off, basically you just need to sit and wait for the login prompt.
Veeam V8 has support for StoreOnce as a Dedupe appliance, although right now it doesn’t really take advantage of Catalyst but I’ve heard it will in V9. I then created a clone of all the backup jobs and repointed them at the StoreOnce Backup Repository and left them to do their thing. Lastly before I logged out for the night I also started a copy of all the backup retention to the StoreOnce VSA… about 3 weeks of backups…. 800GB of raw disk space.
The next morning I checked the StoreOnce interface to see how much dedupe had been achieved and I was impressed to say the least!
Thats pretty Impressive considering Veeam already did it’s dedupe and compression on the data before it landed on the StoreOnce!
After a couple more days of backups we are still at a 4.3:1 dedupe rate only having added about 3GB of unique data.
So here is what I have seen so far in terms of StoreOnce’s ability to compress and dedupe on top of Veeam:
Day 2: Veeam sent 12,746 MB of data to StoreOnce; StoreOnce “data on disk” size increased by 1GB (a 12:1 savings)
Day 3: Veeam sent 12,792MB of data to StoreOnce; StoreOnce “data on disk” size increased by 2GB (a 6:1 savings)
Day 4: Veeam sent 12,160MB of data to StoreOnce; StoreOnce “data on disk” size increased by 1GB (a 12:1 savings)
So I mentioned that offsite backups were the goal here… something that didn’t require user intervention was really the big thing. So to make this happen I created a VPN from the company’s Fortigate to a VPN endpoint on my colo gear and then deployed a StoreOnce VSA there, the same way I deployed one on site.
Configuring replication was pretty easy after the VPN was up.
Source appliance was on the 192.168.3.x subnet and the DR appliance was on the 192.168.13.x subnet. StoreOnce has a really easy wizard for replication. I simply went into the replication area, clicked the share I wanted to replication and started the wizard. I had to enter the IP/Hostname of the DR appliance and then create a share to replciate to… which was all handled by the wizard.
Because of the 2Mbps WAN connection, and my being to lazy to drive an hour away to seed the data, I simply set a 1Mbps cap and configured StoreOnce to only upload with 2 “slots”. (each slot wants 512Kbps minimum). Inserting traffic graph just to add color 🙂 … I guess you could say that the throttle works as advertised too I guess.
I estimate that it will probably take about 3 weeks to get in sync, I guess I should have done a seed, but honestly I’m more interested to see how it can handle a slow connection.
On a side note
While setting up this pair for my friend I also though, “shouldn’t I also be doing offsite backup of my data”? Lately I have been slacking… if the colo I use were to “go away” my blog would be in trouble. But setting up a StoreOnce VSA pair and replicating back to my home lab didn’t take long at all. The Veeam backup of my blog is about 10GB after it lands on StoreOnce. The first night (on my 3Mbps download crap connection) it took StoreOnce about 8-9 hours. On night two it only took about 2 hours, but as you can see it wasn’t maxing out my connection. BTW night two had 616MB of data send to StoreOnce but I honestly don’t even see a bump in the “on disk” storage LOL… That’s awesome!
More to come as it continues to chug away…but in the mean time would love to hear your StoreOnce stories if you are using it.
It’s been about 3 weeks since I implemented StoreOnce VSA so I thought I would share how it has been doing so far.
Below is a spreadsheet I’ve been keeping relating Veeam backup file sizes to disk growth on the StoreOnce appliance. I’m keeping track of these simply to show how much disk and bandwidth savings can be expected compared to storing and replicating Veeam files on their own.
As you can see we are up to a 9.5:1 dedupe rate and have only consumed about 25% of the free StoreOnce VSA’s capacity. At this point we have over 5 weeks of backups on disk, and based on the rate of growth I would say a full year of backups would be pretty conceivable. However I will most likely roll to a G-F-S hierarchy once I hit 60 daily’s.
Backup Job Length
While the StoreOnce VSA’s are doing their initial backups I have also been letting the old backup jobs run, which go straight to disk. Backup times average about 1 minute per job longer on incremental backup days for StoreOnce jobs, full backup days are a little harder to judge because I’m using synthetic full backups to normal disk, and I’m using Active Full Backups each week on the StoreOnce jobs.
Deduplication and compression are both CPU and memory intensive tasks, and therefore most of the time when you add either or both to the mix the time it takes to do those processes increases processing time. Basically the thought is that storage is expensive and CPU and Memory are “cheaper”. With that said most of the time when doing restores from dedupe appliances we will see longer restore times than if we were pulling straight from raw disk.
With that said these test results were pretty surprising to say the least…
- Same virtual machine
- Data Size: 60GB
- Test 1 was with backup files located on the Veeam’s “d” drive… which is an RDM from a 2 node Hp P4000 iSCSI array.
- Test 2 was with backup files located inside of StoreOnce which is backended by a VMDK on the same HP P4000 iSCSI array.
The time to restore from the Raw RDM inside of Veeam was 11 minutes and 12 seconds (for just the VMDK).
The time to restore from the StoreOnce VSA backups was 10 minutes and 23 seconds (for just the VMDK).
So in this case a restore of a real VM actually took 49 seconds LESS!
Obviously to be 100% sure I would need to test some more VM’s and repeat over multiple days…But honestly I wasn’t looking for exact numbers… Just knowing that it is pretty much the same is good enough for me!