This is the story of a customer who recently implemented Veeam and Data Domain; all names have been changed to protect the innocent, etc etc 🙂
“BOB’S BAIT AND TACKLE” was running VMware 4.1 on a mostly HP environment, their SAN was an HP EVA4400, and the VMware servers are HP Blades. For backup they were using HP Data Protector software along with an HP tape library, which was no longer able to meet their backup window requirements.
With a VMware upgrade on the horizon and with SAN space dwindling “BOB’S BAIT AND TACKLE” started looking for a replacement solution for their backup and storage environment.
“BOB’S BAIT AND TACKLE”s existing infrastructure was limited to approximately 10TB of SAN storage (they store a lot of secret customer info or fish pictures I guess), and backups were limited to the speed of their tape autoloader and backup server. Because of this full backups of their Exchange environment required the entire weekend to run (time that could be spent on fishing), and other servers also took many hours to backup.
Their plans to migrate to a new ERP solution also had to be pushed back because of a lack of free storage on their HP EVA4400 SAN.
The IT Team proposed to replace their storage environment with a new EMC VNX5300 fiber channel SAN and replace the tape library and HP Data Protector with a pair of Data Domain DD620’s and Veeam Backup software.
The VNX5300 would include over 45TB of raw capacity, which would give “BOB’S BAIT AND TACKLE” years of growth (and plenty of room to store fishing pictures). It also included advanced features such as Enterprise Flash Drives, aka EFD’s or SSD’s, and the EMC Fast Suite which allows for data to automatically move between tiers of disk for the best performance and cost per gigabyte.
The Data Domain DD620 boxes were chosen because “BOB’S BAIT AND TACKLE” wanted to reduce or eliminate the use of tape storage if possible. To facilitate the decommissioning of tape while maintaining an offsite backup, “BOB’S BAIT AND TACKLE” selected a second site across town which would be connected with dedicated fiber where they would locate their sister Data Domain appliance.
Veeam was chosen as the backup software for “BOB’S BAIT AND TACKLE” because almost all of their servers are now virtualized, and the handful of physical machines that remain are scheduled to be virtualized in the future (they went on a fishing trip and didn’t have time to finish I guess). Veeam also integrates tightly with VMware to provide quicker backups of virtual machines by leveraging VADP (or VMware API’s for Data Protection), and reduces the load placed on the virtual infrastructure as a whole compared to legacy backup applications that introduce agents into the guest operating system.
After completing the upgrades at “BOB’S BAIT AND TACKLE” they are now able to backup not just one of the Exchange DAG members, but all of the servers helping to provide Exchange services including front end servers and MTA servers, all in less than 8 hours. Incremental backups have also been reduced to approximately 2 hours as well.
Instead of a single backup server and tape library Veeam and Data Domain are able to provide multiple paths for backup data to flow through as well. On the Veeam side we have implemented three Veeam proxy servers, and on the Data Domain side we have two CIFS shares and each is attached to its own gigabit Ethernet port. This allows many backup jobs to run in parallel while not affecting the performance of any one job like a single path would.
All of these technologies combined with an optimized configuration by the IT team have led to a shorter backup window and a much faster RPO and RTO. The solution also has the ability to add as much as 4 TB more to the Data Domain boxes to allow for an extended backup retention time in the future.
Real World Data
While most marketing documents show you averages and numbers that are sometimes questionable, this post will use real numbers taken directly from “BOB’S BAIT AND TACKLE”’s environment.
First let’s look at the RAW requirements, the amount of VMware data that Veeam is protecting is approximately 3.7TB. A full backup, after Veeam compresses and dedupes is 1.7TB. The graphs below show the first 30 days of activity for the Data Domain. In that time all Veeam backup VBK and VIB files account for 4.65TB of used space if we would be writing to a non-compressed or deduplicated device. However the actual amount of space used on the Data Domain device is 2.86TB, which is a savings of 1.79TB of space saved in just 30 days, which is a 1.7x reduction or about 40%!
A picture is worth 1000 words
Figure 1: Space Usage
This graph shows three things: compression factor (noted by the back line), the amount of data that Veeam is sending the DD appliance (noted by the blue area), and the amount of data that is actually written to disk after the DD appliance deduplicates and compresses it (noted by the red area).
As you can see the first couple of days there is almost no benefit from deduplication or compression from the Data Domain side. We do start to see the red and blue areas start to separate about day number 7 when a second full backup is taken. Normally we would expect to see the space used grow slowly and not in a step type pattern, however because they started doing backups of virtual machines as they were migrated to the new SAN we see large jumps at the beginning part of the graph. So after the first week of migrations we started those backups, and then after week two’s migrations we started those backups, hence the step pattern.
Figure 2: Daily Amount Written (7days)
This graph shows the total amount of data that is ingested by the Data Domain (denoted by the total height of the bar) as well as the amount written to disk(denoted by the red bar height). The height of the blue area in the bar is the amount of data that was ingested, but considered to be duplicates of data already on the appliance.
The interesting thing to note here is that the amount of data that Veeam sends to the DD appliance on a full backup day (Saturdays by default), and the amount of data that is actually written to the disk. In this example Veeam sent almost 675GB and it only consumed about 27GB on the Data Domain. This is where you see the savings of a dedupe appliance.
Figure 3: Daily Amount Written (30days)
This graph shows the same data as Figure 2; however this is a 30 day view instead of a 7 day view. You can see that at the beginning of the backups we were ingesting large amounts of data (total bar height) and also writing a large amount to disk too (red area of the bar). This is because the Data Domain was seeing a lot of new data that it had not seen before; over the red bars get smaller and smaller while the blue bars stay the same or get larger. As a side note the last two samples on the right show a much larger red area than normal, this is because new machines were added to backup jobs and contained unique data, this is a one-time occurrence and by the next backup we will see much lower rates again.
The combination of Data Domain storage and Veeam Backup software is a near perfect combination for protecting VMware virtual machines. “BOB’S BAIT AND TACKLE” is now able to get good backups without having large backup windows; they are also able to replicate those backups offsite with minimal bandwidth usage and will eliminate the need for tapes once the remaining servers have been virtualized.
Overall “BOB’S BAIT AND TACKLE” should expect to see about an 8 week retention period with the Data Domain configuration today, but they could expect to see as much as a 60 week retention period if the existing appliances are upgraded to their maximum capacity of 12 drives.
Without the Data Domain, Veeam would require us to provide 7TB of storage to store 8 weeks of backups for this customer. If we were to upgrade the DD620 to the 12 drive configuration we could store 60 weeks…. And if we were trying to do that with normal storage it would require 48TB of disk space, instead of 8TB of Data Domain storage.
To compare prices between a DD620 with the maximum drive configuration and a VNX5300 File Only SAN that has enough drive space to hold 48TB of Veeam data like we were mentioning earlier. Please Note that all pricing here is list price.
Data Domain with max drive configuration: $40,433 per site
VNX 5300 File only SAN with ~48TB usable: $74,227 per site
So the Data Domain is about $34,000 cheaper per site just for the hardware investment, Also the Data Domain is going to require 2u of space where as the VNX will require 11u, so if you have to co-locate your DR box at a datacenter there will be more cost there. Or if you are powering it yourself spinning 45 drives compared to 12 will definitely cost you more.