No one can deny the advantages of a dedupe appliance, the space and bandwidth savings that they provide are astounding no matter what the brand. While being fairly affordable in all market segments, sometimes the budget just isn’t there… so what to do? Well one option, now that Windows Server 2012 has deduplication built in is to use a physical server loaded with Windows 2012.
Setting up Windows and Configuring Dedupe
To get started get your hands on a server (a physical server with lots of disks would probably work best), the only real requirement is that you need two separate drives. One will hold the operating system and other stuff that normally sits on the “C:” drive and the second will become our deduplicated backup storage area. Once you have a server and your raid groups setup load Windows 2012 on to it. For this how to I am using Server 2012 R2, then configure the server with the basic settings you would normally set such as hostname, IP address, domain settings, RDP, etc etc.
Next we need to add the deduplication features as well as enable them and format our backup storage. In “Server Manager” click on “Add roles and features”. Click next until you get to the “Server Roles” page, scroll down until you see “File and Storage Services”. Then expand out “File and iSCSI Services”, under this section you will file Data Deduplication, check the box next to it to install it. On the Add Roles and Features Wizard click the Add Features button.
Click next through the rest of the wizard pages and then click Install. Once the installation is complete click Close.
Next we need to configure our backup storage disk to use deduplication as well as get it formatted and online.
Click on File and Storage Services in Server Manager and then click Disks. You should have both your “C:” drive listed as well as your Backup Storage disk that isn’t formatted.
Right click on the disk you want to use as your backup storage and click “Bring Online”.
Then right click on the drive and click New Volume. A wizard will start, click next on the first page and then select the disk you want to format on the second page, then click Next. A box will pop up telling you the drive will be formatted, click OK to proceed.
The next page will allow you to set a size of the new volume. Unless you have a good reason not to, set the size to the maximum available.
Next assign a drive letter.
The default settings of NTFS and “Default” Allocation Unit Size work fine, so just give your volume a name and click Next.
Next we get to the actual deduplication settings. First Enable dedupe by selecting “General purpose file server” from the drop down. Then select the number of days you want to keep data “undeduplicated”. This settings is important, as it will determine when one Veeam session gets deduped among the rest of the sessions. If you have enough storage to store at least 2X the amount of a complete full backup of everything + 1 week of changes then you can set this to 6 or 7 days. This will ensure that at least one full backup is not deduplicated and your SureBackup and normal restores will happen as quick as possible. If however you are tight on space, you can set this as low as 1 day. This will yield the most space savings, but at the cost of slightly slower SureBackup jobs and normal restores.
The last thing you will want to setup before clicking Next is the deduplication schedule. Click the Set Deduplication Schedule button. This step basically allows you to control when the dedupe process has priority over everything else. For me I pick a start time of 5AM, and allow it to run for 10 hours. This will allow Windows to make dedupe a priority from 5AM to 3PM every day…. which will work perfect since my backup will happen from about 7pm to midnight. You can adjust as needed.
After clicking Next you can review what will happen and then click Create to start the processes. After the drive is ready to go you can click close.
If you click on the disk in Server Manager you can now see the deduplication ratio and deduplication savings at the bottom. Right now it will obviously not look very cool since we have no data.
Configuring Veeam
I wont take the time to walk you through a Veeam Backup installation since there are other articles on my blog that do that. Once you have Veeam up and running though, we need to configure the backup repository to point to our Dedupe drive. To do that open Veeam and head over to the Backup Infrastructure section. Click on Backup Repositories on the left and then right click in the white space on the right, select “Add Backup Repository”.
Give you repository a friendly name and description then click next.
Click Next on the Type page, as “Microsoft Windows server” is what we want.
On the next page click “Populate”, and then select the drive you created in step one. Then click Next.
On the repository page click on the Advanced button and select the boxes next to the two dedupe friendly settings. Then click OK and Next to proceed.
Go ahead and leave all of the vPower settings alone unless you have a reason to change them. And then click Next and Finish to complete the wizard.
Lastly while you are still on the Backup Repositories page, we will delete the “Default Backup Repository” just so you don’t accidentally select it for a job. However before we can delete it we need to reconfigure where the configuration backups will land. To do this click on the Blue drop down menu in the top left corner and then select Configuration Backup.
Change the backup repository to the new Dedupe Store and click OK.
Now you can right click on the “Default Backup Repository” and select Remove.
You can now configure your backup jobs and use the new Dedupe store to store your backups. To see the savings that the deduplication of Windows is saving you, refer back to the disk section of Server Manager after the number of days you configured to wait for files to dedupe. In my example I will need to wait a week before I will see any benefits.
Stay tuned for the second post I’m working on where I will test using DFS-R to replicate the Dedupe Backup data to another Windows 2012 Server. Hopefully It will only replicate dedupe data and not rehydrate the backups before sending them to its DFS-R partner, but we shall see.
Will this work with having a vmdk attached on a vm?
Yes, the screenshots from this article are actually from a VM…. I am running it as a VM (but the VMDK’s are targeted to local SAS drives in one of my VMware servers)…. in my lab i try not to run physical servers for anything but hypervisors to save on power 🙂
Thankyou Justin! This article came up at the exact time when some of my colleague was saying the Data Domain is not a good solution but Windows 2012 is better!
So pointing the having the backup drive that you need de duplicating can be a vmdk drive that is provisioned from an equallogic array for example?
Sorry my spelling was awful on that last post!! What I’m saying is can the deduplication drive be a vmdk that is provisioned from a SAN such as the equallogic for example?
Zak,
While 2012 server will work great keep in mind that the Data domain is going to provide better dedupe and compression, and most importantly it will provide replication of dedupe and compressed data. I’m still investigating, but I have read several places that 2012 will rehydrate the data before replicating.
yes, all of the dedupe is done inside of windows, it can be any type of disk that is non-removable…. windows doesnt support dedupe on a removable disk (ie USB drive)
Hello,
Thanks Justin for this blog, very interesting.. Currently, in the beginning of this article you mentioned Physical Server, is it a must? Or for testing purpoaes, can I use VM with pRDM or vRDM? Once I fully aware of all the backup configurations, then will migrate the production server to windows 2012 as its 2008 now.
My current backup infrastructure Veeam on physical server 2 quad cores and 16 GB memory, iSCSI LUNs as Repositories. Symantec Ecxe on same server for tape backup, no yet I moved to v7..
You most certainly can use a Virtual Machine. In fact that is what i was using for my screenshots. I actually didnt even use an RDM…. i just used a VMDK.
Thank you very much, will plan to try it.
I just don’t see what the point of building a WIndows 2012 server as a dedup appliance if you’re going to use Veeam. Veeam does a great deduplication job (and replication) on its own. By using Windows 2012 deduplication you’re limiting Veeams own deduplication functionality and having to enable “Deduplicating Storage Compatibility” on Veeam.
I honestly do not see the point of using ANY deduplication appliance with Veeam, it is really not needed. Data Domain is great, especially with DD Boost, however, pair it with Veeam and you’re not able to use DD Boost, on top of that you just paid for replication and deduplication on the Data Domain, which Veeam can do out of the box.
Data Domain works great with Networker or Netbackup using DD Boost via OST, with Veeam, not so much… In much the same way, building a Windows 2012 deduplication appliance to be used with Veeam is unnecessary.
Well Veeam does session dedupe. Not global dedup, which is what we are after by using a Data Domain or Windows Dedup appliance. I’m told that Veeam will also get DDBoost integration soon as well… nothing official of course, but its a good sign.
The problem I have with Veeam replication is that jobs still have issues running at the same time (Backup and Replication) also when Veeam is replicating over a slow connection it may have to restart the transfer, Data Domain is able to pick up exactly where it left off… saving time and bandwidth. whenever I do projects we use Veeam for backup only (not for replication), and we point those backups to a storage appliance be it DD or VNXe/VNX CIFS (whatever) and then for replication we always use Zerto or Recoverpoint, as you can achieve lower RPO’s and there is no need to mess with VMware snapshots. (Zerto is my personal choice, but sometimes customers have physical servers too)
Hi Justin
Did you ever get around to part 2, or finding out if DFS-R will rehydrate 2012 de-duped data when replicated?
Thanks.
Hi Justin, do you see any issues with backup or restore of Veeam 8 backing up to a Windows 2012 R2 Deduplicated volume then running a Veeam backup to tape? We are seeing some pretty average dedupe and compression with large video files and was interested in setting up Dedupe on the Veeam backup server to get further disk reductions to disk and tape.
Cheers
I noticed you spoke of writing a part 2 article where you would report on your use of DFS-R to replicate Windows Deduplication volumes. Your expressed the hope that DFS-R would not rehydrate the files before sending them.
I assume you discovered that DFS-R does rehydrate the files, although it then uses its own transmission optimization techniques to reduce the amount of data transferred during replication. Unfortunately, this reduction is no where near what would be provided by a true dedupe-aware replication.
We really like Windows deduplication also, but for our customers to be able to use it in the real world, we need replication. After searching all over for a solution, we sat down and wrote one. It allows you to replicate a windows deduplication volume to a one or more remote windows deduplication volumes, making the remote an exact clone of the local volume by transferring on the modified and new deduplication chunks, and the reparse points.
It can also replicate to a windows deduplication volume on an external esata or usb drive.
The product is called Replacador and it is in beta now (Dec 2014). We’d love to know how it works with Veeam, if you want to try it out.
After lots of testing, I wrote and article about how to make Windows Deduplication faster. I tripled my throughput with these techniques.
http://windowsdeduplication.com/how-to-make-windows-deduplication-go-faster/
Per the best practices webinar hosted by Veeam, The Dedupe advanced settings Align Data blocks and decompress data backup blocks should not be enabled. These settings are for backup appliances only, and Win2012 is not a backup appliance. Setting both these settings to checked will increase the storage requirements for backups.
https://www.youtube.com/watch?v=lu0TG_Lw_WI
Thanks for the tip Nate!