I’ve done posts before about how much I hate Microsoft Clusters, mainly because of their shared disk requirements. While VMware does support a virtualized Microsoft cluster with shared disks, the bottom line is that its just a pain in the ass, so when I heard about SQL AAG’s (AlwaysOn Availability Groups) at VMworld the other week I was pretty excited. Sure, they have been out for a while (they came out with SQL 2012) but apparently I live under a rock.
So at a high level thing of an SQL AAG just like you would an Exchange DAG. It allows you to use multiple instances of Windows and a specific application (in this case SQL) to create a highly available service just like you would with an old school Microsoft cluster. But there are actually a lot more benefits.
In the example below from a MS document, it shows how you can use this functionality not only to make SQL highly available at your primary datacenter, but you can also use it to synchronously or asynchronously replicate that data to another SQL server at a DR site.
I’m a big fan of using application specific DR technologies, don’t get me wrong Zerto is awesome and Veeam does a good job too (As do most other products out there), but with this technology your going to get a SQL instance that is already running at DR the moment that things go down at your main site… it’s hard to beat that!
Ok so the whats the downside right ? Well it’s all about the money of course! Basically you’re going to get hit with two things… you need more disk space and you might need more licenses. Let me explain.
SQL server allows you to have a passive copy of SQL running, so if you intend to only have a two node cluster and the second copy of the data is 100% passive unless the first fails then you need no extra licenses outside of whatever is required for one server. (Oh BTW you need SQL Enterprise or the AlwaysOn features) If however you want more than one passive copy, or if you want to read from a passive copy (for doing backups, or to just load balance the reads or something) then you have to fully licenses those copies as well.
For more information on Microsoft SQL licensing see here.
On the disk side things are a little more flexible. Obviously you could use physical servers if your environment requires it, you can also make a bunch of virtual machines, but one option you can’t do with Microsoft clusters is put some in the cloud and some on premise. That is probably one of the coolest things about the AlwaysOn feature is that if you have a hybrid cloud it will allow you to take advantage of both sites for a highly available solution.
How to make it happen
The first thing you need to do is install two or more copies of Windows, I used Server 2012 Datacenter for my testing. After Windows is installed make sure to join the machines to your domain and set up the networking properly.
The second step is to install and configure Failover Cluster services. To do that you need to add the Failover Cluster feature to all of your nodes, once it is on your nodes login to one of them and start the Failover Manager.
We need to run the Create Cluster Wizard. Once we start the wizard add all of your nodes to the c luster.
Next allow the wizard to run the validation tests and make sure everything is ok. Then enter the cluster name and IP address.
Then click next through the Create Cluster wizard. One note, I unchecked the “Add all eligible storage to the cluster” option as I don’t want any storage managed by the cluster. You should now have a cluster with your nodes in it, and it should look something like this one.
Once you finish the wizard you can minimize the Failover Manager.
Next install SQL Server 2012 (or in my case 2014, just because I didn’t have 2012 on my laptop). Do a standard installation just like you would normally do, do not do a cluster installation…. just a normal stand alone install. Do this on each node that you want in the AlwaysOn availability group. Note: you will want to install the SQL server service to use a domain login.
After SQL is installed we need to enable AlwaysOn, to do that open up the SQL Configuration manager and click on “SQL Server Services”. In the main part of the page you should see “SQL Server (MSSQLSERVER)” right click on it and select Properties. In the properties dialog box select the “AlwaysOn High Availability” tab. You then need to check the “Enable AlwaysOn Availability Groups” box and then click OK to exit the properties.
After enabling AlwaysOn you will need to restart SQL Server, to do that you can go into services or you can simply right click on the same place you did a minute ago and restart it from within the SQL Configuration Manager.
Now inside of SQL Management Studio you should be able to expand out the AlwaysOn section, however there wont be anything in it yet.
Before we can create an availability group we need something to protect so create a database and make sure the logging mode is set to Full. Also for the nodes to get in sync with each other you will need a shared folder somewhere on your network, if you don’t already have one create one now.
After you have created your database go back down to “AlwaysOn High Availability” and right click, then select “New Availability Group”.
Give the group a name.
Next select the databases that you want to be in this group. You can select one or many databases.
Next you will need to specify which servers are going to be replicas. To do this click the Add Replica button and type the name of the other server(s) you want to replicate to.
Once you have all of your nodes added, make sure to select the proper options as to what you want them to be doing, in most cases you will want them to automatically failover and be synchronous, but remember if you select “Readable Secondary” then technically you need to fully licenses that copy of SQL as it will no longer be considered passive.
Before clicking next select the “Listener” tab. Here you need to create a listener with a name, port, and IP address. Then click next.
Next we will use that shared directory I was telling you about. It will be used to store a backup of the database so that you can sync up the other nodes of the group. Select “Full” and then enter the path to your shared folder.
Next the wizard will validate your setup, and then you will click next to start the process to setup the group. After all has completed you can click close.
Now under AlwaysOn High Availability (after you refresh SQL Management Studio) you should see a group.
Also note that it has listed which nodes are primary and secondary.
At this point you now can point clients at the listener and they can talk to the databases in your group!
But what about the failover?
Well I figured that showing you screenshots would be kinda dumb, so I built a two node group and put together a video where I manually failover between nodes, and then force a failover by rebooting a node.
Check out the video!
Video information:
I used VMware Workstation to create two Windows 2012 Server nodes and a Windows 2008 R2 domain controller. I then installed Failover Clustering on the two 2012 nodes as well as SQL Server 2014 CTP1.
So how can you get started ? Well I would start by doing more research. If you are going to have both nodes on a single site you would probably be ok, but if you are going to look at doing a second site, or a split site two node cluster or something like that you will probably want to look into how to create a file share witness so that you avoid some split brain stuff, and make sure that services stay at your main site in the event of a network outage.
Anyhow, Thanks for reading!
Justin, nice post. One advantage of Zerto is that it can save you £££’s on SQL licensing as you don’t need a second license in DR site in that 3 node setup described below. The other advantage of course is that it can protect a multi-VM application stack with consistency via write order fideily for stacks like Sharepoint – something that AAGs themselves don’t help with.
Yes Sir! Agree completely
Could you use a hybrid of AOAG and Zerto?
For example;
Site A
– SQL-PRI – Primary
– SQL-2ND – Secondary replica for read only reporting + Operational Data + Data Marts, etc.
Site B
– SQL-PRI – Primary
Zerto would handle DR for SQL-PRI
SQL Server would replicate the data between SQL-PRI and SQL-2ND
If Site A was destroyed SQL-PRI would be fully functional at Site B but no SQL-2ND would be available for offload of reporting, etc. until Site A recovery occurs.
That could potentially benefit software products that are not database HA/DR friendly.
?