Zerto Cloud Service Providers are the group of people who leverage Zerto Virtual Replication to provide disaster recovery as a service to customers who do not wish to own a DR site. Essentially they (the service provider) have all of the needed infrastructures in place to allow customers to fail over their virtual machines into that service providers environment. So what does it take to be a Zerto cloud service provider?
This article is the third in a series and will describe how I’ve done multi-tenant networking in my cloud provider lab, as well as what some of the other options are.
Networking in a multi-tenant cloud is the most interesting part of the series, for me anyhow.
Essentially the goal of networking in a multi-tenant cloud is to provide a secure, isolated network for each customer while still allowing them to get to their data.
To make these goals happen, there are several categories of networking that you will need to configure.
The first type is internal VLAN networking, meaning the networking that lives inside of your four walls and separates one customer’s data from another’s. There is also external networking, meaning networking that brings customer traffic from their site to yours. The third type is the networking that glues the previous two together; like firewalls, VPNs, and routers.
It doesn’t matter if you are the smallest ZCSP or the biggest, you will always have these three networking components in some form. Lastly, before we get started, you should note that Zerto doesn’t really care about which WAN type you pick, or what firewalls you use, etc. It works with them all equally well. The only concern Zerto has is that it can route data from one site to another.
The easiest way to start is probably to show you one way that Zerto Cloud Provider networking can be configured.
What the following diagram is trying to show is that you can have customers with a variety of different WAN connectivity types terminating into your facility. Once inside of your four walls the traffic is processed the same way for everyone. It is first put into a VLAN, then sent to the ZCC. From there the ZCC gets the data to the proper Zerto component on the management network.
[stextbox id=”info”]Some ZCSP’s use dedicated replication VLAN’s; others allow replication to flow through the “Failover Production” portgroup / VLAN. To Zerto it doesn’t really matter.[/stextbox]
In my environment, I terminate remote sites into the same VLAN as the “Failover production” VLAN. Here is what it looks like:
Also in this diagram, you can see that the ZCSP’s Management network is completely separate from any of the customer VLAN’s. All of the shared Zerto components like the ZVM, ZCM, and VRA are all installed in the management network.
This is important the understand. For infrastructure to be multi-tenant it has to be effective at isolating tenants from each other as well as isolating ALL backend management components so that tenants cannot access them. If they are not properly isolated you do not have a secure environment.
So a ZCSP should have all of their Zerto components as well as all of their VMware vSphere components in one or more management networks that are isolated. Personally, I keep my VMware and Zerto management components all in the same management VLAN.
Replication Traffic Flow
Zerto has created a method to get replication traffic from the client data center to the management network at the ZCSP. If you look at the diagram again, you can see the component that straddles the two networks, this is the Zerto Cloud Connector or ZCC for short.
Here is an animated version of the diagram to help you visualize what is going on with replication to a ZCSP.
Replication traffic generated by the client site VRA’s is sent across the WAN to the ZCC that sits at the ZCSP datacenter. The ZCC verifies the data needs to be transferred to the management network, then forwards it to the proper VRA at the ZCSP site. Management traffic is also passed through the ZCC to get back and forth through the client network.
In a sense, the ZCC is acting as a proxy server for replication traffic as well as Zerto management traffic. When you are configuring your ZORG’s (we will talk about these more in a later article), you will set up your ZCC to have an IP and default gateway pointing to the VPN router that can talk to the client site as well as an IP in your management network. The ZCC then configures the needed static routing and default gateway to make this all work.
[stextbox id=”info”]During reverse replication, or in situations where the production VM’s are running at the ZCSP site, the data flow is the same but in the opposite direction.[/stextbox]
My ZCSP networking and other options
Inside my infrastructure things look almost identical to the diagrams I have used in this article. I have two VLAN’s defined for each ZORG, one production VLAN, and one test VLAN. Inside of VMware I use a distributed virtual switch and create port groups for each of the VLANs. I also create all of the VLANs on my switching gear as well.
If you read the first two articles in this series you know I’m not using NSX, so these are just regular 802.1q VLANs.
Each customer also has a pfSense Firewall doing routing and VPN work. (again because I’m not using NSX)
Internal network architecture and product selection will vary depending on budget, and how large you want to scale your environment. Remember there are about 4000 usable VLAN’s. That may seem like a lot, but if you allocate each client 5 VLAN’s each, you are now limited to about 800 customers. Plus you will probably use a few VLAN’s for management too. So to be safe let’s say that 600 is the practical limit (since some customer will probably want more than 5 VLANs.)
[stextbox id=”alert”]Keep in mind too, that I am saying that this maximum would be per core switch. So even if you have 5 POD’s as we talked about in the last article, this maximum would be imposed on as many pods as there are sharing a core switch.[/stextbox]
Each customer will need at a minimum 2 VLAN’s. One for their Zerto fail-over test network, and another for their production fail-over network and replication traffic. This minimum assumes several things:
- The customer wants a test VLAN
- The customer only needs one production VLAN
- Replication traffic will flow through the production VLAN
If any of the above isn’t the case, then your VLAN count per customer will vary. (That’s why I said figure 5 per customer just to be safe)
The takeaway here is that if you plan to scale more than 600-800 customers, then you may need to look at VXLAN. The only other option would be to build an entirely different POD, with different core switching. Because of the second, independent core switch, you have another 4000 VLANs to work with.
VXLAN, on the other hand, will give you about 16 million segments.
Bottom line if scalability is a concern VXLAN is probably the answer, but it comes at a premium price from VMware. It’s been a while since I reviewed the VMware service provider license agreement, but if you check you will see that there is probably still an upcharge for leveraging NSX.
In my ZCSP lab, I am using VPN connections to remote sites. The are provided by the pfSense firewall VM’s that I talked about earlier. While they are not an enterprise class solution, they are proving to be pretty reliable and work very well for my use case. The two that I have in place have been operating flawlessly for over a year now, probably because once they are configured they “just work” and have had no config changes, just security updates.
In the real world external networking is going to vary with almost every customer. The biggest thing to remember here is that Zerto does not encrypt traffic between VRA’s and Zerto does not support NAT between any Zerto components. So, protecting replication traffic is the job of the network layer.
There are two things to remember when determining if a particular WAN connection with work with Zerto:
- Zerto does not support NAT between any Zerto components; or between Zerto and vCenter
- Zerto does not encrypt network traffic
This means that you will need to provide a single layer two network shared between sites, or you need to have layer three routing between sites. Make sure that NAT is not anywhere in the middle.
[stextbox id=”alert”]Remember Zerto does not support NAT between any Zerto components[/stextbox]
WAN security is provided via a VPN or MPLS or some other type of connection that is natively secured.
Zerto is very flexible regarding networking. The only real requirements are no NAT and a secure connection from the client site to the ZCSP. Outside of that Zerto has no preference as to what vendor, or what type of connection you are using.
Another thing to take away is that Zerto doesn’t list a “best practice” or specific requirements on how you setup your cloud networking. All of the decisions are up to you and what is best for your customers, as long as you can deploy ZCC’s and get replication traffic from the client site through the ZCC and into the management network.
Lastly, Zerto networking is more complicated to setup than some of the other “offsite backup/DRaaS” offerings. But what good is VM recovery if it isn’t accessible? When you configure a client for Zerto replication, they are also configuring everything they need to take advantage of those VM’s once they fail over. This isn’t something that you get out of the box with a solution that is tunneling recovery data over a single SSL connection.
ZCSP Post Series
This post is post 3 of many in a series. I’d love for you to follow along and provide feedback and input as I go. If you are not already following my blog, I encourage you to sign up on the right under the sponsor ads. Don’t worry; you will only get mail when new articles get published.
For a list of other articles in this series, please visit the series homepage here.
Need more info on how to be or get started with a ZCSP? Let me know.