Design Tip: Switching in a VMware Environment

Switching is the backbone of every network, without it, you really don’t even have a network. In a VMware environment, it can be twice as crucial because iSCSI also depends on Ethernet switches.

Normally for front-end connectivity (from VMware servers to the core, or maybe even your core) I see Cisco 3750 switches stacked together. This is a really nice solution because it allows you to add multiple vmnics to a port group and distribute them across different switches in the stack for more redundancy.

However, if you are using iSCSI you may be better off using two 3560’s or 2960’s, why? Well if you are doing multipathing properly then you create two subnets anyway, and there is really no need for the traffic to crossover (one subnet per switch). However more importantly is what can happen (and did to me last Sunday morning).

I received a call from my co-worker who was doing an IOS upgrade on a stack of 3750 switches. He said that after letting the switches reboot he can no longer get to any of the virtual machines. I tried to remote in, but no luck since the VPN software relies on Active Directory for extended authentication, luckily for me this customer isn’t far from where I live.

After getting onsite I connected to the ESXi hosts and found that all of the VM’s were powered off. As we know, by default HA will power down VM’s on vSphere 4 if the hosts become isolated. After talking to my CCIE level coworker he stated that he rebooted all switches in the stack at the same time because if you do one at a time they won’t rejoin the stack when they come back up because of different code versions.

So with vSphere 5, we wouldn’t have had an issue if our storage was fiber channel or SAS or iSCSI (provided it was not using the 3750 stack). So the tip from me is this: If you are going to use iSCSI for storage make sure that you use separate switches from the ones used for passing VM and management traffic. If possible use two separate switches so that if needed one can be upgraded and rebooted independently of the other to allow storage to always be online. Lastly, this will also allow the hosts to remain un-isolated during a 3750 stack upgrade because they can use a heartbeat datastore (vSphere 5 only), or you could enable management traffic on an iSCSI vmk port and set a secondary isolation address (the san?).

Share This Post

Leave a Reply