Two years ago I posted an article describing how the VNXe3100 (and 3300 and 3150) had a major drawback in the way that it handled IO during a controller reboot or failure. If you didn’t get a chance to experience this for yourself head over to my older article and take a quick look http://www.jpaul.me/2012/07/vnxe-sp-failover-compared-to-other-solutions/.
Now that you know what the issue was, let’s talk about the new VNXe3200. I received a demo box from the folks at EMC through Chad Sakac’s blog a couple of weeks ago and have finally gotten a chance to play around with it and re-do the same test that I did 2 years ago to the 3200’s predecessor. The results were awesome, but before I go into the details let me just say that I no longer dread when I have to propose a VNXe. I now know first hand that this box is up to the task and should have no problems living up to the reputation of its big brothers (the VNX series).
Why is the 3200 different
So the VNXe 3200 borrows the heart of the VNX… it’s MCx code… It then uses that code to provide native iSCSI (not emulated like all previous VNXe’s) as well as Fiber Channel connectivity…. Simply put this thing has big boy block protocols. Because of this there are services running on both SP’s at the same time and no service “reboot” time is needed like the older systems needed when restarting their iSCSI servers.
I’m not going to go into all the other new awesomeness as there will be lots of other posts about this box on the way, so let’s get into what happen during the failover tests.
From the VNXe 3200 I have presented two LUNs to my VMware ESXi servers. The virtual machine I will be using to test with (SQLAO1) is on a LUN named “FAST_Pool_02” (and yes the VNXe3200 also has Fully Automated Storage Tiering, just like the VNX series). Here is a screen shot showing that SPA is the owner of this LUN, again, very different from the previous VNXe’s. On those model’s LUN’s were owned by an iSCSI server not a storage processor.
So to do the test I decided I would copy 16.2GB of linux ISO files from my VNXe3100 CIFS share to this VM.
The first time that I transferred the 16.2GB of ISO files I was getting between 80-90MB/s as reported by windows.
The next thing to do was to kill SPA, to do that I decided that the garage was too far to walk to so I used the SP Reboot feature in the Service System menu.
By the time that I rebooted the SP I already had the transfer running a second time. In fact it was probably half way through the transfer. When I did that Windows kept right on copying data, buffering it inside of windows for a short time. I attribute this buffering time to VMware and its native multipathing policies, because within a few seconds the buffer started flushing out just as fast as it built up. On the VMware Disk Performance graph for the SQLAO1 VM, you can see that traffic spikes down when the paths switch over, but then immediately spike back up as it starts to use the other paths through SPB.
As you can see the transfer is being limited by my 1Gbps network, because right after the path switch over, VMware dumps all of the data that has been buffering for the last 10ish seconds to the SAN and is able to spike up to 150MBps because of multipathing (BTW I just have 2 x 1Gbps links and IOPS Limit = 1 on the ESXi host).
The total time it took from when I clicked reboot until when the paths were back online and active was about 9 minutes. During that time I grabbed a screenshot of the paths list in VMware, showing that b0 and b1 ports were in use.
After the reboot was completed I checked the paths again and it had automatically failed back to SPA as the owning controller.
I’ve only had the box operational a week, maybe two at the most and already I am certainly impressed. EMC has fixed a lot of the major issues that I had with the older VNXe series. In fact I’m not entirely convinced that they should even call this box a VNXe…. but that is another post coming soon.
I see this box being a great addition to my consultant tool kit. Knowing that the VNX5100 is going away at some point, it left a small hole for the customers that needed enterprise SAN functionality in a cost-effective platform. However with the addition of FAST VP and Fast Cache into the VNXe 3200 as well as Fiber Channel and native iSCSI; it’s not hard to figure out why they picked code name KittyHawk… cause this thing does fly compared to its older brothers.
Stay tuned, more VNXe3200 articles coming as I have time to get them posted.
Disclaimer: EMC has provided a VNXe3200 Demo unit for me to perform these tests with. They however have encouraged everyone in the demo program to post the good the bad and the ugly all the same. I am not being compensated for this article or any of the other EMC articles posted on my site. In fact it actually costs me money because I haven’t figured out how to get the power company to sponsor my lab yet 🙂