For today’s post I am happy to have guest blogger Bruce Henderson (Twitter). Bruce and I worked together on one of our first customer VSAN deployments (this post has been in draft status for a bit) and captured a few “gotcha’s” that we stumbled across along the way. More importantly this post coincides with a great post by Tom Howarth (Blog/Twitter) around challenges with VSAN licensing a last week that caught my eye, that post is located here.
Without further ado, a VSAN deployment in Bruce’s own words….
Over the course of the last month we have been working with a customer to deploy our first VSAN implementation. Since I was the system integrator who had the pleasure of deploying the final VSAN implementation designed by my buddy Jason, he asked me to collect my thoughts and experience and share them.
In a previous post about VSAN network design using Virtual Distributed Switch (VDS) (Notes from the Field: VSAN Design–Networking), Jason shared about how VMware was nice enough to include the VDS license with VSAN. So even if one did not have the VMware top tier license, Enterprise Plus, one could still take advantage of VDS and Network I/O Control (NIOC). What I came to find out is the VSAN/VDS licensing information has yet been delivered entirely to the whole of VMware nor to the licensing mechanism in vSphere/ESXi. This VSAN/VDS licensing debacle with a few other VSAN deployment gotchas took a deployment from exciting new technology to frustration very quickly.
The plan for deployment went like this:
- Install ESXi 5.5 on the first host
- Boot Strap vCenter server on the first host using the single host VSAN Datastore (http://www.virtuallyghetto.com/2013/09/how-to-bootstrap-vcenter-server-onto.html; http://www.virtuallyghetto.com/2013/09/how-to-bootstrap-vcenter-server-onto_9.html)
- Add first host to vCenter & to a vSphere cluster with VSAN enabled & configured in manual mode
- Build the converged 10GB network with VDS
- Install ESXi, add to vCenter, & configure VDS for hosts 2 & 3 before adding them to the VSAN cluster
- Add the new hosts storage resources to VSAN
The deployment stuck to the plan for the most part other than two small issues and one issue that VMware support fail to resolve. During this whole process I would recommend that having a second SSH cli connection open to the ESXi host currently being configured using the tail command on the vmkernel.log. tail –f /var/log/vmkernel.log That will give you all of the necessary information needed if encountering new issues.
When you get a server from a vendor and have specified no RAID configuration you would assume that drives in the unit wouldn’t have been configured and partitioned. Well I guess in this instance that was too much to ask for. On all of the ESXi hosts used in this three node VSAN cluster RAID 5 was configured for the 10K spinning disks and they had been partitioned. Once the RAID was broken the disks still would not show for addition to the VSAN. If there is any partition on the drive VSAN will assume that it has important data on it and will not allow you to claim — good policy =). Not a huge issue, but knowing this will allow you to prep the drives first.
- Via the SSH cli grab a list of your disk devices
- esxcli storage core devices list
- Then it’s just a matter of querying each drive you are going to use for VSAN for partitions and removing them, again via the SSH cli
o partedUtil get /dev/disks/naa.500xxxxxx
15566 255 63 250069680
1 2048 6143 0 0
2 6144 250069646 0 0
- Here we see that this particular disk has two partitions indicated by the 1 & 2 below the drive geometry information
- Deleting those partitions is simple, again via the SSH cli (QUICK DISCLAMER THIS WILL DELETE ALL DATA FROM THE DISK! PLEASE MAKE SURE YOU ARE WORKING ON THE APPROPRIATE DISK BEFORE EXCUTING THIS COMMAND =) )
o partedUtil delete /dev/disks/naa.500xxxxxx 1
o partedUtil delete /dev/disks/naa.500xxxxxx 2
- Querying the disks again will result in just drive geometry info
Once you have removed all of the partitions the drives with then be available to add to the VSAN cluster.
This one is simple and I’m not sure if it will be encountered by all or not but when I went to subsequent ESXi hosts to the vSphere cluster with VSAN enabled I would get “A general system error occurred: Unable to create LSOM file system for VSAN disk mpx.vmhba1:C0:T1:L0.
After much researching I found a post that mentioned the need for the NTP service to be running with the correct time. After some guffawing that something so simple couldn’t be the answer, I configured the NTP service and corrected the time, then the error disappeared.
Thanks wylkdao! (https://communities.vmware.com/thread/478019)
The largest issue that I ran into and the one that people seem to be running into is a licensing issues with VSAN, VDS, and ESXi, Essentials Plus, Standard, and Enterprise. During most of my deployments, especially those that leverage the Essentials Plus licensing, I save the licensing of the ESXi hosts and vCenter until the very last just in case there is a need to leverage a licensed feature that is available in the 60 day Enterprise Plus trial licensing, usually Storage vMotion. In this case that standard operation procedure of licensing at the end caused a major issue.
vCenter had been configured with a vSphere VSAN cluster, all three ESXi host had been configured using the vSphere VDS and added to the VSAN cluster all appropriate storage consumed. The customer was very happy, VSAN’s working and working well! Ok time to license, I added the license to vCenter, added the VSAN license, no problems. Then, when I went to license the ESXi host uber fail, “License downgrade. Cannot assign the license key, it does not support all of the features that are in use.”
WHAT? Wait… WHAT? There was only one thing to do at this point, I called Jason! Dude I thought you said that VDS came with VSAN. “Relax Bruce, VDS is included with VSAN, open a VMware support ticket they’ll be able to get you squared away quickly” Thanks Jason for bringing me back to reality. I was able to smile to the customer and suggest that there had to be a simple fix and VMware support would have it.
Opened the support ticket. When we got them on the speakerphone VMware support said, “I’m sorry sir VDS is NOT included with VSAN.” More smiling to the customer since the networking solution and hardware purchased was specific for VDS. Another call to Jason, “Relax Bruce, VDS is included with VSAN.” After much research we confirmed that VDS indeed is included with VSAN and that you must license the ESXi host VSAN before adding the VDS to ESXi hosts that are not licensed Enterprise Plus.
The fix was to place the ESXi host into maintenance mode, migrate the VDS vmkernel ports to standard vSwitches, license the ESXi host, and then migrate all of the vmkernel ports back to the VDS.
Wash, rinse, repeat, for a happy, healthy VSAN cluster.
The best tip I can give you throughout this whole post is to completely configure ESXi host base settings, i.e. DNS, hostname, NTP client, and add your licenses at the very beginning.
I have heard several similar stories and it would seem that VMware support still believes that VDS is not included with VSAN, hopefully VMware support will be corrected soon.