The Unofficial Official VCP6-DCV Study Guide

vmworld2015

We are just a few short days away from one of my favorite weeks of the year, VMworld! Well just like a few years back, Josh Coen (blog / twitter) and I have teamed up with our good friends at Veeam Software (website / twitter) to release the latest version of “The Unofficial Official VCP6-DCV Study Guide”. With the short turn around time  Josh and I had to complete the study guide we have our fingers crossed that hard copies of the guide will be available next week (watch Twitter for updates) at the Veeam booth. For those who can’t wait, clink on the cover below to download an electronic copy.

Hope to see you at VMworld and happy Studying!

-Jason

Cover

VCP-6 Objective 7.3–Troubleshoot vSphere Upgrades

For this objective I used the following resources:

Objective 7.3 – Troubleshoot vSphere Upgrades

Knowledge

Identify vCenter Server and vCenter Server Appliance Upgrade Issues

For this section I am going to take the easy way out. Refer to Section 12 of the vSphere Troubleshooting  documentation. This section covers the following topics:

  • Collecting Logs for Troubleshooting a vCenter Server Installation or Upgrade
  • Collect Logs to Troubleshoot ESXi Hosts
  • Errors and Warnings Returned by the Installation and Upgrade Precheck Script
  • Restore vCenter Server Services if Upgrade Fails
  • VMware Component Manager Error During Startup After vCenter Server Appliance Upgrade
  • Microsoft SQL Database Set to Unsupported Compatibility Mode Causes vCenter Server Installation or Upgrade to Fail

Create a Log Bundle

Locate/Analyze VMware Log Bundles

There are multiple ways to get at this information, but I will assume the exam is going to be geared more towards using the vSphere Web Client for this task.

Using vSphere Web Client

Have a look over at VMware KB Article 2032892, Collecting Diagnostic Information for ESX/ESXi Hosts and vCenter Server Using the vSphere Web Client.

      • Log into the vSphere Web Client with administrative privileges
      • Under Inventory Lists, select vCenter Servers
      • Click the vCenter Server that contains the ESX/ESXi hosts you
      • wish to export logs from
      • Select the Monitor tab in right hand navigation screen and choose System Logs
      • Click the Export Systems Logs
      • Select the ESX/ESXi hosts you wish to export logs from
      • Optionally, select the Include vCenter Server and vSphere Web Client Logs.
      • Click Next
      • Select the type of Log Data to be exported
      • Optionally, select to Gather Performance Data
      • When ready click the Generate Log Bundle
      • Once the log bundle is generated, click Download Log Bundle
      • Select a location and click Save

 

Log_Bundle

For additional diagnostic and log collection (either virtual appliance, ESXi hosts, or Windows vCenter) have a look at the following VMware KB articles:

Identify Alternative Methods to Upgrade ESXi Hosts in Event of Failure

For this section not really sure what VMware is after with the “Event of Failure” piece. I am going to tackle this from the perspective of outlining the supported methods of upgrading an ESXi hosts. My guess this will give the baseline knowledge for the exam that you will need.

  • vSphere Update Manager – For me this is my favorite of the options. You should already have VUM installed in your environment so the only work that really needs to be done is importing the ESXi 6.0 ISO into the repository and creating an Upgrade baseline. Super easy.
  • Upgrade via ESXi Installer (ISO on USB/CD/DVD) – In a small enough environment you might just be able to create a boot image from the ESXi 6.0 ISO and place it on a CD/DVD/USB device and boot the ESXi host from it. This would be labeled as an “interactive” upgrade. You will have to provide some inputs to complete the upgrade
  • Perform Scripted Upgrade – I myself haven’t used nor seen a lot of scripted upgrades in the field. It is supported and could be a faster deployment method to multiple hosts over VUM.
  • vSphere Auto Deploy – Using Auto Deploy you can reprovision the host and reboot it with a new image profile. This profile would include the ESXi upgrade to 6.x. You will need to leverage vSphere Image Builder to build the package
  • esxcli – You can use the esxcli command-line utility to upgrade hosts to ESXi 6.x

Configure vCenter Logging Options

  • Log into the vSphere Web Client with administrative privileges
  • Under Resources, select vCenter Servers
  • Click the vCenter Server to update the level of logging
  • Select the Settings tab in right hand navigation screen and choose General
  • From the General  tab click Edit
  • The Edit vCenter Server Settings dialog will be displayed. Select Logging Settings
  • Select the level of logging from the Logging Options dropdown.
  • Click OK when finished

Logging_Options

The available options are:

Option Description
None (Disable Logging) Turns off logging
Error (Errors Only) Displays only error log entries
Warning (Errors and Warnings Displays warning and error log entries
Info (Normal Logging – Default) Displays information, error, and warning log entries
Verbose (Verbose) Displays information, error, warning, and verbose log entries
Trivia (Extended Verbose) Displays information, error, warning, verbose, and trivia log entries

 

VCP-6 Objective 7.4–Troubleshoot and Monitor vSphere Performance

For this objective I used the following resources:

Objective 7.4 – Troubleshoot and Monitor vSphere Performance

Knowledge

Describe How Tasks and Events are Viewed in vCenter Server

View All Tasks

  • Log into the vSphere Web Client
  • Select the vCenter Server inventory object you want to view
    • To display the tasks for an object, select the object
    • To display the tasks in the vCenter Server, select the root folder
  • Select the Monitor tab
  • Select Tasks

The screenshot below shows the Tasks tab of a virtual machine in my lab:

Tasks

View Events

  • Log into the vSphere Web Client
  • Select the vCenter Server inventory object you want to view
    • To display the tasks for an object, select the object
    • To display the tasks in the vCenter Server, select the root folder
  • Select the Monitor tab
  • Select Events

The screenshot below shows the Events tab for my lab cluster:

Events

Identify Critical Performance Metrics

As you will see listed in the sections below, the critical points to monitor are CPU, memory, networking, and storage.

Explain Common Memory Metrics

Metric Description
SWR/s and SWW/s Measured in megabytes, these counters represent the rate at which the ESXi hosts is swapping memory in from disk (SWR/s) and swapping memory out to disk (SWW/s)
SWCUR This is the amount of swap space currently used by the virtual machine
SWTGT This is the amount of swap space that the host expects the virtual machine to use
MCTL Indicates whether the balloon driver is installed in the virtual machine
MCTLSZ Amount of physical memory that the balloon driver has reclaimed
MCTLTGT Maximum amount of memory that the host wants to reclaim via the balloon driver

 

Explain Common CPU Metrics

Metric Description
%Used Percentage of physical CPU time used  by a group of worlds
%RDY Percentage of time a group was ready to run but was not provided CPU resources
%CSTP Percentage of time the vCPUs of a virtual machine spent in the co-stopped state, waiting to be co-started
%SYS Percentage of time spent in the ESXi VMkernel on behalf of the world/resource pool

 

Explain Common Network Metrics

Metric Description
MbTX/s Amount of data transmitted in Mbps
MbRX/s Amount of data received in Mbps
%DRPTX Percentage of outbound packets dropped
%DRPRX Percentage of inbound packets dropped

 

Explain Common Storage Metrics

Metric Description
DAVG Average amount of time it takes a device to service a single I/O require (read or write)
KAVG The average amount of time it takes the VMkernel to service a disk operation
GAVG The total latency seen from the virtual machine when performing an I/O request
ABRT/s Number of commands aborted per second

 

Identify Host Power Management Policy

ESXi can take advantage of several power management features that the host hardware provides to adjust the trade-off between performance and power use. ESXi supports five different power management policies ranging from low performance/low power to high performance/high power. The table below provides a breakdown of the five policies:

Power Management Policy Description
犀利士
op” width=”200″>Not Supported
The host does not support any power management features or power management is not enabled in the system BIOS
High Performance The VMkernel detects certain power management features, but will not use them unless the system BIOS requests them for power capping or thermal events
Balanced (Default) The VMkernel uses the available power management features conservatively to reduce host energy consumption with minimal compromise to performance
Low Power The VMkernel aggressively uses available power management features to reduce host energy consumption at the risk of lower performance
Custom The VMkernel bases its power management policy on the values of several advanced configuration parameters. You can set these parameters in the vSphere Web Client Advanced Settings dialog box

To select a power management policy follow the below procedure:

    • Log into the vSphere Web Client with administrative privileges
    • From the Home screen select Host and Clusters
    • Expand your Datacenter and Cluster. Select the desired Host
    • In the right-hand navigation, select the Manage tab and select Settings
    • Scroll down and select Power Management, and click Edit
    • Select a power management policy for the host and click OK

Identify CPU/Memory Contention Issues

Monitor Performance Through ESXTOP

These two topics could easily fill pages of information. For quick and easy knowledge refer to the sections above outlining the more significant performance metrics to monitor. Read Section 7 of the vSphere Monitoring and Performance documentation as well as Duncan Epping’s esxtop blog and the VMware Communities document “Interpreting esxtop Statistics”. Also have a look at this YouTube video, VMworld 2010 – TA6720 Troubleshooting using ESXTOP for Advanced Users. While this video is a few years old, the concepts are still sound in using ESXTOP.

Troubleshoot Enhanced vMotion Compatibility (EVC) Issues

A quick primer on what Enhanced vMotion Compatibility is. EVC mode ensures that all ESX/ESXi hosts in a cluster present the same CPU level/feature set to virtual machines, even if the actual CPU’s on the host differ (they need to be of the same CPU manufacturer, you can not mix AMD with Intel and vise versa). With EVC mode enabled and configured it is then possible to leverage vMotion to migrate virtual machines across hosts.

For troubleshooting EVC mode, it mostly commands down to if the CPU in the ESXi host is supported. They below listing is pulled from VMware KB Article 1005764 – EVC and CPU Compatibility FAQ.

ESXi 6.0 Supports these EVC Modes

  • AMD Opteron Generation 1 (Rev. E)
  • AMD Opteron Generation 2 (Rev. F)
  • AMD Opteron Generation 3 (Greyhound)
  • AMD Opteron Generation 3 (no 3Dnow!)(Greyhound)
  • AMD Opteron Generation 4 (Bulldozer)
  • AMD Opteron “Piledriver” Generation
  • Intel “Merom” Generation (Intel Xeon Core 2)
  • Intel “Penryn” Generation (Intel Xeon 45nm Core2)
  • Intel “Nehalem” Generation (Intel Xeon Core i7)
  • Intel “Westmere” Generation (Intel Xeon 32nm Core i7)
  • Intel “Sandy Bridge” Generation
  • Intel “Ivy Bridge” Generation
  • Intel “Haswell” Generation

Compare and Contrast Overview and Advanced Charts

  • Overview Charts – Display multiple data sets in one panel to easily evaluate different resource statistics, display thumbnail charts for child objects, and display charts for a parent and a child object
  • Advanced Charts – Display more information than overview charts, are configurable, and can be printed or exported to a spreadsheet

VCP 6–Objective 7.2 Troubleshoot vSphere Storage & Network Issues

For this objective I used the following resources:

Objective 7.2 Troubleshoot vSphere Storage & Network

Knowledge

Verify Network Configuration

Refer to each objective under Section Two. Focus on the core concepts and configuration of both vNetwork Standard Switches and vNetwork Distributed Switches:

  • Port/dvPort Groups
  • Load Balancing and Failover Policies
  • VLAN Settings
  • Security Policies
  • Traffic Shaping Policies

For additional information read the VMware Information Guide “VMware Virtual Networking Concepts”. This document is based on VI3 but still does a good job with the core functions of a vStandard Switch.

Verify a Given Virtual Machine is Configured with the Correct Network Resources

Instead of duplicating work, refer to VMware KB 1003893, “Troubleshooting Virtual Machine Network Connection Issues”. More then enough information listed there.

Troubleshoot Physical Network Adapter Configuration Issues

This is pretty straight forward as there is not a lot of configuration done at the physical network layer. Be sure that your physical nics that are assigned to a virtual switch (vSwitch or dvSwitch) are configured the same (speed, vlans, etc) on the physical switch. If using IP Hash as your load balancing method make sure on the physical switch side link aggregation has been enabled. Refer to VMware KB 1001938 and VMware KB 1004048 for further details as well as examples. If using beacon probing for network failover detection it standard practice to use a minimum of three (or more) uplink adapters. See VMware KB 1005577 for further details.

Troubleshoot Virtual Switch and Port Group Configuration Issues

One key aspect to remember is when setting up Port Groups or dvPort Groups, spelling counts (as well as upper/lower case)! If a Port Group is spelled Test on one host and is spelled test on a second host vMotion will fail. Same holds true with Security Policies, if one vSwitch on a host is set to accept Promiscuous Mode and it is set to Reject on the other host, again vMotion will fail. Also, refer to the objectives under Section Two to be sure your switches are configured correctly.

Troubleshoot Common Network Issues

Using the above notes as well as the linked VMware KB articles one should be able to isolate issue to one of four areas:

  • Virtual Machine
  • ESX/ESXi Host Networking (uplinks)
  • vSwitch or dvSwitch Configuration
  • Physical Switch Configuration

Troubleshoot VMFS Metadata Consistency

Use the vSphere On-disk Metadata Analyser (VOMA) to identify and fix incidents of metadata corruption that affect file systems or underlying logical volumes. VOMA is executed from the CLI of an ESXi host and can be used to check and fix metadata inconsistency issues for a VMFS datastore or a virtual flash resource.  The following example was pulled from the vSphere Troubleshooting documentation:

  • Obtain the name and partition number of the device that backs the VMFS datastore that you need to check
    • #esxcli storage vmfs extent list
  • Run VOMA to check for VMFS errors. Provide the absolute path to the device partition that backs the VMDS datastore, and provide a partition number with the device name:
    • # voma –m vmfs –f check –d /vmfs/devices/disks/naa.600508e000000000b367477b3be3d703:3
  • The output lists possible errors

For the full run down of VOMA command options review the table on page 66 of the vSphere Troubleshooting documentation.

Verify Storage Configuration

Refer to the vSphere Storage and the SAN System Design and Deployment Guide (not specific to vSphere 6, but worth a read) by VMware. This will cover a lot of areas needed for working with a FC/iSCSI SAN environment with vSphere. Also a good understanding of the hardware you are using on the backend (storage arrays, FC switches, networking, etc) and there “vSphere Best Practices” documents will assist in the proper configuration.

Identify Storage I/O Constraints

With the mention “storage constraints” I am assuming they are hinting at I/O throughput or I/O latency issues. I find the quickest and easiest way of measuring/checking this is via esxtop/resxtop. VMware KB 1008205 and Duncan Eppings esxtop blog post covers this is in more detail.

Metrics to be aware of:

Disk Metric Threshold Description
DAVG 25 This is the average response time in milliseconds per command being sent to the device
GAVG 25 This the response time as it is perceived by the guest operating system. This number is calculated with the formula: DAVG + KAVG = GAVG
KAVG 2 This is the amount of time the command spends in the VMKernel

 

The following diagram (provided by VMware) provide a visual representation of the chart above:

Horizon_6_Storage_ESXi

Monitor/Troubleshoot Storage Distributed Resource Scheduler (SDRS)

Refer to Section 6, Troubleshooting Resource Management in vSphere Troubleshooting 6.0 documentation (pages 47 thru 55).

Troubleshoot Common Storage Issues

Refer to Section 7, Troubleshooting Storage in vSphere Troubleshooting 6.0 documentation (pages 55 thru 72). The section covers several storage related issues that you may run into.