VCP-6–Objective 7.1 Troubleshoot vCenter Server, ESXi Hosts, and Virtual Machines

For this objective I used the following resources:

Objective 7.1 – Troubleshoot vCenter Server, ESXi Hosts, and Virtual Machines

Knowledge

Identify General ESXi Host Troubleshooting Guidelines

The vSphere Troubleshooting guide is the one stop shop for this section

Identify General vCenter Troubleshooting Guidelines

The vSphere Troubleshooting guide is the one stop shop for this section

Troubleshoot Common Installation Issues

Refer to Objective 1.3 and make sure your hosts meet the hardware requirements as well as the VMware HCL. If using AutoDeploy refer to pages 20 thru 26 of the vSphere Troubleshooting guide and also VMware KB 2000988 (Troubleshooting vSphere Auto Deploy).

Monitor ESXi System Health

With the release of ESXi back in the VI 3.5 days it provided a new way to manage your hosts, the Common Information Model (CIM). CIM allows for a standard framework to manage computing resources and presents this information via the vSphere Client. For further information read the VMware White Paper “The Architecture of VMware ESXi” as well as this VMware Support Insider blog post.

Locate and Analyze vCenter and ESXi Logs

ESXi Log Files and Locations

Log Description
/var/log/auth.log ESXi Shell authentication success and failure
/var/log/dhclient.log DHCP client service, including discovery, address lease requests and renewals
/var/log/esxupdate.log ESXi patch and update installation logs
/var/log/lacp.log Link Aggregation Control Protocol logs
/var/log/hostd.log Host management service logs, including virtual machine and host Task and Events, communication with the vSphere Client and vCenter Server vpxa agent, and SDK connections
/var/log/hostd-probe.log Host management service responsiveness checker
/var/log/rhttproxy.log HTTP connections proxied on behalf of other ESXi host webservices
/var/log/shell.log ESXi Shell usage logs, including enable/disable and every command entered
/var/logsysboot.log Early VMkernel startup and module loading
/var/log/boot.gz A compressed file that contains boot log information
/var/log/syslog.log Management service initialization, watchdogs, scheduled tasks and DCUI use
/var/log/usb.log USB device arbitration events, such as discovery and pass-through to virtual machines
/var/log/vobd.log VMkernel Observation events
/var/log/vmkernel.log Core VMkernel logs, including device discovery, storage and networking device and driver events, and virtual machine startup
/var/log/vmkwarning.log A summary of Warning and Alert log messages excerpted from the VMkernel logs
/var/log/vmksummary.log A summary of ESXi host startup and shutdown, and an hourly heartbeat with uptime, number of virtual machines running, and service resource consumption
/var/log/Xorg.log Vide acceleration

 

vCenter Log Files and Locations

vCenter running on windows the log files will be located in C:\ProgramData\VMware\VMware VirtualCenter\Logs

vCenter running on virtual appliance the log files will be located in /var/log/vmware/vpx

Log Description
vpxd.log The main vCenter Server log, consisting of all vSphere Client and WebServices connections, internal tasks and events, and communication with the vCenter Server Agent (vpxa) on managed ESXi/ESX hosts.
vpxd-profiler.log Profiled metrics for operations performed in vCenter Server. Used by the VPX Operational Dashboard (VOD) accessible at https://VCHostnameOrIPAddress/vod/index.html.
vpxd-alert.log Non-fatal information logged about the vpxd process.
cim-diag.log and vws.log Common Information Model monitoring information, including communication between vCenter Server and managed hosts’ CIM interface.
drmdump ctions proposed and taken by VMware Distributed Resource Scheduler (DRS), grouped by the DRS-enabled cluster managed by vCenter Server. These logs are compressed.
ls.log Health reports for the Licensing Services extension, connectivity logs to vCenter Server.
vimtool.log Dump of string used during the installation of vCenter Server with hashed information for DNS, username and output for JDBC creation.
stats.log Provides information about the historical performance data collection from the ESXi/ESX hosts
sms.log Health reports for the Storage Monitoring Service extension, connectivity logs to vCenter Server, the vCenter Server database and the xDB for vCenter Inventory Service
eam.log Health reports for the ESX Agent Monitor extension, connectivity logs to vCenter Server
catalina.date.log Connectivity information and status of the VMware Webmanagement Services.
jointool.log Health status of the VMwareVCMSDS service and individual ADAM database objects, internal tasks and events, and replication logs between linked-mode vCenter Servers

 

Export Diagnostic Information

Covered in Objective 7.3 – Troubleshoot vSphere upgrades, located HERE. But for reference read VMware KB Article 653 – Collecting Diagnostic Information for VMware ESX/ESXi

Identify Common Command Line Interface (CLI) Commands

Here is a list of command that I use on a daily basis:

  • esxtop – used for real time performance monitoring and troubleshooting
  • vmkping – Works like a ping command but allows for sending traffic out a specific vmkernel  interface
  • esxcli network name space – Used for monitoring or configuring ESXi networking
  • esxcli storage name space – Used for monitoring or configuring ESXi storage
  • vmkfstools – Allows for the management of VMFS volumes and virtual disks from the command line

Troubleshoot Common Virtual Machine Issues

Identify/Troubleshoot Virtual Machines Various States (e.g, Orphaned, Unkown, etc)

For these  two sections refer to Section 2 of the vSphere Troubleshooting documentation. This section covers the following topics:

  • Troubleshooting Fault Tolerant Virtual Machines
  • Troubleshooting USB Passthrough Devices
  • Recover Orphaned Virtual Machines
  • Virtual Machine Does Not Power On After Cloning or Deploying From Template

Troubleshoot Virtual Machine Resource Contention Issues

Identify Virtual Machine Constraints

For these two sections review the following VMware KB articles:

Identify Fault Tolerant Network Latency Issues

Fault Tolerance requirements are covering in Objective 7.5 – Troubleshoot HA and DRS Configurations and Fault Tolerance. For the latency portion remember the following:

  • Use a dedicated 10-Gbit logging network for Fault Tolerance traffic
  • Use the vmkping command to verify low sub-millisecond network latency

Troubleshoot VMware Tools Installation Issues

Have a look at VMware KB Article 1003908 – Troubleshooting a Failed VMware Tools Installation in a Guest Operating System

Identify the Root Cause of a Storage Issue Based on Troubleshooting Information

The vSphere Troubleshooting document covers several issues that you may run into. See Pages 45 thru 51.

Identify Common Virtual Machine Boot Disk Errors

Have a look at VMware KB Article 1003999 – Identifying Critical Guest OS Failures Within Virtual Machines.

%d bloggers like this: