VMware vSphere/ESXi comprehensive, structured troubleshooting and operational questions.

Table of Contents

1. How to fix slow VM performance? Where to start troubleshooting?

  • CPU: Check CPU Ready% (>5% indicates contention). Use esxtop → ‘c’ → look at %RDY.
  • vRAM: Check ballooning/swapping (esxtop → ‘m’). High balloon = memory overcommit.
  • Storage: Use esxtop → ‘d’ → check DAVG/cmd and KAVG/cmd. >20ms latency = issue. Also check datastore peak IOPS/throughput in vCenter Performance charts.
  • Network: Check packet drops, high %DRPTX/%DRPRX in esxtop → ‘n’. Verify NIC teaming and switch config.

Start: vCenter → VM → Performance tab → isolate resource bottleneck.


2. If ESXi host not responding, what’s next?

  • Ping the host management IP.
  • If unresponsive, iLO/iDRAC (out-of-band) access → check console, reboot if needed.
  • DO NOT restart services blindly—first determine if it’s a PSOD or just management agent hang.

3. Management service restart command (ESXi Shell)?

# Restart hostd (main management agent)
/etc/init.d/hostd restart

# Restart vpxa (vCenter agent)
/etc/init.d/vpxa restart

# Or restart all (use cautiously)
services.sh restart

Note: In ESXi 7+, services.sh is preferred.


4. ESXi stuck in Maintenance Mode?

  • Check for powered-on VMs (including hidden ones like vCLS).
  • Check vMotion tasks—cancel any stuck migrations.
  • Check storage paths—dead paths can block evacuation.
  • Force exit (last resort):bash12

5. Add new Datastore to Cluster?

  1. vCenter → Hosts and Clusters → Select cluster/host.
  2. Right-click → Storage → New Datastore.
  3. Choose VMFS (for block) or NFS (for NAS).
  4. Select storage device/NAS path → format → finish.
  5. Datastore auto-appears to all hosts in cluster if shared storage.

6. Migrate VM disk between Datastores?

  • Storage vMotion (requires vCenter):
    • Right-click VM → Migrate → Change storage only.
  • Cold Migration (VM powered off): Move files via Datastore Browser or vmkfstools.

7. Check storage issues from the VMware side?

  • vCenter: Monitor → Performance → Datastore metrics (latency, IOPS).
  • ESXi CLI:bash123
  • Check PSOD logs if the host crashes during I/O.

8. Fix ESXi PSOD (Purple Screen of Death)?

  • Reboot host (via iLO if needed).
  • Collect logs (/var/log/vmkernel.log, core dumps).
  • Check hardware: RAM, storage controller, firmware compatibility.
  • Update ESXi, drivers, and firmware per HCL.

9. Recurring VM slow performance?

  • Enable vSphere Performance Diagnostics.
  • Check for resource pool limits, CPU/memory reservations.
  • Anti-affinity rules to avoid noisy neighbors.
  • Monitor storage array health (backend latency).
  • Consider the VMXNET3 adapter, disable unnecessary services inside the guest OS.

10. VM stuck during vMotion?

  • Cancel task in vCenter.
  • Check network connectivity between vMotion vmknics.
  • Ensure MTU consistency (jumbo frames must match).
  • Verify firewall allows vMotion (TCP 8000-8999).
  • Check storage locks—retry after 5-10 mins.

11. After PSOD, VM not restarting on another host?

  • Ensure vSphere HA is enabled on cluster.
  • Check VM restart priority (should not be “Disabled”).
  • Verify datastore accessibility from other hosts.
  • Manually register VM on another host if needed:bash1

13. Prechecks for vSphere Upgrade?

  • Compatibility: Check VMware HCL.
  • Backup: vCenter, host configs, VMs.
  • Check interop: vCenter ↔ ESXi version matrix.
  • Remove deprecated features (e.g., VSS if moving to VDS).
  • Ensure >15% free space on boot device.
  • Update firmware on servers/storage.

14. Upgrade ESXi Host?

  • Option 1: vCenter → Host → Updates → Remediate.
  • Option 2: ESXi CLI:bash1
  • Option 3: Boot from ISO (disruptive).

15. Upgrade vCenter Server (VCSA)?

  • Use VAMI (https://vcenter:5480) → Update → Check Updates → Stage → Install.
  • Or use vCenter GUI → Menu → Lifecycle Manager → Updates.

16. Install vCenter Appliance?

  1. Mount ISO → Run installer (GUI or CLI).
  2. Deploy OVA to ESXi host.
  3. Configure network, SSO domain, storage.
  4. Complete setup via browser.

17. Revert ESXi after bad upgrade?

  • ESXi uses bootbank: Reboot → press Shift+R during boot → choose previous image.
  • Or use:bash12

18. Migrate Standard Switch (VSS) to Distributed Switch (VDS)?

  1. Create VDS in vCenter.
  2. Add ESXi hosts to VDS.
  3. Migrate vmknic and VM port groups one by one.
  4. Remove VSS after all uplinks are migrated.

Use Networking → Virtual Switches → Migrate VMs to another network.


19. VM backup failed? How to fix?

  • Check VM snapshot limit (32 max).
  • Ensure VM tools are running.
  • Disable memory snapshots if not needed.
  • Check storage space on the datastore & backup target.
  • Exclude RDMs, independent disks from the backup job.

20. Why VM appear as orphaned?

  • VM files exist on datastore, but not registered in vCenter/host inventory.
  • Caused by: host disconnect, manual file deletion, HA misfire.
  • Fix: Right-click datastore → Register VM.

21. VM Port Group?

  • Logical construct for vSwitch-based VM network connectivity.
  • Defines VLAN, security, shaping policies.
  • VM’s vNIC connects to port group.

22. VMkernel Port? Purpose?

  • Used by ESXi host (not VMs) for:
    • Management
    • vMotion
    • iSCSI/NFS
    • vSAN
    • Fault Tolerance
  • Each service requires dedicated or shared vmknic with IP.

23. Ways to upgrade ESXi Host?

  1. vCenter Lifecycle Manager (vLCM) – image-based (recommended).
  2. ESXCLI – online depot or offline bundle.
  3. ISO boot – full reinstall (disruptive).
  4. Auto Deploy – stateless hosts.

24. Upgrade VCHA (vCenter HA)?

  • Break VCHA → upgrade vCenter → reconfigure VCHA.
  • Not supported to upgrade while VCHA is active.
  • Backup → disable HA → upgrade → re-enable.

25. Fix VM disk latency?

  • Check storage array performance (backend queue depth, cache).
  • Use a paravirtualized SCSI (PVSCSI) controller.
  • Enable VAAI (hardware acceleration).
  • Avoid overprovisioning datastore.
  • Monitor with esxtop → DAVG/cmd.

26. Change VM disk from Thick to Thin?

  • Storage vMotion → choose Thin Provision during migration.
  • Or CLI (VM powered off):bash12

27. Two VMs with the same IP?

  • IP conflict → network instability, packet loss.
  • Use ARP inspection, DHCP reservations, or guest OS tools to detect.
  • Not a VMware issue—fix at guest or network layer.

28. RAID 10 vs RAID 01? Which better?

  • RAID 10 (1+0): Mirror then stripe → survives multiple disk failures (as long as not the same mirror pair).
  • RAID 01 (0+1): Stripe then mirror → fails if one disk per side fails.
  • RAID 10 is preferred for performance + fault tolerance.

29. RAID 5 vs RAID 6?

  • RAID 5: Block-level striping + 1 parity → survives 1 disk failure. Write penalty.
  • RAID 6: 2 parity → survives 2 disk failures. Higher write penalty, better for large drives.

30. ESXi Partitions & Uses?

  • Boot Bank (Active/Passive): Dual OS images for rollback.
  • Scratch: Logs, temp files (can be on local disk or remote NFS).
  • VMFS Volume: Stores VMs.
  • No traditional /home, /var—most is in RAM or scratch.

31. Logs for PSOD?

  • /var/log/vmkernel.log
  • Core dump: /var/crash/
  • vm-support -x (captures full diagnostic bundle)
  • PSOD screen photo (contains CPU, module info)

32. VM Files & Uses

  • .vmx – config
  • .vmdk – virtual disk descriptor
  • -flat.vmdk – actual disk data
  • .nvram – BIOS settings
  • .vswp – swap file
  • .log – VM logs
  • .vmsd – snapshot metadata
  • .vmsn – snapshot state

33. Map one Physical NIC to multiple vSwitches?

  • No. One physical NIC (vmnic) can belong to only one vSwitch (VSS or VDS).
  • But one vSwitch can have multiple vmnics (for redundancy/throughput).

34. Services impacted if vCenter down?

  • Not impacted: VMs, HA, DRS (runs on hosts), vMotion (can do via CLI), local console.
  • Impacted:
    • DRS automation
    • vMotion GUI
    • Storage vMotion
    • Cloning, templates
    • Permissions, alarms, scheduled tasks
    • Lifecycle Manager

Hosts continue running—vCenter is management plane only.


35. ESXi reboots continuously after upgrade?

  • Corrupted bootbank → use Shift+R to revert.
  • Incompatible hardware/firmware → check HCL.
  • Failed VIBs → boot to Tech Support Mode → remove bad VIB.
  • Boot device failure → check SSD/HDD health.

36. Clone VM without vCenter?

  • Powered off:bash123
  • Register a new VM via vim-cmd.

37. DPM & Proactive HA?

  • DPM (Distributed Power Management): Powers off hosts during low load to save energy.
  • Proactive HA: Integrates with hardware (e.g., iLO) to evacuate VMs before host failure (e.g., PSU, fan warning).

38. Slot Calculation (for HA)?

  • Slot = max CPU + max memory of any VM in the cluster.
  • Total slots = total resources/slot size.
  • Ensures HA can guarantee failover capacity.
  • Can cause inefficiency → use “Percentage of cluster resources” instead.

39. Calculate IOPS?

  • IOPS = 1 / (Seek Time + Rotational Latency)
  • SSD: No seek/rotation → IOPS = throughput / I/O size.
  • Rule of thumb:
    • SATA HDD: 75-100 IOPS
    • SAS HDD: 150-200 IOPS
    • SSD: 10K–100K+ IOPS

Use esxtop → ‘d’ → CMD/s for actual IOPS.


40. New Features in ESXi 8.0?

  • APM (Advanced Power Management) support
  • vSphere Lifecycle Manager enhancements (image management)
  • Secure Boot + TPM 2.0 enforcement
  • Deprecation of VSS (DVS only for new features)
  • Improved vGPUs, DirectPath I/O
  • vSphere+ cloud services integration

41. vSAN & Disk Replacement?

  • vSAN: Software-defined storage using local disks across a cluster.
  • Replace disk:
    1. Put the disk in maintenance mode (Ensure Accessibility or Full Data Migration).
    2. Physically replace.
    3. vSAN auto-rebuilds (if policy allows).
    4. Exit maintenance mode.

42. Upgrade vRealize Suite / vRA vCenter?

  • Use vRealize Suite Lifecycle Manager (vRSLCM).
  • Follow the product-specific upgrade path.
  • Backup vRA, IaaS, and database first.

Not standard vCenter—separate product.


43. IPs required for 4-node vSAN cluster?

  • Per host:
    • 1 Management
    • 1 vMotion (optional but recommended)
    • 1 vSAN (dedicated)
    • 1 VM Network (varies)
  • Total minimum: 4 (mgmt) + 4 (vSAN) + 4 (vMotion) = 12 IPs.
  • Plus vCenter IP, vSAN VMkernel multicast/VNI, witness (if stretched).

44. VMFS Datastore Full? Fix?

  • Immediate: Delete snapshots, old logs, unused VMs.
  • Extend: Add extent (not recommended) or grow LUN (if array supports).
  • Migrate: Move VMs to another datastore via Storage vMotion.
  • Enable SIOC to manage congestion.

45. Standard Switch vs Distributed Switch?

FeatureVSSVDS
ManagementPer-hostCentral (vCenter)
LACP
NetFlow, Port Mirroring
CLI config❌ (mostly)
vMotion network configManual per hostAuto-sync

46. vMotion not working after moving host to new cluster?

  • vMotion vmknic not configured in new cluster.
  • Network policy mismatch (VLAN, MTU).
  • vMotion enabled? → Host → Configure → Networking → VMkernel adapters → Edit → Enable vMotion.

47. P2V & V2V Migration?

  • P2V: Use VMware vCenter Converter Standalone (discontinued but works) or 3rd-party (Starwind, Disk2vhd + import).
  • V2V:
    • Between vCenters: Cross-vCenter vMotion (6.0+)
    • Export OVF → Import
    • Storage vMotion (same vCenter)

48. Upgrade ESXi Hardware Firmware (HP/Dell)?

  • HP: Use SPP (Service Pack for ProLiant) via iLO or OneView.
  • Dell: Use Dell EMC Repository Manager or iDRAC → Firmware Update.
  • Always check VMware HCL after firmware upgrade.

49. Create ESXi Upgrade Baseline?

  • In vCenter → Hosts and Clusters → Updates → Create Baseline.
  • Choose Upgrade type.
  • Attach image (ISO/ZIP) or use online depot.
  • Associate with cluster → Remediate.

vLCM (in 7.0+) uses Images, not baselines.


50. HA vs Proactive HA?

  • HA: Reacts to host failure (power loss, PSOD) → restarts VMs elsewhere.
  • Proactive HA: Reacts to hardware degradation warnings (via vendor provider) → preemptively evacuates VMs before failure.

Proactive HA requires vendor integration (HP, Dell, etc.).

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top