VMware

Primarily ESXi-related

lol

The service snmpd failed to start

- Posted in VMware by

Error "The service snmpd failed to start" will be displayed if the service has not been configured.

Configure snmpd via ESXi shell as follows:

Reset settings, if any current exists: esxcli system snmp set -r

Community: esxcli system snmp set -c aners

Port (udp): esxcli system snmp set -p 161

Location: esxcli system snmp set -L "Room 641A"

Contact (email): esxcli system snmp set -C mail@yourmail.tld

Enable service? esxcli system snmp set -e yes

Then go to https://esxihost/ui/#/host/manage/services and start the service. Consider changing the start policy as well.

lol

Module 'CPUID' power on failed.

- Posted in VMware by

Even though the number of configured vCPUs in the VM wasn't changed nor excessive, a VM wouldn't cross-migrate (got an EVC error) hot.

After powering off the VM and migrating to another cluster, which was on the same EVC level, powering on the VM resulted in the error "Module 'CPUID' power on failed."

Inspecting the CPU Identification Mask settings (Edit Settings -> CPU > CPUID Mask -> Advanced) and resetting to default did not resolve the issue.

I assumed the VMX had some clues, and found multiple lines of cpuid-related flags:

[aners@derp:/vmfs/volumes/vsan:5..2]grep -i cpu *.vmx:

...
sched.cpu.units = "mhz"
cpuid.80000001.edx = "---- ---- ---0 ---- ---- ---- ---- ----"
cpuid.80000001.eax.amd = "---- ---- ---- ---- ---- ---- ---- ----"
cpuid.80000001.ebx.amd = "---- ---- ---- ---- ---- ---- ---- ----"
cpuid.80000001.ecx.amd = "---- ---- ---- ---- ---- ---- ---- ----"
cpuid.80000001.edx.amd = "---- ---- ---0 ---- ---- ---- ---- ----"
sched.cpu.latencySensitivity = "normal"
...

Removing all 'cpuid...'-lines from the VMX resolved the issue, the VM was now able to boot.

lol

vSphere Remove snapshot task 0%, stuck?

- Posted in VMware by

When removing large snapshots, the task status is progressing towards 100%, or so it should be; sometimes it goes to 0% in the web UI and the user is left clueless.

Updating the web UI doesn't bring the current progress back. Luckyli the shell on the host can be used to retrieve the progress:

Chaining a few commands, will get the progress of the "Snapshot.remove"-task:

1) Get a list of all VMs and filter by the name of your VM: vim-cmd vmsvc/getallvms|grep -i garg|awk '{print $1}'

[root@virt58:/vmfs/volumes/vsan:5...2/4...9] vim-cmd vmsvc/getallvms|grep -i garg|awk '{print $1}'
72

The Vmid is returned, in this example 72

2) Verify the Vmid is in fact the VM you're interested in, fetch it's name: vim-cmd vmsvc/get.summary 72|grep name

[root@virt58:/vmfs/volumes/vsan:5...2/4...9] vim-cmd vmsvc/get.summary 72|grep name
name = "Gargoil",

3) Having verified the Vmid, get the running tasks: vim-cmd vmsvc/get.tasklist 72

(ManagedObjectReference) [
   'vim.Task:haTask-72-vim.vm.Snapshot.remove-138283664'
]

4) Copy the vim.Task identifier and get task_info, filter "state" and "progress": vim-cmd vimsvc/task_info haTask-72-vim.vm.Snapshot.remove-138283664|grep "state|progress"

   state = "running",
   progress = 86,

What the web UI failed to display, is that the "Snapshot.remove"-task is running and 86% complete, I guess this is why CLI is usually my favourite goto.

For more verbose output, remove the pipe to grep

lol

Invalid configuration for device '0'.

- Posted in VMware by

One of my Veeam Backup Copy jobs failed for every VM in the job, reporting IO errors:

10/08/2023 08.29.02 :: Processing vSRX-18.3 Error: File does not exist. File: [vSRX-18.3.1D2023-08-09T020227_4DE4.vib]. Failed to open storage for read access. Storage: [vSRX-18.3.1D2023-08-09T020227_4DE4.vib]. Failed to restore file from local backup. VFS link: [summary.xml]. Target file: [MemFs://frontend::CDataTransferCommandSet::RestoreText_{13faea43-f648-4fee-8abb-630907bd1df7}]. CHMOD mask: [0]. Agent failed to process method {DataTransfer.RestoreText}. ...

The volume holding the VIBs, an external USB-drive forwarded to the Veeam guest within ESXi, was gone. Seemingly a failed drive.

VCSA failed to remove the USB Host device, with the error:

Invalid configuration for device '0'

Not being able to remove a missing device (the USB-controller), even when the VM is powered off, I had no choice but to manually delete it:

Simply remove the VM from the inventory - obviously not deleting it from VMFS.

Locating the device in the .VMX for the VM and removing the line from the configuration:

usb_xhci.autoConnect.device0 = "path:0/1/1/5 host:esxi7.rotteslottet.lan hostId:71 05 57 47 bf 16 10 d6-2c 0e 3c 7c 3f 11 9a f0 autoclean:1 deviceType:remote-host"

After trashing the failed drive, replacing it with a new one, and re-powering the USB-controller, I simply registered the VM via VCSA and started over with the local backup.

I wish it was possible to forcefully remove a virtual device via VCSA

lol

Use ovftool to deploy the image directly to the ESXi-host instead:

ovftool -dm=thick -ds=<DATASTORENAME> -n=<VMNAME> --net:"VM Network"="<VMNETWORKNAME>" "junos-media-vsrx-x86-64-vmdisk-18.2R1.9.scsi.ova" vi://root@esxi-host.tld

Replace with your own settings

lol

If a snapshot seems stuck, use the console to verify a task is actually running:

1) Run vim-cmd vmsvc/getallvms and note the relevant VM-ID 2) Run vim-cmd vmsvc/get.tasklist <VM-ID> and note the Task-id 3) Run vim-cmd vimsvc/task_info <Task-id> to get task status 4) Browse to the VMs location on the datastore and run watch -d 'ls -lut | grep -E "delta|flat|sesparse"' to monitor the process

lol

Unmap VMFS using esxcli

- Posted in VMware by

First fetch a list of VMFS:

esxcli storage filesystem list

For VMFS' where unmapping is supported, run:

esxcli storage vmfs unmap --volume-label=<label> | --volume-uuid=<uid>  [--reclaim-unit=<blocks>]
lol

ESXTOP xterm, for unsupported terminals

- Posted in VMware by

Set TERM to xterm, before running esxtop to get a usable output, when the terminal/tty is not supported; run the following command to do so:

TERM=xterm esxtop
lol

Get Virtual Machine uptime, with vim-cmd

- Posted in VMware by

Run vim-cmd vmsvc/getallvms to get a list of VM IDs (pipe to grep -i to filter)

With the ID from the second column, use the following command to fetch the uptime (replace 12345 with your VMs ID)

vim-cmd vmsvc/get.summary 12345 |grep "uptimeSeconds"
lol

ESXi 6.5, switch to legacy USB-stack

- Posted in VMware by

Disable vmkusb module in ESXi 6.5 and switch to legacy stack:

esxcli system module set -m=vmkusb -e=FALSE

Reenable vmkusb:

esxcli system module set -m=vmkusb -e=TRUE

Either change requires rebooting of ESXi