ESXi VM PCAP collection

This is a small guide on how to collect PCAPs for network traffic from/to a virtual machine running on VMware ESXi. Scripts have been provided by VMware GSS and can be found here and used at your own risk.

(I get it, who would want to use Python scripts downloaded from a random website and run them on your ESXi, feel free to look into the code after download, or else request them from VMware)

The scripts can help to collect on multiple PortIDs, divide the PCAPs into smaller files, and rotate the logs for each x minutes and it can help to monitor the capture that is ongoing.

If you don’t want to use the scripts, then there is also a more manual way to run the collection here.

Finding the VM port IDs

Connect to the ESXi host with SSH and list the running VMs.

[root@dc1esxedge1-02:~] esxcli network vm list
World ID  Name              Num Ports  Networks
--------  ----------------  ---------  --------
68018705  nsxedgenode1             3  dvportgroup-914135, dvportgroup-914135 

Find the VM that you want to collect from, let’s take WorldID 68018705 and check the switch port IDs for it

[root@esxedge1-02:~] esxcli network vm port list -w 68018092
   Port ID: 67108942
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914135
   DVPort ID: 109
   MAC Address: 00:50:56:ab:82:59
   IP Address: 0.0.0.0
   Team Uplink: vmnic2
   Uplink Port ID: 2214592556
   Active Filters:

   Port ID: 67108943
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914135
   DVPort ID: 108
   MAC Address: 00:50:56:ab:06:b0
   IP Address: 0.0.0.0
   Team Uplink: vmnic2
   Uplink Port ID: 2214592556
   Active Filters:

   Port ID: 67108944
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914134
   DVPort ID: 98
   MAC Address: 00:50:56:ab:93:ab
   IP Address: 0.0.0.0
   Team Uplink: vmnic1
   Uplink Port ID: 2214592558
   Active Filters:

In this case, we note down the following port IDs and save them for later

67108942
67108943
67108945
67108946

Start the collection with scripts

Upload the Python script to the /tmp/ on the ESXi host. You also need a folder on the local datastore of the ESXi host where the pcap logs can be stored.

Prepare and run the first script where we provide the PortIDs and where to store the PCAPs

[root@esxedge1-02:~] python /tmp/prepare_pktcap.py -p 67108942 -p 67108943 -p 67108945 -p67108946 -u -d /vmfs/volu
mes/prd.dc1esxedge1-02/capture/42434546 -o /tmp/rotating_cap.sh -G 15m -r 3600
The current vmci heap is 99% free and so far there have been 0 allocation failures
[root@esxedge1-02:~]

Next, we will need to start the rotating log script, this will start the PCAPs collection threads and rotate the logs each 15 min as we defined in the prepare_pktcap script. The script should be left running in the SSH connection. if you stop it, the collection will also stop.

[root@esxedge1-02:~] /tmp/rotating_cap.py
Dump: 294272, broken : 0, drop: 0, file err: 0.
Dump: 175296, broken : 0, drop: 0, file err: 0.
Dump: 371584, broken : 0, drop: 0, file err: 0.
Dump: 317248, broken : 0, drop: 0, file err: 0.
Dump: 371648, broken : 0, drop: 0, file err: 0.
Dump: 317312, broken : 0, drop: 0, file err: 0.

Monitor the collection

The last script is for doing the monitoring of the sessions. You will need to open a new SSH session to the host to run this script.

[root@esxedge1-02:~] /tmp/pktcap_sessions.py -l
The vmci heap is 58% free
session    portID       devName
579        67108942
580        67108942
581        67108945
582        67108946
583        67108943
584        2214592556   vmnic2
585        67108946
586        2214592556   vmnic2
587        67108943
588        67108945
589        2214592558   vmnic1
590        2214592558   vmnic1

From the output, we can see that it collects the PortIDs that we have defined, but it also collects PCAP for the VMNics. This is valuable to us if we need to compare traffic on what is going in on the host and to the VM and visa versa.

Looking at the output that is stored on the datastore

[root@esxedge1-02:/vmfs/volumes/5fd89aca-1f6f344a-f65f-043f72c0064a/capture/42434546] ls
dc1esxedge1-02_2023-11-29T07_42_11_p67108942_d0_sna.pcap
          ....
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d0_sna.pcap.log      dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d0_sna.pcap_rot.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d0_sna.pcap_rot.log  dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap          dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap.log      dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap_rot.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap_rot.log  
killfile

We can see the PCAPs for each of the PortIDs and VMnics, and it will rotate every 15 minutes.

Stopping the collection

You might have noticed the “killfile” from above. Remote this file and the collection will stop.

[root@esxedge1-02:~] rm -rf /vmfs/volumes/5fd89aca-1f6f344a-f65f-043f72c0064a/capture/42434546/killfile

Conclusion

The scripts are handy because they rotate the logs and have a way to monitor and kill the collections. This way we don’t have to manually kill processes on the ESXi host.

When the collection is done, you can copy out the logs for further analysis, I found the Filezilla SFTP client to be the fastest way of copying out the data.

If out want to merge the PCAPs afterward, on MacOS, you can do it with

mergecap -w merged.pcap *.pcap

Troubleshooting

If you find that you can’t start the rotating logs script it might be because you have tried to start it before and it somehow stall. You can find the process IDs for it and kill it manually.

[root@esxedge1-02:~] ps -Tcjstv | grep -i rotating_cap

[root@esxedge1-02:~] kill 

Download

ESXi network routes

I honestly don’t know why this is still a problem. Support for routed vmotion traffic was added back at vSphere6. Here we are vSphere7 and still have to set your gateway/routes for the vmotion stack through esxcli.

Either way, here is how it works

vMotion stack

Each tcp/ip stack can only have one gateway, that makes sense. And if you want to keep your management and vMotion traffic separated you need two tcp/ip stacks.

It’s nicely done through vSphere vCenter GUI and there is a KB for it. And you even have the option to override the default gateway and specify the right one for your vMotion stack.

[root@dc1esxcompx-xx:~] esxcli network ip route ipv4 list -N vmotion
Network     Netmask        Gateway  Interface  Source
----------  -------------  -------  ---------  ------
10.1.115.0  255.255.255.0  0.0.0.0  vmk1       MANUAL

But when looking at the routing table from esxcli it is not set. If you know, feel free to give me a kick and enlighten me.

ESXCLI add a static route

So for me to set an actually default route I have to do it as shown below

[root@dc1esxcompx-xx:~] esxcli network ip route ipv4 add -g 10.1.115.1 -n 0.0.0.0/0 -N vmotion

[root@dc1esxcompx-xx:~] esxcli network ip route ipv4 list -N vmotion
Network     Netmask        Gateway     Interface  Source
----------  -------------  ----------  ---------  ------
default     0.0.0.0        10.1.115.1  vmk1       MANUAL
10.1.115.0  255.255.255.0  0.0.0.0     vmk1       MANUAL

PowerCLI

Need to do it on a cluster with multiple hosts? No problem LucD from VMware community got you covered. I only did a little customization and it works for my needs.

connect-viserver -Server 

$stackName = 'vmotion'
$ipGateway = '10.1.115.1'
$ipDevice = 'vmk3'
$cluster = "computexx"
$vmhosts = get-cluster $cluster | get-vmhost

foreach($vmhost in $vmhosts)
{
$esx = Get-VMHost -Name $vmhost
$netSys = Get-View -Id $esx.ExtensionData.ConfigManager.NetworkSystem
$stack = $esx.ExtensionData.Config.Network.NetStackInstance | where{$_.Key -eq 'vmotion'}
$config = New-Object VMware.Vim.HostNetworkConfig
$spec = New-Object VMware.Vim.HostNetworkConfigNetStackSpec
$spec.Operation = [VMware.Vim.ConfigSpecOperation]::edit
$spec.NetStackInstance = $stack
$spec.NetStackInstance.ipRouteConfig.defaultGateway = $ipGateway
$spec.NetStackInstance.ipRouteConfig.gatewayDevice = $ipDevice
$config.NetStackSpec += $spec
$netsys.UpdateNetworkConfig($config,[VMware.Vim.HostConfigChangeMode]::modify)
}

Conclusion

Manipulating the vmotion stack route table with either esxcli or PowerCLI is working great.

Need to know more? there are plenty of good bloggers and KBs out here.

Unlock ESXi root account

I often see people, including myself, lock themselves out of the ESXi web-based host client. It only locks you out from ssh and the web console. Password lockout is NOT active on the console/DCUI. Below is how you reset the counter and regain access.

Procedure to unlock the ESXi root

First, you need to gain ILO/IMM/IPMI or physical access to the server.

  1. At the console, press ALT+F1 to get to the ESXi shell. If a login shows up continue with step 3, otherwise continue with step 2. Change back to the login screen with ALT+F2.
  2. Login to the DCUI (to enable the ESXi Shell if not already done)
    1. Login with root and the correct password.
    2. Go to Troubleshooting Options
    3. Select Enable ESXi Shell
    4. Press CTRL+ALT+F1
  3. At the ESXi shell login with root and the password
  4. Run the following command to unlock the root account:
pam_tally2 --user root --reset

ESXCLI host upgrade procedure

Most of the time you would want to use VMware Update Manager when doing upgrade. Its part of vCenter and is necessary tool when having to maintain your environment. But for smaller deployments, with standalone hosts and no vCenter the following upgrade methods are desired and can help the upgrade time. Instead of having to upgrade with IPMI and an ISO.

Online mode:

This method is for getting the update online, no need to download ISO/offline bundles, etc. This will work for most of the upgrade use cases.

1: Connect to your ESXi host via the host client and enable SSH. Afterward ssh to the ESXi host and enable ESXi firewall rule to allow the host to access the internet.

esxcli network firewall ruleset set -e true -r httpClient

2: With the beneath command you will get a list of available ESXi packaged that are on the VMware repos. Enter this command to list all available profiles. We filter only those which are relevant to our case – upgrade to ESXi 6.7

esxcli software sources profile list -d https://hostupdate.vmware.com/software/VUM/PRODUCTION/main/vmw-depot-index.xml | grep -i ESXi-6.7

3. Chose the desired profile and use the following command for choosing and upgrading the ESXi version. Before upgrade its a good idea to enter maintenance mode.

esxcli system maintenanceMode set --enable true
esxcli software profile update -p ESXi-6.7.0-20190402001-standard -d https://hostupdate.vm
ware.com/software/VUM/PRODUCTION/main/vmw-depot-index.xml

4. After it’s done, you will need to restart the host, after its rebooted you will run on the new ESXi version.

Custom, with Offline bundle:

This method is for when you desire to install a custom update, or that your hosts down have access to the internet.

1: Download the offline bundle from the VMware webpage, in this upgrade I will use an HPE custom version. But if you run a generic version, that will also work.

2: After downloading the “VMware-ESXi-6.7.0-8169922-depot.zip” file, place it (upload it) to a datastore which is visible by your ESXi host. Best would be a local datastore if this host has some. If not, it can also be a shared datastore too.

3: Find the profile name from the depot offline bundle

 esxcli software sources profile list -d /vmfs/volumes/prd.r60lun01/ISO/VMware-ESXi-6.7.0-Up
019-depot.zip

Put your host into maintenance mode, enable SSH if you haven’t done yet.

3: Execute this command to upgrade your ESXi 6.x to 6.7

esxcli software profile update -p ESXi-6.7.0-13006603-standard -d /vmfs/volumes/your_datastore/VMware-ESXi-6.7.0-13006603-depot.zip

esxcli software profile update -p HPE-ESXi-6.7.0-Update2-Gen9plus-670.U2.10.4.1.8 -d /vmfs/volumes/prd.r60lun01/ISO/VMware-ESXi-6.7.0-Update2-13006603-HPE-Gen9plus-670.U2.10.4.1.8-Apr2019-depot.zip

After checking that your upgrade was successful, reboot your host. You should see a message saying that the upgrade completed successfully.

Troubleshooting

I have tried to get an error with:

Failed updating the bootloader: Execution of command /usr/lib/vmware/bootloader-installer/install-bootloader failed: non-zero code returned…. return code: 1”

Error when upgrading, due to “insufficient space”.

This problem is due to the SWAP is but on the installation of the ESXi, not a good thing. So let’s change it.

Go to the UI of the ESXi Hosts https://IP/ui, login and proceed to the following:

Manage > System > Swap > Edit Settings

Chose the dropdown and select a datastore. Apply and the swap space is not freed from the ESXi install device so that you can try to upgrade again.

Conclusion:

After the upgrade, it’s a good idea to disable the ESXi firewall rule for “HTTP outside access”. Stop and disable SSH again, but it’s optional 🙂

esxcli network firewall ruleset set -e false -r httpClient

Now you should have an upgraded host.

ESXi – physical memory population

Had en interesting problem where a ESXi host only showed it had 30GB of memory, but the motherboard was populated with 6*8GB modules. In earlier versions of ESXi 5.5< it was possible to use dmidecode to show how the physical hardware was populated. But since 6.0> that have been removed.

The new command to find those kind of information are now “smbiosDump”

smbiosDump | grep -A 6 'Memory Device'

You can also just run smbiosDump without any paramenters and you get a hole lot of information to crawl through.