ESXi VM PCAP collection

This is a small guide on how to collect PCAPs for network traffic from/to a virtual machine running on VMware ESXi. Scripts have been provided by VMware GSS and can be found here and used at your own risk.

(I get it, who would want to use Python scripts downloaded from a random website and run them on your ESXi, feel free to look into the code after download, or else request them from VMware)

The scripts can help to collect on multiple PortIDs, divide the PCAPs into smaller files, and rotate the logs for each x minutes and it can help to monitor the capture that is ongoing.

If you don’t want to use the scripts, then there is also a more manual way to run the collection here.

Finding the VM port IDs

Connect to the ESXi host with SSH and list the running VMs.

[root@dc1esxedge1-02:~] esxcli network vm list
World ID  Name              Num Ports  Networks
--------  ----------------  ---------  --------
68018705  nsxedgenode1             3  dvportgroup-914135, dvportgroup-914135 

Find the VM that you want to collect from, let’s take WorldID 68018705 and check the switch port IDs for it

[root@esxedge1-02:~] esxcli network vm port list -w 68018092
   Port ID: 67108942
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914135
   DVPort ID: 109
   MAC Address: 00:50:56:ab:82:59
   IP Address: 0.0.0.0
   Team Uplink: vmnic2
   Uplink Port ID: 2214592556
   Active Filters:

   Port ID: 67108943
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914135
   DVPort ID: 108
   MAC Address: 00:50:56:ab:06:b0
   IP Address: 0.0.0.0
   Team Uplink: vmnic2
   Uplink Port ID: 2214592556
   Active Filters:

   Port ID: 67108944
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914134
   DVPort ID: 98
   MAC Address: 00:50:56:ab:93:ab
   IP Address: 0.0.0.0
   Team Uplink: vmnic1
   Uplink Port ID: 2214592558
   Active Filters:

In this case, we note down the following port IDs and save them for later

67108942
67108943
67108945
67108946

Start the collection with scripts

Upload the Python script to the /tmp/ on the ESXi host. You also need a folder on the local datastore of the ESXi host where the pcap logs can be stored.

Prepare and run the first script where we provide the PortIDs and where to store the PCAPs

[root@esxedge1-02:~] python /tmp/prepare_pktcap.py -p 67108942 -p 67108943 -p 67108945 -p67108946 -u -d /vmfs/volu
mes/prd.dc1esxedge1-02/capture/42434546 -o /tmp/rotating_cap.sh -G 15m -r 3600
The current vmci heap is 99% free and so far there have been 0 allocation failures
[root@esxedge1-02:~]

Next, we will need to start the rotating log script, this will start the PCAPs collection threads and rotate the logs each 15 min as we defined in the prepare_pktcap script. The script should be left running in the SSH connection. if you stop it, the collection will also stop.

[root@esxedge1-02:~] /tmp/rotating_cap.py
Dump: 294272, broken : 0, drop: 0, file err: 0.
Dump: 175296, broken : 0, drop: 0, file err: 0.
Dump: 371584, broken : 0, drop: 0, file err: 0.
Dump: 317248, broken : 0, drop: 0, file err: 0.
Dump: 371648, broken : 0, drop: 0, file err: 0.
Dump: 317312, broken : 0, drop: 0, file err: 0.

Monitor the collection

The last script is for doing the monitoring of the sessions. You will need to open a new SSH session to the host to run this script.

[root@esxedge1-02:~] /tmp/pktcap_sessions.py -l
The vmci heap is 58% free
session    portID       devName
579        67108942
580        67108942
581        67108945
582        67108946
583        67108943
584        2214592556   vmnic2
585        67108946
586        2214592556   vmnic2
587        67108943
588        67108945
589        2214592558   vmnic1
590        2214592558   vmnic1

From the output, we can see that it collects the PortIDs that we have defined, but it also collects PCAP for the VMNics. This is valuable to us if we need to compare traffic on what is going in on the host and to the VM and visa versa.

Looking at the output that is stored on the datastore

[root@esxedge1-02:/vmfs/volumes/5fd89aca-1f6f344a-f65f-043f72c0064a/capture/42434546] ls
dc1esxedge1-02_2023-11-29T07_42_11_p67108942_d0_sna.pcap
          ....
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d0_sna.pcap.log      dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d0_sna.pcap_rot.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d0_sna.pcap_rot.log  dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap          dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap.log      dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap_rot.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap_rot.log  
killfile

We can see the PCAPs for each of the PortIDs and VMnics, and it will rotate every 15 minutes.

Stopping the collection

You might have noticed the “killfile” from above. Remote this file and the collection will stop.

[root@esxedge1-02:~] rm -rf /vmfs/volumes/5fd89aca-1f6f344a-f65f-043f72c0064a/capture/42434546/killfile

Conclusion

The scripts are handy because they rotate the logs and have a way to monitor and kill the collections. This way we don’t have to manually kill processes on the ESXi host.

When the collection is done, you can copy out the logs for further analysis, I found the Filezilla SFTP client to be the fastest way of copying out the data.

If out want to merge the PCAPs afterward, on MacOS, you can do it with

mergecap -w merged.pcap *.pcap

Troubleshooting

If you find that you can’t start the rotating logs script it might be because you have tried to start it before and it somehow stall. You can find the process IDs for it and kill it manually.

[root@esxedge1-02:~] ps -Tcjstv | grep -i rotating_cap

[root@esxedge1-02:~] kill 

Download

Juniper upgrade process

Junos is in my opinion an awesome OS for your network. I enjoy the CLI, where commands are alike across all of Juniper’s products. Also, the many features and the fact that it’s not cisco.

BUT it also has its drawbacks. Honestly, I have seen some weird bugs. And keeping track of all the PRs from Juniper is a full-time job. And last but not least, the software upgrades are kind of a pain. especially on Junos devices older than 18.x.

EX3400 – format/install

For this case, I had a new EX3400, but with older firmware, 15.1X53-D58.3. I needed to upgrade to the latest SR in the newest train but from the CLI of the device only jumping 3 firmware versions are supported.

15.1> 18.1 > 18.4 > 19.3 > 20.2 > 21.1

But you can also do a format/install where you interrupt the boot process and then load a new firmware image on the device from a TFTP server. This is all done outside of Junos. This way you can jump to whatever version you want.

Jumping many version might make your config invalid, so beaware.

Juniper has a LOT of kb articles for this process and they all vary. So here is the process in my own writing

Process of format install

First, we need to get the right image from the juniper support side. It needs to the install image and the extension is .tgz

  • Download the image into your TFTP server.

In my case, the TFTP is a Linux box. If you prefer windows then TFTPd3264 is the way to go. Or MacOS then look here.

root@tftp:/srv/tftp# wget -O junos-install-media-net-ex-arm-32-21.4R1.12.tgz  'https://cdn.juniper.net/software/junos/21.4R1.12/junos-install-media-net-ex-arm-32-21.4R1.12.tgz?SM_USER=jv......5ce43fbdad2'
Resolving cdn.juniper.net (cdn.juniper.net)... 23.78.40.231
Connecting to cdn.juniper.net (cdn.juniper.net)|23.78.40.231|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 393745989 (376M) [application/octet-stream]
Saving to: ‘junos-install-media-net-ex-arm-32-21.4R1.12.tgz’

junos-install-media-net-ex-arm-32-21.4R1.12.tgz      100%[==================================================================>] 375.50M  3.48MB/s    in 2m 4s

2022-01-26 20:47:46 (3.03 MB/s) - ‘junos-install-media-net-ex-arm-32-21.4R1.12.tgz’ saved [393745989/393745989]

root@tftp:/srv/tftp# ls
junos-install-media-net-ex-arm-32-21.4R1.12.tgz
  • Now let’s reboot the switch and interrupt the “first” boot loader. just keep hitting ctrl+c after you powered rebooted when you see the “=>” you are in the right place. Here we set the IP address on the me0 interface and boot into the next boot loader.
Board: EX3400-24T
Base MAC: C00380FAAD2E
arm_clk=1000MHz, axi_clk=500MHz, apb_clk=125MHz, arm_periph_clk=500MHz
Net:   Registering eth
Broadcom BCM IPROC Ethernet driver 0.1
Using GMAC0 (0x18022000)
et0: ethHw_chipAttach: Chip ID: 0xdc14; phyaddr: 0x1
et0: gmac_serdes_init read sdctl(0xf4141c)
et0: gmac_serdes_init() serdes_status0: 0xf100ff00; serdes_status1: 0xf00
et0: gmac_serdes_init() PLL ready brought up exit
serdes_reset_core pbyaddr(0x1) id2(0xf)
bcmiproc_eth-0
Last Reset Reason: 0
Hit ^C to stop autoboot:  0
=>setenv ipaddr 10.1.100.253
=>setenv gatewayip 10.1.100.1
=>setenv netmask 255.255.255.0
=>setenv serverip 10.1.101.130
=>save
=>boot
Saving Environment to SPI Flash...
SF: Detected MX25L6405D with page size 256 Bytes, erase size 64 KiB, total 8 MiB, mapped at 0001faa0
Erasing SPI flash...Writing to SPI flash...done
Erasing SPI flash...Writing to SPI flash...done
SF: Detected MX25L6405D with page size 256 Bytes, erase size 64 KiB, total 8 MiB
device 0 offset 0x3c0000, size 0x10000
SF: 65536 bytes @ 0x3c0000 Read: OK
  • Wait for a few seconds for the next bootloader to appear and press ctrl+c again. Now you will see a menu, this menu you choose 5 and 5 and you should see “loader>”
Hit ^C to stop autoboot:  0 
Options Menu

1.  Recover [J]unos volume
2.  Recovery mode - [C]LI

3.  Check [F]ile system
4.  Enable [V]erbose boot
5.  [B]oot prompt
6.  [M]ain menu
Choice: 
Type 'menu' to go back to the menu
Type 'boot-junos' to boot into Junos
Type 'reboot' to reboot

5 5
  • We now set use the install format with the TFTP location of the image we downloaded in the first step.
Type '?' for a list of commands, 'help' for more detailed help.
loader> install --format tftp://10.1.101.130/junos-install-media-net-ex-arm-32-21.4R1.12.tgz
/kernel text=0x105b888 data=0x640fc+0x1fbf04 syms=[0x4+0x914a0+0x4+0x9b821]
/ex3400.dtb size=0x1f76
/crypto.ko text=0x419e0 data=0xe58+0x2a0 syms=[0x4+0x4740+0x4+0x2ba5]
/iflib.ko text=0x11f10 data=0x910+0x58 syms=[0x4+0x2b10+0x4+0x2194]
/miibus.ko text=0x19f38 data=0x10c4+0x78 syms=[0x4+0x51f0+0x4+0x3491]
/if_gmac.ko text=0xbc3c data=0x688+0xc syms=[0x4+0x1cc0+0x4+0x15ad]
/contents.iso size=0x279b000
Using DTB from loaded file '/ex3400.dtb'.
Kernel entry at 0xc1000180...
Kernel args: (null)
---<<BOOT>>---
GDB: no debug ports present
K cache
Release APs
WARNING: WITNESS option enabled, expect reduced performance.
mwill now attempt to reach the remote host.
<====== LOADS OF OUTPUT TO CONSOLE ======>
<====== LOADS OF OUTPUT TO CONSOLE ======>
Downloading /junos-install-media-net-ex-arm-32-21.4R1.12.tgz from 10.1.101.130 ...
rmed on 1024 samples passed.t-up health tests perfo
  300.6MB  03:52random: unblocking device.
  393.7MB  05:04
Installing Junos OS release ...

After 15-20 minutes the switch will have the install finished and ready for you to log into and start loading your config.

FreeBSD/arm (Amnesiac) (ttyu0)
login: 

Conclusion

This is a very helpful process and might come in handy when having new switches with old firmware that need to be applied. Skipping the smaller version jumps is a time saver.

This format install process can also be done with a USB key. This process is also quite simple but requires you to have physical access to the switch.

In my case, I have a console over ssh and can manage the switch out-of-band so TFTP is the easy way.