More and more customers seem to be requesting ERSPAN for Network Trafic Analyser appliances. Companies promise to analyze your VM network traffic and give security tips on what’s happening in your environment.
NSX supports multiple ways of doing SPAN(Switched Port Analyser) or port mirroring. I have listed the two useful variants first. The two last are very limited and not very useful.
- Logical SPANโThe source and destination must be in the same overlay segment. The source will be VMs, and each SPAN entry will be limited to 5 VMs.
- Remote L3 SPAN – The source can be an overlay segment and the destination must be L3 reachable from all eligible host VMKernel ports.
There are two more, accessible from a policy view, but I don’t find them useful.
- Local SPAN – The source and destination must be on the same host.
- Remote SPAN – The Source and destination must be L2 reachable
For this article, we will look at how we can implement ERSPAN. With this setup the hosts holding the VM will be the ones to create the port mirror, encapsulate it in GRE, and send it off to its destination This means we will need to create an overlay segment where we can connect VMKernel from all hosts to.
What is ERSPAN?
Packets are encapsulated in GRE and additional headers are added to the original packet. Details below
Original Ethernet Frame: The packet being mirrored (source traffic).
- ERSPAN Header: metadata, including session ID, VLAN ID, timestamp, and sequence number.
- GRE (Generic Routing Encapsulation) Header: Wraps the ERSPAN traffic to enable transport over an IP network.
- Outer IP Header: Specifies the source and destination IP addresses of the ERSPAN session, enabling routing across the network.
Environment:
Quick drawing of what we are doing. The ERSPAN will mirror customer network traffic and source it from the hosts VMKernal ports into the mirror segment. NSX will be told to ship off the ERSPAN traffic to the collector IP.
First, we need to create a VMKkernel port for the mirror network. In my case, I created an overlay segment dedicated to the mirror traffic. On the two hosts in my setup, I made a VMKernal port with an IP in the mirror segment subnet.
My collector VM then has a nic and IP in the mirror segment and this is where we are going to send the ERSPAN traffic.
In the source I have the customer network segment and the destination is the IP of the collector VM.
On the collector VM a dedicated nic is used for the ERSPAN traffic. Using Wireshark on the interface with the option “ip proto 0x2f” decapsulated the GRE and showed the original packets.
In the example above I’m following a TCP stream where a client is accessing an IIS server on 10.44.44.10. Here we can see all the communication packets of between the client and server.
Performance
let’s say, it’s not free in terms of the use of resources. For the host to duplicate and encapsulate the traffic it takes up some CPU resources. DPUs can offload this if you want, but that also comes with a cost.
I haven’t been able to find out the real impact of ERSPAN in terms of resources. If you know a way to see how many CPU cycles etc are used to create ERSPAN, please let me in on this secret ๐
Performance test
Here two VMs on the customer network segment have an iperf running. The hosts are connected with a 10Gbit interface, and with the iperf setup below we saturate this interface.
iperf3.exe -c 10.44.44.10 -P2 -t 1000
In Wireshark we can see that we received all the packages, but for bandwidth i would expect more. I only see about 9 Mbit of traffic on the Windows task manager. My expectation would be that Wireshark to receive the complete package. ERSPAN traffic should match the iperf traffic usage.
Again, if you know the ERSPAN traffic does not match the monitored segments bandwidth usage, let me know ๐
Conclusion
NSX can indeed deliver mirror port data. VMware documentation is stating that its a feature for troubleshoting and not permanent usage. Talked to a VMware employee at Explore 2024 that said that this might be an obsolete statement in the documentation, but that they have had cases where network got complete exhausted due to enable port mirror on the networks that where also carrying the ERSPAN traffic, which ofcause will make a loop in traffic.
Have been running this setup for couple of month now and it seems to be stable. A way to help ensure that ERSPAN traffic will not flood your network would be to implement QoS on the overlay segment carrieing the ERPSAN. This way it the traffic will be droped