ERSPAN with NSX

More and more customers seem to be requesting ERSPAN for Network Trafic Analyser appliances. Companies promise to analyze your VM network traffic and give security tips on what’s happening in your environment.

NSX supports multiple ways of doing SPAN(Switched Port Analyser) or port mirroring. I have listed the two useful variants first. The two last are very limited and not very useful.

  1. Logical SPAN—The source and destination must be in the same overlay segment. The source will be VMs, and each SPAN entry will be limited to 5 VMs.
  2. Remote L3 SPAN – The source can be an overlay segment and the destination must be L3 reachable from all eligible host VMKernel ports.

There are two more, accessible from a policy view, but I don’t find them useful.

  1. Local SPAN – The source and destination must be on the same host.
  2. Remote SPAN – The Source and destination must be L2 reachable

For this article, we will look at how we can implement ERSPAN. With this setup the hosts holding the VM will be the ones to create the port mirror, encapsulate it in GRE, and send it off to its destination This means we will need to create an overlay segment where we can connect VMKernel from all hosts to.

What is ERSPAN?

Packets are encapsulated in GRE and additional headers are added to the original packet. Details below

Original Ethernet Frame: The packet being mirrored (source traffic).

  1. ERSPAN Header: metadata, including session ID, VLAN ID, timestamp, and sequence number.
  2. GRE (Generic Routing Encapsulation) Header: Wraps the ERSPAN traffic to enable transport over an IP network.
  3. Outer IP Header: Specifies the source and destination IP addresses of the ERSPAN session, enabling routing across the network.

Environment:

Quick drawing of what we are doing. The ERSPAN will mirror customer network traffic and source it from the hosts VMKernal ports into the mirror segment. NSX will be told to ship off the ERSPAN traffic to the collector IP.

dc1
dc1
VDS

VDS
vmnic1
vmnic1
vmnic1
vmnic1
vmk – mgmt
vmk – m…
vmnic1
vmnic1
vmnic1
vmnic1
vmk – mirror
vmk – m…
segment – customer network
segment – customer network
app2
app2
vmk – vmotion
vmk – vm…
vmk – mirror
vmk – m…
vmk – mgmt
vmk – m…
vmk – vmotion
vmk – vm…
segment – mirror network
segment – mirror network
collector
colle…
Text is not SVG – cannot display

First, we need to create a VMKkernel port for the mirror network. In my case, I created an overlay segment dedicated to the mirror traffic. On the two hosts in my setup, I made a VMKernal port with an IP in the mirror segment subnet.

My collector VM then has a nic and IP in the mirror segment and this is where we are going to send the ERSPAN traffic.

In the source I have the customer network segment and the destination is the IP of the collector VM.

On the collector VM a dedicated nic is used for the ERSPAN traffic. Using Wireshark on the interface with the option “ip proto 0x2f” decapsulated the GRE and showed the original packets.

In the example above I’m following a TCP stream where a client is accessing an IIS server on 10.44.44.10. Here we can see all the communication packets of between the client and server.

Performance

let’s say, it’s not free in terms of the use of resources. For the host to duplicate and encapsulate the traffic it takes up some CPU resources. DPUs can offload this if you want, but that also comes with a cost.

I haven’t been able to find out the real impact of ERSPAN in terms of resources. If you know a way to see how many CPU cycles etc are used to create ERSPAN, please let me in on this secret 🙂

Performance test

Here two VMs on the customer network segment have an iperf running. The hosts are connected with a 10Gbit interface, and with the iperf setup below we saturate this interface.

iperf3.exe -c 10.44.44.10 -P2 -t 1000

In Wireshark we can see that we received all the packages, but for bandwidth i would expect more. I only see about 9 Mbit of traffic on the Windows task manager. My expectation would be that Wireshark to receive the complete package. ERSPAN traffic should match the iperf traffic usage.

Again, if you know the ERSPAN traffic does not match the monitored segments bandwidth usage, let me know 🙂

Conclusion

NSX can indeed deliver mirror port data. VMware documentation is stating that its a feature for troubleshoting and not permanent usage. Talked to a VMware employee at Explore 2024 that said that this might be an obsolete statement in the documentation, but that they have had cases where network got complete exhausted due to enable port mirror on the networks that where also carrying the ERSPAN traffic, which ofcause will make a loop in traffic.

Have been running this setup for couple of month now and it seems to be stable. A way to help ensure that ERSPAN traffic will not flood your network would be to implement QoS on the overlay segment carrieing the ERPSAN. This way it the traffic will be droped

Was this post helpful?

Was this article helpful?
YesNo

Jesper Ramsgaard