VMware NSX BGP Route maps

BGP Routemaps within VMware NSX allow administrators to manipulate BGP routing decisions based on specified conditions. They enable administrators to influence route selection, adjust attributes, and filter routes to meet the needs of their network architecture.

Key Components of BGP Route maps:

Before we delve into practical examples, let’s review the fundamental components of BGP Route maps in VMware NSX:

  1. Match Conditions: These define the criteria against which routes are evaluated. Match conditions include route prefixes, AS paths, route origins, etc.
  2. Actions: Actions specify what should happen to routes that match the specified conditions. Common actions include setting attributes like local preference, AS path prepend, and route filtering.

In each route map, multiple things can be set. In this section, we will look at two very common ways to control incoming and outgoing traffic.

AS Prepend

AS prepend is a technique that influences inbound traffic flow by manipulating the AS path attribute of outgoing BGP advertisements. Adding your AS number multiple times to the AS path can make certain routes less preferable or expensive to incoming traffic.

Local Preference

Local preference is an attribute used to influence outgoing traffic flow by indicating the preferred exit point for traffic leaving your network.

The setup

We already got our T0 setup with BGP Peers to our upstream routers.

We also have a VM behind a T1 that is connected to the T0. From here we can trace routes so we can see what path it will take.

A tracert from the VM shows us that we are getting out of the .185 peer.

How to

Create the prefix lists

If not already, then we need to define some prefix lists. It can be done with 0.0.0.0/0 or a more specific prefix if we want to specify what the route map will hit IPs. Head into the NSX manager, Networking, Tier-0 Gateways, and edit the T0. Under its Routing tab, you will see prefixes.

In this example, there is both a specific and an any. We will use the 0.0.0.0/0 here

Create the route maps

Next, we need to prepare the route maps. They can also be found under the tier-0 routing section. Here we will define two routemaps. one for prepend and one for local pref.

Apply the route maps

Now we have set together their route map with its criteria and prefixes. We now need to apply the route maps to the BGP Peers. Under the BGP section of the T0 press the “BGP Neighbors”.

Now we will add the local pref for the .177 peer. This will manipulate the use of this peer for the outgoing traffic for the specified prefixes.

Caution: i have seen that if you choose local pref on one bgp peer and the other peers dont have as prepend setup, then traffic will drop. Not acaltly sure why, but i assume that i could be due to asymetric route. eggress on on the peer with local pref and incoming on the other peer. But when setting as prepend on the other peers it works again.  

Traffic from our test VM will now flow through the .177 peer instead of the .185 peer that was before. Looking at the incoming traffic it also goes to the peer with the best route, hence the peer that doesn’t have any prepend on its AS number.

Conclusion

With BGP Routemaps in VMware NSX, administrators can control traffic flow precisely, optimizing network performance, resource utilization, or migration. Experiment with different match conditions and actions to tailor route maps to the unique requirements of your NSX environment.

Cloud Director – LB Content Switching

Content switching is used to direct incoming traffic to different pools/servers based on the content of the requests. By looking at the application layer (L7 in the OSI model) we can inspect the client request content, such as URL, header, cookies, queries, etc.

ALB has the concept of policies on virtual services. Each policy rule can have a match and action to it. So we can match on a host header http request and then do a content switch action afterward.

This is, unfortunately, a feature that is not yet supported by Cloud Director(10.5.1). But it can be done in the ALB manager, behind the back of VCD. Rules are in the VCD UI, but not the content of the policy.

My guess is therefore that when it comes to VCD UI in the future, it will be able to adopt the existing content-switching policies.

If you also think this feature should be implemented in VCD, please do a feature request on this site.

Workaround

In Cloud Director you setup the pools you want to do content switching on. From the screenshot below you can see port80 and port443 are the ones being used to day. Creating the extra pools here will have VCD aware of the pool, so we only need to change a bit on existing objects in the ALB manager.

In the ALB manager, find the virtual service, edit it, and navigate to the policy section. Here we can see the different policy types. Find “HTTP request” and add to new rules.

Each rule will have the match and action point as mentioned before. When doing the action content switch get to choose a specific pool.

Save it all and you are ready to test the content switching. From the screenshot below, you can see how it will look inside the VCD UI.

NSX V2T migration

The reason for me to research how to do content switching was primarily that some tenants are still using NSX-V because they use haproxy application rules to do content switching. NSX migration for Cloud Director tool does support basic load balancing migrations. But not when there are application rules applied.

Now tenant can schedule the service window, remove the application rules, and have the migration tool migrate all nat, firewall, routes, and basic load balancing.

After the migrations are done the content switching rules can be created manually, from what the application rules in haproxy specified.

Conclusion

Even if it’s not supported by VCD yet, we can still do it. Ofcause the tenants can’t do it themself but will need to log a support case with you until the feature is introduced. And when they do, then let’s hope it will adopt the rules being done behind its back.

Cloud Director PowerShell deploy

Having to deploy multiple Cloud Director environments can be tedious, especially when OVF deployment in vCenter fails with some obsure error all the time. But using powershell to deploy is smooth sailing.

Here you will get the small piece of PowerShell to do so. Its very basic and but works everytime.

Connect-VIServer -server vcsa.home.lab

$OVA = "C:\teknik\VMware_Cloud_Director-10.4.1.9057-20912720_OVF10.ova"

# vCenter Server used to deploy to
$VMCluster = "mgmt01"
$VMDatastore = "vSan"

# VCD Configuration
$VMName = "vcd1-03"

$VMNetwork1 = "vcd_vCloud_External_Perimeter"
$VMNetwork2 = "vcd_vCloud_Internal_Perimeter"

$VMNetworkIP1 = "10.66.66.31"
$VMNetworkIP2 = "10.55.55.31"
$VMNetmask = "255.255.255.0"
$VMGateway = "10.66.66.1"

$VMDNS = "10.44.44.10"
$VMNTP = "ntp.home.lab"
$VMSearchPath = "home.lab"

$eth0Routes = ""
$eth1Routes = "10.55.55.1 10.44.44.0/24, 10.55.55.1 10.1.94.0/24"

$DeploymentSize = "standby-medium"
$RootPassword = ''

$cluster = Get-Cluster $VMCluster
$vmhost = $cluster | Get-VMHost | Get-Random
$datastore = $vmhost | Get-Datastore $VMDatastore

# Setup ovfconfig
$OvfConfig = Get-OvfConfiguration $OVA

$OvfConfig.DeploymentOption.Value = $DeploymentSize

$OvfConfig.NetworkMapping.eth0_Network.Value = $VMNetwork1
$OvfConfig.NetworkMapping.eth1_Network.Value = $VMNetwork2

$OvfConfig.vami.VMware_vCloud_Director.ip0.Value =  $VMNetworkIP1
$OvfConfig.vami.VMware_vCloud_Director.ip1.Value =  $VMNetworkIP2
$OvfConfig.vami.VMware_vCloud_Director.netmask0.Value = $VMNetmask
$OvfConfig.vami.VMware_vCloud_Director.netmask1.Value = $VMNetmask
$OvfConfig.vami.VMware_vCloud_Director.gateway.Value = $VMGateway
$OvfConfig.vami.VMware_vCloud_Director.DNS.Value = $VMDns
$OvfConfig.vami.VMware_vCloud_Director.domain.Value = $VMName
$OvfConfig.vami.VMware_vCloud_Director.searchpath.Value = $VMSearchPath

$OvfConfig.vcloudapp.VMware_vCloud_Director.enable_ssh.Value = $true
$OvfConfig.vcloudapp.VMware_vCloud_Director.expire_root_password.Value = $false
$OvfConfig.vcloudapp.VMware_vCloud_Director.ntp_server.Value = $VMNTP
$OvfConfig.vcloudapp.VMware_vCloud_Director.varoot_password.Value = $RootPassword

$OvfConfig.vcloudnet.VMware_vCloud_Director.routes0.Value = $eth0Routes
$OvfConfig.vcloudnet.VMware_vCloud_Director.routes1.Value = $eth1Routes

# Deploy
Import-VApp -Source $OVA -OvfConfiguration $OvfConfig -Name $VMName -Location $cluster -VMHost $vmhost -Datastore $datastore -DiskStorageFormat thin

ESXi VM PCAP collection

This is a small guide on how to collect PCAPs for network traffic from/to a virtual machine running on VMware ESXi. Scripts have been provided by VMware GSS and can be found here and used at your own risk.

(I get it, who would want to use Python scripts downloaded from a random website and run them on your ESXi, feel free to look into the code after download, or else request them from VMware)

The scripts can help to collect on multiple PortIDs, divide the PCAPs into smaller files, and rotate the logs for each x minutes and it can help to monitor the capture that is ongoing.

If you don’t want to use the scripts, then there is also a more manual way to run the collection here.

Finding the VM port IDs

Connect to the ESXi host with SSH and list the running VMs.

[root@dc1esxedge1-02:~] esxcli network vm list
World ID  Name              Num Ports  Networks
--------  ----------------  ---------  --------
68018705  nsxedgenode1             3  dvportgroup-914135, dvportgroup-914135 

Find the VM that you want to collect from, let’s take WorldID 68018705 and check the switch port IDs for it

[root@esxedge1-02:~] esxcli network vm port list -w 68018092
   Port ID: 67108942
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914135
   DVPort ID: 109
   MAC Address: 00:50:56:ab:82:59
   IP Address: 0.0.0.0
   Team Uplink: vmnic2
   Uplink Port ID: 2214592556
   Active Filters:

   Port ID: 67108943
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914135
   DVPort ID: 108
   MAC Address: 00:50:56:ab:06:b0
   IP Address: 0.0.0.0
   Team Uplink: vmnic2
   Uplink Port ID: 2214592556
   Active Filters:

   Port ID: 67108944
   vSwitch: DSVNSX
   Portgroup: dvportgroup-914134
   DVPort ID: 98
   MAC Address: 00:50:56:ab:93:ab
   IP Address: 0.0.0.0
   Team Uplink: vmnic1
   Uplink Port ID: 2214592558
   Active Filters:

In this case, we note down the following port IDs and save them for later

67108942
67108943
67108945
67108946

Start the collection with scripts

Upload the Python script to the /tmp/ on the ESXi host. You also need a folder on the local datastore of the ESXi host where the pcap logs can be stored.

Prepare and run the first script where we provide the PortIDs and where to store the PCAPs

[root@esxedge1-02:~] python /tmp/prepare_pktcap.py -p 67108942 -p 67108943 -p 67108945 -p67108946 -u -d /vmfs/volu
mes/prd.dc1esxedge1-02/capture/42434546 -o /tmp/rotating_cap.sh -G 15m -r 3600
The current vmci heap is 99% free and so far there have been 0 allocation failures
[root@esxedge1-02:~]

Next, we will need to start the rotating log script, this will start the PCAPs collection threads and rotate the logs each 15 min as we defined in the prepare_pktcap script. The script should be left running in the SSH connection. if you stop it, the collection will also stop.

[root@esxedge1-02:~] /tmp/rotating_cap.py
Dump: 294272, broken : 0, drop: 0, file err: 0.
Dump: 175296, broken : 0, drop: 0, file err: 0.
Dump: 371584, broken : 0, drop: 0, file err: 0.
Dump: 317248, broken : 0, drop: 0, file err: 0.
Dump: 371648, broken : 0, drop: 0, file err: 0.
Dump: 317312, broken : 0, drop: 0, file err: 0.

Monitor the collection

The last script is for doing the monitoring of the sessions. You will need to open a new SSH session to the host to run this script.

[root@esxedge1-02:~] /tmp/pktcap_sessions.py -l
The vmci heap is 58% free
session    portID       devName
579        67108942
580        67108942
581        67108945
582        67108946
583        67108943
584        2214592556   vmnic2
585        67108946
586        2214592556   vmnic2
587        67108943
588        67108945
589        2214592558   vmnic1
590        2214592558   vmnic1

From the output, we can see that it collects the PortIDs that we have defined, but it also collects PCAP for the VMNics. This is valuable to us if we need to compare traffic on what is going in on the host and to the VM and visa versa.

Looking at the output that is stored on the datastore

[root@esxedge1-02:/vmfs/volumes/5fd89aca-1f6f344a-f65f-043f72c0064a/capture/42434546] ls
dc1esxedge1-02_2023-11-29T07_42_11_p67108942_d0_sna.pcap
          ....
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d0_sna.pcap.log      dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d0_sna.pcap_rot.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d0_sna.pcap_rot.log  dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap          dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap.log      dc1esxedge1-02_2023-11-29T07_42_11_pvmnic2_d1_sna.pcap_rot.log
dc1esxedge1-02_2023-11-29T07_42_11_p67108945_d1_sna.pcap_rot.log  
killfile

We can see the PCAPs for each of the PortIDs and VMnics, and it will rotate every 15 minutes.

Stopping the collection

You might have noticed the “killfile” from above. Remote this file and the collection will stop.

[root@esxedge1-02:~] rm -rf /vmfs/volumes/5fd89aca-1f6f344a-f65f-043f72c0064a/capture/42434546/killfile

Conclusion

The scripts are handy because they rotate the logs and have a way to monitor and kill the collections. This way we don’t have to manually kill processes on the ESXi host.

When the collection is done, you can copy out the logs for further analysis, I found the Filezilla SFTP client to be the fastest way of copying out the data.

If out want to merge the PCAPs afterward, on MacOS, you can do it with

mergecap -w merged.pcap *.pcap

Troubleshooting

If you find that you can’t start the rotating logs script it might be because you have tried to start it before and it somehow stall. You can find the process IDs for it and kill it manually.

[root@esxedge1-02:~] ps -Tcjstv | grep -i rotating_cap

[root@esxedge1-02:~] kill 

Download

Just Enoght vSphere rights for replication

Just enough rights for a user to see and manage VMs in order to setup migration replication with either Azure Migrate or VMware Availability.

We are seeing more and more moving VMs around between providers, just a few years back this was not something that anybody wanted to go into, but the market is in a transition where offboarding is as important as onboarding. Good for the customer.

To ensure that the rights are just enough so that Azure Migrate or VMware Availability Onprem can’t see all VMs in your datacenter you need to limit the rights that the appliance is given.

To help out I have created a small permission script that can help with the setup of permissions.

It will create two roles, one global and one for the tenant resource group. Then it will setup the permissions so it’s just enough for the replication to work.

If you do not want the VCDA plugin into vCenter then you can remove the lines that define “Extension”

$roleGlobal = "vcda_repl_global"
$roleTenant = "vcda_repl_tenant"
$viserver = "vcsa1.home.lab"
$tenantRespool = "tenant1.comp1.01 (6284cdf1-cb7f-43bb-8e0f-09f439e09555)"
$vsphereUser = "tenant1_vcda"

Connect-VIServer -server $viserver

$roleIds = @()
$roleIds += "System.Anonymous"
$roleIds += "System.View"
$roleIds += "System.Read"
### Cryptographic Operations
$roleIds += "Cryptographer.ManageKeys"
$roleIds += "Cryptographer.RegisterHost"
### Datastore Privileges
$roleIds += "Datastore.Browse"
$roleIds += "Datastore.Config"
$roleIds += "Datastore.FileManagement"
### Extension Privileges - Not needed if you dont want plugin to vcenter 
$roleIds += "Extension.Register"
$roleIds += "Extension.Unregister"
$roleIds += "Extension.Update"
### Host Configuration Privileges
$roleIds += "Host.Config.Connection"
### Profile-driven Storage Privileges
$roleIds += "StorageProfile.View"
### Storage Views Privileges
$roleIds += "StorageViews"
### Host.Hbr.HbrManagement
$roleIds += "Host.Hbr.HbrManagement"

$roleIdsTenant = @()
### Resource Privileges
$roleIdsTenant += "Resource.AssignVMToPool"
### Virtual Machine Configuration Privileges
$roleIdsTenant += "VirtualMachine.Config.AddExistingDisk"
$roleIdsTenant += "VirtualMachine.Config.Settings"
$roleIdsTenant += "VirtualMachine.Config.RemoveDisk"
### Virtual Machine Inventory Privileges
$roleIdsTenant += "VirtualMachine.Inventory.register"
$roleIdsTenant += "VirtualMachine.Inventory.Unregister"
### Virtual Machine Interaction
$roleIdsTenant += "VirtualMachine.Interact.PowerOn"
$roleIdsTenant += "VirtualMachine.Interact.PowerOff"
### Virtual Machine State Privileges
$roleIdsTenant += "VirtualMachine.State.CreateSnapshot"
$roleIdsTenant += "VirtualMachine.State.RemoveSnapshot"
### Host.Hbr.HbrManagement
$roleIdsTenant += "VirtualMachine.Hbr.ConfigureReplication"
$roleIdsTenant += "VirtualMachine.Hbr.ReplicaManagement"
$roleIdsTenant += "VirtualMachine.Hbr.MonitorReplication"


New-VIRole -name $roleGlobal -Privilege (Get-VIPrivilege -Server $viserver -id $roleIds) -Server $viserver
Set-VIRole -Role $roleGlobal -AddPrivilege (Get-VIPrivilege -Server $viserver -id $roleIds) -Server $viserver

New-VIRole -name $roleTenant -Privilege (Get-VIPrivilege -Server $viserver -id $roleIdsTenant) -Server $viserver
Set-VIRole -Role $roleTenant -AddPrivilege (Get-VIPrivilege -Server $viserver -id $roleIdsTenant) -Server $viserver

$globalPrivileges = Get-VIPrivilege -Role $roleGlobal

$rootFolder = Get-Folder -NoRecursion
$permission1 = New-VIPermission -Entity $rootFolder -Principal (Get-VIAccount -Domain vsphere.local -User $vsphereUser ) -Role $roleGlobal -Propagate:$false

$tenantRespool = Get-ResourcePool -Name $tenantRespool
$permission1 = New-VIPermission -Entity  $tenantRespool -Principal (Get-VIAccount -Domain vsphere.local -User $vsphereUser ) -Role $roleTenant -Propagate:$true

VCD – Force delete network

In our v2t conversion, the NSX for Cloud Director migration tool has had some issues when doing cleanup. One of them is that it cant delete the old NSX-V backed network even though there is nothing left in VCD using it. The error message can be seen below.

2023-05-22 10:54:28,551 [connectionpool]:[_make_request]:452 [DEBUG] [tenant.01] | https://vcd.ramsgaard.me:443 "DELETE /cloudapi/1.0.0/orgVdcNetworks/urn:vcloud:network:ce108a33-fa5c-4cae-8c16-60edd536ad20 HTTP/1.1" 400 None
2023-05-22 10:54:28,556 [vcdOperations]:[deleteOrgVDCNetworks]:1090 [DEBUG] [tenant.01] | Failed to delete Organization VDC Network lan.[ 1ca6fd03-de82-4835-b12e-58c5c043b2bc ] Network lan cannot be deleted, because it is in use by the following vApp Networks: lan.
2023-05-22 10:54:28,556 [vcdNSXMigratorCleanup]:[run]:230 [ERROR] [tenant.01] | Failed to delete Org VDC networks ['lan'] - as it is in use
Traceback (most recent call last):
  File "src\vcdNSXMigratorCleanup.py", line 218, in run
  File "<string>", line 1, in <module>
  File "src\core\vcd\vcdValidations.py", line 53, in inner
  File "src\core\vcd\vcdOperations.py", line 1094, in deleteOrgVDCNetworks
Exception: Failed to delete Org VDC networks ['lan'] - as it is in use

I found someone else having this problem, where they discovered a forceful way to delete the network. I have used this but wrapped it in Powershell instead. In my case, it can get the network URN from the log of the migration tools. Else you can also easily see the URN from the GUI URL when in the context of the network.

### Variables
$vcdUrl = "https://vcd.ramsgaard.me"
$apiusername = "@system"
$password = ''
$networkUrn = "urn:vcloud:network:ce108a33-fa5c-4cae-8c16-60edd536ad20"

### Auth against API and enable TLS1.2 for PowerShell
$base64AuthInfo = [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f $apiusername,$password)))
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls12
$auth =Invoke-WebRequest -Uri "$vcdUrl/api/sessions" -Headers @{Accept = "application/*;version=36.0";Authorization="Basic $base64AuthInfo"} -Method Post

### Get VirtualWire
$virtualWire = Invoke-RestMethod -Uri "$vcdUrl/cloudapi/1.0.0/orgVdcNetworks/$($networkUrn)" -Headers @{Accept = "application/json;version=36.0";Authorization="Bearer $($auth.Headers.'X-VMWARE-VCLOUD-ACCESS-TOKEN')"} -Method GET
$virtualWire

### Delete VirtualWire
$deleteStatus = Invoke-RestMethod -Uri "https://vcd.hostcenter.dk/cloudapi/1.0.0/orgVdcNetworks/$($networkUrn)?force=true" -Headers @{Accept = "application/json;version=36.0";Authorization="Bearer $($auth.Headers.'X-VMWARE-VCLOUD-ACCESS-TOKEN')"} -Method DELETE
$deleteStatus

Above PowerShell is used at your own risk 🙂

Basic VMware PhotonOS config

Still a bit new to PhotonOS. But it’s getting more and more that I use PhotonOS for deploying things with, like CSE for VCD.

I always struggle with the basics, even though I know a bit about Linux and Unix. So here is come quick tips on

  • First login
  • Network
  • SSH
  • Passwords
  • NTP

First login

When the VM is deployed you can now login to it with root and password “changeme”. It will then require you to change the password to something else.

Network

You can view network interfaces with networkctrl and IP a to show what is already configured.

If you want to set it to a static ip you have to create a new config file.

cat > /etc/systemd/network/10-static-eth0.network << "EOF"
 
[Match]
Name=eth0
 
[Network]
Address=172.16.4.225/24
Gateway=172.16.4.1
DNS=172.16.4.10
Domains=home.lab
EOF

After the file has been created you can set the correct permissions and restart the network and resolver. If you skip the chmod you will probably see a fail in network reboot due to the system not being able to read the new file

chmod 644 /etc/systemd/network/10-static-eth0.network
systemctl restart systemd-networkd
systemdctl restart systemd-resolved

SSH config

When you want to connect to PhotonOS with a password but also have either your pageant or native ssh console running you will try to authenticate with a public/private key. If your key is not added to the server you will get a “Too many failed authentications”. To get around this you can use a parameter.

ssh -o PreferredAuthentications=password -o PubkeyAuthentication=no <hostname>

This will then make authentication with a password.

Forgot root password

If you forgot the root password and need to reset it you can do the following.

  1. reboot and press “e” to enter the grub boot loader. add the line rw init=/bin/bash shown on the screenshot. Press F10 to boot

2. You should now be in a shell on the system, from here you can run passwd

3. Unmount / and reboot the system

umount /
reboot -f

Disable password history

If you encounter the problem of trying to reset the password but the password you want to set says: Password has been already used. Choose another.

You can disable this by editing the /etc/pam.d/system-password.

By changing ‘remember â€˜ from 5 to 0 we can disable the remember password count and reset the root password.

Paste into PhotonOS PuTTy session

When using vi there is a “bug” that prevents you from normal paste with right-click. To workaround this you need to write :set mouse= in vi command mode. After done you can not use paste with right-click.

Shift+Ins might be also used to paste on PuTTY.

NTP settings

Some apps running in PhotonOS are very time-sensitive, vcda is one of them. use watch -n 0,1 date on each of the appliances that need to communicate and verify that time is not skewed.

If you need to set up NTP post-deployment you can do as follows. Edit the timesyncd.conf file with a text editor such as vi:

vi /etc/systemd/timesyncd.conf

In the [Time] section edit the NTP entry with the correct NTP server address:

[Time]
#FallbackNTP=time1.google.com time2.google.com time3.google.com time4.google.com
NTP=ntpAddress

After having put in the NTP server you want to use restart the network and time sync service

systemctl restart systemd-networkd
systemctl restart systemd-timesyncd

Verify that the time on the appliance is now synchronizing with the NTP server.

Further troubleshooting is to see if the NTP service is running

systemctl status systemd-timesyncd

NSX 4.0.1 > 4.1.0 upgrade problems

Precheck gave warnings back for all the edge nodes. Where it stated the problem below.

Edge node 4006d386-a394-43a4-6b04b242f8b3 vmId is not found on NSX Manager. Please refer to https://kb.vmware.com/s/article/90072

The KB article states that NSX managers are missing the VM_ID for the edge nodes and gave an example of how to manually find the Edge VM moref and post it to the NSX API.

Using PowerShell to update the VM_ID

Instead of the manual procedure from the KB, I made a small script.

### Login details
$nsxUsername = "admin"
$nsxPassword = "Yi....kes12!"
$nsxmanager = "nsxt.home.lab"

### Connect to vcenter so that we can fetch moref
Connect-VIServer vcsa1.home.lab

### NSX Manager auth header
$Type = "application/json;charset=UTF-8"
$Header = @{"Authorization" = "Basic " + [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes($nsxUsername + ":" + $nsxPassword)) }
$nsxUri = "https://$($nsxmanager)"

### Edge Vm moref update
$edgenodes =  (Invoke-RestMethod -Uri "$nsxUri/api/v1/transport-nodes" -Headers $Header -Method GET -ContentType $Type).results | Where-Object {$_.node_deployment_info.deployment_type -eq "VIRTUAL_MACHINE"}

### Loop through the edge nodes
foreach($edgenode in $edgenodes){
write-host "Updating edge node - $($edgenode.display_name)"
$specEdge =  (Invoke-RestMethod -Uri "$nsxUri/api/v1/transport-nodes/$($edgenode.id)" -Headers $Header -Method GET -ContentType $Type).node_deployment_info.deployment_config

$vmid = ((get-vm $edgenode.display_name).Id).Split("-")[-1]
write-host "Found edge node moref in vcenter - vm-$vmid)"

write-host "Removing form factore and adding vm_id to object)"
$specEdge | Add-Member -NotePropertyName vm_id -NotePropertyValue "vm-$vmid"
$specEdge = $specEdge | Select-Object -Property * -ExcludeProperty form_factor

try {
    write-host "Updating against NSX API"
    Invoke-RestMethod -Uri "$nsxUri/api/v1/transport-nodes/$($edgenode.id)?action=addOrUpdatePlacementReferences" -Headers $Header -Method POST -ContentType "application/json" -Body $($specEdge|ConvertTo-Json -Depth 10)
}
catch {
   $streamReader = [System.IO.StreamReader]::new($_.Exception.Response.GetResponseStream())
   $ErrResp = $streamReader.ReadToEnd() | ConvertFrom-Json
   $streamReader.Close()
  }
if($ErrResp){
   write-host "$($ErrResp.error_message) - $($edgenode.display_name) not updated with success)" -ForegroundColor Red }
   else{
   write-host "$($edgenode.display_name) updated with success)"
   }
}

Unfortunately, even after updating vm_id the precheck still failed, with the same error. NSX API accepted with code 200 the post nothing happened behind the scenes

VMware support:

Opening a VMware SR, they got logs and the info that we already tried to update the vm_id. They asked us to do a couple of other things.

Refreshing the edge node config data

VMware SR stated us to try and do an API call for refreshing the edge node config data. VMware code NSX API info. With the info from the previous script, we can do a POST against NSX API.

Invoke-RestMethod -Uri "$nsxUri/api/v1/transport-nodes/$($edgenode.id)?action=refresh_node_configuration&resource_type=EdgeNode&read_only=true" -Headers $Header -Method POST

Reboot of edge nodes

Put an Edge node into NSX maintenance mode and afterward do a reboot of the node, initiated from vCenter with a “Restart Guest OS”. The reboot went fine and the Edge node was put into production again.

Reboot of NSX Managers

Rebooting the NSX Managers, one at a time, and of cause waiting for the rebooted node to come back online with no errors before continuing with the next one.

Result

Unfortunately, non of the above helped, precheck still gives the same error.

Further troubleshooting:

Looking at the API guide I stumbled over an edge node redeployment call. I have redeployed the edge nodes manual before, and I have to say, it’s a pain! It’s not hard, but it takes a lot of time. But this call will help to redeploy in a way that doesn’t affect the data plane.

  • Edge is being put into NSX maintenance mode
  • Edge is then deleted and a new one is deployed, with the same naming as the old one.
  • After the edge node is deployed and registered in the manager it exits maintenance mode and goes into production again

Executing the API call

First, we get the config for the specific edge. Afterward, we post to the redeploy API with the body that we got from the config get request.

The try-catch is helping you get a better error description, if something goes wrong.

$redeployBody = (Invoke-RestMethod -Uri $nsxUri/api/v1/transport-nodes/$($edgenode.id) -Headers $Header -Method GET) 

try {
    Invoke-RestMethod -Uri "$nsxUri/api/v1/transport-nodes/$($edgenode.id)?action=redeploy" -Headers $Header -Method POST -ContentType "application/json" -body $redeployBody
    }
catch {
    $streamReader = [System.IO.StreamReader]::new($_.Exception.Response.GetResponseStream())
    $ErrResp = $streamReader.ReadToEnd() | ConvertFrom-Json
    $streamReader.Close()
    $ErrResp
}

After redeployment, more problems….

After the redeployment of one edge node, the error with vm_id was no more for the redeployed node. Great! but now the upgrade coordinator gave a new precheck error… I can’t exactly remember, but it said something like “The Host Upgrade Unit Groups are not suitable for a T0.”

Google results pointed in the direction of VMware KB to reset the upgrade plan. But was not successful. (I then later found out that on the next page of the upgrade wizard, there is a “Reset plan” button)

Continued to redeploy all the remaining edge nodes, this helped clear all the errors from the upgrade precheck. The upgrade could then begin 😀

And another problem…

After the edge nodes that held the T0 gateway, one of the edge nodes was not negotiating up BGP to physical fabric. Tried a lot of things.

  • Redeployment of the edge node with the redeploy API action didn’t help.
  • Migrating the edge node to the same hosts as the working edge node was on, didn’t help.
  • Ping from working edge node to non-working on the VLAN uplink IPs worked.
  • Ping from non-working to physical fabric didn’t work.

Then tried to remove and add the interfaces of the T0 on the non-working edge node. Made everything work again. A quite random bug?

Conclusion

NSX upgrades compared to NSX-V upgrades seem to be quite troublesome or let’s say that there is room for improvements. The good thing is that when edge nodes are upgraded then all tenants are also. And the upgrade happens with zero downtime. The T0/T1 robustness is amazing, no drops, no IPSec go down.

VCD CPI/CCM – Load balancer

When a TKG cluster is deployed by Container Service Extension(CSE) it means that it lives within VMware Cloud Director(VCD).

Inside this TKG cluster, you will find a Cloud Controller Manager(CCM) pod under kube-system called “vmware-cloud-director-ccm”. CCM pod is part of Cloud Provider Interface(CPI) that gives you some capabilities on how for example to add a Persistent Volume(PV) or do load balancing with NSX Advanced Loadbalancer(ALB). Basically, the CCM will contact the VCD CPI API and from there orchestrate what you requested in your Kubernetes YAML files.

At the time of writing, L4 load balancing(LB) features through ALB are the only option available. This is because the CPI is not yet completely featured to create L7 LB with ALB. It’s on the roadmap though.

One-arm vs. two-arm…

I found that there are two conspects that are with knowing of, one-arm and two-arm load balanceres.

The two LB methods are described here. Since it’s an old article, AVI/ALB was not in the VMware portfolio back then. And also NSX-T has migrated away from the LB service where it lived as haproxy within tier1 and over to AVI/ALB service engines.

The default setting of load balancing with VCD CPI is two arms, meaning that it will tell VCD to create a DNAT rule towards a 192.168.8.x internal subnet used to create ALB VIPs.

WAN > T1 DNAT(185.139.232.x:80)> 192.168.8.x(LB internal subnet) > ALB SE > LB Pool members

Since L7 LB features in VCD are not yet available AND also it will become very costly. Most customers will probably choose to have an Nginx or Apache ingress controller inside their own Kubernetes cluster.

Since VCD 10.4 the two-arm config has been working and therefore it’s more desirable since you can use multiple ports on a single public IP. Where one-arm config would allocate one public IP pr Kubernetes service(correct me if I’m wrong).

If you are running your own ingress controller then some find the one-arm approach more desirable since ALB will then hold the public IP address.

WAN > T1 Static Route to ALB (185.139.232.x) > ALB SE > LB Pool members

How to change to one arm LB?

Hugo Phan has done a good write-up on this blog.

Basically, it’s downloading the existing config and changing the config map of VCD CPI removing the part where it 192.168.8.x subnet is defined. After this, you delete the existing CPI and then add it from the yaml file that you edited.

Snip from Hugo Phan blog – vmwire.com

How can I use/test this with my VCD TKG cluster?

If you don’t have a demo app that you prefer, then I can recommend either yelp or retrogames. Here I will do it with yelp. William Lam has done a good write-up and also hosts deployment files for yelp.

Step 1 – Deploy the application

kubectl create ns yelb
kubectl apply -f https://raw.githubusercontent.com/lamw/vmware-k8s-app-demo/master/yelb-lb.yaml

Step 2 – Check that all pods are running

jeram@QL4QJP2F4N ~ % kubectl -n yelb get pods
NAME                             READY   STATUS    RESTARTS   AGE
redis-server-74556bbcb7-f8c8f    1/1     Running   0          6s
yelb-appserver-d584bb889-6f2gr   1/1     Running   0          6s
yelb-db-694586cd78-27hl5         1/1     Running   0          6s
yelb-ui-8f54fd88c-cdvqq          1/1     Running   0          6s
jeram@QL4QJP2F4N ~ % 

The deployment file is asking k8s for a service of a load balancer, CCM picks this up and asks VCD CPI to have ALB creating the L4 load balancing.

Task view from VCD

Step 3 – Get the IP and go check out the yelb site

jeram@QL4QJP2F4N ~ % kubectl -n yelb get svc/yelb-ui
NAME      TYPE           CLUSTER-IP     EXTERNAL-IP       PORT(S)        AGE
yelb-ui   LoadBalancer   100.64.53.23   185.177.x.x   80:32047/TCP   5m52s

Step 4 – Scale the UI

Let’s see how many instances have from the initial deployment.

jeram@QL4QJP2F4N ~ % kubectl get rs --namespace yelb
NAME                       DESIRED   CURRENT   READY   AGE
redis-server-74556bbcb7    1         1         1       8m11s
yelb-appserver-d584bb889   1         1         1       8m11s
yelb-db-694586cd78         1         1         1       8m11s
yelb-ui-8f54fd88c          1         1         1       8m11s
jeram@QL4QJP2F4N ~ % 

We can then scale the UI to 3 and check again to see if that happens.

jeram@QL4QJP2F4N ~ % kubectl scale deployment yelb-ui --replicas=3 --namespace yelb
deployment.apps/yelb-ui scaled

jeram@QL4QJP2F4N ~ % kubectl get rs --namespace yelb
NAME                       DESIRED   CURRENT   READY   AGE
redis-server-74556bbcb7    1         1         1       9m45s
yelb-appserver-d584bb889   1         1         1       9m45s
yelb-db-694586cd78         1         1         1       9m45s
yelb-ui-8f54fd88c          3         3         3       9m45s
jeram@QL4QJP2F4N ~ % 

UI is now scaled to replicates of 3. Seen from the Load Balancer view I VCD it will only show the worker nodes. Since k8s is doing its own loadbalancing arose the pod instances.

Step 5 – Cleanup

jeram@QL4QJP2F4N ~ % kubectl -n yelb delete pod,svc --all && kubectl delete namespace yelb
pod "redis-server-74556bbcb7-f8c8f" deleted
pod "yelb-appserver-d584bb889-6f2gr" deleted
pod "yelb-db-694586cd78-27hl5" deleted
pod "yelb-ui-8f54fd88c-6llf7" deleted
pod "yelb-ui-8f54fd88c-9r6wf" deleted
pod "yelb-ui-8f54fd88c-cdvqq" deleted
service "redis-server" deleted
service "yelb-appserver" deleted
service "yelb-db" deleted
service "yelb-ui" deleted
namespace "yelb" deleted
jeram@QL4QJP2F4N ~ % 

Again, CCM will instruct VCD CPI to clean up. NICE!

Conclusion

We now have a good idea of how load balancing works with VCD TKG deployd K8s clusters. Off cause we are looking forward to the L7 features. But it’s a good start and VMware is working hard to help in making k8s deployment and day2 operations easier.

NSX-T – Topology view troubleshooting

Ever seen the beneath picture when trying to see the cool topology view in NSX-T?

If so, there is an API call that can help you resync the topology view so you again can see it.

# Login details
$nsxUsername = "admin",
$nsxPassword = "",
$nsxmanager = "nsxt.home.lab"

# NSX Manager auth header
$Type = "application/json;charset=UTF-8"
$Header = @{"Authorization" = "Basic " + [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes($nsxUsername + ":" + $nsxPassword)) }
$nsxUri = "https://$($nsxmanager)"

# Request topology resync
Invoke-RestMethod -Uri "$nsxUri/policy/api/v1/ui/network-topology/resync" -Headers $Header -Method POST -ContentType $Type

You don’t get any feedback from the API call, but after a few minutes, you will again be able to see the topology again.

NSX-V Edge – Disable ALG

In NSX-V 6.4.11 and 6.4.12 there was introduced a bug to the Application Layer Gateway where it would cause package drop.

From the release notes of NSX-V 6.4.13, it states:

Fixed Issue 2886595: ALG-based services are not working in NSX Data Center for vSphere 6.4.11 and 6.4.12.:

Edge Firewall drops packets for ALG-based services FTP/SFTP/SNMP when using NSX Data Center for vSphere 6.4.11 and 6.4.12.

A temporary workaround is to disable ALG on the affected edge until the NSX-V installation can be upgraded to 6.4.13 or 6.4.14.

Procedure

  • Connect to the NSX Manager as admin and enter enable mode by typing: enable
    Enter engineering mode by typing: st en
  • Enter the NSX Manager root password: IAmOnThePhoneWithTechSupport
    Get the password for the Edge by typing:
    /home/secureall/secureall/sem/WEB-INF/classes/GetSpockEdgePassword.sh edge-ID
[root@nsxvmanager ~]# /home/secureall/secureall/sem/WEB-INF/classes/GetSpockEdgePassword.sh edge-1269
Edge root password:
        edge-1269       -> u8ORKdfFIM$hZ
[root@nsxvmanager ~]#
  • Access the Edge VM console, log in as the admin user and enter enable mode by typing: enable
  • Enable engineering mode by typing: debug engineeringmode enable
  • Enter the root shell on the Edge by typing the password obtained from the NSX manager: st en
  • Run commands as followings to make the workaround reboot permanent and to disable the ALG in the kernel without a reboot.
    echo “net.netfilter.nf_conntrack_helper = 1” >> /etc/sysctl.conf
    sysctl net.netfilter.nf_conntrack_helper=1

Conclusion

Since the procedure above is done directly on the edge it will not survive an edge redeploy. This is because a redeploy will take its configuration from the NSX manager and not look at what is done directly on the edge itself.

Off cause, the correct solution would be to have the NSX manager upgraded to the latest version and afterward upgrade the edge version.

NSX-T Edge node password management

The default NSX-T password expiration time is set to 90 days. But in a lab environment, this is not required. So here is a bit on how to disable the timer but also how to recover from an expired or forgotten password.

Reset exipred password

If the password for either of the users, audit, root, or admin, has expired you will see it when you try to log in with SSH. It will then prompt you to enter the current password followed by the new two times. Since this is only for home lab, and would like the previous password, I set a new and quick-to-remember password. Fimmer_old_password1. The SSH session then disconnects and you start a new connection with the new password.

nsx-edge> set user admin password My-New_VMware1!_Password old-password Fimmer_old_password1

After the reset and re-reset you now have 90 days of password again. or you could disable the password expiration…

If you find yourself in a situation of a forgotten admin password.

You will most likely be able to log in with the root account. Even if expired using the console of the Edge VM will always work. From there you can use the normal Linux password reset command to reset the admin account password.

passwd admin

And if you have tried the wrong password too much you can unlock the account with pam tally.

pam_tally2 --user admin --reset

Another note when you are logged in with root, users can still use nsxcli, just wrap your nsxcli commands with su admin -c ”

su admin '-c clear user audit password-expiration'

If your find yourself completely locked out of NSX-T

VMware has some good documentation on this. Basically it is

  • Connect to the console of the appliance and reboot the system. When the GRUB boot menu appears, press the left SHIFT or ESC key quickly. Press e to edit the menu. Press e to edit the selected option.
  • Search for the line starting with linux and add systemd.wants=PasswordRecovery.service to the end of the line. Press Ctrl-X to boot.

Set password to never expire

SSH to the edge node with the admin account. Using the nsxcli we can adjust the expiration to a maximum of 90 days. The commands below will set the password expiration to 9999 days and clear the expiration if already happened. VMware has it in their documentation here

nsx-edge> set user admin password-expiration 9999
nsx-edge> set user root password-expiration 9999
nsx-edge> set user audit password-expiration 9999
nsx-edge> clear user admin password-expiration
nsx-edge> clear user root password-expiration
nsx-edge> clear user audit password-expiration

SSH-keys

Something that is always better than passwords is SSH Keys. You can add multiple ssh-keys to the same users in NSX-T. The cool thing is that you have a label for the key so multiple users can have access with their own SSH key, this way you avoid some of the hassles of having to use passwords in with your SSH connections

nsx-edge> set user admin ssh-keys label jr type ssh-rsa value AAAAB3NzaC1yc2EAAAADAQABAAACAQC/VPq30qzyJHr8v6qh1vF1CVY8R9U09iCkqnIs9H6d9hBOeDu/e52rPj2BOQUfHwBmGRPVqZUyuOO20hDgT/BzP0QxISv9l2OpFariz8AmHu9m4kUwAdrBDvplw8fFeafppUwQF/aFsIF+t1PtFluz0Bp3N/sp3NQGWfkez7myctGc9X3eMc6oUAYrPPJeDZz1x5JoGdwdH/w6wjr3uK03kRx6TX1kNqxSypIQQ/8lYg1TG7yAuF5DhX4fJrPjpiLau1H6z0vChVpqY1q8oMntzHHtYtByFMrNtWFfAvG94BT27h/Lkmz5JM5d41TbL0YdZT8zCTrXzUG87wdEaRiB5ZeKy9LENgfxKO66scSU2gjiXwpyJTrHKZYz9g5EERH/41w+qMT90HAM3ArSIvk7pROoKhZy0IeOwfWbmMlQvKQFjS7OtKnFEeVUYRqnLvi3XeUiFbLxmW3ID8IqQy3iDNuESiVNRcp/PoN7lxL9cfGJdXBuJ3PBcaQZx/vQpePRqW9eBSmhhS1beIUlLV0UOFdRGTMMMjOlp7m7jaw5EnvztbInfPOdMPoUuSL9iGut7M0SVMgEzo0MiJDHNdLQYK0EKO8qrWMz76UHhpdnhOQNdi3/wtVVzxVUR/D9zBa1q2oL8ml7jKVubVbBd6Vm0lEEquDEN3I9Dan/Ev0j9w==

Conclusion

Having an expired password will cause you all sorts of trouble.

If you don’t have a PAM solution that can help you to automatically change the password, then setting the expiration to 9999 days will for sure help your manageability.

Putting your SSH key onto the nodes and managers will help you in the long run, and is in my opinion also a more secure solution than having passwords.

ESXi network routes

I honestly don’t know why this is still a problem. Support for routed vmotion traffic was added back at vSphere6. Here we are vSphere7 and still have to set your gateway/routes for the vmotion stack through esxcli.

Either way, here is how it works

vMotion stack

Each tcp/ip stack can only have one gateway, that makes sense. And if you want to keep your management and vMotion traffic separated you need two tcp/ip stacks.

It’s nicely done through vSphere vCenter GUI and there is a KB for it. And you even have the option to override the default gateway and specify the right one for your vMotion stack.

[root@dc1esxcompx-xx:~] esxcli network ip route ipv4 list -N vmotion
Network     Netmask        Gateway  Interface  Source
----------  -------------  -------  ---------  ------
10.1.115.0  255.255.255.0  0.0.0.0  vmk1       MANUAL

But when looking at the routing table from esxcli it is not set. If you know, feel free to give me a kick and enlighten me.

ESXCLI add a static route

So for me to set an actually default route I have to do it as shown below

[root@dc1esxcompx-xx:~] esxcli network ip route ipv4 add -g 10.1.115.1 -n 0.0.0.0/0 -N vmotion

[root@dc1esxcompx-xx:~] esxcli network ip route ipv4 list -N vmotion
Network     Netmask        Gateway     Interface  Source
----------  -------------  ----------  ---------  ------
default     0.0.0.0        10.1.115.1  vmk1       MANUAL
10.1.115.0  255.255.255.0  0.0.0.0     vmk1       MANUAL

PowerCLI

Need to do it on a cluster with multiple hosts? No problem LucD from VMware community got you covered. I only did a little customization and it works for my needs.

connect-viserver -Server 

$stackName = 'vmotion'
$ipGateway = '10.1.115.1'
$ipDevice = 'vmk3'
$cluster = "computexx"
$vmhosts = get-cluster $cluster | get-vmhost

foreach($vmhost in $vmhosts)
{
$esx = Get-VMHost -Name $vmhost
$netSys = Get-View -Id $esx.ExtensionData.ConfigManager.NetworkSystem
$stack = $esx.ExtensionData.Config.Network.NetStackInstance | where{$_.Key -eq 'vmotion'}
$config = New-Object VMware.Vim.HostNetworkConfig
$spec = New-Object VMware.Vim.HostNetworkConfigNetStackSpec
$spec.Operation = [VMware.Vim.ConfigSpecOperation]::edit
$spec.NetStackInstance = $stack
$spec.NetStackInstance.ipRouteConfig.defaultGateway = $ipGateway
$spec.NetStackInstance.ipRouteConfig.gatewayDevice = $ipDevice
$config.NetStackSpec += $spec
$netsys.UpdateNetworkConfig($config,[VMware.Vim.HostConfigChangeMode]::modify)
}

Conclusion

Manipulating the vmotion stack route table with either esxcli or PowerCLI is working great.

Need to know more? there are plenty of good bloggers and KBs out here.

ssacli – CLI configuring SmartArray

When you install an HPE server with the VMware custom image for HPE servers you automatically get all the HPE tools for configuring the hardware. Neat.

Here is a small guide on how to clean and setup the array. The ssacli is located in /opt/smartstorageadmin/ssacli/bin on the ESXi serve – start here and follow the next commands

Show the existing config:

[root@esx2:/opt/smartstorageadmin/ssacli/bin] ./ssacli ctrl slot=0 ld all show
Smart Array P440ar in Slot 0 (Embedded)
   Array A
      logicaldrive 1 (279.37 GB, RAID 1, Failed)
   Array B
      logicaldrive 2 (3.27 TB, RAID 1+0, OK)

Delete the existing logical drives:

[root@esx2:/opt/smartstorageadmin/ssacli/bin] ./ssacli ctrl slot=0 ld 1 delete
Warning: Deleting an array can cause other array letters to become renamed.
         E.g. Deleting array A from arrays A,B,C will result in two remaining
         arrays A,B ... not B,C 
Warning: Deleting the specified device(s) will result in data being lost.
         Continue? (y/n)  y
[root@esx2:/opt/smartstorageadmin/ssacli/bin] ./ssacli ctrl slot=0 ld 2 delete
Warning: Deleting the specified device(s) will result in data being lost.
         Continue? (y/n)  y
[root@esx2:/opt/smartstorageadmin/ssacli/bin] 

Show all physical drives in the server:

[root@esx2:/opt/smartstorageadmin/ssacli/bin] ./ssacli ctrl slot=0 pd all show
Smart Array P440ar in Slot 0 (Embedded)
   Array A
      physicaldrive 1I:1:9 (port 1I:box 1:bay 9, SAS HDD, 900 GB, OK)
      physicaldrive 1I:1:10 (port 1I:box 1:bay 10, SAS HDD, 900 GB, OK)
      physicaldrive 1I:1:11 (port 1I:box 1:bay 11, SAS HDD, 900 GB, OK)
      physicaldrive 1I:1:12 (port 1I:box 1:bay 12, SAS HDD, 900 GB, OK)
      physicaldrive 1I:1:13 (port 1I:box 1:bay 13, SAS HDD, 900 GB, OK)
      physicaldrive 1I:1:14 (port 1I:box 1:bay 14, SAS HDD, 900 GB, OK)
      physicaldrive 1I:1:15 (port 1I:box 1:bay 15, SAS HDD, 900 GB, OK)
      physicaldrive 1I:1:16 (port 1I:box 1:bay 16, SAS HDD, 900 GB, OK)

Create a new volume with available physical drives:

[root@esx2:/opt/smartstorageadmin/ssacli/bin] ./ssacli ctrl slot=0 create type=ld drives=1I:1:9,1I:1:10,1I:1:11,1I:1:12,1I:1:13,1I:1:14,1I:1:15,1I:1:16 raid=6

Warning: Controller cache is disabled. Enabling logical drive cache will not take effect until this has been resolved.
[root@esx2:/opt/smartstorageadmin/ssacli/bin] 

Conclusion

Nice and easy – Give the HPE ssacli manual a read for more commands, starting page 57 and forward or use ssacli help.

[root@esx2:/opt/smartstorageadmin/ssacli/bin] ./ssacli help

CLI Syntax
   A typical SSACLI command line consists of three parts: a target device, 
   a command, and a parameter with values if necessary. Using angle brackets to
   denote a required variable and plain brackets to denote an optional 
   variable, the structure of a typical SSACLI command line is as follows:

      <target> <command> [parameter=value]

   <target> is of format:
      [controller all|slot=#|serialnumber=#]
      [array all|<id>]
...........

VMware CSE – Stuck cluster deployment

After upgrading to CSE 3.1.3 with VCD 10.3.1 I encountered a problem when creating clusters from the Ubuntu 20.04 native cluster template.

Basically, the mstr node would be deployed and started, VMTools will become ready and the first script injection would happen. Then all of a sudden the VM would reboot and the cluster creation will fail because it can’t see the process anymore. This will sometimes leave a cluster in the “Creation in progress” status but somehow it can not be managed anymore.

22-06-02 10:42:34 | cluster_service_2_x:2811 - _wait_for_tools_ready_callback | DEBUG :: waiting for guest tools, status: vm='vim.VirtualMachine:vm-835608', status=guestToolsNotRunning
22-06-02 10:42:39 | cluster_service_2_x:2811 - _wait_for_tools_ready_callback | DEBUG :: waiting for guest tools, status: vm='vim.VirtualMachine:vm-835608', status=guestToolsRunning
22-06-02 10:42:41 | cluster_service_2_x:2817 - _wait_for_guest_execution_callback | DEBUG :: waiting for process 1706 on vm 'vim.VirtualMachine:vm-835608' to finish (1)
22-06-02 10:42:46 | cluster_service_2_x:2817 - _wait_for_guest_execution_callback | DEBUG :: process [0, <Response [200]>, <Response [200]>] on vm 'vim.VirtualMachine:vm-835608' finished, exit code: 0
22-06-02 10:42:46 | cluster_service_2_x:2869 - _execute_script_in_nodes | DEBUG :: about to execute script on mstr-7e34 (vm='vim.VirtualMachine:vm-835608'), wait=True
22-06-02 10:42:48 | cluster_service_2_x:2817 - _wait_for_guest_execution_callback | DEBUG :: waiting for process 1729 on vm 'vim.VirtualMachine:vm-835608' to finish (1)
22-06-02 10:42:58 | cluster_service_2_x:2896 - _execute_script_in_nodes | ERROR :: Error executing script in node mstr-7e34: process not found (pid=1729) (vm='vim.VirtualMachine:vm-835608')
Traceback (most recent call last):
  File "/opt/vmware/cse/python/lib/python3.7/site-packages/container_service_extension/rde/backend/cluster_service_2_x.py", line 2879, in _execute_script_in_nodes
    callback=_wait_for_guest_execution_callback)

I created an SR request with Cloud Director GSS for both the failed deployment and for the stuck clusters that now couldn’t be deleted. Multiple screen sharing sessions later and no result.

Then I found the GitHub for Container Service Extension, the issue page had a very tempting title Failed deployments using TKGm on VCD. Many seem to have the same problem, no fix on the deployments but it seems that one guy had the fix for deletion of the stuck clusters.

The workaround

You need to find the ID of the user that owns the cluster. You can in the More>Kubernetes Clusters menu in VCD see who the owner is.

When you have the owner you can go into Administration > User > <User>. Then then the URL with contain the ID of the user.

vcd.ramsgaard.me/tenant/tenant1/administration/access-control/users/v9993018-ebf5-4ded-8134-27ddcc4ccbf0/general

With the userId you can fill out the body for the next API call.

$vdchost = "vcd.ramsgaard.me"
$apiusername = "svc-cse@system"
$password = 'Ye.........iks12!'

$base64AuthInfo = [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f $apiusername,$password)))
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls12
$auth =Invoke-WebRequest -Uri "https://$vdchost/api/sessions" -Headers @{Accept = "application/*;version=32.0";Authorization="Basic $base64AuthInfo"} -Method Post

$accessBody = '{
    "grantType": "MembershipAccessControlGrant",
    "accessLevelId": "urn:vcloud:accessLevel:FullControl",
    "memberId": "urn:vcloud:user:e96cf9e8-535f-45d8-8a87-b9dac659f85f"
  }' | ConvertFrom-Json

$status = Invoke-RestMethod -Uri "https://$vdchost/cloudapi/1.0.0/entities/urn:vcloud:type:cse:nativeCluster:2.1.0/accessControls" -Headers @{Accept = "application/json;version=36.1";Authorization="Bearer $($auth.Headers.'X-VMWARE-VCLOUD-ACCESS-TOKEN')"} -ContentType "application/json" -Method post -Body ($accessBody | ConvertTo-Json)

When the API call is done you should now be able to delete the stuck cluster.

If you should be so unfortunate that the cluster is stuck in a “not resolved” state and the deletion through VCD GUI still fails you need to use the vcd cse cli.

### Login to VCD system or tenant organistaion
vcd login vcd.ramsgaard.me system jr
### Show clusters
vcd cse cluster list
### Force delete the cluster
vcd cse cluster delete tanzu1 --force

Conclusion:

The problem occurred in the first place due to a bug in VCD 10.3.1, the MQTT bus had some bug and therefore the cluster creation failed. 10.3.2 or 10.3.3 fixed the bug. (Off cause the VMware Tanzy Grid version should be used in the future)

It took some time to find the workaround, I hope the future of CSE will be more fault tolerant so these situations would not appear.

Until then there is a way to get out of the stuck cluster situation.

Disk mapping Windows <-> VMware – Part 2

A couple of years ago I did a post on how to map your windows disk with the real disk in VMware. The post will be an extension of it but with updated commands.

Why do I need to know the mapping? It happens when you stumble upon a VM disk with many disks attached. If the many disks vary in size you normally can look at those numbers and match them with the disks in VMware, but when all disks have the same size that approach become difficult.

Windows serial number:

In windows, we can retrieve the serial number on the disk we need to expand and then map the serial number to the VMware disk. In newer Windows Server versions it’s fairly easy to find but when dealing with older than 2012 you are missing the PowerShell cmdlets like get-disk. Someone on StackOverflow got a way that works on Windows Server 2008 > 2022.

$DriveLetter = "C:"
Get-CimInstance -ClassName Win32_DiskDrive |
Get-CimAssociatedInstance -Association Win32_DiskDriveToDiskPartition |
Get-CimAssociatedInstance -Association Win32_LogicalDiskToPartition |
Where-Object DeviceId -eq $DriveLetter |
Get-CimAssociatedInstance -Association Win32_LogicalDiskToPartition |
Get-CimAssociatedInstance -Association Win32_DiskDriveToDiskPartition |
Select-Object -Property SerialNumber

VMware disk:

From VMware’s side, it’s straightforward to find the disk and its serial number. Below is an scripted way of finding the disk and then adding the extra capacity.

Connect-VIServer ""

$VMname = ""
$disksn = "6000c295ec128b3d14472bdbf8e65aee"
$vmDisk = (Get-VM $VMname | Get-HardDisk) | Where-Object {$_.ExtensionData.Backing.uuid.Replace("-","") -eq $disksn } 

$ExpandSizeGb = 50
$vmDisk | Set-HardDisk -CapacityGB ($vmDisk.CapacityGB + $ExpandSizeGb) -Confirm:$false 

Conclusion:

Instead of having to guess what disk in windows is mapping to the VMware disk you here have a more automated way. The disk serial number retrieve commands are compatible with up to Windows Server 2022.

Cloud Director 10.3 – Update certificates

Since my last article on how to update Cloud Director SSL certificates, there has been a major change. No more binary java truststore – jaaay.

Cloud Director has changed over too, what I think, is a better and more normal way of storing the private and public keys, which is in PEM format. From release notes, the change actually happened in 10.2, but the certificate path changed again in 10.3. If you are in doubt of where the certificate path is then look inside global.properties

/opt/vmware/vcloud-director/etc/global.properties

VMware’s own documentation state that we can now just swap the .pem files, use the cell-management tool to import and restart the cell.

What we will do and what is needed

  • Get a new public signed certificate
    • Either in PEM format as .key and .pem(certificate including intermediate)
    • Or in PFX so it can be exported
  • Backup existing certificates
  • Replace existing certificates with your new certificate
  • Run VCD tool to import and define the private key encryption password
  • Restart cell(s)

Process

If you have a pfx you can use this article to extract the key and cert. If you already have the two files, .key end .pem then you can proceed.

We will follow VMware documentation and create a backup of the existing files.

cp /opt/vmware/vcloud-director/etc/user.http.pem /opt/vmware/vcloud-director/etc/user.http.pem.original
cp /opt/vmware/vcloud-director/etc/user.http.key /opt/vmware/vcloud-director/etc/user.http.key.original
cp /opt/vmware/vcloud-director/etc/user.consoleproxy.pem /opt/vmware/vcloud-director/etc/user.consoleproxy.pem.original
cp /opt/vmware/vcloud-director/etc/user.consoleproxy.key /opt/vmware/vcloud-director/etc/user.consoleproxy.key.original

Now we can wither SCP in our key and certificate or edit and replace the content of the files on the server by copying and pasting in content from the files you have. Whatever you find to be the easiest.

Forgot your root password for the Cloud Director appliance, off cause not. But anyway, here is a link to reset it....

After the “user.http.pem/key” and “user.consoleproxy.pem/key” files have been updated with the new certificate data we can tell Cloud Dictor to update its config with the commands below. This is done to update the encryption password for the private key.

If you don’t care about security you can also update without –key-password, then off cause your private key will need to be in an unencrypted format in the .key files.

/opt/vmware/vcloud-director/bin/cell-management-tool certificates -j --cert /opt/vmware/vcloud-director/etc/user.consoleproxy.pem --key /opt/vmware/vcloud-director/etc/user.consoleproxy.key --key-password PASSWD
/opt/vmware/vcloud-director/bin/cell-management-tool certificates -p --cert /opt/vmware/vcloud-director/etc/user.http.pem --key /opt/vmware/vcloud-director/etc/user.http.key --key-password PASSWD

If everything works out it will tell you the certificates have been updated and you need to restart VCD for it to take effect.

SSL configuration has been updated. You will need to restart the cell for changes to take effect.

Now safely shut down your cell(s) with the command below. this will ensure that VCD is the first shutdown when all tasks are done.

/opt/vmware/vcloud-director/bin/cell-management-tool cell -i $(service vmware-vcd pid cell) -s

Start again with the command below

systemctl start vmware-vcd

Conclusion

VMware has made it much easier to change a certificate in Cloud Director. The new way of storing certificates is a warm welcome change.

I did see a few different placements for the .key and .pem files depending on versions or if the cells have been created with raw Linux or an appliance, but you can always look in the conflig file placed in the same folder as the certificates.

Storage DRS recommendations – with PowerCLI

Many things can happen when you let Storage DRS run fully automated. If you have it on from the beginning it will probably only give you good things. But enabling it on a large storage space imbalanced cluster might be a bit too risky.

Many things that Storage DRS is not aware of. Like your storage underneath running out of space on pool/aggregate or the operations is too IO heavy to run within business hours.

Call me a wimp, but in this case, it seems better to be in control and apply the recommendations little by little. But having to use the GUI is a pain, you need to go into Storage Cluster > Monitor > Storage DRS > Recommendations. And from here you need to override the selections and uncheck the boxes so you can run smaller batches of Storage vMotions.

I will just use VMware PowerCLI cmdlets…

Well, unfortunately not all of vSphere API is exposed through PowerCLI cmdlets, but after a bit of googling it seemed quite easy to call the SDK API directly from within PowerShell

One post that came to my attention where containing most of the code needed.

Solution:

I’m not that much into what the ServiceInstance or StorageRessoruceManager is. But I expect it to be the API instantiated by PowerShell where you then have each operation from where you can find the functionality that you are looking for.

 # DSC you want to work with
$dscName = 'DatastoreCluster'

# Get DSC info
$dsc = Get-View -ViewType StoragePod -Filter @{'Name'=$dscName}

# Get Service Intance
$si = Get-View ServiceInstance

# Get the StorageResourceManager
$storMgr = Get-View -Id $si.Content.StorageResourceManager

# Refresh SDRS Recommendation on DSC
$storMgr.RefreshStorageDrsRecommendation($dsc.MoRef)

# Update dsc object with fresh recommendation data
$dsc.UpdateViewData()

# Filter on reason for storage balance. Select only 40 VMs.
$balance = $dsc.PodStorageDrsEntry.Recommendation | Where-Object {$_.Reason -eq "balanceDatastoreSpaceUsage"}  | Select-Object -First 40

# Do a run of each VM and start the storage vMotion process
foreach($vm in $balance){
   $message = "Moving VM: {0} to datastore: {1}" -f $(get-vm -id $("VirtualMachine-"+$($vm.Action[0].Target.Value))).name, $(get-datastore -id $vm.Action[0].Destination).name
   write-host $message -ForegroundColor Green
   $storMgr.ApplyStorageDrsRecommendationToPod($dsc.MoRef,$vm.Key)
} 

Conclusion:

I was expecting to use some PowerCLI cmdlets to make my granular balance of the storage cluster. Unfortunately, that did not exist.

But from the great community, I found how to use the vSphere API through PowerShell and in the end got the functionalty I was looking for.

Maybe there is an easier way to do the same, if so, let me know. Until next time I have a bit of vSphere SDK googling to do.

Veeam – Network Extention Appliance performance

This post will give a brief write-up on what to expect from a network perspective when using the Veeam Network Extention.

Since you found this post I don’t think an introduction is needed. But anyway. A quick write-up of the network so you can visualize how the test is performed.

  • Greenline indicated the L2 VPN made from both NEA to CloudGateway
  • The on-prem environment with 10gbit internet uplink
  • Service provider with multiple 10gbit internet uplinks
  • 4ms between on-prem and service provider

Tests:

So when a replica VM has been failover and the NEA L2 is running. What to expect? Veeam does not give you any info on the performance of the NEA. Veeam support is not either able to give out a performance chart. So here a the results from ping and iperf test.

Test 1 – ping of latency over the L2 tunnel:

Ping to 185.177.120.140 showing the latency over the internet. Ping to 192.168.12.151 showing the latency over the L2 VPN.

So from a latency perspective, it seems good. Only adding 1ms to the internet latency. Which is pretty good.

Test 2 – iperf over tunnel

iperf test from a VM in service provider side to an on-prem server.

About 110Mbit, not very good compared to the internet being able of doing 10Gbit.

iperf test from VM in service provider side to an on-prem server. This time with -P 8 for 8 parallel threads.

8 threads are not giving any further bandwidth.

Test 3 – Multiple VLAN with multiple NEA

It’s always interesting to find where the bottleneck could be. Since iperf over the internet is giving a completely different result. then it must be within NEA. When I tried to do multiple VLAN bridges to the cloud resources in Cloud Director I get the same results pr NEA. Meaning it could be something in NEA or its components making the bottleneck.

The good news is off cause that you will see the same result pr NEA even when doing iperf test to the same target in the other end. So NEA will scale linearly.

A look from the Veeam Cloud Connect gateway being the broker of the L2 VPN connections.
View from the iperf server – showing the connections from the servers in the other end of L2 VPN.

Conclusion

NEA is a very helpful solution, especially when it comes to large migrations where L2 between datacenters is required meanwhile migrating. Bandwidth using this solution is not great, but I would say is ok. L2 connection should only be used shortly when doing actually migrations.

In numbers, it seems NEA will add +1ms to the latency seen over the internet between the two environments. Bandwith is between 110 to 140mbit pr sec.

Manual mount VMFS datastore

Have a datastore that shows Not Consumed? From time to time I stumble across them and from what I have found there is really only one way to get around it, manually mount the datastores from the shell of the ESXi host.

Not sure what the root cause for it is, but if you know, then please let me know 🙂

Workaround

What we need to do is have the partitions UUID’s on the block device listed and afterwards mount the datastore with that UUID.

### Listing all available datastores that is not mounted
esxcfg-volume –l
### Mount a specefic datastore with the UUID found with -l 
esxcfg-volume –M <UUID>

Conclusion

After mounting with esxcfg-volume it should be mounted permanatly. Hope it works for you to.