Veeam – Network Extention Appliance performance

This post will give a brief write-up on what to expect from a network perspective when using the Veeam Network Extention.

Since you found this post I don’t think an introduction is needed. But anyway. A quick write-up of the network so you can visualize how the test is performed.

  • Greenline indicated the L2 VPN made from both NEA to CloudGateway
  • The on-prem environment with 10gbit internet uplink
  • Service provider with multiple 10gbit internet uplinks
  • 4ms between on-prem and service provider

Tests:

So when a replica VM has been failover and the NEA L2 is running. What to expect? Veeam does not give you any info on the performance of the NEA. Veeam support is not either able to give out a performance chart. So here a the results from ping and iperf test.

Test 1 – ping of latency over the L2 tunnel:

Ping to 185.177.120.140 showing the latency over the internet. Ping to 192.168.12.151 showing the latency over the L2 VPN.

So from a latency perspective, it seems good. Only adding 1ms to the internet latency. Which is pretty good.

Test 2 – iperf over tunnel

iperf test from a VM in service provider side to an on-prem server.

About 110Mbit, not very good compared to the internet being able of doing 10Gbit.

iperf test from VM in service provider side to an on-prem server. This time with -P 8 for 8 parallel threads.

8 threads are not giving any further bandwidth.

Test 3 – Multiple VLAN with multiple NEA

It’s always interesting to find where the bottleneck could be. Since iperf over the internet is giving a completely different result. then it must be within NEA. When I tried to do multiple VLAN bridges to the cloud resources in Cloud Director I get the same results pr NEA. Meaning it could be something in NEA or its components making the bottleneck.

The good news is off cause that you will see the same result pr NEA even when doing iperf test to the same target in the other end. So NEA will scale linearly.

A look from the Veeam Cloud Connect gateway being the broker of the L2 VPN connections.
View from the iperf server – showing the connections from the servers in the other end of L2 VPN.

Conclusion

NEA is a very helpful solution, especially when it comes to large migrations where L2 between datacenters is required meanwhile migrating. Bandwidth using this solution is not great, but I would say is ok. L2 connection should only be used shortly when doing actually migrations.

In numbers, it seems NEA will add +1ms to the latency seen over the internet between the two environments. Bandwith is between 110 to 140mbit pr sec.

Manual mount VMFS datastore

Have a datastore that shows Not Consumed? From time to time I stumble across them and from what I have found there is really only one way to get around it, manually mount the datastores from the shell of the ESXi host.

Not sure what the root cause for it is, but if you know, then please let me know 🙂

Workaround

What we need to do is have the partitions UUID’s on the block device listed and afterwards mount the datastore with that UUID.

### Listing all available datastores that is not mounted
esxcfg-volume –l
### Mount a specefic datastore with the UUID found with -l 
esxcfg-volume –M <UUID>

Conclusion

After mounting with esxcfg-volume it should be mounted permanatly. Hope it works for you to.

Change VM MoRef in VBR database

This information can be found in many other places on the big internet, but since I can never find it myself, I will make a post more about the procedure.

When you switch ESXi host, vCenter, or remove and add from inventory your VM will get a new ID. In the world of VMware, it’s called MoRef ID.

When this happens Veeam will lose its coupling to the VM and backup will fail with:
– Virtual Machine <> is unavailable and will be skipped from processing.
– Nothing to process. All machines were excluded from task list.

How to verify there is a MoRef mismatch:

From a VMware perspective it’s easy:

connect-viserver <vcenter> -Credential $cred
Get-VM | select name, id

This will give you something like:

PS C:\Windows\system32> Get-VM | select name, id
Name Id
---- --
VirtualMachine-vm-71326

From Veeam perspective it’s a bit harder since you will need to query the MS SQL database that Veeam uses. So download the SQL Studio Manager from Microsoft.

Open the SQL Studio Manager as administrator on the server to gain access to the Veeam database. You can use the following query to find the MoRef that is in the Veeam database:

SELECT [dbo].[BObjects].id, [dbo].[BObjects].object_id, [dbo].[BObjects].host_id, [dbo].[BObjectsSensitiveInfo].object_name, [dbo].[BObjectsSensitiveInfo].path
FROM [dbo].[BObjects]
INNER JOIN [dbo].[BObjectsSensitiveInfo] ON [dbo].[BObjectsSensitiveInfo].bObject_id=[dbo].[BObjects].id  
WHERE object_name = '<vmname>'

Verify:

So we can now see that the VM in VMware has MoRef “vm-71326”. But Veeam database has “vm-992”. From here on you know what’s wrong and you need to open a Veeam support case to get the supported procedure.

If you don’t care about supported procedures you can update the database with VMware VM new MoRef ID and your VBR job should be running again. The SQL query would look like this:

UPDATE [dbo].[BOobjects]
SET [object_id] = 'new-id'
WHERE [object_id] = 'old-id'

Conclusion

It’s not that had to change the MoRef in the VBR database. But remember, if you care about having a supported installation. Then you need to create a Veeam support case and have them help you. Something could have changed in the VBR database schema since this post.

CMD – Massdelete files

Not sure what happened, but thousands of small files piled up in the path below.

C:\ProgramData\Microsoft\Crypto\RSA\S-1-5-8

My google-fu wasn’t sufficient enough to find out what files in that folder actually did. An article from Kofax said that just deleting files could be troublesome. But also found another article that stated that he deleted all files older than 30 days and hasn’t had problems yet. So I dare to do the same. The command was found on a blog, so all credit goes to this guy.

forfiles /D -30 /C "cmd /C attrib -s @file & echo @file & del @file"

forfiles is a very nice command that iterates through the files in a folder according to its parameters. /D -30 iterates through all files more than 10 days old. attrib -s takes off the System attribute, which is needed for DEL to work. The echo is there so you can see that it is doing its job.

Extend disk with LVM

Here is a quick walkthrough showing you how to expand an LVM volume or partition in Linux by first resizing logical volume followed by resizing the file system to take advantage of the additional space.

Note: In this example, we are working in Ubuntu, some commands may differ in different Linux distributions.

DISCLAIMER: Make sure that you have proper backups in place before starting out with the resize procedure on your VMs! Create any backups necessary to ensure that if something goes wrong you can always go back to a previous working state. Losing any information or wiping out your disks is all your fault if this happens. Using this procedure is all at your own responsibility.

If you are unsure about LVM and its components I will suggest you read a bit upon it. There are tons of articles, for example on Digital Ocean.

Process

This process can be easy to do with LVM as it can be done on the fly with no downtime needed, you can perform it on a mounted volume without interruption. In order to increase the size of a logical volume, the volume group that it is in must-have free space available. It goes as follows.

  • Add more space from the hardware level. Either raid controller, or hypervisor.
  • Resize your partition to contain the extra space
  • Resize the PV in LVM
  • Expand the LV in LVM
  • Resize filesystem

To view the free space of your volume group, run pgdisplay command as shown below and look at the “Free PE / Size” field, in this case non-free.

root@www:~# pvdisplay  
--- Physical volume ---
  PV Name               /dev/sdb1
  VG Name               server1-vg
  PV Size               99.00 GiB / not usable 3.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              25343
  Free PE               0
  Allocated PE          25343
  PV UUID               cgPMYF-PkeW-1iaS-FxUZ-Ky9r-Zoa8-ktpEk5

Since we have added more space from the hardware level need to grow the partition. This is done by deleting it and create a new one that is starting from the same sectors but where the new partition uses more sectors than before. In the output beneath we can see how to extend the partition.

root@www:~# fdisk -l /dev/sdb
Disk /dev/sdb: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xb97da1a0

Device     Boot Start       End   Sectors Size Id Type
/dev/sdb1        2048 207620095 207618048  99G 83 Linux
root@www:~# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.27.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help): d
Selected partition 1
Partition 1 has been deleted.

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1): 1
First sector (2048-209715199, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-209715199, default 209715199): +99.9G

Created a new partition 1 of type 'Linux' and of size 99.9 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Re-reading the partition table failed.: Device or resource busy

The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8).

root@www:~# partprobe

The disk partition is now extended. We need to inform LVM to grow its PV

root@www:~# pvresize /dev/sdb1
  Physical volume "/dev/sdb1" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized

Now we can verify that there is available space to grow LV.

root@www:~# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sdb1
  VG Name               server1-vg
  PV Size               99.88 GiB / not usable 3.00 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              25568
  Free PE               225
  Allocated PE          25343
  PV UUID               cgPMYF-PkeW-1iaS-FxUZ-Ky9r-Zoa8-ktpEk5

Lastly, we grow the LV and extend the filesystem.

root@www:~# lvextend -l +100%FREE /dev/mapper/server1--vg-root
  Size of logical volume server1-vg/root changed from 196.51 GiB (50307 extents) to 197.39 GiB (50532 extents).
  Logical volume root successfully resized.

root@www:~# resize2fs /dev/mapper/server1--vg-root
resize2fs 1.42.13 (17-May-2015)
Filesystem at /dev/mapper/server1--vg-root is mounted on /; on-line resizing required
old_desc_blocks = 13, new_desc_blocks = 13
The filesystem on /dev/mapper/server1--vg-root is now 51744768 (4k) blocks long.

Conclusion

We have now successfully expanded a file system and corresponding LVM logical volume without any downtime. This was done by first expanding the partition of the disk, then the logical volume and finally performing an online resize of the file system.

No servers were harmed doing this procedure 🙂

Cisco ASA HA member swap

The old Intel C2000 bug is still among us. Not all devices have been affected, yet. For me, this is the first. So here is my 2 cent on how to remediate the cluster with an RMA unit.

https://www.cisco.com/c/en/us/support/docs/field-notices/642/fn64228.html

Process:

  1. Label all cables from the faulty unit, this is for you to not worry that any cables are mislocated when you are swapping the unit. I also disabled the links for the switch so that I with a calm can plug them in again without fearing that something will go wrong. Better safe than sorry.
  2. Make sure the firmware level on the new device is the same as the existing member. Firmware up/downgrade is needed.
  3. Backup the config of the HA member still running. Just in case something bad happens.
  4. Give an IP on the HA link interface, from this point it should find the existing member and start replication the config.

Firmware upgrade:

I have booted the unit and linked it only with a console cable. I can now see the firmware version and it needs an upgrade.

ciscoasa> en
Password:
ciscoasa#
ciscoasa# sh version

Cisco Adaptive Security Appliance Software Version 9.8(2)20
Firepower Extensible Operating System Version 2.2(2.63)
Device Manager Version 7.5(1)

Compiled on Fri 02-Feb-18 06:10 PST by builders
System image file is "disk0:/asa982-20-lfbff-k8.SPA"
Config file at boot was "startup-config"

ciscoasa up 30 secs

Hardware:   ASA5516, 8192 MB RAM, CPU Atom C2000 series 2416 MHz, 1 CPU (8 cores                                            )
Internal ATA Compact Flash, 8000MB
BIOS Flash M25P64 @ 0xfed01000, 16384KB

I have prepared a USB disk format in FAT and downloaded the matching firmware from cisco.com. When you plug in the USB key to the ASA you should now see a disk1 where you can copy from. If you want to see disk1 content issue command “show disk1”.

We now copy over the files and make the system boot the new firmware.

ciscoasa# copy disk1:/asa9-13-1-lfbff-k8.SPA disk0:/asa9-13-1-lfbff-k8.SPA

Source filename [asa9-13-1-lfbff-k8.SPA]?

Destination filename [asa9-13-1-lfbff-k8.SPA]?

Copy in progress...CCCCC
Verifying file disk0:/asa9-13-1-lfbff-k8.SPA...
Computed Hash   SHA2: 80500c1790c76e90dde61488c3f977b8
                      69711278b6e550eeb8ea8830e19c4a23
                      8cf03fe64d1d9927d4a78e77b6090234
                      98485fbf9bc058eb3820b32e7a56f91f

Embedded Hash   SHA2: 80500c1790c76e90dde61488c3f977b8
                      69711278b6e550eeb8ea8830e19c4a23
                      8cf03fe64d1d9927d4a78e77b6090234
                      98485fbf9bc058eb3820b32e7a56f91f


Digital signature successfully validated

Writing file disk0:/asa9-13-1-lfbff-k8.SPA...

107543456 bytes copied in 26.40 secs (4136286 bytes/sec)
ciscoasa# config t
ciscoasa(config)# boot system disk0:/asa9-13-1-lfbff-k8.SPA
ciscoasa(config)# wr mem
ciscoasa(config)# reload

After reload, the system is now up and I can confirm that it has booted on the new firmware.

ciscoasa> show version

Cisco Adaptive Security Appliance Software Version 9.13(1)
SSP Operating System Version 2.7(1.107)
Device Manager Version 7.5(1)

Compiled on Mon 23-Sep-19 09:28 PDT by builders
System image file is "disk0:/asa9-13-1-lfbff-k8.SPA"
Config file at boot was "startup-config"

ciscoasa up 29 secs

Hardware:   ASA5516, 8192 MB RAM, CPU Atom C2000 series 2416 MHz, 1 CPU (8 cores)
Internal ATA Compact Flash, 8000MB
BIOS Flash M25P64 @ 0xfed01000, 16384KB

Joining the HA cluster

We now verified that the two ASA firewalls are on the correct firmware level. Now connect all the cables to the firewall, on the switch side all data links are administratively down, the HA link between the two ASA is a dedicated link. And those are the links we are now going to configure.

You can grab the lines from the existing member that are actively running. If you don’t have the failover key, you can also reset this on on the primary/existing member.

failover lan unit secondary
failover lan interface HA_FAILOVERLINK GigabitEthernet1/7
failover key ***
failover link HA_STATELINK GigabitEthernet1/8
failover interface ip HA_FAILOVERLINK 172.16.254.1 255.255.255.0 standby 172.16.254.2
failover interface ip HA_STATELINK 172.16.255.1 255.255.255.0 standby 172.16.255.2

The new ASA is now ready to contact the primary member of the cluster and start the replication. In my case, the interfaces for HA were administratively down. So we are now going to enable the link and enable failover.

interface GigabitEthernet 1/7
no shut
interface GigabitEthernet 1/8
no shut
failover

If something in the config is not ok, missing files or other is listed and you have to remediate this before you again can try to enable HA with the “failover” command. In the output beneath you can see what happens when the failover command is enabled with success.

Detected an Active mate
Beginning configuration replication from mate.
End configuration replication from mate.

You can now check failover status and see if the standby member is in ready mode. if not try giving the standby a reload.

failover reload-standby

If the standby member is now in a ready state you are now ready to do a live failover. Remember to enable the ports again on the switch side.

Conclusion

The process is not so bad as I thought. And there where no downtime involved. For me, I was missing ASDM and AnyConnect packages on the new standby node. I downloaded it from the existing primary node and then copied it to a USB disk. When the USB disk is plugged into the standby ASA I can then copy the files over the ASA flash.

copy /noconfirm disk1:/anyconnect-win-4.8.01090-webdeploy-k9.pkg disk0:/anyconnect-win-4.8.01090-webdeploy-k9.pkg
copy /noconfirm disk1:/anyconnect-macos-4.8.01090-webdeploy-k9.pkg disk0:/anyconnect-macos-4.8.01090-webdeploy-k9.pkg
copy /noconfirm disk1:/VPN_client_profile.xml disk0:/VPN_client_profile.xml
copy /noconfirm disk1:/anyconnect-linux64-4.8.01090-webdeploy-k9.pkg disk0:/anyconnect-linux64-4.8.01090-webdeploy-k9.pkg
copy /noconfirm disk1:/Management_client_profile.xml disk0:/Management_client_profile.xml
copy /noconfirm disk1:/asdm-7131.bin disk0:/asdm-7131.bin

From there on I could do a live failover and see the little “Active” light change on the physical ASA firewalls. With all traffic flowing uninterrupted. Mission accomplished.

macOS TFTP server

I have never really found a good TFTP for macOS. Is it funny that macOS is much used by network people but there isn’t a decent TFTP server?

Well, there is. macOS has it built-in, no GUI though. But that’s also fine, as long as you know to use it. It’s disabled by default, but you can start and stop it with the following commands.

### Start TFTP
sudo launchctl load -F /System/Library/LaunchDaemons/tftp.plist

### Stop TFTP
sudo launchctl unload -F /System/Library/LaunchDaemons/tftp.plist

### Check if its running (no process means it not running)
netstat -atp UDP | grep tftp

The TFTP daemon uses the /private/tftpboot folder so we are going to copy the file there. Then set the correct permissions on the file.

### Copy file to tftp folder
cp FILENAME /private/tftpboot
### Set permissions for the folder and files within
chmod -R 766 /private/tftpboot

There is a gotcha with the TFTP daemon, which is you cant copy a file to the TFTP daemon if that file does not already exist there.  To work around it you can just create a file and set the permission for it. Then your devices will just send data into the pre-created file.

### Create the file
touch /private/tftpboot/FILENAME
### Set permissions
chmod -R 766 /private/tftpboot

Shrink VMDK disk

I have always thought that VMDK could only grow, so that has also been my default response to colleagues when they expanded a disk too much. Sure a storage vMotion could reclaim unused space in a thin disk, but the “down arrow” for storage capacity would never work. But then someone mentioned that he had done shrinking of disks a couple of times, I decided to investigate.

The official VMware kb isn’t too much help – somewhere discussing it on StackOverflow. But then I found an older post back from 2016 that seems to have found the approach so that’s what we are going to test out.

Disclaimer:

This is not supported in any way, use at your own responsibility. If you want a supported solution, then VMware converter in a v2v manner is kind of the only way. If you still want to try out the method, then be sure to have a valid backup! And by backup, it’s not a VMware snapshot.

Not supported:

From the VMware documentation, it seems shrinking disk is not allowed under the following circumstances:

  • The virtual machine is hosted on an ESX/ESXi server.ESX/ESXi Server can shrink the size of a virtual disk only when a virtual machine is exported. The space occupied by the virtual disk on the ESX/ESXi server, however, does not change.
  • The virtual machine has a Mac guest operating system.
  • You preallocated all the disk space to the virtual disk when you created it.
  • The virtual machine contains a snapshot.
  • The virtual machine is a linked clone or the parent of a linked clone.
  • The virtual disk is an independent disk in nonpersistent mode.
  • The file system is a journaling file system, such as an ext4, xfs, or jfs file system.

The test scenario:

I have a windows 2019 VM, here is the process I want to try out

  1. Expand VMDK disk in vCenter
  2. Extent disk in VM guest using diskpart
  3. Shrink disk in VM guest using diskpart
  4. calculate new sector size
  5. edit VM *.vmdk with the newly calculated sector size
  6. Storage migrate to other datastore
  7. Check if VM is still ok.

Walkthrough:

We start off with the VM. Its Windows 2019, original size is 40GB.
Disk is now extended with 5gb.
With a view from the esxi we can see the disk is also showing 45GB.
inside “win2019.vmdk” we can see the “extent description”. This is the number we have to change after the guest os filesystem has been shrunk.
Here we see the disk has been extended to 45GB and then shrunk down with 10GB.

Calculating the “extent description”:

So there is now 10GB free space we can shrink the VMDK with.

A virtual disk described as monolithic and flat consists of two files. One file contains the descriptor. The other file is the extent used to store virtual machine data.

Considering our existing extent
RW 94371840 VMFS “win2019-flat.vmdk”
This means that the file win2019-flat.vmdk is 94371840 sectors × 512 bytes/sector = 48318382080 bytes = 48318MB in size.

Let’s calculate the new value from GB to sectors.

36GB x 1024(mb) x 1024(kb) x 1024(byte) / 512byte pr sector = 75.497.472

before proceeding, we need to power off the VM. The .vmdk file is loaded into memory, so even if we can edit it now and start storage vMotion our changed value will just change back.

Letting vMotion do its magic
And after the boot of VM the disk is now shrunk. And we still have a working guest os.

Conclusion

It worked, we were able to add more space to the VM, extent, and shrink the guest os filesystem. We then calculated the number of sectors for the .vmdk file and storage vMotion did its magic and made the VMDK smaller in physical size.

I have also tried this in a couple of cases, also real life senairoes where people have added 4TB to much…. Then its sometimes easier to shrink than having to move files around.

VCD – Find free external IPs

Finding free public IPs in Cloud Director backed by NSX-V is not as easy as it should be. Some people will tell you to ping the scope and see what’s responding. But pinging is not reliable was of finding free IPs. Not every device is responding to ICMP messages.

Somewhere along the line, I found a guy on the VMware forum posting a script for finding available IPs in Cloud Director using the PowerCLI module for querying VCD and getting back IPs that are not allocated by an Edge. I have been using the script quite a bit since. His blog is not available today, but the code is still on the forum.

Now it’s also available here on the site with a bit more explanation on how to connect and use the function. I have been using it with NSX-V as backend, haven’t tried it with NSX-T at the network backend yet.

 ### Install PowerCLI module
Install-Module -Name VMware-vCD-Module

### Import PowerCLI Cloud module
Import-Module -Name VMware.VimAutomation.Cloud

### Connect to Cloud Director instance with your credentials
Connect-CIServer -server <VCD_URL>

Function Get-FreeExtIPAddress([String]$extnetName){
    function  Convertto-IPINT64  () { 
    param ($ip) 
    
    $octets = $ip.split(".") 
    return [int64]([int64]$octets[0]*16777216 +[int64]$octets[1]*65536 +[int64]$octets[2]*256 +[int64]$octets[3]) 
    } 
    
    function  Convertto-INT64IP() { 
    param ([int64]$int) 
    
    return (([math]::truncate($int/16777216)).tostring()+"."+([math]::truncate(($int%16777216)/65536)).tostring()+"."+([math]::truncate(($int%65536)/256)).tostring()+"."+([math]::truncate($int%256)).tostring() )
    } 
    $extnet = Get-ExternalNetwork -name $extnetName
    $ExtNetView = $Extnet | Get-CIView
    $allocatedGatewayIPs = $extnetView.Configuration.IpScopes.IpScope[0].SubAllocations.SubAllocation.IpRanges.IpRange | ForEach-Object {
        $startaddr = Convertto-IPINT64 -ip $_.StartAddress
        $endaddr = Convertto-IPINT64 -ip $_.EndAddress
        for ($i = $startaddr; $i -le $endaddr; $i++) 
        { 
            Convertto-INT64IP -int $i 
        }
    }
    [int]$ThirdStartingIP = [System.Convert]::ToInt32($extnet.StaticIPPool[0].FirstAddress.IPAddressToString.Split(".")[2],10)
    [int]$ThirdEndingIP = [System.Convert]::ToInt32($extnet.StaticIPPool[0].LastAddress.IPAddressToString.Split(".")[2],10)
    [int]$FourthStartingIP = [System.Convert]::ToInt32($extnet.StaticIPPool[0].FirstAddress.IPAddressToString.Split(".")[3],10)
    [int]$FourthEndingIP = [System.Convert]::ToInt32($extnet.StaticIPPool[0].LastAddress.IPAddressToString.Split(".")[3],10)
    $octet = $extnet.StaticIPPool[0].FirstAddress.IPAddressToString.split(".")
    $3Octet = ($octet[0]+"."+$octet[1]+"."+$octet[2])
    $2Octet = ($octet[0]+"."+$octet[1])
    $ips = @()
    if ($ThirdStartingIP -eq $ThirdEndingIP) {
        $ips = $FourthStartingIP..$FourthEndingIP | % {$3Octet+'.'+$_}
    } else {
        do {
            for ($i=$FourthStartingIP; $i -le 255; $i++) {
                $ips += ($2Octet + "." + $ThirdStartingIP + "." + $i)
            }
            $ThirdStartingIP=$ThirdStartingIP + 1
        } while ($ThirdEndingIP -ne $ThirdStartingIP)
        for ($i=0;$i -le $FourthEndingIP; $i++) {
            $ips += ($2Octet + "." + $ThirdStartingIP + "." + $i)
        }
    }
        $allocatedIPs = $ExtNetView.Configuration.IpScopes.IpScope[0].AllocatedIpAddresses.IpAddress
    for ($i=0;$i -le $ips.count; $i++) {
        for ($j=0; $j -lt $allocatedGatewayIPs.count; $j++) {
            if ($ips[$i] -eq $allocatedGatewayIPs[$j]) {
                $ips = $ips | Where-Object { $_ -ne $ips[$i] }
                $i--
            }
        }
        for($z=0;$z -lt $allocatedIPs.count;$z++) {
            if ($ips[$i] -eq $allocatedIPs[$z]) {
                $ips = $ips | Where-Object { $_ -ne $ips[$i] }
                $i--
            }
        }
    }
    return $Ips
}

### Find the names of external networks
Get-ExternalNetwork

### Find free IPs using function from above
Get-FreeExtIPAddress -extnetName <vcd-net>

You can help yourself by copy and pasting the code snip into either PowerShell ISE or VisualCode. And since you need to install a cmdlet you need to run it with elevated rights. If you get a red message with importing the module it’s probably because of execution rights, you then need to run to command beneath. This is for allowing remote signed cmdlets to be executed.

Set-ExecutionPolicy RemoteSigned

Getting the names of the external networks with “get-externalnetwork”
Using the function to find available IPs in the selected external network

Change Veeam job repository

Having to move jobs to another repository is something that can be time-consuming. If you have to do it with the existing job you will need to disable the job, move the data to the new location, point the job to the new location and then enable the job again.

I found the other approach to creating new jobs easier. You can do it in the GUI, but when having 200+ jobs it could take some time. Instead, I did a small script to list all jobs located on the old repo.

The script also looks a the retention point and writes into the old job that is can be deleted after x days. If you have very long retention then this way is not feasible and you will probably have to move data and point exing job to the new location.

Hope somebody else can use it 🙂

Add-PSSnapin -Name VeeamPSSnapIn
Connect-VBRServer -Server <VBRSERVER>

$Jobs = Get-VBRJob | where {$_.IsScheduleEnabled -eq $true} | where {$_.FindTargetRepository().name -eq "dc2sveeamrepo01-scaleout"} | where {$_.IsRunning -eq $false}

# Select first 15 jobs and list them afterwards.
$remainingJobs = $Jobs | select -first 40
$remainingJobs | select name

foreach ($job in $remainingJobs)
{
$oldjob = "$($job.Name)delete after $($job.BackupStorageOptions.RetainCycles) days"
$job.Info.CommonInfo.name = "$oldjob"
$Job.update()
Disable-VBRJob -Job $oldjob

$jobName = "$($job.Name.Split("_")[0])"

Copy-VBRJob -job $oldjob -name $jobName  -Repository "dc2sveeamrepo02-scaleout" 

Enable-VBRJob -job $jobName

sleep 5

start-vbrjob -Job $jobName -RunAsync 

write-host "$jobName : have been cloned and is now started...." -ForegroundColor Green
}
 

NSX API – DLR L2 bridging

Here is a script for mass DLR L2 bridge creation. I had to bridge a couple of hundred VLAN to VXLAN, and while it was maybe faster to create it by hand I would not have learned anything.

The script is reading from a CSV file where I have all my info. Then loops through the entries and create a distributed port group and then initiates an L2 bridge. The VXLAN had been created post to this operation.

$csv = Import-Csv "D:\temp\VLAN.csv" -Delimiter ";"
Import-Module PowerNSX
get-module -name vmware* -ListAvailable | Import-Module

$cred = get-credential
connect-viserver -server -Credential $cred

foreach ($net in $csv) {
    $vdportgroup = ("zitmit-$($net.acl)").ToLower()

    $exists = Get-VDSwitch -Name "DSMpls01-EX" | Get-VDPortgroup -Name $vdportgroup -ErrorAction SilentlyContinue
    if (!$exists) {
        Get-VDSwitch -Name "DSMpls01-EX" | New-VDPortgroup -Name $vdportgroup -VLanId $net.mitvlan -NumPorts 2
        $created = Get-VDSwitch -Name "DSMpls01-EX" | Get-VDPortgroup -Name "zitmit-acl-10344"
        if (!created) {
            Write-Host -ForegroundColor Green "Portgroup created: $vdportgroup"

            $vdportgroupId = ($created.Id).Replace("DistributedVirtualPortgroup-","")
            $vdportgrpupName = $created.Name

            create-nsxl2bridge -aclname $($net.acl) -dvportGroup $($created.key)
        }
    }
    else {
        Write-Host -ForegroundColor Yellow "Portgroup have allready been created: $vdportgroup"
        #Get-VDSwitch -Name "DSMpls01-EX" | New-VDPortgroup -Name $vdportgroup -VLanId $net.mitvlan -NumPorts 2
    }
}

Function create-nsxl2bridge {
    param(
        [string]$aclname,
        [string]$dvportGroup
    )

    # Login info
    $nsxUsername = 
    $nsxPassword = 

    # Allow all SSL protocols
    $AllProtocols = [System.Net.SecurityProtocolType]'Ssl3,Tls,Tls11,Tls12' 
    [System.Net.ServicePointManager]::SecurityProtocol = $AllProtocols

    # Connect to NSX manager
    $connection = Connect-NsxServer  10.1.70.5 -Username $nsxUsername -Password $nsxPassword -WarningAction SilentlyContinue
    $virtualwire = Get-NsxLogicalSwitch | Where-Object { $_.name -match "$aclname" -and $_.name -notmatch "lan" }

    if ($virtualwire.count -gt 1) {
        $message = "Something could wrong - $aclname"
        write-host $message -ForegroundColor yellow
        $message | Out-File C:\log\create-nsxl2bridge.txt -Append
        $virtualwire = $virtualwire[0]
    }
    elseif (!$virtualwire) {
        $message = "virtualwire was not found: $($virtualwire.objectId) - acl: $aclname"
        write-host $message -ForegroundColor yellow
        $message | Out-File C:\log\create-nsxl2bridge.txt -Append
        return
    }

    # Edge info
    $edgeId = "edge-1120"
    $Type = "Accept: application/xml"
    $Header = @{"Authorization" = "Basic " + [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes($nsxUsername + ":" + $nsxPassword)) }
    $nsxUri = "https://10.1.0.4/api/4.0/edges/$edgeId/bridging/config"

    # Getting edge config
    $currentL2Config = $null
    $currentL2Config = Invoke-RestMethod -Uri $nsxUri -Headers $Header -Method GET -ContentType $Type

    # Check if already there
    foreach ($z in $currentL2Config.SelectNodes("//name"))
    {
        if ($z.'#text' -match $aclname ) {
            write-host "Already exists: $aclname" -ForegroundColor yellow
            return
        }
    }

    # Add extra xml node to currentconfig
    $handler1 = $null
    $handler1 = $currentL2Config.CreateNode('element', "bridge", '')
    $attr = $currentL2Config.CreateNode('element', "bridgeId", '')
    $attr.InnerText = "$null";
    $handler1.AppendChild($attr)
    $attr = $currentL2Config.CreateNode('element', "name", '')
    $attr.InnerText = "$aclname";
    $handler1.AppendChild($attr)
    $attr = $currentL2Config.CreateNode('element', "virtualWire", '')
    $attr.InnerText = "$($virtualwire.objectId)";
    $handler1.AppendChild($attr)
    $attr = $currentL2Config.CreateNode('element', "dvportGroup", '')
    $attr.InnerText = "$dvportGroup";
    $handler1.AppendChild($attr)
    
    # Remove nodes from existing XML
    $currentL2Config.SelectNodes("//virtualWireName") | ForEach-Object { $_.ParentNode.RemoveChild($_) }
    $currentL2Config.SelectNodes("//isSharedNetwork") | ForEach-Object { $_.ParentNode.RemoveChild($_) }
    $currentL2Config.SelectNodes("//dvportGroupName") | ForEach-Object { $_.ParentNode.RemoveChild($_) }

    # Add the newly created node to existing XML
    $currentL2Config.bridges.AppendChild($handler1)

    # PUT edge config
    $respons = Invoke-RestMethod -Uri $nsxUri -Headers $Header -Method PUT -ContentType 'application/xml' -Body $currentL2Config
    write-host "L2 Created: $($virtualwire.objectId) - acl: $aclname" -ForegroundColor Green
}

VCD – remove org and its items

Having a nice VRO job to create the VCD tenants with its VDCS, Edges and networks are nice. When having to clean up after testing its a pain to click through the GUI to first remote networks, then edges, then disable VDC, delete it, and final delete the org. A bit of PowerShell fu can help with the task, this is a quick and dirty script set of commands, but it works as intended.

get-module VMware.VimAutomation.Cloud | Import-Module

$ciCred = Get-Credential
Connect-CIServer -Server vcd.domain.tld -Credential $ciCred

$org = get-org deletemeorg
$orgvdc = $org | get-orgvdc
### Remove Org networks
$orgvdc | Get-OrgvdcNetwork | Remove-OrgVdcNetwork 
### Remove edges
($orgvdc | get-edgegateway | Get-CIView).delete() 
### Remove VDC
$orgvdc | Set-OrgVdc -Enabled $false | remove-orgvdc
### Remove Org
$org | set-org -Enabled $false | remove-org

Unlock ESXi root account

I often see people, including myself, lock themselves out of the ESXi web-based host client. It only locks you out from ssh and the web console. Password lockout is NOT active on the console/DCUI. Below is how you reset the counter and regain access.

Procedure to unlock the ESXi root

First, you need to gain ILO/IMM/IPMI or physical access to the server.

  1. At the console, press ALT+F1 to get to the ESXi shell. If a login shows up continue with step 3, otherwise continue with step 2. Change back to the login screen with ALT+F2.
  2. Login to the DCUI (to enable the ESXi Shell if not already done)
    1. Login with root and the correct password.
    2. Go to Troubleshooting Options
    3. Select Enable ESXi Shell
    4. Press CTRL+ALT+F1
  3. At the ESXi shell login with root and the password
  4. Run the following command to unlock the root account:
pam_tally2 --user root --reset

Veeam repository with local account

For the matter of security, I consider it a good idea to isolate the Veeam repository server from Active Directory. So that a compromised domain admin account or other can not gain access to the repository.

But when wanting to do add the repository to the VBR its failing and saying “Access Denied”.

Alright, a bit of googling and found a short and precise article from another guy having solved this problem.

What was the solution?

Open regedit on the repository server and navigate to following

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System

Here you add a DWORD with the name of “LocalAccountTokenFilterPolicy” and value of “1”. This fixes the problem and without rebooting.

### The PowerShell way
if((Test-Path 'HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System')){New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System' -Name 'LocalAccountTokenFilterPolicy' -Value '1' -PropertyType DWORD} 

Now you can add the repository server to the VBR. I always forgot where to find the info for the reg hack, so now it’s here so future Jesper can find it 🙂

Using dd – disk cloning

dd (disk duplication) is a utility that can read raw data of a disk, even if the Mac doesn’t understand the filesystem.

I have used it before in p2v a physical to a virtual server. For details take a look at this article.

  • if= specifies input path (file, or device)
  • of= specifies output path (file, or device)
  • bs=n sets both input and output block size (optional, default=512 byte blocks)
  • conv=noerror,sync tells dd to be fault-tolerant and ignore read errors (optional)

If the operation stops with an I/O error, trying to salvage all readable data with conv=noerror,sync.
This option can often recover a dead hard drive or an unreadable file, but it does not repair the error.

Make a clone of a disk:

In this case, I was fooling around with a Microsoft StorSimple appliance and wanted to have a backup of it to go back to if I messed it up too badly. This is not the first time I have done it, and sure not the last time, and always forgot the commands, so to future Jesper, here goes.

Open Terminal and try this:

  1. Attach and identify the source disk:
diskutil list

2. if mounted, unmount the source disk:

diskutil unmountDisk /dev/disk2

Copy the source disk to the desktop:

sudo dd if=/dev/disk2 of=~/Desktop/diskimage.img

If you want to save space, make it gzip it with:

sudo dd if=/dev/disk2 | gzip -c > ~/Desktop/diskimage.img.gz

Failed disk backup:

If the copy fails due to disk errors, try using the following command at step 3:

dd if=/dev/disk2 of=~/Desktop/diskimage.img conv=noerror,sync

dd will take much longer using this option, errors are written as null bytes. fsck / chkdsk the disk afterward.

Notes:

  • Using /dev/rdisk# (raw disk) in place of /dev/disk# is much faster
  • You can check progress, while the command is running, by pressing Ctrl-T
  • If you can attach both disks, copy directly: sudo dd if=/dev/disk2 of=/dev/disk3 bs=64k

VHDX Native boot

I stumbled upon the concept of VHDX native boot. Its a rather old feature but very overlooked. When I had a windows laptop I would have loved this feature. Being able to multiboot so you could format your PC with still have the possibility to boot the old installation.

Its fairly simple, you OS is contained within a VHDX on your disk. The boot loader on the disk then has an entry of that VHDX file. Simple but yet powerful. You could have another VHDX for all your data and then a VHDX for OS. Then when booting into each of your environments you would have your data with you.

How to:

If you have a native os install today you can still use this feature. So it’s easy to convert into VHDX native boot. First, we need to do a bit of diskpart, then use dism to install the OS into the VHDX. But this can all be done while your old OS is running and you don’t have to prep USB keys etc.

Diskpart

We will create a new VHDX file with parameters of size and type.

diskpart
create vdisk file=C:\WindowsImages\w10ent64en-gb.vhdx maximum=51200 type=expandable
attach vdisk
create part primary
format quick label="System"
assign letter=W
exit

DISM:

Now we have a VHDX file that Is attached, formated, and has a drive letter. On to install of your OS. You will need your install media, for Windows 10 you normally use index 1, if you are installing Windows server index 1 is probably the server core install if you want GUI then install with index 2.

Afterward, bcdboot creates the boot entry into the boot loader and lastly, you can change the description on the entry in the boot loader, so that you can remember what has been installed.

dism /apply-image /imagefile:E:\sources\install.wim /index:1 /applydir:W:\
bcdboot W:\Windows
bcdedit /set {default} description "Windows 10 Ent EN-GB VHDX"

Conclusion:

You can now reboot and have the choice to boot your newly created VHDX with your fresh installed OS. So your company installed Windows laptop without admin rights, you can now boot into your private install so you have a company and private side of your work PC.

StorSimple 8100 – reuse

I bought a Microsoft StorSimple 8100 unit. The only catch, it did not contain its SSDs and the password for it was unknown. Fair enough.

Its quite an interesting unit, hardware-wise. Its a 2U with 2x750W PSU, 12×3,5 SAS bays, and 2x compute nodes. Each compute node is in fact a Xeratex CS-6000-AB containing:

  • 1* E5-2648L 1,8Ghz 8 core 70Watt CPU
  • 4* 8GB DDR3 memory
  • 1* LSI/Avago/Broadcom 2308 SAS HBA (1*SFF8088 and 1*internal link)
  • 1* Mellanonx ConnectX3 10/40/56Gbit dual QSFP
  • 1* 128GB SSD for OS.

The two compute nodes share a Xeratex HB1235 enclosure with the 12 3,5″ drive bays. This enclosure is used for many other storage vendors as HPE 3PAR or Dell Compellent SANs.

IPMI/BMC enable

Not having a DisplayPort to connect a screen so you can see what is going on is making this a very proprietary piece of hardware. But when having access to the IPMI then all of sudden it becomes easy to reuse the hardware for something different than the StorSimple software.

This is how to enable the IPMI/BMC hardware.

  1. Reseat one of the controllers or power cycle the appliance with a console cable connected
  2. Press Esc to enter the boot options
  3. Select “Setup Utility” from the list
  4. It will prompt for a password (E1aD8wAbMxB3XcpjwVKD)
  5. Go to Advanced tab
  6. Go to  IPMI BMC Configuration
  7. Go to BMC Configuration
  8. Scroll down till you get to the bottom and you will see the network configuration
  9. Select LAN Channel number 1 and static IP source
  10. Enter the IP, subnet, and gateway
  11. Press F10 to save and exit
  12. Log into the BMC with web browser and access the console from there Log in Username: admin  Password: admin
The IPMI/BMC interface of Xeratex CS-6000 node

Now you can open a java based KVM tool to get the display from the node and do what you want. Awesome!

Java….

But there is a small catch, you can’t just open and run the IPMI. The firmware is old and uses encryption algorithms that are not allowed anymore. So you need to change the security properties of your java install and run the IPMI in an Internet Explorer running compatibility mode.

This is quite an easy fix. What I did was to open notepad as administrator, and edit the following file:

C:\Program Files\Java\jre1.8.0_131\lib\security\java.security

find and comment out the line that starts with “jdk.jar.disabledAlgorithms” by prefixing a #. Note that this will allow jar files signed with any algorithms to run, which can is to be considered insecure! But for us a necessary measure for getting access to the IPMI.

StorSimple software

Each compute node is using VHDX native-boot. So the SSD has a boot loader, and then each VHDX is in that boot loader. That means that they can deploy a newer version or factory reset by switching over to another VHDX disk. I was actually not aware of something like VHDX native boot, but its a very nice feature. For sure going to use that on my windows based laptop in the future. So much easier than having to do the native OS install.

The StorSimple software is based on Windows Server 2012R2. You are normally only able to use use the console connection for direct management, but it actually also has an IPMI/BMC feature on each compute node you can look deeper into the system.

Since I did not have the device password the StorSimple software could do nothing. So I got my fingers on PCUnlocker, a password reset tool. Booted through IPMI, where I could attach the VHDX file and have it reset the passwords of the administrator. This account was also disabled, but PCUnlocker did also take care of that part.

Now boot back into the StorSimple software I could now choose an administrator account, type in my new password and now I had access to a cmd. It was using server core install, so no GUI but that’s ok because now I had access to all the other HCS PowerShell cmdlets.

Unfortunately the former owner had also tried to mess around with it, so the factory default VHDX images and the compute node signatures did not match and therefore the “reset-hcsfactorydefault” could not validate the factory default images. Bummer.

Many of the HCS cmdlets where PowerShell cmdlets referring to a DDL, so no way to see what was going on. But the “test-hcsfactoryimage” and reset/initialize scripts where full-blown PowerShell. So from there, I could see what was checked for the VHDX image to validate. I actually did a bypass on the validation, and did the reset command, but after each node had generated a new VHDX from the factory VHDX files I booted but was stuck in the boot state of HCS software.

I found an eagerness to find a way to fix it, but then again the time spent would not payout. You need an Azure subscription to actually manage StorSimple since there is no local GUI, only the serial console. So I decided to install Windows Server 2019 in it instead. 🙂

Conclusion

It’s a nice piece of hardware, StorSimple should have been nice to use if it was not depended on Azure. I now have a 2-node possibility to run an HCI cluster running Storage Spaces and with a failover cluster, presented to each node with CSV volumes. I could run HyperV and have a 2U box with full redundancy. I still feel the eager to fix the StorSimple software but not for now 🙂

FreeBSD – install phpipam

Quick guide on how to install phpipam on FreeBSD. I will assume that you know how to install FreeBSD 🙂

Remember to have a freshly updated server and use sudo instead of directly root access.
### Patch OS
freebsd-update fetch install

### Install all required packages
pkg install nginx php74-sockets php74-openssl php74-gmp php74-gettext php74-mbstring php74-gd php74-curl php74-pear php74-pdo_mysql php74-session php74-filter php74-json php74-iconv php74-ctype mysql57-server git sudo screen

Configure mysql

We won’t do any tuning to mysql, just create a user and database and lets go.

### Enable mysql on boot
sysrc mysql_enable=YES

### Run mysql_secure installation, choose to edit root password and press other to everything else.
mysql_secure_installation

### Login to mysql and create database, user and grant access to user
$ mysql -u root -p
CREATE DATABASE phpipam;
GRANT ALL ON phpipam.* TO phpipam@localhost IDENTIFIED BY 'trwITH!lU';
FLUSH PRIVILEGES;
QUIT;

Configure phpipam

Get phpipam and put in www dir. Use git to get code, this will also make it easier for version updates later on.

### Create folder
mkdir -p /usr/local/www/phpipam

### Get phpipam into folder
git clone https://github.com/phpipam/phpipam.git /usr/local/www/phpipam

### use version instead of dev
cd /usr/local/www/phpipam && git checkout -b 1.4 origin/1.4

### Create config.php
cp /usr/local/www/phpipam/config.dist.php /usr/local/www/phpipam/config.php 

### Edit config.php so it matches mysql settings you created
$db['host'] = 'localhost';
$db['user'] = 'phpipam';
$db['pass'] = 'trwITH!lU';
$db['name'] = 'phpipam';
$db['port'] = 3306;
Updating phpipam
### Create backup of config.php
cp /usr/local/www/phpipam/config /tmp/config.php

### Create backup of database
cd /usr/local/www/phpipam
mysqldump -uroot -p phpipam > db/bkp/phpipam_$(date -v-1d +%d-%B-%Y).db

### Pull from GitHub
cd /var/www/phpipam
git pull
git checkout -b 1.x origin/1.x
git submodule update --init --recursive

Finish up by opening the web interface and follow upgrade procedure.

Configure nginx

Make nginx start on boot and backup the original config. We will then add our own.

### Enable nginx and mysql and boot
sysrc nginx_enable=YES

### backup original config
mv /usr/local/etc/nginx/nginx.conf /usr/local/etc/nginx/nginx.conf.org

After we now have the backup, lets add the content beneath to nginx.conf.

user  www;
worker_processes 2;

error_log /var/log/nginx/error.log;

events {
    worker_connections 1024;
}

http {
    include mime.types;
    default_type application/octet-stream;

    #access_log  logs/access.log  main;
    sendfile on;
    #tcp_nopush     on;

    keepalive_timeout 65;

    gzip on;

    # disable max upload size
    client_max_body_size 0;
    # add timeouts for very large uploads
    client_header_timeout 30m;
    client_body_timeout 30m;

    server {
        listen 80;
        server_name ipam.ramsgaard.me;
        root /usr/local/www/phpipam;
        index index.php;
        location ~ \.php$ {
            fastcgi_split_path_info ^(.+\.php)(/.+)$;
            fastcgi_index index.php;
            fastcgi_pass unix:/var/run/php-fpm.sock;
            include fastcgi_params;
            fastcgi_param PATH_INFO $fastcgi_path_info;
            fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        }
    }


    # HTTPS server
    #
    #server {
    #    listen       443 ssl;
    #    server_name  localhost;
    #    ssl_certificate      cert.pem;
    #    ssl_certificate_key  cert.key;
    #    ssl_session_cache    shared:SSL:1m;
    #    ssl_session_timeout  5m;
    #    ssl_ciphers  HIGH:!aNULL:!MD5;
    #    ssl_prefer_server_ciphers  on;
    #    location / {
    #        root   html;
    #        index  index.html index.htm;
    #    }
    #}
}

Configure PHP/FPM

Lets make a production ini file and afterwards setup php-fpm config file.

cp /usr/local/etc/php.ini{-production,}

Open the file /usr/local/etc/php-fpm.d/www.conf and uncomment the following lines.

listen.owner = www
listen.group = www
listen.mode = 0660

### Replace the TCP socket with unix socket.
;listen = 127.0.0.1:9000
listen = /var/run/php-fpm.sock;

### Enable and start php-fpm 
sysrc php_fpm_enable=YES
service php-fpm start

Conclusion

We have now installed all the required components, you should now reboot the server and check if all the services is coming up automatically. If so you can proceed and access the web interface of your new phpipam installation. Then follow the guide on how to get setup.

NetApp – ServiceProcessor stuck updating

After update to OnTap 9.7 the service processors where stuck in “updating”. They never came up again, not even after rebooting it.

Procedure:

  • Disable auto update
  • Reboot one SP, wait for it to show online.
  • Run the update parameter manuel
  • If its online and updated then enable auto update again.
### Disable autoupdate
system service-processor image modify -node <nodename> -autoupdate false

### Reboot the service processor
system service-processor reboot-sp -node <nodename>

### Initiate update
system service-processor image update -node <nodename>

### Verify version and SP status
system service-processor show

### Enable autoupdate
system service-processor image modify -node <nodename> -autoupdate true
Here we see that the ctrl02 is online again, but with wrong firmware.
After Manuel update we now have the correct firmware and they are online. Ready to enable autoupdate

https://kb.netapp.com/app/answers/answer_view/a_id/1028746/~/service-processor-firmware-update-fails-

Cloud Director 10.1 released

Been using vCloud since version 5.1. After a brief love affair with something called “Azure Pack” we put all our focus into vCloud.

8.20 was the first sign of heartbeats coming from VCD. We got confirmation that vCloud was for sure the platform that we were and had been looking for. Now we see the 10.1 released and from my point of view it’s a big one, may things change in GUI as in infrastructure. This release is also the final farewell to the old flex GUI.

First off we have to address the naming, I always liked the vCloud term, for me a strong brand. So a bit sad to see that go and now we have to get used to the Cloud Director instead. Thankfully we can still use the acronym VCD for VMware Cloud Director. #LongLiveVCD.

In the next few points, I will address some of the major things within this release.

APIs

We use a lot of the functionality of the APIs of VCD. Since we see that the development of VCD is changing into higher gear, so is the deprecation of the older API versions. For a small service provider, it’s always hard to revisit automation already working with existing APIs. When going on board 10.1 we have to go through a couple of workflows to update the to use the new 34.0 API. But on the other side, it’s also a good chance to refactor and optimize.

  • VMware Cloud Director API version 29 and below are not supported.
  • VMware Cloud Director API version 30.0 is deprecated and will become unsupported after VMware Cloud Director 10.1
  • VMware Cloud Director API version 31.0 is deprecated.

NSX-T feature improvements

More of the core NSX-T features is now available through VCD.

  • IPSec VPN
  • Dedicated External Network
  • BGP and Route Advertisement

We have been looking from the side for NSX-T development to reach an acceptable level for some time. NSX-V is still doing a good job. As someone who right now is standing up a new 16 node VMware cluster as a new provider VDC, I would have wished for it to be 6 months later so that all NSX-T functionality was ready and we could hopefully solo use NSX-T.

But we have to look into maybe having two 8 node clusters for NSX-V and on for NSX-T so we can already now start to transition to NSX-T…

But the good thing about being a VMware customer is that you are not left in the dust. There have been already been created migration tools for NSX-V > NSX-T, NSX-T Data Center Migration Coordinator, but it had no integration to VCD. which bring me to the next point!

NSX-V to NSX-T VCD Migration Tool

This is a way of helping us transition from NSX-V to NSX-T as we are seeing NSX-V lacking to the end of support in January 2021.

Before we could still do a new provider VDC that was backed by NSX-T controller and then start to move workloads over to the new cluster and at the time had to use NSX-T functionality, but all in a manual process.

There is now an automated way to do it, which is VCD aware. The approach will require a new cluster since NSX-V and NSX-T can’t coexist in the same cluster. From the Whats New in 10.1 it stats that the workflow will help with following

  • Automates migration of vCD metadata and workloads from NSX-V to NSX-T
  • Migrate per Org VDC migration to reduce maintenance window to single tenant
  • Minimize network downtime with bridged networks during migration
  • Live migrate with vMotion to ensure non-disruption to user workloads
  • Keep source VDC configuration and environment as-is to allow rollback
Before live migration
After live migration

Tomas did a good discussion on this subject

SSL and Certificate Management

This seems like something to read up on carefully. In short, VCD does not trust endpoint certificates unless they have been imported to the trust store.

There is a tool helping with the import, trust-infra-certs, that automatically connect to the endpoint, grabbing and importing the certificate. If this is not done successfully you will not be able to talk to those endpoints after upgrading to VCD 10.1.

App Launchpad

A new feature to help introduce a marketplace with the help of the content from Bitnami. From there we can now offer customers to easily find, deploy and manage new workloads. Not just as VMs but also as containers.

Daniel did an excellent write up on this subject.

Conclusion

There is still a lot more in this release to talk about, CSE2.6, OSE1.5, Terraform 2.7 provider, etc. read more from the official release notes.

Might have had to write a disclaimer for the length of this post and the lack of interesting pictures, will try to improve for next time.

I love to see VCD take flight. We are looking forward being part of the future journey where things like Bitnami and App Launchpad together with more NSX-T functionality and a whole lot of other features helps us Cloud Providers to help other business to there digital transformation .

Big shout out to VMware and the VCD team!