VCD CPI/CCM – Load balancer

When a TKG cluster is deployed by Container Service Extension(CSE) it means that it lives within VMware Cloud Director(VCD).

Inside this TKG cluster, you will find a Cloud Controller Manager(CCM) pod under kube-system called “vmware-cloud-director-ccm”. CCM pod is part of Cloud Provider Interface(CPI) that gives you some capabilities on how for example to add a Persistent Volume(PV) or do load balancing with NSX Advanced Loadbalancer(ALB). Basically, the CCM will contact the VCD CPI API and from there orchestrate what you requested in your Kubernetes YAML files.

At the time of writing, L4 load balancing(LB) features through ALB are the only option available. This is because the CPI is not yet completely featured to create L7 LB with ALB. It’s on the roadmap though.

One-arm vs. two-arm…

I found that there are two conspects that are with knowing of, one-arm and two-arm load balanceres.

The two LB methods are described here. Since it’s an old article, AVI/ALB was not in the VMware portfolio back then. And also NSX-T has migrated away from the LB service where it lived as haproxy within tier1 and over to AVI/ALB service engines.

The default setting of load balancing with VCD CPI is two arms, meaning that it will tell VCD to create a DNAT rule towards a 192.168.8.x internal subnet used to create ALB VIPs.

WAN > T1 DNAT(185.139.232.x:80)> 192.168.8.x(LB internal subnet) > ALB SE > LB Pool members

Since L7 LB features in VCD are not yet available AND also it will become very costly. Most customers will probably choose to have an Nginx or Apache ingress controller inside their own Kubernetes cluster.

Since VCD 10.4 the two-arm config has been working and therefore it’s more desirable since you can use multiple ports on a single public IP. Where one-arm config would allocate one public IP pr Kubernetes service(correct me if I’m wrong).

If you are running your own ingress controller then some find the one-arm approach more desirable since ALB will then hold the public IP address.

WAN > T1 Static Route to ALB (185.139.232.x) > ALB SE > LB Pool members

How to change to one arm LB?

Hugo Phan has done a good write-up on this blog.

Basically, it’s downloading the existing config and changing the config map of VCD CPI removing the part where it 192.168.8.x subnet is defined. After this, you delete the existing CPI and then add it from the yaml file that you edited.

Snip from Hugo Phan blog –

How can I use/test this with my VCD TKG cluster?

If you don’t have a demo app that you prefer, then I can recommend either yelp or retrogames. Here I will do it with yelp. William Lam has done a good write-up and also hosts deployment files for yelp.

Step 1 – Deploy the application

kubectl create ns yelb
kubectl apply -f

Step 2 – Check that all pods are running

jeram@QL4QJP2F4N ~ % kubectl -n yelb get pods
NAME                             READY   STATUS    RESTARTS   AGE
redis-server-74556bbcb7-f8c8f    1/1     Running   0          6s
yelb-appserver-d584bb889-6f2gr   1/1     Running   0          6s
yelb-db-694586cd78-27hl5         1/1     Running   0          6s
yelb-ui-8f54fd88c-cdvqq          1/1     Running   0          6s
jeram@QL4QJP2F4N ~ % 

The deployment file is asking k8s for a service of a load balancer, CCM picks this up and asks VCD CPI to have ALB creating the L4 load balancing.

Task view from VCD

Step 3 – Get the IP and go check out the yelb site

jeram@QL4QJP2F4N ~ % kubectl -n yelb get svc/yelb-ui
NAME      TYPE           CLUSTER-IP     EXTERNAL-IP       PORT(S)        AGE
yelb-ui   LoadBalancer   185.177.x.x   80:32047/TCP   5m52s

Step 4 – Scale the UI

Let’s see how many instances have from the initial deployment.

jeram@QL4QJP2F4N ~ % kubectl get rs --namespace yelb
NAME                       DESIRED   CURRENT   READY   AGE
redis-server-74556bbcb7    1         1         1       8m11s
yelb-appserver-d584bb889   1         1         1       8m11s
yelb-db-694586cd78         1         1         1       8m11s
yelb-ui-8f54fd88c          1         1         1       8m11s
jeram@QL4QJP2F4N ~ % 

We can then scale the UI to 3 and check again to see if that happens.

jeram@QL4QJP2F4N ~ % kubectl scale deployment yelb-ui --replicas=3 --namespace yelb
deployment.apps/yelb-ui scaled

jeram@QL4QJP2F4N ~ % kubectl get rs --namespace yelb
NAME                       DESIRED   CURRENT   READY   AGE
redis-server-74556bbcb7    1         1         1       9m45s
yelb-appserver-d584bb889   1         1         1       9m45s
yelb-db-694586cd78         1         1         1       9m45s
yelb-ui-8f54fd88c          3         3         3       9m45s
jeram@QL4QJP2F4N ~ % 

UI is now scaled to replicates of 3. Seen from the Load Balancer view I VCD it will only show the worker nodes. Since k8s is doing its own loadbalancing arose the pod instances.

Step 5 – Cleanup

jeram@QL4QJP2F4N ~ % kubectl -n yelb delete pod,svc --all && kubectl delete namespace yelb
pod "redis-server-74556bbcb7-f8c8f" deleted
pod "yelb-appserver-d584bb889-6f2gr" deleted
pod "yelb-db-694586cd78-27hl5" deleted
pod "yelb-ui-8f54fd88c-6llf7" deleted
pod "yelb-ui-8f54fd88c-9r6wf" deleted
pod "yelb-ui-8f54fd88c-cdvqq" deleted
service "redis-server" deleted
service "yelb-appserver" deleted
service "yelb-db" deleted
service "yelb-ui" deleted
namespace "yelb" deleted
jeram@QL4QJP2F4N ~ % 

Again, CCM will instruct VCD CPI to clean up. NICE!


We now have a good idea of how load balancing works with VCD TKG deployd K8s clusters. Off cause we are looking forward to the L7 features. But it’s a good start and VMware is working hard to help in making k8s deployment and day2 operations easier.