Cloud Director – LB Content Switching

Content switching is used to direct incoming traffic to different pools/servers based on the content of the requests. By looking at the application layer (L7 in the OSI model) we can inspect the client request content, such as URL, header, cookies, queries, etc.

ALB has the concept of policies on virtual services. Each policy rule can have a match and action to it. So we can match on a host header http request and then do a content switch action afterward.

This is, unfortunately, a feature that is not yet supported by Cloud Director(10.5.1). But it can be done in the ALB manager, behind the back of VCD. Rules are in the VCD UI, but not the content of the policy.

My guess is therefore that when it comes to VCD UI in the future, it will be able to adopt the existing content-switching policies.

If you also think this feature should be implemented in VCD, please do a feature request on this site.

Workaround

In Cloud Director you setup the pools you want to do content switching on. From the screenshot below you can see port80 and port443 are the ones being used to day. Creating the extra pools here will have VCD aware of the pool, so we only need to change a bit on existing objects in the ALB manager.

In the ALB manager, find the virtual service, edit it, and navigate to the policy section. Here we can see the different policy types. Find “HTTP request” and add to new rules.

Each rule will have the match and action point as mentioned before. When doing the action content switch get to choose a specific pool.

Save it all and you are ready to test the content switching. From the screenshot below, you can see how it will look inside the VCD UI.

NSX V2T migration

The reason for me to research how to do content switching was primarily that some tenants are still using NSX-V because they use haproxy application rules to do content switching. NSX migration for Cloud Director tool does support basic load balancing migrations. But not when there are application rules applied.

Now tenant can schedule the service window, remove the application rules, and have the migration tool migrate all nat, firewall, routes, and basic load balancing.

After the migrations are done the content switching rules can be created manually, from what the application rules in haproxy specified.

Conclusion

Even if it’s not supported by VCD yet, we can still do it. Ofcause the tenants can’t do it themself but will need to log a support case with you until the feature is introduced. And when they do, then let’s hope it will adopt the rules being done behind its back.

VCD CPI/CCM – Load balancer

When a TKG cluster is deployed by Container Service Extension(CSE) it means that it lives within VMware Cloud Director(VCD).

Inside this TKG cluster, you will find a Cloud Controller Manager(CCM) pod under kube-system called “vmware-cloud-director-ccm”. CCM pod is part of Cloud Provider Interface(CPI) that gives you some capabilities on how for example to add a Persistent Volume(PV) or do load balancing with NSX Advanced Loadbalancer(ALB). Basically, the CCM will contact the VCD CPI API and from there orchestrate what you requested in your Kubernetes YAML files.

At the time of writing, L4 load balancing(LB) features through ALB are the only option available. This is because the CPI is not yet completely featured to create L7 LB with ALB. It’s on the roadmap though.

One-arm vs. two-arm…

I found that there are two conspects that are with knowing of, one-arm and two-arm load balanceres.

The two LB methods are described here. Since it’s an old article, AVI/ALB was not in the VMware portfolio back then. And also NSX-T has migrated away from the LB service where it lived as haproxy within tier1 and over to AVI/ALB service engines.

The default setting of load balancing with VCD CPI is two arms, meaning that it will tell VCD to create a DNAT rule towards a 192.168.8.x internal subnet used to create ALB VIPs.

WAN > T1 DNAT(185.139.232.x:80)> 192.168.8.x(LB internal subnet) > ALB SE > LB Pool members

Since L7 LB features in VCD are not yet available AND also it will become very costly. Most customers will probably choose to have an Nginx or Apache ingress controller inside their own Kubernetes cluster.

Since VCD 10.4 the two-arm config has been working and therefore it’s more desirable since you can use multiple ports on a single public IP. Where one-arm config would allocate one public IP pr Kubernetes service(correct me if I’m wrong).

If you are running your own ingress controller then some find the one-arm approach more desirable since ALB will then hold the public IP address.

WAN > T1 Static Route to ALB (185.139.232.x) > ALB SE > LB Pool members

How to change to one arm LB?

Hugo Phan has done a good write-up on this blog.

Basically, it’s downloading the existing config and changing the config map of VCD CPI removing the part where it 192.168.8.x subnet is defined. After this, you delete the existing CPI and then add it from the yaml file that you edited.

Snip from Hugo Phan blog – vmwire.com

How can I use/test this with my VCD TKG cluster?

If you don’t have a demo app that you prefer, then I can recommend either yelp or retrogames. Here I will do it with yelp. William Lam has done a good write-up and also hosts deployment files for yelp.

Step 1 – Deploy the application

kubectl create ns yelb
kubectl apply -f https://raw.githubusercontent.com/lamw/vmware-k8s-app-demo/master/yelb-lb.yaml

Step 2 – Check that all pods are running

jeram@QL4QJP2F4N ~ % kubectl -n yelb get pods
NAME                             READY   STATUS    RESTARTS   AGE
redis-server-74556bbcb7-f8c8f    1/1     Running   0          6s
yelb-appserver-d584bb889-6f2gr   1/1     Running   0          6s
yelb-db-694586cd78-27hl5         1/1     Running   0          6s
yelb-ui-8f54fd88c-cdvqq          1/1     Running   0          6s
jeram@QL4QJP2F4N ~ % 

The deployment file is asking k8s for a service of a load balancer, CCM picks this up and asks VCD CPI to have ALB creating the L4 load balancing.

Task view from VCD

Step 3 – Get the IP and go check out the yelb site

jeram@QL4QJP2F4N ~ % kubectl -n yelb get svc/yelb-ui
NAME      TYPE           CLUSTER-IP     EXTERNAL-IP       PORT(S)        AGE
yelb-ui   LoadBalancer   100.64.53.23   185.177.x.x   80:32047/TCP   5m52s

Step 4 – Scale the UI

Let’s see how many instances have from the initial deployment.

jeram@QL4QJP2F4N ~ % kubectl get rs --namespace yelb
NAME                       DESIRED   CURRENT   READY   AGE
redis-server-74556bbcb7    1         1         1       8m11s
yelb-appserver-d584bb889   1         1         1       8m11s
yelb-db-694586cd78         1         1         1       8m11s
yelb-ui-8f54fd88c          1         1         1       8m11s
jeram@QL4QJP2F4N ~ % 

We can then scale the UI to 3 and check again to see if that happens.

jeram@QL4QJP2F4N ~ % kubectl scale deployment yelb-ui --replicas=3 --namespace yelb
deployment.apps/yelb-ui scaled

jeram@QL4QJP2F4N ~ % kubectl get rs --namespace yelb
NAME                       DESIRED   CURRENT   READY   AGE
redis-server-74556bbcb7    1         1         1       9m45s
yelb-appserver-d584bb889   1         1         1       9m45s
yelb-db-694586cd78         1         1         1       9m45s
yelb-ui-8f54fd88c          3         3         3       9m45s
jeram@QL4QJP2F4N ~ % 

UI is now scaled to replicates of 3. Seen from the Load Balancer view I VCD it will only show the worker nodes. Since k8s is doing its own loadbalancing arose the pod instances.

Step 5 – Cleanup

jeram@QL4QJP2F4N ~ % kubectl -n yelb delete pod,svc --all && kubectl delete namespace yelb
pod "redis-server-74556bbcb7-f8c8f" deleted
pod "yelb-appserver-d584bb889-6f2gr" deleted
pod "yelb-db-694586cd78-27hl5" deleted
pod "yelb-ui-8f54fd88c-6llf7" deleted
pod "yelb-ui-8f54fd88c-9r6wf" deleted
pod "yelb-ui-8f54fd88c-cdvqq" deleted
service "redis-server" deleted
service "yelb-appserver" deleted
service "yelb-db" deleted
service "yelb-ui" deleted
namespace "yelb" deleted
jeram@QL4QJP2F4N ~ % 

Again, CCM will instruct VCD CPI to clean up. NICE!

Conclusion

We now have a good idea of how load balancing works with VCD TKG deployd K8s clusters. Off cause we are looking forward to the L7 features. But it’s a good start and VMware is working hard to help in making k8s deployment and day2 operations easier.