Day 28: How to Load Balance?

In Day 26's post, we discussed the architecture we are going to build. The first component that we need up and running for the entire network to work is the load balancer. Right now, the connections are made in such a way that my Zodiac FX switches are connected to the load balancer alone. The load balancer in-turn is connected to the network of distributed controllers.Without the load balancer, the switches cannot access or communicate with their respective controllers.

Let us firstly understand what are the properties of the load balancer that we need to be concentrating on. The load balancer is primarily forwarding OpenFlow packets from the switch to controller and vice versa. But as we have studies earlier, OpenFlow is an application layer protocol encapsulated in a TCP packet. Thus we need to redirect this TCP packet from the load balancer. Thus we shall be building an L4 load balancer that acts at the TCP level. The load balancer needs to act differently in different scenarios. Let's look at them:

Initial Normal Functionality - The switches on one side and the controllers on the other side. Each switch has a specific controller IP address that it will contact. In my configuration, s1 always talks to c1, s2 to c2 and s3 to c3. Apart from this, the load balancer also creates copies of the received OpenFlow messages and sends each one to the master controller since it needs to be aware of what needs to be done when inter-domain communication needs to happen.

When one of the controller fails - The same physical configuration is retained. Suppose say c1 is down due to an on-going DoS attack. In this situation, s1 cannot contact any other controller on it's own as it is configured with only c1's IP address. But s1 would come to know that c1 is down only when it sends a response directly to c1 and receives no response in return. But s1 cannot directly communicate to c1 since all packets flowing in the network go through load balancer. When the load balancer receives a packet from s1, it changes the destination to either c2 or c3 depending on whichever controller has lesser load. Redirection happens in such a way that the response is received from c2 or c3, but s1 thinks it is still communicating with c1. Even in this scenario, OpenFlow packets have to be sent to the master controller for synchronization purposes.

Back to normal - when the controller c1 has recovered from the DoS attack it was subjected to, it again is available and load balancer can start functioning as in case 1.

These are three stages in which our load balancer needs to work. Implementing all of this at one go seems kind of confusing and too much to handle. Thus we shall go one step at a time. Thus for now, I shall be concentrating on building this-

An L4 load balancer that redirects packets based on IP affinity. Each of my switches is associated with an IP address and so are my controllers. Whenever a packet comes from a certain IP address, I shall forward it to the corresponding controller. I shall not be taking care of fail-over mechanism right away. Once we achieve so much, we shall slowly add more features to our load balancer.

To build the same, I have been looking at Nginx and HAProxy. Given that Nginx Plus - which has better features is paid, I shall be sticking to HAProxy for now.

HAProxy stands for High Availability Proxy. As the name suggests, it acts as a tool with the help of which we can build proxies in between the network that redirect traffic to a pool of servers thus making the server network highly available to the clients. It is mostly used for HTTP redirection. At least most of the examples cited in many websites only deal with HTTP mode. But another less often used mode is the tcp mode. This is what we shall be using. In HAProxy, we need to take care of 2 major things - the front-end and the back-end. The front-end interacts with the client at all times and the back-end interacts with the pool of servers. There are predefined commands we need to write for each and every specific tasks like listening to a particular port, defining pool of servers and at what ports the redirection should happen, how and on which port the binding needs to happen, what sort of a load balancing algorithm is to be used etc. We shall be using the IP hashing load balancing algorithm. To be prepared for tomorrow's lesson, you could go through these links :P

https://www.digitalocean.com/community/tutorials/an-introduction-to-haproxy-and-load-balancing-concepts
https://stackoverflow.com/questions/39016291/haproxy-loadbalancing-tcp-traffic
http://blog.afkham.org/2014/10/tcp-load-balancing-with-haproxy.html

In tomorrow's post, we shall see how to install HAProxy and try out few codes. If possible, we shall complete the primary task of redirecting from s1 to c1, s2 to c2 and s3 to c3 tomorrow.

Refer to the previous and next articles here.

Author: Shravanya
Co-author: Swati

Comments

Popular posts from this blog

Day 12: Master Slave SDN Controller Architecture

Day 50: Tcpreplay and tcpliveplay approach

Day 10: Mininet Simulation of a basic distributed SDN controller architeture