Load Balancer

Implement a load balancer for web servers. It provide the following functionality:

Add a new server to the cluster => add(server_id).

Remove a bad server from the cluster => remove(server_id).

Pick a server in the cluster randomly with equal probability => pick().

Motivations:

  • Finite memory of server

  • Load Balancers can provide a service that has no downtime and better maintenance.

    • For example, say you needed to update your website. A load balancer allows you to do this without shutting down your entire system.

  • Handle and sustain more total connections (as the demand for the product increases).

  • Maximize resources, and minimize response time.

Further Possible Designs (scheduling algorithms):

  • Round Robin

  • Fewest number of connections

  • Fewest current load on the basis of...

    • Current CPU occupation

    • Current memory occupation

  • Randomized algorithm:

    • Uniform selection, or fit a beta distribution that best models the constraints of your system.

    • Weighed Sampling on the basis of certain performance metrics of the server like memory capacity or CPU performance.

Other Issues:

  • Requests made by same user should be cached to the same server Id.

Complexity: O(1) time for all operations, O(N) space

Implementation 1

import random

# randomized load balancer
class LoadBalancer1:
    def __init__(self):
        self.servers = []
        self.server_hash = {}

    def add(self, id):
        self.server_hash[id] = len(self.servers)
        self.servers.append(id)

    def remove(self, id):
        if id not in self.server_hash:
            raise ValueError("Server ID does not exist")
        index = self.server_hash[id]

        # swap for O(1) removal (order doesnt matter)
        self.servers[-1], self.servers[index] = \
            self.servers[index], self.servers[-1]

        # reflect the changes onto the dictionary
        self.server_hash[len(self.servers)] = index

        # remove from end of list and dictionary
        self.servers.pop()
        del self.server_hash[id]

    def select(self):
        return random.choice(self.servers)


# sanity checking
def test1(removals):
    random.seed(42)

    N_SERVERS = 10
    obj = LoadBalancer1()
    print("adding")
    for i in range(0, N_SERVERS):
        obj.add(i)
    print("selecting")
    for i in range(0, N_SERVERS):
        print(obj.select())
    print("removing: ", removals)
    for i in removals:
        obj.remove(i)
    print("selecting")
    for i in range(0, N_SERVERS):
        print(obj.select())

test1([1, 3, 4, 6, 9])
test1(reversed([1, 3, 4, 6, 9]))

Complexity: O(1) time for add, removal, but O(n) for select -> which potentially might be unscalable depending on the constraints of the system design.

Implementation 2

# weighted load balancer
class LoadBalancer2:
    def __init__(self):
        self.cum_sum = 0
        self.servers = []
        self.weights = []

        # sole purpose is to hash into the list
        self.server_hash = {}

    def add(self, id, weight):
        assert(weight <= 1)
        self.server_hash[id] = len(self.servers)
        self.weights.append(weight)
        self.servers.append(id)
        self.cum_sum += weight

    def remove(self, id):
        if id not in self.server_hash:
            raise ValueError("Server ID does not exist")
        index = self.server_hash[id]
        self.cum_sum -= self.weights[index]

        # swap for O(1) removal (order doesnt matter)
        self.servers[-1], self.servers[index] = \
            self.servers[index], self.servers[-1]
        self.weights[-1], self.weights[index] = \
            self.weights[index], self.weights[-1]

        # reflect the changes onto the dictionary
        self.server_hash[len(self.servers)] = index

        # remove from end of list and dictionary
        self.servers.pop()
        self.weights.pop()
        del self.server_hash[id]

    def select(self):
        pmf = [weight/self.cum_sum for weight in self.weights]
        cdf = [sum(pmf[:i+1]) for i, x in enumerate(pmf)]
        rnd = random.uniform(0, 1)

        # start from 0
        if 0 <= rnd <= cdf[0]:
            return self.servers[0]

        # now initiate the rest
        for i in range(1, len(cdf)):
            if cdf[i-1] <= rnd <= cdf[i]:
                return self.servers[i]
        print("Shouldn't get here")


# sanity checking
def test2(removals):
    random.seed(42)

    N_SERVERS = 10
    obj = LoadBalancer2()
    print("adding")
    for i in range(0, N_SERVERS):
        obj.add(i, random.uniform(0, 1))
    print("selecting")
    for i in range(0, N_SERVERS):
        print(obj.select())
    print("removing: ", removals)
    for i in removals:
        obj.remove(i)
    print("selecting")
    for i in range(0, N_SERVERS):
        print(obj.select())


test2([1, 3, 4, 6, 9])
test2(reversed([1, 3, 4, 6, 9]))

Last updated