Load Balancers in Azure

Before we look into Load Balancers in Azure, let us take a look at the OSI model and TCP/IP model to understand the different layers:

As you can see from the above diagram, OSI model has 7 layers (Please Do Not Throw Sausage Pizza Away) and TCP/IP model has 4 layers.

OSI layers 5, 6, 7 are combined into one Application Layer in TCP/IP and OSI layers 1, 2 are combined into one Network Access Layer in TCP/IP – however TCP/IP does not take responsibility for sequencing and acknowledgement functions, leaving these to the underlying transport layer.

Load balancing is one of the most important functions used in Azure for optimal distribution of network traffic across servers to provide the best possible application performance and application availability. There are layer 4 load balancing, layer 7 load balancing, and even L4/L7 load balancing.

The difference between layer 4 load balancing and layer 7 load balancing is based on the various layers in the Open Systems Interconnection (OSI) Reference Model for networking. A Layer 4 load balancer works at the transport layer, using the TCP and UDP protocols to manage transaction traffic based on a simple load balancing algorithm and basic information such as server connections and response times. A Layer 7 load balancer works at the application layer and makes its routing decisions based on more detailed information such as the characteristics of the HTTP/HTTPS header, message content, URL type, and cookie data. A L4/L7 load balancer manages traffic based on a set of network services across ISO layers 4 through 7 that provide data storage, manipulation, and communication services.

Layer 4 Load Balancing vs. Layer 7 Load Balancing

Layer 4 Load Balancing

Layer 4 load balancing, operating at the transport level, manages traffic based on network information such as application ports and protocols without visibility into the actual content of messages. This is an effective approach for simple packet-level load balancing. The fact that messages are neither inspected nor decrypted allows them to be forwarded quickly, efficiently, and securely. On the other hand, because layer 4 load balancing is unable to make decisions based on content, it’s not possible to route traffic based on media type, localization rules, or other criteria beyond simple algorithms such as round-robin routing.

Layer 7 Load Balancing

Layer 7 load balancing operates at the application level, using protocols such as HTTP and SMTP to make decisions based on the actual content of each message. Instead of merely forwarding traffic unread, a layer 7 load balancer terminates network traffic, performs decryption as needed, inspects messages, makes content-based routing decisions, initiates a new TCP connection to the appropriate upstream server, and writes the request to the server.

While the need for encryption incurs a performance penalty for layer 7 processing, this can be largely reduced through the use of SSL offload functionality. Enabling application-aware networking, layer 7 load balancing allows more intelligent load balancing decisions and content optimizations. By viewing or actively injecting cookies, the load balancer can identify unique client sessions to provide server persistence, or “sticky sessions,” sending all client requests to the same server for greater efficiency. Packet-level visibility allows content caching to be used, holding frequently accessed items in memory for easy retrieval.

Load Balancing options in Azure

Azure provides various load balancing services that you can use to distribute your workloads across
multiple computing resources, but the following are the main services:

● Azure Load Balancer

● Azure Traffic Manager

● Azure Application Gateway

● Azure Front Door

Azure Load Balancer

Azure Load Balancer is a network-layer load balancer from Microsoft. Its low-latency, layer 4 load balancing features help you build high availability and network performance into your applications. It can balance traffic between Azure Virtual Machines (VMs) and multitiered hybrid apps within your virtual networks.

Azure Load Balancer supports:

TCP and UDP
Layer 4
Apps that are both global and regional

The Open Systems Interconnection (OSI) model’s layer 4 is where Azure Load Balancer functions. It is the client’s single point of contact. Inbound flows that arrive at the load balancer’s front end are distributed to backend pool instances by the Azure load balancer.

These flows are based on load-balancing rules and health probes that have been set up. Azure Virtual Machines or instances from a virtual machine scale set can be used as backend pool instances.

Load balancing rules: Load balancing rules specify how traffic should be routed once it arrives at the load balancer. These rules can be used to send traffic to a backend pool. Client IPs can be directed to the same backend virtual machines if session persistence is enabled.

Health probes: When the health probe in the backend pool detects any failed virtual machines in a load balancer, it stops routing traffic to that particular failed virtual machine. It can set up a health probe to check the health of the backend pool’s instances.

Azure Traffic Manager

Azure Traffic Manager is a load balancer for DNS traffic. Using DNS-based traffic routing mechanisms, it can distribute traffic to services across global Azure regions as efficiently as possible. It can prioritize user access, assist in data sovereignty compliance, and alter traffic to accommodate app upgrades and maintenance.

Azure Traffic Manager supports:

TCP, UDP, HTTP, HTTPS, HTTP/2
Layer 7
Apps that are available worldwide

Priority traffic-routing method

Often an organization wants to provide reliability for their services. To do so, they deploy one or more backup services in case their primary goes down. The ‘Priority’ traffic-routing method allows Azure customers to easily implement this failover pattern.

The Traffic Manager profile contains a prioritized list of service endpoints. By default, Traffic Manager sends all traffic to the primary (highest-priority) endpoint. If the primary endpoint isn’t available, Traffic Manager routes the traffic to the second endpoint. In a situation where the primary and secondary endpoints aren’t available, the traffic goes to the third, and so on. Availability of the endpoint is based on the configured status (enabled or disabled) and the ongoing endpoint monitoring.

Weighted traffic-routing method

The ‘Weighted’ traffic-routing method allows you to distribute traffic evenly or to use a pre-defined weighting.

In the Weighted traffic-routing method, you assign a weight to each endpoint in the Traffic Manager profile configuration. The weight is an integer from 1 to 1000. This parameter is optional. If omitted, Traffic Managers uses a default weight of ‘1’. The higher weight, the higher the priority.

For each DNS query received, Traffic Manager randomly chooses an available endpoint. The probability of choosing an endpoint is based on the weights assigned to all available endpoints. Using the same weight across all endpoints results in an even traffic distribution. Using higher or lower weights on specific endpoints causes those endpoints to be returned more or less frequently in the DNS responses.

A point to remember is that DNS responses get cached by clients. They’re also cached by the recursive DNS servers that the clients use to resolve DNS names. This caching can have an effect on weighted traffic distributions. When the number of clients and recursive DNS servers is large, traffic distribution works as expected. However, when the number of clients or recursive DNS servers is small, caching can significantly skew the traffic distribution.

Performance traffic-routing method

Deploying endpoints in two or more locations across the globe can improve the responsiveness of your applications. With the ‘Performance’ traffic-routing method, you can route traffic to the location that is ‘closest’ to you.

The ‘closest’ endpoint isn’t necessarily closest as measured by geographic distance. Instead, the ‘Performance’ traffic-routing method determines the closest endpoint by measuring network latency. Traffic Manager maintains an Internet Latency Table to track the round-trip time between IP address ranges and each Azure datacenter.

Traffic Manager looks up the source IP address of the incoming DNS request in the Internet Latency Table. Traffic Manager then chooses an available endpoint in the Azure datacenter that has the lowest latency for that IP address range. Then Traffic Manager returns that endpoint in the DNS response.

As explained in How Traffic Manager Works, Traffic Manager doesn’t receive DNS queries directly from clients. Instead, DNS queries come from the recursive DNS service that the clients are configured to use. As such, the IP address used to determine the ‘closest’ endpoint isn’t the client’s IP address, but it’s the IP address of the recursive DNS service. This IP address is a good proxy for the client.

Traffic Manager regularly updates the Internet Latency Table to account for changes in the global Internet and new Azure regions. However, application performance varies based on real-time variations in load across the Internet. Performance traffic-routing doesn’t monitor load on a given service endpoint. If an endpoint becomes unavailable, Traffic Manager won’t include it in the DNS query responses.

Geographic traffic-routing method

Traffic Manager profiles can be configured to use the Geographic routing method so that users get directed to specific endpoints (Azure, External, or Nested) based on the geographic location their DNS query originates from. With this routing method, it enables you to be in compliance with data sovereignty mandates, localization of content & user experience and measuring traffic from different regions. When a profile is configured for geographic routing, each endpoint associated with that profile needs to have a set of geographic regions assigned to it. A geographic region can be at following levels of granularity

World– any region
Regional Grouping – for example, Africa, Middle East, Australia/Pacific etc.
Country/Region – for example, Ireland, Peru, Hong Kong SAR etc.
State/Province – for example, USA-California, Australia-Queensland, Canada-Alberta etc. (note: this granularity level is supported only for states / provinces in Australia, Canada, and USA).

When a region or a set of regions is assigned to an endpoint, any requests from those regions get routed only to that endpoint. Traffic Manager uses the source IP address of the DNS query to determine the region from where a user is querying from. Commonly found as the IP address of the local DNS resolver making the query for the user.

Traffic Manager reads the source IP address of the DNS query and decides which geographic region it’s originating from. It then looks to see if there’s an endpoint that has this geographic region mapped to it. This lookup starts at the lowest granularity level (State/Province where it’s supported, else at the Country/Region level) and goes all the way up to the highest level, which is World. The first match found using this traversal is chosen as the endpoint to return in the query response. When matching with a Nested type endpoint, an endpoint within that child profile is returned, based on its routing method. The following points are applicable to this behavior:

Multivalue traffic-routing

The Multivalue traffic-routing method allows you to get multiple healthy endpoints in a single DNS query response. This enables the caller to do client-side retries with other endpoints in the event of a returned endpoint being unresponsive

Subnet traffic-routing

Select Subnet traffic-routing method to map sets of end-user IP address ranges to a specific endpoint within a Traffic Manager profile. When a request is received, the endpoint returned will be the one mapped for that request’s source IP address.

Azure Application Gateway

As a service, Azure Application Gateway provides an application delivery controller. Using layer 7 load balancing capabilities, it can turn web front ends into scalable and highly available programs and securely distribute regional applications.

Azure Application Gateway supports:

HTTP, HTTPS, and HTTP/2
Layer 7
Apps for a certain region
Firewall for web applications
Offloading SSL/TLS

Azure Front Door

Azure Front Door supports the delivery of highly secure worldwide applications. Using the Microsoft global edge network, it can deliver the real-time performance of global online applications. It can transform many microservice apps into a single, more secure app delivery architecture by accelerating content.

Azure Front Door supports:

HTTP, HTTPS, and HTTP/2
Layer 7
Apps that are available worldwide
Firewall for web applications
Offloading SSL/TLS

Azure Action