By Luis Bueno – System Test & Guilherme Pfeiff – Application Engineer
In this article we present the balancing protocols that aim to optimize traffic on ISPs using Link Aggregation, especially in MPLS scenarios.
Link aggregation
Links between two devices can be aggregated to operate as a single virtual link. This set of aggregate interfaces is called the Link Aggregation Group (LAG).
One of the motivations for using LAGs is to provide redundancy. Another factor is the increase in link capacity. Two or more 10G interfaces can be aggregated, for example, to have a single link with a capacity of 20Gbps or more.
It should also be borne in mind that 40G or 100G modules are still expensive, favoring the use of LAGs. It may be cheaper to add some 10G interfaces than to invest in new equipment to use 40G or 100G links.
Traffic balancing between LAG interfaces is based on a hash. This hash is calculated by taking into consideration criteria such as IP (source / destination) addresses, MAC (source / destination) addresses, TCP / UDP ports (source / destination), MPLS labels, and more, as configured by the user.
One purpose of using hash based packet header information is that packets belonging to the same stream will have the same hash and will be routed through the same LAG link. This prevents packets from the same data stream from being routed through different LAG links and reaching their destination out of order.
Problems in static balancing modes
Traditional balancing methods are static, ie streams with certain hash values will be routed to a specific LAG link even if it is already saturated.
In the image below, two 250Mbps streams arrive at the switch, which will route them through the LAG. The hash of both streams is calculated and the balancing algorithm determines that stream 1 should be forwarded to link 1 and stream 2 to link 2 of the LAG, with each link occupying 25%.
If, in this scenario, an additional 800Mbps stream arises, the static balancing algorithm will send it through one of the links along with the other stream, making it saturated. In the image below, stream 3 was sent over link 2, leaving it saturated and causing packet dropping while link 1 is underused. The static algorithm does not take into account the use of links to determine the output interface.
Dynamic Balancing – Dynamic Load Balance
With Dynamic Load Balance, the switch will analyze the size of flows and the use of LAG interfaces to determine which link to route traffic to. In the image below, realizing the existence of a new 800Mbps stream, the dynamic algorithm moved stream 2 to link 1 and forwarded stream 3 through link 2.
Dynamic Load Balance monitors link utilization continuously, moving flows between LAG links dynamically, making link utilization optimized, avoiding saturated links while other links are underused.
Why has MPLS become the dread of LAG users?
Although DLB optimizes LAG balancing, there is still dependence on hash calculation. As shown below, different streams in a VPN can always have the same hash, with no traffic balancing.
In the scenario below, there is a VPN between devices A and B.
Two streams pass through this VPN, but as both have the same source and destination, they are encapsulated in the same labels, leaving the same headers. As shown in the image below, the hash is calculated from these labels and ends up being the same value for both streams.
As a result, both streams are routed through the same interface as the LAGs, with no effective traffic balancing occurring. That is, even if there are numerous different data streams in this VPN, traffic will always occupy a single interface of LAGs, with no balancing occurring.
To prevent L2VPN traffic from being polarized and unbalanced, Flow Aware Transport (FAT) was created. Having this feature enabled on both ends of the VPN, a new label is added to the packet that is generated based on the flow. When generating the hash, this third label will be considered, which should result in a different value for each stream. In this way, balancing takes place properly.
The features cited in this article are implemented in the Datacom DM4270 switches, a family of 10Gbps high port density and 100Gbps uplink switches. Like other Datacom next-generation switches, the DM4270 is based on DmOS.
DATACOM maintains R&D investments in its network operating system to deliver to the market a solution that follows the latest software trends, ie modular architecture software focused on scalability and performance.
It is important to mention that Datacom has a complete structure in its headquarters where face-to-face training is offered. Training will enable you to manipulate equipment, configure various topologies and application scenarios in a complete lab environment, and can count on the help of our professionals in a range of good practices that will greatly assist in the operation of your network.
To learn more about applications on ISPs, visit the site and watch the releases and tutorials on our YouTube channel. The Support team is also available at suporte.prevendas@datacom.com.br. To request a proposal, please contact Datacom's commercial team: sales@datacom.com.br and (+55) 51 3933 3000.