Thursday, April 23, 2009

Understanding Redistribution (Part I)

Abstract: Describe the purpose of redistribution and the issues involved.
Prerequisites: Good understanding of IGP routing protocols (OSPF, EIGRP, RIPv2).

Let’s start straight with a rolling out a group of definitions. Redistribution is a process of passing the routing information from one routing domain to another. The ultimate goal of redistribution is to provide full IP connectivity between different routing domains. Another goal (not always required, though) is to provide redundant connectivity, i.e. backup paths between routing domains. Routing domain is a set of routers running the same routing protocol. Redistribution process is performed by border routers - i.e. routers belonging to more than one routing domain. On the contrary, internal routers belong just to one routing domain. Redistribution may be one-way (from one domain to another but not vice-versa) or two-way (bi-directional). Next, internal routes are the IGP prefixes native to a routing protocol; i.e. they are originated by IGP’s natural method, and their respective subnets belong to the IGP routing domain. External routes are the IGP prefixes injected into IGP routing domain via a border router - they have no corresponding IP subnets in the routing domain. They appear to be “attached” somehow to the border router that has originated them, and detailed information about their reachability is “compressed” and lost during the redistribution. Transit routing domain is the domain used as path to transport packets between two other routing domains. Domain becomes transit when two border routers perform bi-directional redistribution with two other routing domains. Stub routing domain is configured not to transit packets (effectively by blocking transit redistribution) between two other domains.

Let’s look at a picture to clarify the concepts.

Fig 1.1

Redistribution_1

The routing domains on the picture are described in the following table:

Table 1.1

DomainRouters
OSPFR2,R3,R4
EIGRP 123R1,R2,R3
RIPv2R4,R5,BB2
EIGRP 356R3,R5,R6

Examples of internal routers are R1, R6, BB2. The border routers on the picture are reflected in the following table:

Table 1.2

ProtocolOSPFEIGRP 356EIGRP 123RIPv2
OSPFN/AR3R2,R3R4
EIGRP 356R3N/AR3R5
EIGRP 123R2,R3R3N/ANONE
RIPv2R4R5NONEN/A

We will further use the figure and the tables for reference. Note that each domain on Fig 1.1 may be configured to be either transit or stub. For example, if we configure bi-directional redistribution on R3 between RIPv2 and OSPF and also on R2 between EIGRP 123 and OSPF, the OSPF domain will become transit point between two other domains. However, if we take R2 and R3, and configure R2 and R3 to send default routes into EIGRP 123, while redistributing EIGRP 123 into OSPF, we will make EIGRP 123 a stub domain.

What is redistribution needed for?

As we mentioned, the goal is to provide full connectivity between different routing domains. Usually, redistribution is needed when you merge two networks or migrate your network from one routing protocol to another. As such, redistribution is usually deemed to be a temporary solution. However, in reality, we often find that there is nothing more permanent than a temporary solution. And with the respect to CCIE lab exam, you are simply forced to face the redistribution, since the lab scenarios almost anytime involve a number of IGPs running on the topology. Note a very important “functional” property of redistribution: effectively, redistribution is used to “broadcast” the complete routing information among all the routing domain in a given topology.

What are the problems with redistribution?

a) Suboptimal routing

As it has been mentioned already, the external routes have no detailed information about their reachability. Even more, their original routing metric (e.g. cost) has to be converted to the native metric (e.g. to hop count). This is where a concept of seed metric appears. Seed metric is the initial metric assigned to external routes, redistributed into internal protocol (e.g. under RIP routing process: redistribute ospf 1 metric 1). In effect, external prefixes appear to be “attached” to the advertising border router, with “native” seed metric assigned. Due to such “simplifications”, and loss of detailed information, suboptimal routing may occur.

Example:

For our example, take EIGRP 123 routing domain. If RIPv2 routes enter EIGRP 123 domain by transiting OSPF and EIGRP 356 domains, packets from R1 to BB2 may take path R1->R2->R4->BB2 (if R2 sets better seed metric). In some worse cases, this route may even take path R1->R2->R3->R5->BB2 (if R2 thinks RIPv2 external routes transiting EIGRP 356 and redistributed into OSPF are better than RIPv2 injected into OSPF by R4) . Sometimes, due to asymmetric redistributions packets may take one path in forward direction and the other in the backward (e.g. packets from R1 to BB2 flow R1->R2->R4->BB2 and packets from BB2 to R1 flow BB2->R5->R3->R1).

b) Routing Loops

The other, more dangerous problem, is possibility that routing loops they may appear due to redistribution. Every routing protocol is able to converge to a loop-free routing only if it has full information on the existing topology. OSPF needs a complete link-state view of the intra-area topologies and star-like connectivity of non-backbone area to Area0. EIGRP requires a to carry out diffusing computations on all the routers in order to provide a loop free routing. RIP slow converges by executing a Bellman-Ford algorithm (gradient-driven optimization) on a whole topology. Since redistribution squashes and hides the original information, no IGP could guarantee a loop-free topology. Loops usually occur when IGP native information (internal routes) re-enter the routing domain as external prefixes due to use of two-way redistribution. The last important thing to note: external routes are always redistributed in a “distance-vector” fashion - i.e. advertised as a vector of prefixes and associated distances, even with link-state protocols.

Example:

Imagine that R4, R5 and R3 are configured for bi-directional redistribution between OSPF, RIPv2 and EIGRP 356 respectively. In effect, RIPv2 routes may transit EIGRP and OSPF, and appear on R4 as OSPF routes. Due to OSPF higher AD, they will be preferred at R4 over native routes, and will leak into RIPv2 domain. Further, BB2 may prefer those “looped back” routes (if say R4 is closer to BB2 than R5) and try to reach R5 connected interfaces via R4->R3. But thanks to two-way redistribution R3 will think R5 is better being reached via R4 - a loop is formed.

Is there a way to overcome those issues?

The answer is - “yes, by using a carefully designed redistribution policy”. Since routing protocols could not find and isolate the inter-domain loops, we either need to invent a new “super-routing” protocol, running on top of all IGPs (they call it BGP actually, and use to redistribute routing information between autonomous systems), or configure redistribution so that no routing loops would potentially occur and (hopefully) routing become “somewhat” optimal. We are going to describe a set of heuristics (rules of thumb) that could help us designing loop-less redistribution schemes. We start with the concept of administrative distance.

Administrative distance is a special preference value that allows selection of one protocols prefixes over another. This feature definitely needed on border routers (running multiple IGPs), which may receive the same prefixes via different IGPs. Cisco has assigned some default AD values to it’s IGPs (EIGRP, OSPF, RIPv2: 90, 110, 120), but we’ll see how this should be changed in accordance with policy. For now, we should note that two protocols - OSPF and EIGRP - offer capability to assign different administrative distance values to internal and external prefixes, thanks to their property to distinguish internal and external routes. This is a very powerful feature, which we are going to use extensively during our redistribution policy design.

Here comes our first rule of thumb. Rule 1: Router should always prefer internal prefix information over any external information. Clearly this is because external information is condensed and incomplete. For our example, if R4 receives a native prefix via RIPv2 and the same prefix via OSPF, it should prefer RIPv2 information over OSPF, even though OSPF has better AD than RIPv2 by default. This is easy to implement, thanks to the fact that we can change OSPF external AD independently of OSPF internal AD. The same rule holds true for any internal router: (not just for border routers) always prefer internal information over external for the same prefix. For example if R2 Loopback0 is advertised as native into EIGRP 123 and OSPF, and then redistributed into OSPF via R3 somehow, R4 should be configure so that OSPF external AD is higher than internal AD, and so that internal prefix is always preferred. This rule also eliminates suboptimal routing, by making sure no “dubious” paths are selected to reach a prefix. Effectively it is implemented so that all protocol external ADs are greater than any protocol internal AD (e.g. OSPF External AD > RIP Internal AD, EIGRP External AD > RIP Internal AD etc). However, RIPv2 has no notion of external routes.

So how could we implement this rule with RIPv2? First we should ensure that RIP AD is always greater than any other protocols external AD - on border routers, where this is needed. Next we need to configure so that RIPv2 internal routes have AD less that any other protocols external AD. To do this, we can take an access-list, enumerate all RIPv2 prefixes, and selectively assign a lower AD to those prefixes. Again, note that this procedure is needed on border routers only, and that you can re-use the access-list. Next, we need to make sure that inside a RIPv2 domain external routes are always considered worse than internal. We can effectively implement this by assigning a relatively high seed metric to redistributed (external) routes - say 8 hops. Since RIP topologies of large diameter are rare, it’s safe to assume with our policy that any prefix with metric (hop count) > 8 is an external one. (We may even use this property to distinguish RIPv2 internal prefixes in route-maps, thank to match metricfeature).

Next rule of thumb is known as Rule 2: Split-Horizon - Never redistribute a prefix injected from domain A into B back to domain A. This rule is targeted to eliminate “short” loops, by preventing the routing information leaked out of a routing domain to re-enter the same domain via some other point. For out example, it R2 and R3 are doing a two-way redistribution, we may want to prohibit EIGRP routes to transit the OSPF domain and enter the EIGRP domain again. This kind of situations occurs when two routing domains have more than one point of mutual redistribution. While the rule could be implemented playing with AD values or matching only internal routs in route-maps, it’s easier and more generic to use tagged routes and filter based on tag information. For example we may tag EIGRP 123 routes injected into OSPF with the tag value of “123″ and then configure to block routes with this tag, when redistributing from OSPF into EIGRP 123. Additionally, we tag OSPF routes with tag 110 when sending them to EIGRP 123 domain, and block routes with the same tag entering back the OSPF domain. While this rule may seem to be effective on detecting only “short” loops, it could be used to develop a simple, yet loop-free redistribution strategy.

First, recall how OSPF behaves with respect to inter-area routes exchange. In essence, all areas are linked to a backbone and form a star - loop-free - topology. OSPF then safely passes down the areas summary LSA using the distance-vector behavior, and never advertises those LSA back into backbone. This way, the core knows all the routing information and redistributes it down to leaves. And thanks to a loop-free “skeleton” we are guaranteed to never face any routing loops even with distance-vector advertisements. Now we can reuse this idea among the heterogenous routing domains. Take one routing domain and make it the center of the new star - in essence, make it the only transit domain in the topology. The other domains will effectively become “stub” domains, using our previous definitions - i.e. they exchange routes only with the core routing domain. Proceed with configuring two-way redistribution on border routers (enable route prefix exchange). If a given domain has more than one point of attachment to the star core (the backbone), configure to implement Rule 2 on border routers. Next, implement Rule 1 on border routers, to avoid suboptimal routing issues. That does the trick! For our example, we may configure mutual redistribution on R2 (EIGRP 123 and OSPF), R3 (EIGRP 123 and OSPF), R4 (RIPv2 and OSPF). We will then need to implement tag-based filtering on R2 and R3, as well as tune ADs in accordance with Rule 1. The detailed configuration examples will follow in the further publications.

Okay, now what if we don’t have a “central” routing domain attached to all other domains in topology? Let’s say R3 is not running OSPF in our example, and we have all routing domains connected in “ring” fashion. In short, the same idea still may be utilized, by replacing pure “star-like” topology with “tree”. Tree is loop-less too, so there is a guarantee that no loops will form. We are going to discuss this, and other more complicated scenarios in the next publications.

No comments: