This post introduces the IS-IS LSP Database Overload Bit and how it can be used to avoid traffic black-whole during certain transient network conditions. It includes a sample scenario where such issue might occur along with the relevant tests and verification used throughout. Taking full advantage of the opportunity, it briefly demonstrates how IS-IS handles older LSPs on point-to-point links.
Overload Bit Introduction
If a router is unable to store received Link-State PDUs (LSPs) or run SPF for that matter due to lack of available resources (memory or CPU), it must inform the other routers in its area that it is experiencing issues. This is accomplished by setting the overload (OL) bit in its LSPs alerting other routers it shouldn’t be used for transit traffic due to possible database inconsistency. This is a logical behaviour because link-state protocols are characterised by the propagation of the required information to build a complete connectivity map of the network on each participating router which is then used to calculate the shortest path to different destinations. If a router is unable to reliably store this information and have a common view of the topology as other routers in its area, it can no longer make reliable routing decisions and shouldn’t be used for transit traffic until its operation is back to normal–packets destined to directly connected networks can still be forwarded to the overloaded router as normal.
The OL bit was introduced into IS-IS when system resources constraints were a much greater concern in data networks than it is nowadays. In modern networks the OL bit servers a different useful purpose considering modern routers are unlikely to become easily overloaded to a point it’s unable to store LSPs or run SPF. If that’s so, what’s the use of the OL bit in modern data networks?
Traffic Black-hole Scenario
In attempt to answer such question, I’d would like to introduce the following topology and a fault scenario resulting in traffic black-hole due to transient network conditions.
In this scenario R6 is a peering router and is receiving an external route 220.127.116.11/32 from R5 via eBGP and advertising to its iBGP peers, R2, R3 and R4 after changing the next hop to itself to ensure full reachability.
If we review a few basic concepts, the next hop of a BGP route must be reachable for it to be considered a valid route and it is normally learned via IGP–IS-IS in this case. Therefore, if R2 would like to send traffic towards 18.104.22.168 it must do a route lookup for 22.214.171.124–R6’s loopback address. There are two physically available paths to reach R6, however, the path via R3 is chosen as it has a lower IGP metric when compared to the path via R4.
Now imagine a scenario where R3 suffers a system reload for whatever reason. The problem now is down to convergence time between IS-IS and BGP. Whilst IS-IS converges very quickly, BGP does so very slowly. As a matter of fact, BGP runs onto of TCP, in order for TCP to work we need valid internal routes to get BGP up and running. For that reason, there’s no way BGP will be able to converge before IS-IS. As a result, R2 would still forward traffic towards R3 as it is on the best path to reach 126.96.36.199 (188.8.131.52 next hop). Since R3 will not have a valid route to 184.108.40.206 it will drop the traffic, resulting in a transient black-hole condition. It’s a transient condition because once BGP has fully converged the black-hole will no longer exist.
R2#show ip route 220.127.116.11 Routing entry for 18.104.22.168/32 Known via "bgp 100", distance 200, metric 0 Tag 200, type internal Last update from 22.214.171.124 00:25:36 ago Routing Descriptor Blocks: * 126.96.36.199, from 188.8.131.52, 00:25:36 ago Route metric is 0, traffic share count is 1 AS Hops 1 Route tag 200 MPLS label: none
R2#show ip route 184.108.40.206 Routing entry for 220.127.116.11/32 Known via "isis", distance 115, metric 15, type level-2 Redistributing via isis 1 Last update from 18.104.22.168 on GigabitEthernet1.23, 00:25:49 ago Routing Descriptor Blocks: * 22.214.171.124, from 126.96.36.199, 00:25:49 ago, via GigabitEthernet1.23 Route metric is 15, traffic share count is 1
R2#ping 188.8.131.52 source lo0 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 184.108.40.206, timeout is 2 seconds: Packet sent with a source address of 220.127.116.11 U.U.U Success rate is 0 percent (0/5)
R2#traceroute 18.104.22.168 source lo0 Type escape sequence to abort. Tracing the route to 22.214.171.124 VRF info: (vrf in name/id, vrf out name/id) 1 126.96.36.199 0 msec 0 msec 0 msec 2 188.8.131.52 !H * !H
As you can see from the output above, 184.108.40.206 is unreachable. How can the OL bit help us avoid such issue? If R3 sets the OL bit until BGP has fully converged, R2, will send traffic towards R4 instead as R3 is “experiencing problems” and shouldn’t be used for transit traffic. Once BGP has converged, R3 will clear the OL bit in its LSPs signaling to the other routers it can be used for transit once again. The fact directly connected networks are reachable even when the OL bit is set is good news because all BGP session to all internal routers can be establish successfully.
The command set-overload-bit under the IS-IS process would help us to accomplish that, however, using it on its own would cause the OL bit to be set immediately, although having some practical purposes, this is not what we’re trying to accomplish in this particular scenario. There are some optional parameters including, on-startup which as the name suggests sets the OL bit upon system startup for a period of time and wait-for-bgp that sets the OL bit and maintain it set until BGP has fully converged.
Testing and Verification
In order to test and verify the theory, I’ve set the OL bit startup timer to 600 seconds which should be more than enough time for the BGP to converge after the R3 has reloaded.
! Configuration added to R3 router isis 1 set-overload-bit on-startup 600
Once reloaded I’ve checked R2’s database and sure enough it shows the OL bit is set for the LSP originated by R3. As a result traffic towards 220.127.116.11 (18.104.22.168 next hop) is being routed via R4.
R2#show isis database Tag 1: IS-IS Level-2 Link State Database: LSPID LSP Seq Num LSP Checksum LSP Holdtime ATT/P/OL R2.00-00* 0x000000C7 0xA0E6 828 0/0/0 R3.00-00 0x000000C2 0xE18E 827 0/0/1 R4.00-00 0x000000C3 0xB3B3 1018 0/0/0 R6.00-00 0x000000C3 0x10EA 824 0/0/0
R2#show ip route 22.214.171.124 Routing entry for 126.96.36.199/32 Known via "isis", distance 115, metric 20, type level-2 Redistributing via isis 1 Last update from 188.8.131.52 on GigabitEthernet1.24, 00:04:10 ago Routing Descriptor Blocks: * 184.108.40.206, from 220.127.116.11, 00:04:10 ago, via GigabitEthernet1.24 Route metric is 20, traffic share count is 1
R2#ping 18.104.22.168 source lo0 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 22.214.171.124, timeout is 2 seconds: Packet sent with a source address of 126.96.36.199 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms
R2#traceroute 188.8.131.52 source lo0 Type escape sequence to abort. Tracing the route to 184.108.40.206 VRF info: (vrf in name/id, vrf out name/id) 1 220.127.116.11 0 msec 1 msec 0 msec 2 18.104.22.168 1 msec 0 msec 0 msec 3 22.214.171.124 1 msec * 1 msec
The alternative would be to use set-overload-bit on-startup wait-for-bgp which would clear the OL bit as soon as BGP converges, providing a much more dynamic convergence as opposed to wait for overload-bit timer to expire.
It’s worth noting however that although the wait-for-bgp offers a more dynamic approach when compared setting a static timer, if one BGP session does not come up, the OL bit will never be cleared (Carroll & Doyle, 2006).
To verify I’ve shutdown the peering to R6, added set-overload-bit on-startup wait-for-bgp and reloaded R3. I also enabled debug isis update-packets on R2.
As a side note, if you attempt to change settings whilst the overload-bit timer is active you should received the following message:R3(config-router)#set-overload-bit on-startup wait-for-bgp Current overload-bit timer is running. 'no set-overload-bit' first to stop the timer or change wait-for-bgp later
Once R3 has successfully restarted, I was able to witness first hand the flooding process on R2–more specifically how IS-IS handles older LSPs over point-to-point links. Here’s the relevant debug output.
... *May 25 17:48:28.863: ISIS-Upd (1): Rec L2 LSP 0000.0000.0003.00-00, seq 3, ht 1200, *May 25 17:48:28.863: ISIS-Upd (1): from SNPA 000c.29fc.2fcf (GigabitEthernet1.23) *May 25 17:48:28.863: ISIS-Upd (1): LSP older than database copy ... *May 25 17:48:29.814: ISIS-Upd (1): Rec L2 LSP 0000.0000.0003.00-00, seq CC, ht 1199, *May 25 17:48:29.814: ISIS-Upd (1): from SNPA 000c.29fc.2fcf (GigabitEthernet1.23) *May 25 17:48:29.814: ISIS-Upd (1): LSP newer than database copy ...
R2 already had a LSP for R3 in its database with a good/valid lifetime. When R3 came back online and an adjacency was formed it flooded its LSP as it normally would, however, with a lower sequence number (seq 3) than R2 had in its database (seq CB)–hence ‘LSP older than database copy.’ R3 later did send a newer LSP (seq CC)–‘LSP newer than database copy.’ Here’s a more detailed diagram I was able to put together whist looking at the relevant packet captures.
Although periodic flooding of CSNPs doesn’t happen in point-to-point links as it normally would in broadcast links–DIS floods CSNP every 10 seconds–, CSNPs are still used in point-to-point links when an adjacency is first established.
Looking at the output of show isis database we can see the OL bit is set for R3 LSP.
R2#show isis database Tag 1: IS-IS Level-2 Link State Database: LSPID LSP Seq Num LSP Checksum LSP Holdtime ATT/P/OL R2.00-00 * 0x000000D2 0x8AF1 985 0/0/0 R3.00-00 0x000000CC 0xCB99 985 0/0/1 R4.00-00 0x000000C8 0xA9B8 849 0/0/0 R6.00-00 0x000000CD 0xFBF4 984 0/0/0
Once all BGP sessions are established a new LSP flood should happen and the OL bit cleared.
... *May 25 17:53:15.587: ISIS-Upd (1): Rec L2 LSP 0000.0000.0003.00-00, seq CD, ht 1199, *May 25 17:53:15.587: ISIS-Upd (1): from SNPA 000c.29fc.2fcf (GigabitEthernet1.23) *May 25 17:53:15.587: ISIS-Upd (1): LSP newer than database copy *May 25 17:53:15.587: ISIS-Upd (1): Important fields changed *May 25 17:53:15.587: ISIS-Upd (1): TID 0 full SPF required ...
The ‘Important fields changed’ indicates the LSP has changed, in other words, we received an LSP that has the OL bit cleared. This can be confirmed looking at the output of show isis database as shown below.
R2#show isis database Tag 1: IS-IS Level-2 Link State Database: LSPID LSP Seq Num LSP Checksum LSP Holdtime ATT/P/OL R2.00-00 * 0x000000D2 0x8AF1 906 0/0/0 R3.00-00 0x000000CD 0xC5A2 1191 0/0/0 R4.00-00 0x000000C8 0xA9B8 769 0/0/0 R6.00-00 0x000000CD 0xFBF4 905 0/0/0
Having demonstrated the previous black-hole scenario, it’s worth remembering that modern core networks have MPLS enabled along with LDP and/or RSVP-TE operating as a label distribution mechanism. If we add MPLS to our previous scenario, although R3 might not have BGP up and running or a valid route to 126.96.36.199 for that matter, it wouldn’t drop the traffic. In such case, the LFIB is added to the equation during the forwarding process. Here’s a diagram:
As you can see in the diagram, R3 is no longer performing a routing lookup on the IPv4 packet to determine the next hop. This process has already been done on R2 who determined that to reach 188.8.131.52 (184.108.40.206 next hop) it must use label 3002 (learned via LDP) which is what R3 is expecting. In that way, when the labeled packet arrive on R3 it already know what to do with label 3002. In this case, it’s an implicit null (label 3) therefore it pops the label and forward the packet as an IPv4 packet only to R6 who knows about 220.127.116.11 and will have no problem knowing what to do with it.
Carroll, J. & Doyle, J. (2006). Routing TCP/IP Vol. 1: Cisco Press.
McPherson, D. (2002). RFC3277 IS-IS Transient Blackhole Avoidance: IETF.
IP Routing: ISIS Configuration Guide, Integrated IS-IS Routing Protocol Overview
Cisco IOS IP Routing: ISIS Command Reference, set-overload-bit