ccie blog

BGP Recursive Routing Failure

I came across an really awesome problem about a BGP next hop recursion failure that you can run into. I bet you I get asked this when I do my CCIE lab exam. Let me show you a basic BGP network setup and introduce a very annoying problem to solve. Below is the network, along with the current configurations underneath.

BGP_Recursive_Next_Hop

R1#
router bgp 1
 no synchronization
 bgp router-id 1.1.1.1
 bgp log-neighbor-changes
 neighbor 150.1.2.2 remote-as 2
 neighbor 150.1.2.2 ebgp-multihop 2
 neighbor 150.1.2.2 update-source Loopback0
 no auto-summary
!
interface Loopback0
 ip address 150.1.1.1 255.255.255.255
!
 ip route 150.1.2.0 255.255.255.0 10.0.12.2
R2#
router bgp 2
 no synchronization
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 neighbor 150.1.1.1 remote-as 1
 neighbor 150.1.1.1 ebgp-multihop 2
 neighbor 150.1.1.1 update-source Loopback0
 no auto-summary
!
interface Loopback0
 ip address 150.1.2.2 255.255.255.255
!
ip route 150.1.1.0 255.255.255.0 10.0.12.1

So in the network it’s a very basic configuration to get BGP working.

On R2, the admin decides he wants to advertise his loopback address into the network and issues the commands below.

R2(config)#router bgp 2
R2(config-router)#network 150.1.2.2 mask 255.255.255.255

Let’s check out the problem he’s just caused. On R1’s routing table now, he is now losing all his routes to R2 and then re-gaining them about once per minute. See below, when I spam the #sh ip route bgp command a couple of times to illustrate the problem.

R1#sh ip route  bgp
     150.1.0.0/16 is variably subnetted, 3 subnets, 2 masks
B       150.1.2.2/32 [20/0] via 150.1.2.2, 00:00:03
R1#sh ip route  bgp

R1#sh ip route  bgp

So sometimes the 150.1.2.2 prefix is there and sometimes it isn’t (FYI, if R2 was advertising more BGP routes to R1, this same problem would occur for all of those routes too). The problem is a good one. What is happening is that the new prefix R2 is advertising (the 150.1.1.1/32) via BGP is more specific than the static route that R1 is using to route towards R2’s loopback (which is a /24). Because the more specific BGP route is preferred over the static route, it now says that in order to reach 150.1.1.1, go via 150.1.1.1, which is wrong, it should go via 10.0.12.2 like I configured in my static route statement. To illustrate the problem, I’ve caught a capture of the routing table at the point where the BGP route actually made it into the routing table.

R1#sh ip route  bgp
     150.1.0.0/16 is variably subnetted, 3 subnets, 2 masks
B       150.1.2.2/32 [20/0] via 150.1.2.2, 00:00:00
!
!
R1#sh ip route 150.1.2.2
Routing entry for 150.1.2.2/32
  Known via "bgp 1", distance 20, metric 0
  Tag 2, type external
  Last update from 150.1.2.2 00:00:02 ago
  Routing Descriptor Blocks:
  * 150.1.2.2, from 150.1.2.2, 00:00:02 ago
      Route metric is 0, traffic share count is 1
      AS Hops 1
      Route tag 2
!
!
R1#sh ip cef 150.1.2.2
150.1.2.2/32, version 43, epoch 0
0 packets, 0 bytes
  via 150.1.2.2, 0 dependencies, recursive
    unresolved

So as you can see, in order to reach 150.1.2.2 for any prefixes advertised by R2, we must look for a route towards 150.1.2.2. From the routing table above, you can see that we are learning the BGP route from 150.1.2.2 with a next hop of 150.1.2.2, so we actually can’t route to it. And the CEF table confirms this with defining the outgoing interface as “unresolved”.

We ideally would want to use the static route that says to reach 150.1.2.2, go via 10.0.12.2. But because R2 is advertising a more specific /32 route than our /24 static route to reach R2, it’s overriding it. So now R1 doesn’t have a next hop that can resolve to an outgoing interface. To fix this, you can just make the static route a /32 instead and then it would be more preferred because of admin distance.

FYI. The reason why the route cycles into and out of the routing table every 60 seconds is because of the BGP scanning process. Where it checks if a BGP prefix has a next hop of itself, or through the BGP route it installed. If it does, then it marks the route as unreachable and drops it from the routing table. The BGP update then comes back in, and the process is repeated.

6 Comments

MarshallAugust 22nd, 2015 at 3:14 am

Keep it up! I’m mid-studies through the INE v5 R&S and can sympathize with what a massive amount of material the IEs require. Found your blog I think through Googling clarification on one lab or another. One day at a time!

StephenGarbettAugust 22nd, 2015 at 10:02 pm

Thanks.

RaitoOctober 30th, 2015 at 11:22 am

Interesting problem, thank you for sharing.

NavleshNovember 1st, 2015 at 6:33 am

it’s really very helpful. Thanks ,,,great work.

Ram Kumar GJuly 18th, 2016 at 12:36 pm

R1#sh ip cef 150.1.2.2
150.1.2.2/32
nexthop 10.0.12.2 GigabitEthernet0/0
R1#

RaviFebruary 10th, 2017 at 11:49 am

Nice explanation of a very common issue.

Thanks….

Leave a comment

Your comment