BGP Next Hop Self

Hi,
I’m new at BGP and cannot get past a issue. I set up this simple lab:

R1:

R1#sh run | s bgp
router bgp 10
no synchronization
bgp router-id 1.1.1.1
bgp log-neighbor-changes
network 77.0.10.1 mask 255.255.255.255
neighbor 203.0.113.2 remote-as 20
no auto-summary

R2:

R2#sh run | s bgp
router bgp 20
no synchronization
bgp log-neighbor-changes
neighbor 203.0.113.1 remote-as 10
neighbor 203.0.113.6 remote-as 20
neighbor 203.0.113.6 next-hop-self
no auto-summary

R3:

R3#sh run | s bgp
router bgp 20
no synchronization
bgp log-neighbor-changes
network 88.0.20.1 mask 255.255.255.255
neighbor 203.0.113.5 remote-as 20
no auto-summary

I didn’t create any IGP protocol between R2 and R3 because they’re just directly connected

R1 cannot ping R3 Loopback 1 interface (88.0.20.1) unless I add the source param because there is no going back to R1 from R3:

R1# ping 88.0.20.1 source 77.0.10.1

So there is no communication between those routes actually unless I specify the source IP.

How can it be solved in a more realistic scenario? Am I missing some configuration?

Should a default route be redistributed via any IGP inside each AS pointing at the border route? I guess advertising 203.0.113.4/30 and 203.0.113.0/30 through eBGP is not a good option.

Thanks.

Hello MIguel

BGP functions somewhat differently than IGPs. When a router leanrs about a particular destination using an IGP like OSPF or EIGRP, you expect to be able to reach that destination with a ping. This is not the case with BGP.

This is because BGP is used to advertise networks from one Autonomous System (AS) to another. The primary information that is given to a BGP router, is the destination AS. In your scenario, the 77.0.10.1/32 network is advertised, and R3 learns about it. R3 knows that to reach that network it must go via AS10. So BGP is telling R3 “You must follow the path of AS20 → AS10 to reach your destination.”

Simply knowing the ASes that you will traverse does not guarentee routing success. You must ensure that the underlying within each AS is also functional. In your particular case, what is happening is that when you ping 88.0.20.1 from R1 without specifying the source, the IP address of the exit interface is used as the souce IP. Specifically, 203.0.113.1 is used as the souce IP. When this ping reaches R3, it must send the ping back to 203.0.113.1. But it doesn’ know where to find it becuse it’s not in the routing table.

If you want to make this ping work, you must enable a IGP within AS20 and advertise the 203.0.113.0/3 network into the routing protocol.

However, keep in mind that you will rarely want a core BGP router to communicate directly with some end device in another AS. This lack of connectivity between the interface of a BGP router and an end device (represented by your Lo0 interface of R3) is actually desirable, and it’s what you would see in a production network. Does that make sense?

I hope this has been helpful!

Laz

1 Like

Hi Rene & Team
As per RFC ,
=========
When sending a message to an internal peer, if the route is not
locally originated, the BGP speaker SHOULD NOT modify the
NEXT_HOP attribute unless it has been explicitly configured to
announce its own IP address as the NEXT_HOP.

Topo:
====
–ibgp—10/24-[1.1.1.1]–ibgp—[2.2.2.2]—ibgp----[3.3.3.3]----ibgp–[4.4.4.4]–10/24–ibgp–

Now, Assume none of rotuers have next-hop-self config enabled

Now, Will 2.2.2.2 advertise 10/24 to 3.3.3.3 without changing nexthop . i.e. with value 1.1.1.1 as NH

Will Split horizon rule still be applied?

In simple words, if next-hop is not changed, every router in full mesh will know direct NH to reach 10/24 network, then why do we need to enforce split-horizon rule

Does it mean,
Because of FullMesh+Next-hop-self, you have to enable Splithorizon.
IF Fullmesh+external_advertise is used, you dont need splithorizon.

Can you please help clarify.

Hello James

I understand your question and it can indeed become confusing. It’s important to realize that the split horizon rule and the next hop attribute are two separate features and don’t affect one another.

In your particular topology, router 1.1.1.1 has an iBGP peering with 2.2.2.2. and with 3.3.3.3 and with 4.4.4.4. Even though the topology is in a straight line, these peerings still form. Remember, iBGP peerings can form between iBGP peers that are NOT directly connected.

So 1.1.1.1 advertises the 10.0.0.0/24 network to 2.2.2.2, and to 3.3.3.3, and to 4.4.4.4, and in all these advertisements, 1.1.1.1 is the next hop IP.

Split horizon says that 2.2.2.2 WILL NOT re-advertise the 10.0.0.0/24 network it learned from 1.1.1.1 to any other iBGP peer.

Now what you are saying is, since the next hop is not changed, why don’t we cancel out the split horizon rule and have 2.2.2.2 re-advertise the 10.0.0.0/24 network to 3.3.3.3? Won’t it still have the same correct next hop IP of 1.1.1.1 avoiding routing loops? Yes, this is the case, but it doesn’t solve the possible loop that can still be created, and that’s why we still need the split horizon rule!!

If split horizon isn’t applied, for example, 4.4.4.4 may then advertise the 10.0.0.0/24 network to 1.1.1.1. Then 1.1.1.1 will learn about its own network from another iBGP peer. And what if one of the routers does change the next hop to itself? Then we can still have a loop! Even if we didn’t we would still have all the BGP peers re-advertising all of the networks they learned from everyone else, and that too is inefficient and unacceptable. Only one source should exist for routes injected into iBGP with an AS. So split horizon should and would always be applied regardless of whether or not next hop self is applied.

Ultimately, the next hop and the split horizon rule are there to solve two different problems. Under certain circumstances, it may seem that one solves the other, but it still causes inefficiency and possible problems if the next hop self command were used.

I hope this has been helpful!

Laz

Hello, everyone.

I don’t quite understand the default next-hop behaviour. Why does the next-hop not change automatically when a route is received from an eBGP peer and then advertised to an iBGP peer?

From all the time I’ve labbed something, I always needed the next-hop-self command which made me think, why isn’t it issued by default?

David

Hello David

There are two issues involved here. The first has to do with design. BGP is fundamentally a path vector protocol, and as such, the next hop IP is an important part of the protocol. Indeed, the next hop IP is a well-known and mandatory attribute, and thus by definition should remain the same throughout its propagation. This is a fundamental architectural design of BGP.

Now the truth is that for virtually all enterprise networks, the next hop self feature is needed because internal routers don’t know how to get to the external eBGP peer. However, BGP was not originally designed only for enterprise networks. It was designed for large ISPs and transit ASes too.

In many such implementations, the internal routers DO know how to get to the eBGP peer outside of the AS. THis is especially true in ISP style architectures as well as when using MPLS L3VPNs.

So, although in many scenarios the next-hop-ip feature is needed, on the Internet backbone, the feature is typically unused.

I hope this has been helpful

Laz