BGP Neighbor Adjacency States

Hello Rene.

I appreciate your and Laz’s help here. We can leave it as it is, as it’s probably not that important, it’s just something I was curious about.

I’ve one more question that I want to ask.
What’s up with these empty UPDATE messages? They are sent each time a neighborship is successfuly established.

The first one makes perfect sense since that is a network that I am advertising into BGP but what’s up with the second empty UPDATE message?

This occurs every time a neighbor adjacency is established, regardless of whether any networks are actually being advertised. Both routers send an empty UPDATE message. I’d assume that it’s some sort of KEEPALIVE, but KEEPALIVE messages are literally sent before it during the OpenConfirm state. Any ideas here?

Kind regards,

Hello David

It’s always worth investigating these things as they help us to dig deeper into the inner workings of BGP and the “why” concerning the way the protocol has been designed. Sometimes we encounter to such situations where it is difficult to interpret how they operate, and sometimes it just happens to be the way that a particular vendor implements the protocol on their devices.

Now concerning your other question, this is what is known as an End-of-RIB marker. In RFC 4724, which describes the Graceful Restart Mechanism for BGP, this is further explained like so:

An UPDATE message with no reachable Network Layer Reachability
Information (NLRI) and empty withdrawn NLRI is specified as the End-
of-RIB marker that can be used by a BGP speaker to indicate to its
peer the completion of the initial routing update after the session
is established…

Although the End-of-RIB marker is specified for the purpose of BGP
graceful restart, it is noted that the generation of such a marker
upon completion of the initial update would be useful for routing
convergence in general, and thus the practice is recommended.

Also, note that multiple BGP messages can be grouped together within a single TCP segment rather than being sent separately. In the Wireshark output that you shared, we see that the End-of-RIB marker is actually sent as a separate Update message. Whereas the first update message has a non-zero value for the path attribute length, the send update is indeed an End-of-RIB marker since both values are set to 0. Does that make sense?

I hope this has been helpful!


Hello, everyone!

The most confusing thing to me that I never wrapped my head around are the first 3 BGP neighbor states (well the first 2, the 3rd one is optional).

So when we configure BGP and specify a start event (a neighbor command, for example) the router sets the state for that neighbor to Idle. Here it will find a matching route, initialize all the necessary resources and timers, start listening for TCP port 179 connections and send a SYN message and then move on to the Connect state, is this correct?

In this state, the routers are trying to establish the TCP connection using the 3-way handshake. Once this process is successfull, an OPEN message is sent and the router (whoever is the first to do this) moves into the OpenSent state, and so does the other router.

What confuses is me is all the additional… hassle around it :smiley:. If an error occurs during the adjacency formation which causes it to move back to Idle, the ConnectRetryTimer is set to 60 seconds and doubles on subsequent failures?

Then, if the TCP connection in the Connect state fails and the ConnectRetryTimer depletes, a new TCP connection is attempted, the timer is reset and the adjacency state is moved to Active?

And then when the ConnectRetryTimer depletes itself during the Active state… we move back to the Connect state and reset the ConnectRetryTimer? :smiley:

Isn’t this just redundant? It feels like the two states are just playing ping pong with eachother. If the ConnectRetryTimer depletes itself in Connect, the state is moved to Active and vice-versa. This is what confuses me. The ConnectRetryTimer and all this state moving/timing behind it.

Not to mention that I was never successful in labbing these (as indicated by the posts above) :smiley:

Can someone please shed some light onto this?

Thank you.

Hello David

Yes that is correct. The Idle state is essentially a preparation state, preparing all that’s necessary to start communication. As soon as the SYN message is sent or received, it moves to the Connect state.

The Connect state can be seen as the TCP three-way handshake itself. Once it is successful, it moves on to the OpenSent state. Now, what if there is a failure? In that case, the ConnectRetryTimer is indeed set to 60 seconds and it doubles on subsequent failures. Why? This is a mechanism to prevent constant, rapid attempts to establish a connection, which could consume significant resources. It is a kind of dampening method to avoid flapping.

BGP is not designed to converge quickly like IGPs. Its stability is much more important than its speed of convergence, because flapping BGP routes can have devastating effects on the network, and on the Internet as a whole.

If the TCP connection fails in the Connect state and the ConnectRetryTimer expires, the router attempts a new TCP connection, resets the timer, and transitions to the Active state.

When the ConnectRetryTimer expires in the Active state, the router transitions back to the Connect state and resets the timer. This is not redundant but rather a way to continuously attempt to establish a connection until successful, with a delay between attempts to conserve resources.

The confusion might arise from the fact that the Active state is optional and often skipped in modern implementations. In the past, the Active state was used to initiate a new connection when the initial attempt failed, but nowadays, routers often move directly from Connect to OpenSent state, skipping the Active state altogether.

In a lab environment, you might not see these states due to the speed of modern networks and devices. The transition between states usually happens too fast to be observed.

I hope this has been helpful!