BGP Neighbor Adjacency States

davidilles · July 5, 2023, 3:04pm

Hello Rene.

I appreciate your and Laz’s help here. We can leave it as it is, as it’s probably not that important, it’s just something I was curious about.

I’ve one more question that I want to ask.

What’s up with these empty UPDATE messages? They are sent each time a neighborship is successfuly established.

The first one makes perfect sense since that is a network that I am advertising into BGP but what’s up with the second empty UPDATE message?

This occurs every time a neighbor adjacency is established, regardless of whether any networks are actually being advertised. Both routers send an empty UPDATE message. I’d assume that it’s some sort of KEEPALIVE, but KEEPALIVE messages are literally sent before it during the OpenConfirm state. Any ideas here?

Kind regards,
David

lagapidis · July 10, 2023, 10:48am

Hello David

It’s always worth investigating these things as they help us to dig deeper into the inner workings of BGP and the “why” concerning the way the protocol has been designed. Sometimes we encounter to such situations where it is difficult to interpret how they operate, and sometimes it just happens to be the way that a particular vendor implements the protocol on their devices.

Now concerning your other question, this is what is known as an End-of-RIB marker. In RFC 4724, which describes the Graceful Restart Mechanism for BGP, this is further explained like so:

An UPDATE message with no reachable Network Layer Reachability
Information (NLRI) and empty withdrawn NLRI is specified as the End-
of-RIB marker that can be used by a BGP speaker to indicate to its
peer the completion of the initial routing update after the session
is established…

Although the End-of-RIB marker is specified for the purpose of BGP
graceful restart, it is noted that the generation of such a marker
upon completion of the initial update would be useful for routing
convergence in general, and thus the practice is recommended.

Also, note that multiple BGP messages can be grouped together within a single TCP segment rather than being sent separately. In the Wireshark output that you shared, we see that the End-of-RIB marker is actually sent as a separate Update message. Whereas the first update message has a non-zero value for the path attribute length, the send update is indeed an End-of-RIB marker since both values are set to 0. Does that make sense?

I hope this has been helpful!

Laz

davidilles · December 20, 2023, 3:25pm

Hello, everyone!

The most confusing thing to me that I never wrapped my head around are the first 3 BGP neighbor states (well the first 2, the 3rd one is optional).

Idle
So when we configure BGP and specify a start event (a neighbor command, for example) the router sets the state for that neighbor to Idle. Here it will find a matching route, initialize all the necessary resources and timers, start listening for TCP port 179 connections and send a SYN message and then move on to the Connect state, is this correct?

Connect
In this state, the routers are trying to establish the TCP connection using the 3-way handshake. Once this process is successfull, an OPEN message is sent and the router (whoever is the first to do this) moves into the OpenSent state, and so does the other router.

What confuses is me is all the additional… hassle around it . If an error occurs during the adjacency formation which causes it to move back to Idle, the ConnectRetryTimer is set to 60 seconds and doubles on subsequent failures?

Active
Then, if the TCP connection in the Connect state fails and the ConnectRetryTimer depletes, a new TCP connection is attempted, the timer is reset and the adjacency state is moved to Active?

And then when the ConnectRetryTimer depletes itself during the Active state… we move back to the Connect state and reset the ConnectRetryTimer?

Isn’t this just redundant? It feels like the two states are just playing ping pong with eachother. If the ConnectRetryTimer depletes itself in Connect, the state is moved to Active and vice-versa. This is what confuses me. The ConnectRetryTimer and all this state moving/timing behind it.

Not to mention that I was never successful in labbing these (as indicated by the posts above)

Can someone please shed some light onto this?

Thank you.
David

lagapidis · January 1, 2024, 8:24am

Hello David

Yes that is correct. The Idle state is essentially a preparation state, preparing all that’s necessary to start communication. As soon as the SYN message is sent or received, it moves to the Connect state.

The Connect state can be seen as the TCP three-way handshake itself. Once it is successful, it moves on to the OpenSent state. Now, what if there is a failure? In that case, the ConnectRetryTimer is indeed set to 60 seconds and it doubles on subsequent failures. Why? This is a mechanism to prevent constant, rapid attempts to establish a connection, which could consume significant resources. It is a kind of dampening method to avoid flapping.

BGP is not designed to converge quickly like IGPs. Its stability is much more important than its speed of convergence, because flapping BGP routes can have devastating effects on the network, and on the Internet as a whole.

If the TCP connection fails in the Connect state and the ConnectRetryTimer expires, the router attempts a new TCP connection, resets the timer, and transitions to the Active state.

When the ConnectRetryTimer expires in the Active state, the router transitions back to the Connect state and resets the timer. This is not redundant but rather a way to continuously attempt to establish a connection until successful, with a delay between attempts to conserve resources.

The confusion might arise from the fact that the Active state is optional and often skipped in modern implementations. In the past, the Active state was used to initiate a new connection when the initial attempt failed, but nowadays, routers often move directly from Connect to OpenSent state, skipping the Active state altogether.

In a lab environment, you might not see these states due to the speed of modern networks and devices. The transition between states usually happens too fast to be observed.

I hope this has been helpful!

Laz

camerone7276 · April 30, 2024, 2:59pm

Hello,

When does it move to open confirm? Does it just go to open sent then to established? Or would it move to open confirm after then to established?

Thanks

lagapidis · May 2, 2024, 6:29am

Hello Cameron

A BGP router will move from the OpenSent to the OpenConfirm state once it receives and validates the OPEN message from its peer.

Once it enters the OpenConfirm state, it begins to send KEEPALIVE messages, while simultaneously waiting to receive KEEPALIVE messages. Once it receives the first KEEPALIVE message, it will then move to the Established state. Does that make sense?

I hope this has been helpful!

Laz

camerone7276 · May 3, 2024, 12:36am

That makes perfect sense, thank you so much Laz. Have a great night

ashwith171 · September 17, 2024, 9:45pm

Hi Rene,

What is the difference between connect and active state, In both state the router is trying to initiate a TCP connection, right?

Thank You,
Ashwith

lagapidis · September 19, 2024, 4:34am

Hello Ashwith

Yes, you’re correct that in both the Connect and Active states, the router is trying to initiate a TCP connection. However, there is a key difference between the two:

In the Connect state, the router is waiting for an acknowledgment of the TCP connection request (SYN packet) it has sent to its neighbor. If the router gets an acknowledgment (SYN-ACK packet), it moves to the OpenSent state. If it does not receive an acknowledgment, it will keep trying until the ConnectRetry timer expires, and then it will move to the Active state.

In the Active state, the router is still trying to establish a TCP connection with its neighbor, but this time it is more aggressive. It starts sending hello packets to its neighbor. If it gets a reply, it moves to the OpenSent state. If it does not get a reply after a certain period, it declares the neighbor as down.

So, the main difference is the method and aggressiveness with which the router tries to establish the TCP connection in each state.

I hope this has been helpful!

Laz

sathish84be · December 29, 2024, 2:18pm

Hi Team, May I know what kind of resources BGP will initialize?

lagapidis · January 5, 2025, 2:23pm

Hello Sathish

In the description of the IDLE state in the BGP FSM, when it says “initialize some resources,” it refers to the preparatory tasks and system-level allocations that BGP performs to get ready for establishing a connection with a remote neighbor. Specifically, this includes:

Memory Allocation: Allocating memory to store session-specific data, such as BGP messages, route tables, and neighbor state information.
Data Structures Setup: Initializing internal data structures to manage the BGP session. This might include creating entries for the neighbor in tables that track BGP sessions and routes.
Timers Configuration: Setting up or resetting necessary timers.
Event Handling Setup: Preparing mechanisms to handle specific BGP events and transitions, ensuring that the FSM is ready to respond to changes or triggers like successful TCP connections or session resets.
Socket Setup for Listening: Opening a TCP socket and listening for incoming connection attempts from the remote neighbor.
Log Initialization: Starting or resetting logging mechanisms to track session establishment and debug information.

These are all internal mechanisms performed at a system level by BGP and are not generally configurable. But they ensure that BGP is ready to handle the next stages in the FSM, such as transitioning to the Connect state if the TCP connection succeeds.

I hope this has been helpful!

Laz

sathish84be · January 5, 2025, 2:57pm

Thank you for the detailed information