BGP Neighbor Adjacency States

Hello Tejas

Yes, you can configure the hold timer to be zero. In this case, keepalives are not sent, and the BGP session does remain. This is verified by the BGP RFC 4271. However, this is not an implementation that you should employ in today’s networks. This feature was enabled in order to accommodate running BGP over dialup circuits such as ISDN, where connectivity on demand was configured. The keepalives sent by BGP would continually cause ISDN to redial and connect, resulting in additional costs for the specific circuit.

Today you wouldn’t want to implement a hold down timer of 0 because if a BGP neighbor does indeed go down, a hold down timer of 0 will not allow you to detect it.

I hope this has been helpful!

Laz

Hii Rene.

i am confused with connectRetry timers in IDLE,CONNECT & Active states. "ConnectRetry timers will reset.what will happen if connectRetry timers expires in these states and in which state this timer will reset again???

Regards,
varma

Hello Chandrasekhar

The ConnectRetry Timer plays various roles for different states. However, there seems to be some discrepancy between how Cisco describes it and how it is described in the RFC. Some of the details can be found below, but I believe that the clearest picture can be seen from the RFC, whose link you will find below.

A router will enter the Idle state when BGP is initially activated. At the time of activation, the ConnectRetry timer is undefined and irrelevant. It only begins to have meaning for the Idle state if an error occurs. Specifically, for the Idle state, the ConnetRetry Timer will be set to 60 seconds in the event of a failed start event, or in the event that we reenter the Idle state due to an error at some other stage of the process. The timer must reach zero before another connection attempt is initiated. Further failures to leave the Idle state will cause the ConnectRetry timer to double in length from the previous iteration.

The next state is the Connect state, where a TCP connection is initiated. A 3-way TCP handshake is attempted, and if it completes, the ConnectRetry timer is reset, an Open message is sent to the neighbor, and then the OpenSent state begins. If this process is successful, the ConnectRetry timer plays no role. However, if during the 3-way handshake the ConnectRetry timer depletes before the handshake is complete, then, according to Cisco, the state moves to Active, but according to the RFC 4271, it actually stays in the Connect state and retries the connection.

In the Active state, if the ConnectRetry timer expires, it initiates a TCP connection to the other BGP peer and changes its state back to Connect.

In the OpenSent, OpenConfirm, and Established states, the ConnectRetry timer doesn’t play a role, but if an error occurs, and BGP returns to the Idle state, the ConnectRetry timer is reset to 0.

I hope this has been helpful!

Laz

Hello Rene/Laz,

My questions is regarding the Active and Connect states. I see in your debugs, on R1 BGP goes from Init to Active while on R2 it goes from Init to Connect. Why this discrepancy? I have observed similar results in my Wireshark captures as well. Here is what I am assuming. For the active router, BGP goes from Init to Connect as TCP connection was initiated. While on passive router, TCP connection request was received and hence it goes from Init to Connect. This deduction seems pretty flawed to be honest but after going through numerous sources, this is the only possible reason I could think of. Can you please shed some light on this?

Thanks.

Hello Kunj

Unfortunately, the answer is not very technical. As with many debugs, sometimes events occur quite quickly, and some information may simply be omitted. In this case, it seems that R1 simply went through the states of the FSM quite quickly, so the debug displayed multiple events using a single statement saying that it went from Idle to Active. On R2, the event was not so quick, so it displayed the idle to connect state change.

In the lesson, Rene states that “(it doesn’t show the Connect state in the debug)”.

I hope this has been helpful!

Laz

Hello Rene/Las,
Can u please confirm why and when we need to configure BGP neighborship with APIPA address?
Regards
Unni

Hello Unni

APIPA or Link Local addresses for IPv4 should never be used as the IP addresses of BGP peers. This goes contrary to the address usage rules dictated in RFC 3927. However, some vendors and systems do use them. One such situation is with Amazon’s AWS service, an example of which can be found here. This has been known to be used over an IPsec tunnel setup to Amazon VPC. I am not familiar with other uses of APIPA addresses for such purposes.

I hope this has been helpful!

Laz

When the router is in the OpenConfirm state, what happens if a keepalive is not received? Does it go back to Idle?

Thanks,
Michael

Hello Michael

When all else fails, check the RFC! :sunglasses: According to RFC4271 (pages 67-68), the OpenConfirm state is maintained until a Keepalive is received, in which case it goes to Established, or a notification message is received, in which case it goes to Idle. It also states that:

If the HoldTimer_Expires event (Event 10) occurs before a KEEPALIVE message is received, the local system: [among other things] changes its state to Idle.

Now the HoldTimer is a value that is contained in a field within the BGP OPEN message. The value is negotiated. Again, according to Page 13 of the RFC:

This 2-octet unsigned integer indicates the number of seconds the sender proposes for the value of the Hold Timer. Upon receipt of an OPEN message, a BGP speaker MUST calculate the value of the Hold Timer by using the smaller of its configured Hold Time and the Hold Time received in the OPEN message. The Hold Time MUST be either zero or at least three seconds. An implementation MAY reject connections on the basis of the HoldTime. The calculated value indicates the maximum number of seconds that may elapse between the receipt of successive KEEPALIVE and/or UPDATE messages from the sender.

During other negotiations, as described in the RFC, there are cases where the HoldTime can also be set to a value of up to four minutes in some cases.

I hope this has been helpful!

Laz

1 Like

@lagapidis: Thanks for the pointer, that answers my question!

Michael

1 Like

Hi all,

I think this is the best explanation I’ve read but still, after this, the OCG, RFC, CBT Nuggets, and other youtube videos, I’m still a little confused about the relationship and interactivity between the Connect and Active states.

From some playing around in GNS3 with debugging and packet captures it seems like there may some relationship between Connect/Active and whether a router becomes the active or passive side of the neighbor negotiation process. Is that on the right track or no?

Other than that it seems kind of random whether a router enters Connect or Active immediately following the Idle state.

Do you know how specific the ENCOR exams are with regard to this type of thing? Generally speaking, compared to prepping for the CCNA, I find myself digging really deep on every subject so far and I’m not sure if that’s the best use of my study time or if I should make myself stop at a certain point and move on.

Hello Aaron

The processes involved in the BGP finite state machine are actually more complicated than those described in the lesson. If you want to get into the full detail, check out Section 8.2.2 Finite State Machine of RFC 4271 that describes the process in detail.

The finite state machine of BGP has many timers and various events that determine the change from state to state. Concerning the Connect and Active states, according to this RFC, there is only one situation where the connect state will move to the active state:

 If the TCP connection fails (Event 18), the local system checks
      the DelayOpenTimer.  If the DelayOpenTimer is running, the local
      system:
        - restarts the ConnectRetryTimer with the initial value,
        - stops the DelayOpenTimer and resets its value to zero,
        - continues to listen for a connection that may be initiated by
          the remote BGP peer, and
        - changes its state to Active.

So if you’re in the connect state, and the TCP connection fails, but the DelayOpenTimer has not expired, (the DelayOpenTimer is used to delay the sending of an OPEN message on a connection), then the peer goes into Active state.

Similarly, when in an Active state, the BGP peer will only enter the Connect state in one situation:

 In response to a ConnectRetryTimer_Expires event (Event 9), the local system:
        - restarts the ConnectRetryTimer (with initial value),
        - initiates a TCP connection to the other BGP peer,
        - continues to listen for a TCP connection that may be initiated
          by a remote BGP peer, and
        - changes its state to Connect.

You can find out more info about these events in the RFC.

There are very specific events that have to take place to get an Idle peer into either Active or Connect state. You can see those in detail in the RFC as well.

As for the certification exams, it is highly unlikely that you would be asked any questions that go into such detail. Digging deeply into every subject is great when you want to learn for learning’s sake. However, for the certifications, I suggest that you go over the content on the site only to the depth in which the content itself has gone. You can make a list of the deeper questions you may have so you can revisit them later, but if you want to focus on the certifications, go only as deep as the lessons go…

I hope this has been helpful!

Laz

R1#
BGP: 192.168.12.2 active went from Idle to Active
BGP: 192.168.12.2 open active, local address 192.168.12.1
BGP: ses global 192.168.12.2 (0x4B43F3FC:0) act Adding topology IPv4 Unicast:base
BGP: ses global 192.168.12.2 (0x4B43F3FC:0) act Send OPEN
BGP: 192.168.12.2 active went from Active to OpenSent
BGP: 192.168.12.2 active sending OPEN, version 4, my as: 1, holdtime 180 seconds, ID C0A80C01
BGP: 192.168.12.2 active rcv message type 1, length (excl. header) 34
BGP: ses global 192.168.12.2 (0x4B43F3FC:0) act Receive OPEN
BGP: 192.168.12.2 active rcv OPEN, version 4, holdtime 180 seconds
BGP: 192.168.12.2 active rcv OPEN w/ OPTION parameter len: 24
BGP: 192.168.12.2 active rcvd OPEN w/ optional parameter type 2 (Capability) len 6
BGP: 192.168.12.2 active OPEN has CAPABILITY code: 1, length 4
BGP: 192.168.12.2 active OPEN has MP_EXT CAP for afi/safi: 1/1
BGP: 192.168.12.2 active rcvd OPEN w/ optional parameter type 2 (Capability) len 2
BGP: 192.168.12.2 active OPEN has CAPABILITY code: 128, length 0
BGP: 192.168.12.2 active OPEN has ROUTE-REFRESH capability(old) for all address-families
BGP: 192.168.12.2 active rcvd OPEN w/ optional parameter type 2 (Capability) len 2
BGP: 192.168.12.2 active OPEN has CAPABILITY code: 2, length 0
BGP: 192.168.12.2 active OPEN has ROUTE-REFRESH capability(new) for all address-families
BGP: 192.168.12.2 active rcvd OPEN w/ optional parameter type 2 (Capability) len 6
BGP: 192.168.12.2 active OPEN has CAPABILITY code: 65, length 4
BGP: 192.168.12.2 active OPEN has 4-byte ASN CAP for: 2
BGP: nbr global 192.168.12.2 neighbor does not have IPv4 MDT topology activated
BGP: 192.168.12.2 active rcvd OPEN w/ remote AS 2, 4-byte remote AS 2
BGP: 192.168.12.2 active went from OpenSent to OpenConfirm

The Open message must be sent on open sent state …Right…But why its showing that the active sending Open message ??

As i understood, The TCP 3 way handshake will be initiated in the idle state and also the connection retry timer will be set to 60,if 3way handshake gets successful then it will move to the opensent otherwise it will move to Active and it will reset the connection retry-time to 60 and will initiate another TCP 3 way handshake .and if the TCP 3 way handshake gets completed in second time then the sate will move to the open-sent and sent the open message and immediately will move the open confirm state ,in this state if the open message is contained the right information and if thats verified correctly by peer then the peer router will send the keep alive message and if the keep alive not received in open confirm state then the state will be go down to Active state . Kindly correct me here…

and once the state will reach on Open sent state state then there is no count of Connection retry timer…Right …!!

Hello Narad

The BGP Finite state machine (FSM), which is the defined set of processes that create and maintain peers, is defined in detail in RFC4271, which defines BGP in general. There, in the Finite State Machine section of the document, you will find a detailed description of the processes that are followed.

The FSM process is much more detailed and involved than how it is described in the lesson. For certification, you don’t need to know this level of detail. You will notice if you go through it, that there are some cases where the Open message is sent during either the Connect state or the Active state. Specifically, when in the active state, the process says:

If the local system receives a DelayOpenTimer_Expires event (Event 12), the local system:
        - sets the ConnectRetryTimer to zero,
        - stops and clears the DelayOpenTimer (set to zero),
        - completes the BGP initialization,
        - sends the OPEN message to its remote peer,
        - sets its hold timer to a large value, and
        - changes its state to OpenSent.

So you see, the OPEN message is sent during Active state.

For the details of the three-way handshake, the timers, and the sending of BGP messages, take a look at the FSM section of the RFC for a detailed description. If you have further questions after reviewing it, let us know!

I hope this has been helpful!

Laz

Hello,
I dont exactly understand what does it mean when state: Connect fails.
It is said “In case it fails, we continue to the Active state. If the ConnectRetry timer expires then we will remain in this state.”
It means that, TCP hand shake could not complete or ConnectRetry has expired?
And also “If the ConnectRetry timer expires then we will remain in this state.”
It means that it stays in connect state or Active?
I was trying to cath “connect” state by typing #show ip bgp sum during shutting interface or resetting BGP session but i couldn’t.
Connect state change so fast? And when exactly it goes to connect? When router receive SYN from other router?
It will be helpful if you can explain me that.
Best regards,
Marcin

Hello Marcin

It always helps to go back to the original official description of the BGP finite state machine (FSM). This can be found at RFC4271 in section 8.2.2.

As far as the connect state goes, the RFC details what happens in every case, whether a timer expires or another event occurs. Specifically, if the TCP connection fails, it states the following:

  If the TCP connection fails (Event 18), the local system checks
  the DelayOpenTimer.  If the DelayOpenTimer is running, the local
  system:

    - restarts the ConnectRetryTimer with the initial value,

    - stops the DelayOpenTimer and resets its value to zero,

    - continues to listen for a connection that may be initiated by
      the remote BGP peer, and

    - changes its state to Active.

Now the TCP failure is defined as Event 18. Event 18 is defined as:

  Event 18: TcpConnectionFails

     Definition: Event indicating that the local system has received
                 a TCP connection failure notice.

                 The remote BGP peer's TCP machine could have sent a
                 FIN.  The local peer would respond with a FIN-ACK.
                 Another possibility is that the local peer
                 indicated a timeout in the TCP connection and
                 downed the connection.

So to answer your question, a connect failure occurs when the TCP handshake does not complete, or when the BGP peer sends a FIN, or there is a timeout in the TCP connection. (This timeout is not to be confused with the timers that BGP has).

This is the only case when the Connect state will move to the Active state. All other cases described in the RFC move the FSM to either the Idle or the OpenSent state.

It actually stays in the Connect state. If you look at the RFC you’ll see that if this timer expires, the TCP connection is dropped, and a new TCP connection is actually initiated. So there is no change of state here.

A router will remain in the connect state as long as it takes to do a TCP handshake. A successful TCP handshake takes milliseconds, so you will most likely not catch it. However, you may be able to simulate a failure by creating a control plane policing scheme to block TCP port 179 on one of the peers. This will delay the handshake and will give you the opportunity to see the router in the Connect state.

So when does a router actually go into the connect state? Again, based on the RFC, when the TCP connection to the other BGP peer begins.

I hope this has been helpful!

Laz

Hello!

I’ve read that “Errors cause the state to revert back to Idle and the ConnectRetryTimer to be set to 60 seconds initially, doubling on subsequent failures”.

Is there any way to produce something like this in a lab? I’ve tried all sorts of different scenarios, yet I’ve never seen this timer in action, nor it doubling on subsequent failures.

For example, I’ve purposely mismatched the AS numbers specified in the neighbor statements which caused the connection to be torn down, the routers moved back into the Idle state and retried the connection for several attempts in a row. It took them just a few seconds, so when exactly does this ConnectRetryTimer come into play?

I’ve packet-captured the scenario I mentioned above (open it in a new tab to make it larger)

David

Hello David

The BGP finite state machine is the name given to the process by which BGP peerings are established or fail to be established. This is described in full in RFC4271 in the Finite State Machine section. The RFC further states details concerning the ConnectRetry timer as well.

Now it is difficult to recreate a situation in which this timer will expire. This is because the expiry occurs due to a missed update and not due to a misconfiguration. So if you were to mismatch ASes as you did, the failure of BGP is due to incorrect information exchanged and not an expiry of the timer itself.

What must happen in order for this timer to expire? Well, one particular scenario is this:

  1. A BGP peer enters the idle state and initiates a TCP connection to the BGP peer. It then changes its state to Connect.
  2. In the Connect state, the BGP peer is waiting for the TCP connection to be completed. If there is no response from the peer, then the ConnectRetryTimer may expire.

So it is a failure in the TCP exchange that will cause the timer to expire. How can you simulate this? You can create an access list on the remote peer to block TCP port 179 in an incoming direction, thus never allowing the TCP session to establish, thus not allowing the BGP peering to take place. Make sure to use the ACL in an incoming direction, and not outgoing.

If you try it out, let us know how get along!

I hope this has been helpful!

Laz

Hello Laz

Thank you for providing me with information on how to simulate this. I’ve configured R1 for BGP to peer with R2 and configured an ACL on R2 which denies any TCP traffic from R1 with the destination port of 179.


obrázok
obrázok

However, I’ve observed a simillar behaviour as with my example above. The configured BGP speaker attempted the connection every few seconds or so.

Kind regards,
David

Hi David,

This one has me scratching my head. The default Cisco IOS ConnectRetry timer should be 120 seconds:

I’ll look at the latest RFC (4271):

Idle state:
  Initially, the BGP peer FSM is in the Idle state.  Hereafter, the
  BGP peer FSM will be shortened to BGP FSM.

  In this state, BGP FSM refuses all incoming BGP connections for
  this peer.  No resources are allocated to the peer.  In response
  to a ManualStart event (Event 1) or an AutomaticStart event (Event
  3), the local system:

    - initializes all BGP resources for the peer connection,

    - sets ConnectRetryCounter to zero,

    - starts the ConnectRetryTimer with the initial value,

I tried a setup similar to yours. I first tried filtering with an access-list. I also tried using neighbor X update-source Loopback. with loopback interfaces that are not advertised.

Whatever I try, I never run into that 120 seconds delay :slight_smile: I tried this on IOS 15.x so I’m curious to see what older versions would do…

Rene