OSPF Packets and Neighbor Discovery

lagapidis · December 7, 2016, 9:41am

Hello Shaun.

Yes, you are accurate in your description!

Laz

shantel · December 22, 2016, 1:00am

19 posts were merged into an existing topic: OSPF Packets and Neighbor Discovery

wilder7bc · September 20, 2017, 10:00pm

I had a question on this forum post. I was curious at what level you learn of this such as CCNP TSHOOT? or if this is not something you pick up any regular cisco training but from trouble shooting experience in real word. I think its great piece of information though especially for multi vendor companies such as mine that might have Brocade and Cisco and the possibility of different MTU`s

andrewAndrew P

Aug '16



Kishor,
I assume you mean the “ExStart” or “Exchange” state, so I will write about those. If OSPF is having an authentication problem, you will not see the routers stuck in ExStart or Exchange. In fact, you won’t see anything at all. The output for “show ip ospf neighbors” will just be blank (if a neighbor relationship hadn’t already formed). If a neighbor relationship already formed, and then an authentication problem is introduced, the neighbors will just drop once the dead interval is reached. The reason, in both cases, is because if there is an authentication mismatch, then the other router’s Hello message will simply be ignored. By ignoring Hello messages, the OSPF state machine will never even begin, let alone get to an ExStart phase.

To answer your other question, the most common reason by far for OSPF to be stuck in ExStart is because of an MTU mismatch between neighbors. Besides MTU mismatches, other possibilities include duplicate router IDs, access-list that block unicast packets, or NAT misconfigurations.

I recommend you check out a Cisco article that goes into great detail on OSPF getting stuck. You will find they have a very detailed 14 step explanation as to why an MTU mismatch causes this problem.

2http://www.cisco.com/c/en/us/support/docs/ip/open-shortest-path-first-ospf/13684-12.html#exstart2

lagapidis · September 23, 2017, 8:43am

Hello Brian

@andrew can tell us from his point of view, but I’d like to share my experience as well. I find that the basic theory necessary to be able to understand these things is given clearly in CCNA and CCNP where we learn about the different states of the OSPF process of convergence. The ExStart and Exchange states are actually first mentioned in CCNA, but are further examined in CCNP. So really, the theory is there from the certifications to be able to comprehend what is going on. The experience allows you to solidify the theory into more a comprehensive and clearer understanding of the concepts. When you face such an issue in the real world and you troubleshoot and figure it out, you fully understand the intricacies of its functionality. Without the experience, these concepts will not be fully understood. But without the theory provided by the certifications, you’ll never get a chance to gain the experience in the first place.

I hope this has been helpful!

Laz

wilder7bc · September 23, 2017, 2:55pm

very good explanation thanks!

hengsovandara1345 · December 19, 2017, 5:04pm

Hi
Are DR , BDR and Master , Slave the same ?

ReneMolenaar · December 21, 2017, 12:48pm

Hi Heng,

These are two separate things.

The master/slave roles are used during the DBD exchange.
The DR/BDR are used on multi-access networks to reduce the number of neighbor adjacencies.

Rene

waqar675 · March 13, 2018, 10:20am

Dear andrew,

I test this Exstart behaviour by using same router ID on two differnet nodes, i didnt see any of the node goes to Exstart state.
But when changing the MTU on interface it stuck in Exstart state.

//BR
Waqar

lagapidis · March 16, 2018, 8:07am

Hello Mohammad

Having the same router ID can indeed lead to an OSPF node getting stuck in Exstart state. According to this Cisco Documentation, this can be the case.

In order to duplicate such a scenario however, it may be necessary to have more than two OSPF routers and to have the OSPF processes reset in all routers. It may take some experimentation to get the circumstances right.

I hope this has been helpful!

Laz

waqar675 · March 18, 2018, 8:42am

Hello laz,

I got your point but my concern here is that when there is MTU mismatch we see in (show ip ospf neighbor) command that neighor is in EXSTART state but with the Router ID scenario we didnt see any of the output depicting that its in EXSTART state although we see the blank output in (show ip ospf neighbor) command when its having same Router ID.
Thanks for the valuable feedback.

//BR
Waqar

lagapidis · March 20, 2018, 7:03am

Hello Mohammad

I just did a simulation where I had the following topology:

The configuration for R1 is:

router ospf 1
router-id 1.1.1.1
network 10.10.10.0 0.0.0.255 area 0
 network 10.10.101.0 0.0.0.255 area 0

and for R2

router ospf 1
router-id 1.1.1.1
network 10.10.10.0 0.0.0.255 area 0
network 10.10.102.0 0.0.0.255 area 0

So the router IDs are the same.

The result that I get from the show ip ospf neighbor command is the following:

Router1#show ip ospf neighbor

    Neighbor ID     Pri   State           Dead Time   Address         Interface
    1.1.1.1           1   EXSTART/DR      00:00:34    10.10.10.2      GigabitEthernet0/0

and from R2:

Router#show ip ospf neighbor 

    Neighbor ID     Pri   State           Dead Time   Address         Interface
    1.1.1.1           1   EXSTART/DR      00:00:30    10.10.10.1      GigabitEthernet0/0

So I do indeed get the EXSTART state when the router IDs are the same. The reason for this is that during the EXSTART the routers exchange DBD packets. The router and its neighbour form a master slave relationship where the higher router ID becomes the master. If the router IDs are the same, this mechanism of choosing master/slave does not complete and the router gets stuck.

I hope this has been helpful!

Laz

waqar675 · March 20, 2018, 9:42am

Hello Laz,

Well explained, let me do the labbing for the same topology.
Thanks alot.

//BR
Waqar

waqar675 · March 20, 2018, 11:00am

Hello Laz,

I used the same topology and configuration as you used above but in my case i am not seeing the desired results.

I enabled terminal logging and seeing the below logs:

*Mar 20 10:41:32.276: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 1.1.1.1 from 10.10.10.2 on interface GigabitEthernet0/1
*Mar 20 10:40:31.640: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 1.1.1.1 from 10.10.10.1 on interface GigabitEthernet0/2

and for the (show ip ospf neighbor) i am having Blank Output.

R1#sh ip ospf neighbor

There is the possibility that I am missing something which is stopping me to get the desired results.

R1#show running-config partition router ospf 1
Building configuration...
! Last configuration change at 10:39:04 UTC Tue Mar 20 2018
router ospf 1
 router-id 1.1.1.1
 network 10.10.10.0 0.0.0.255 area 0
 network 10.10.101.0 0.0.0.255 area 0

R2:

R2#show running-config partition router ospf 1
Building configuration...
! Last configuration change at 10:39:07 UTC Tue Mar 20 2018
router ospf 1
 router-id 1.1.1.1
 network 10.10.10.0 0.0.0.255 area 0
 network 10.10.102.0 0.0.0.255 area 0
!

arindom.nag · March 22, 2018, 3:37pm

Hi Laz,
I am getting same issue(above mentioned ) in my lab topology…no neighbourship is happening between R1 & R2…
So why we are not getting in our LAB ?
Waiting for your valuable response.

Thanks & Regards,
Arindom

lagapidis · March 23, 2018, 3:33pm

Hello Mohammad and Arindom

This is interesting. It seems that there are two different behaviours that are occurring when the same configuration is implemented between routers. Researching the issue further I have found two Cisco documentations that indicate two different things. First, we are told that if there is no output from the show IP ospf neighbor command, this could be because of identical router IDs between the two routers:

Router IDs are used in order to identify each router in an OSPF network. Routers with the same Router ID will ignore HELLOs sent by each other, which prevents them from forming adjacency. The first line of show ip ospf command output displays the current Router ID of each router.

The above explanation refers to the results you are getting. This is documented here on page 3. Secondly, we are also told that:

MTU mismatch, although the most common, is not the only reason that OSPF neighbors get stuck in the exstart/exchange state. This can also be due to both routers having the same router ID (mis-configuration).

This is the case in my experimentation and is documented here.

This may be due to a platform issue or an IOS version issue. My test was done using two 1941 routers running IOS Version 15.1(4)M4. Can you both @waqar675 and @arindom.nag share your platforms and IOS versions? We’ll get to the bottom of this!

Thanks!

Laz

arindom.nag · March 24, 2018, 10:48am

Hi Laz,
Nice explanation…
Sorry for late reply…today morning i tried with your same platform or IOS and getting right output..plz find the details…

R1#sh version 
Cisco IOS Software, C1900 Software (C1900-UNIVERSALK9-M), Version 15.1(4)M4, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2007 by Cisco Systems, Inc.
Compiled Wed 23-Feb-11 14:19 by pt_team

ROM: System Bootstrap, Version 15.1(4)M4, RELEASE SOFTWARE (fc1)
cisco1941 uptime is 5 minutes, 30 seconds
System returned to ROM by power-on
System image file is "flash0:c1900-universalk9-mz.SPA.151-1.M4.bin"
Last reload type: Normal Reload

R1#sh ip ospf neighbor 


Neighbor ID     Pri   State           Dead Time   Address         Interface
1.1.1.1           1   EXSTART/BDR     00:00:34    192.168.12.2    GigabitEthernet0/0



R2#sh version 
Cisco IOS Software, C1900 Software (C1900-UNIVERSALK9-M), Version 15.1(4)M4, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2007 by Cisco Systems, Inc.
Compiled Wed 23-Feb-11 14:19 by pt_team

ROM: System Bootstrap, Version 15.1(4)M4, RELEASE SOFTWARE (fc1)
cisco1941 uptime is 5 minutes, 49 seconds
System returned to ROM by power-on
System image file is "flash0:c1900-universalk9-mz.SPA.151-1.M4.bin"
Last reload type: Normal Reload

R2#sh ip ospf neighbor 


Neighbor ID     Pri   State           Dead Time   Address         Interface
1.1.1.1           1   EXSTART/DR      00:00:38    192.168.12.1    GigabitEthernet0/0

Then i tried with different IOS Version 12.4(25d) for recheck and getting no ouput like as previous(which Mohammad and i got ) please find the details…

R4#sh version
Cisco IOS Software, 3700 Software (C3745-ADVIPSERVICESK9-M), Version 12.4(25d), RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2010 by Cisco Systems, Inc.
Compiled Wed 18-Aug-10 08:18 by prod_rel_team

R4#sh running-config | section ospf
router ospf 1
 router-id 1.1.1.1
 log-adjacency-changes
 network 192.168.12.0 0.0.0.255 area 0

R4#debug ip ospf adj
*Mar  1 00:05:38.275: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 1.1.1.1 from 192.168.12.1 on in    terface FastEthernet0/0
*Mar  1 00:05:11.367: OSPF: end of Wait on interface FastEthernet0/0
*Mar  1 00:05:11.367: OSPF: DR/BDR election on FastEthernet0/0
*Mar  1 00:05:11.367: OSPF: Elect BDR 1.1.1.1
*Mar  1 00:05:11.367: OSPF: Elect DR 1.1.1.1
*Mar  1 00:05:11.367: OSPF: Elect BDR 0.0.0.0
*Mar  1 00:05:11.367: OSPF: Elect DR 1.1.1.1
*Mar  1 00:05:11.367:        DR: 1.1.1.1 (Id)   BDR: none
*Mar  1 00:05:11.867: OSPF: No full nbrs to build Net Lsa for interface FastEthernet0/0

R4#sh ip ospf neighbor



----------------------------------------------------------
R3#sh version
Cisco IOS Software, 3700 Software (C3745-ADVIPSERVICESK9-M), Version 12.4(25d), RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2010 by Cisco Systems, Inc.
Compiled Wed 18-Aug-10 08:18 by prod_rel_team

R3#sh running-config | section ospf
router ospf 1
 router-id 1.1.1.1
 log-adjacency-changes
 network 192.168.12.0 0.0.0.255 area 0

R3#
*Mar  1 00:08:02.831: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 1.1.1.1 from 192.168.12.2 on interface FastEthernet0/0
R3#
*Mar  1 00:09:02.839: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 1.1.1.1 from 192.168.12.2 on interface FastEthernet0/0
R3#
*Mar  1 00:10:12.819: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 1.1.1.1 from 192.168.12.2 on interface FastEthernet0/0
R3#
*Mar  1 00:11:12.823: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 1.1.1.1 from 192.168.12.2 on interface FastEthernet0/0
R3#
*Mar  1 00:12:22.831: %OSPF-4-DUP_RTRID_NBR: OSPF detected duplicate router-id 1.1.1.1 from 192.168.12.2 on interface FastEthernet0/0


R3#sh ip ospf neighbor

So Laz final conclusion is which was in my and Mohammad mind( why we didn’t reproduce in our LAB & no output was like you ) its happening due to IOS which you mentioned yesterday:smiley:

Aside Mohammad please go through the above mention output and you can try with this “Version 15.1(4)M4” then you can get which you and i expected.

Thanks & Regards,
Arindom

guruprassad889 · October 14, 2018, 5:00pm

In the lesson, it states that lsack packets will be transferred upon receiving the dbd packets. However, when dbd packets are sent there is no lsack.

Say for example, the master sends a dbd packet, the slave receives and responds with a dbd packet (might include lsa headers) and the sequence number field is same as the packet that was sent by the master.

Please let me know if my understanding is wrong

lagapidis · October 22, 2018, 9:10pm

Hello Sriguruprassad

This is a good point you bring up. An LSA acknowledgement does not always have to be in the form of an LSAck packet. Acknowledgements can also be implied with the receipt of a link state update. For example, the RFC for OSPF states the following:

Each newly received LSA must be acknowledged. This is usually done by sending Link State Acknowledgment packets. However, acknowledgments can also be accomplished implicitly by sending Link State Update packets.

So to more clearly state it, each DBD packet recieved must be acknowledged. This can be done using a LSAck packet or implicitly using a response such as a LS Update. Multiple LSAcks can also be grouped together and sent in a single LSAck packet as well. More info on this can be found in the RFC link below:

Looking at the cloudshark output you sent, many of the DBD packets sent has been responded to with the LS update sent afterwards as an acknowledgement.

I hope this has been helpful!

Laz

markos9552 · January 7, 2019, 9:05am

Hello all

I would like to make 2 questions.

During the neighbor adjacency process will the DROTHER router send the DBD,LSUs to the unicast addresses of DR and BDR or to the multicast address of 224:0:0:6? What addresses will also the DR and BDR use to reply and send their DBD, LSUs to the DROTHER?
Database description packets are only exchanged during the neighbor adjacency process? They are not sent after the adjacency is fully established? In this case updates are only sent directly through LSUs and without the use of DBD packets? Thank you for the reply.

Regards
Markos

lagapidis · January 10, 2019, 9:42am

Hello Markos

Concerning question 1, there is no clear cut answer given for the destination IP address for DBDs from DROTHER routers in the RFC2328 describing OSPFv2. It states what kinds of packets use what kind of address (unicast/multicast) except for the DBD. Take a look at Pages 58-59 on the RFC. However, researching further, there seems to be a consensus that the DBD packets are sent to the unicast addresses of the DR and BDR. Now according to the RFC:

On physical point-to-point networks, the IP destination is always set to the address AllSPFRouters. On all other network types (including virtual links), the majority of OSPF packets are sent as unicasts, i.e., sent directly to the other end of the adjacency. In this case, the IP destination is just the Neighbor IP address associated with the other end of the adjacency … The only packets not sent as unicasts are on broadcast networks; on these networks Hello packets are sent to the multicast destination AllSPFRouters, the Designated Router and its Backup send both Link State Update Packets and Link State Acknowledgment Packets to the multicast address AllSPFRouters, while all other routers send both their Link State Update and Link State Acknowledgment Packets to the multicast address AllDRouters.

So it seems the DR and BDR send LSUs and LSAs to the AllSPFRouters multicast address.

For question 2, DBD packets will be sent to a neighbour depending on the neighbour’s state:

In ExSTart, the router sends an empty DBD.
In the Exchange state the DBD contain summaries of the link state information. Whether or not it is sent depends on whether the router is master or slave.
In loading and full states, the slave must resend its last DBD in rspose to duplicate DBD packets received from the master, but this is done only once, as soon as these states have been reached.

Now when receiving DBD packets under various states, routers will either accept them or drop them, depending on the state of the neighbor adjacency. The conditions under which DBDs are sent or recieved are described in detail in the following sections of the RFC:

10.6. Receiving Database Description Packets
10.8. Sending Database Description Packets

The RFC link once again:

https://www.ietf.org/rfc/rfc2328.txt

I hope this has been helpful!

Laz