IP SLA (Service-Level Agreement) on Cisco IOS

hishamabdelghanygami · March 7, 2022, 8:50pm

I have a question here , if I configured IP SLA with ICMP-echo and the frequency is every 10 seconds,
This means that it send echo request every 10 seconds, if the destination stopped replying , after how many echo requests failure the ip sla would be considered down, is there a certain count, 1 failure, 2 failures.. 10 ?

Thanks

lagapidis · March 9, 2022, 10:05am

Hello Hisham

What you are looking for is the threshold parameter of the IP SLA configuration. By default, the threshold is set to 5000 milliseconds, or 5 seconds, regardless of the frequency of the IP SLA. But you can change this accordingly. To find out more about these parameters and how they should be configured, take a look at this NetworkLessons note on IP SLA Parameters.

I hope this has been helpful!

Laz

seb.j.thomas · April 28, 2022, 9:28am

Hey all,

I have a question around best-practice of the SLA timeout, specifically relating to missed probes - I find mixed answers to this in documentation.

Take for example the following situation -

I setup an SLA monitor for ICMP-ECHOS to a neighbour that are sent every 5 seconds, but only want to declare the router dead after it misses 3 echos in a row -

If I configured this with a Track object and set the delay down timer, would I count the timer from the first missed ICMP echo? e.g. set the delay timer to then be 11 seconds rather than 16?

Here’s my logic - fist ICMP echo is missed so the IP SLA reports a timeout. I then start a timer for 11 seconds, because that would give it time for another 2 probes to be sent, totaling 3. If neither of those responds then I declare the router to be dead.

Does that make sense? Should the delay timer be a bit longer than the echo to give it a chance? This is where I find the advice a bit lack-luster.

Appreciate any help and guidance on this one!

lagapidis · April 30, 2022, 5:39am

Hello Sebastian

The delay down timer will begin counting from the first failed ICMP echo. The IP SLA will consider an SLA failed if either the configured timeout or the configured threshold are exceeded. See the NetworkLessons Note on IP SLA parameters for more information about that.

So to be precise, the down timer will begin counting after the timeout or threshold values have been exceeded. If they are set to 1 second for example, then the delay down timer will start counting 1 second after the initial ICMP echo request was sent.

Based on this, it makes sense for you to set the delay timer to be 11 seconds since you want three echoes to fail before the SLA is considered failed. Let’s take a look at the process second by second, assuming a threshold of 1 second.

0 - echo request sent
1 - echo reply was not received, SLA fails, delay down timer is started
2
3
4
5 - echo request sent, delay down timer is at 4
6 - echo reply was not received, SLA fails, delay down timer is at 5
7
8
9
10 - echo request sent, delay down timer is at 9
11 - echo reply was not received, SLA fails, delay down timer is at 10
12 - delay down timer is at 11, the down delay has expired, SLA now requires a reaction event (i.e. any tracking associated with SLA is triggered).

You could theoretically set it to 10 seconds, but that would make the timer expire exactly the same time as the third echo reply fails, so leaving an extra second is helpful.

Note that if your threshold/timeout values are larger, then you will have to appropriately change your delay down timer as well to accommodate the slightly larger times needed to define an SLA as down.

I hope this has been helpful!

Laz

u.forte · February 11, 2023, 5:36pm

Hi i have a doubt about this configuration
hi i have a question

i want explanation about this configuration
what does mean threshold timeout and frequency ?
and what do the command track 1 delay down ?
thanks

ip sla 1
icmp-echo 10.1.12.100 source-interface vlan xyz
threshold 600
timeout 600
frequency 10


track 1 ip sla 1

delay down 5

lagapidis · February 14, 2023, 7:47am

Hello Ugo

Take a look at this lesson that describes the IP SLA configuration in detail:

Briefly:

threshold - the maximum time under which a response is considered successful. Anything about this would be considered a failure.
timeout - the amount of time the device will wait for a response. This should not be less than the threshold
frequency - how often the SLA test will be applied. In this case, it’s how often the ping will be sent.

Now the track 1 ip sla 1 command simply enables the SLA feature for this particular setup. The delay down 5 command is used as a kind of dampening mechanism for links that may be flapping. Take a look at this post for more info:

I hope this has been helpful!

Laz

davidilles · February 2, 2024, 12:45pm

Hello, everyone!

I have a few questions to confirm. I am not quite sure whether the timeout vs threshold explanation is 100% correct.

I don’t think that anything that goes above the threshold is considered a failure.

I’ve configured the timeout to be 100 seconds while the threshold was set to 100 ms which is a significantly lower value.

To summarize and provide more clarity, here’s my configuration.
obrázok

The operation timeout is set to 100 seconds
The operation frequency is set to 100 seconds
The operation threshold is set to 100 milliseconds

R1 is connected to R2 (192.168.12.2). This is an icmp-echo operation which is used for basic connectivity verification. Once I schedule it, the first packet is successsful.

Since the frequency is set to 100 seconds, the next packet should be sent once the operation ttl hits 3500 seconds. However, I will shut down the link between R1 and R2 to break the connectivity.

After 100 seconds, another packet is sent.

However, notice that the number of failures does not increment, despite the threshold being already exceeded (20 seconds have passed since this packet was sent).

The operation is considered as failed and the counter increments only after the timeout (100 seconds) expires.

From how I understand the threshold, it’s a value that once exceeded, the administrator can configure the device to take a specific action, such as generate an SNMP trap. It’s basically “this is taking a bit too long, let’s raise some alarms” value.

Laz or Rene, can you confirm this please and verify my understanding? Thank you!

lagapidis · February 10, 2024, 7:08am

Hello David

Yes, you are correct. The threshold specifies after how much time a “reaction event” will take place. The timeout is the amount of time an IP SLA will wait for a response from its echo request packet. If this is exceeded, then a failure is recorded. Take a look at this NetworkLessons note on the topic.

I know you set the parameters to extreme values to test the operation, however, there are some guidelines that should be used when setting the threshold, timeout, and frequency values. These are outlined in the links to the related IP SLA command reference found in the note I linked to above.

The only comment I’m questioning from your post is this:

The next packet should be sent once the 100 seconds of the frequency elapse, not once the operation time to live elapses. The Operation time to live is the duration that the IP SLA will function. If you don’t configure it, it will have a default value of 3600 seconds, or one hour, and this is why you have a value of 3595 in your output.

I hope this has been helpful!

Laz

camerone7276 · June 27, 2024, 4:00pm

Hi,

I wanted a little clarification on the ip sla responder command. I understand what it is doing but this is all you need for responses no additional syntax? You don’t need to specify which sla# you selected or anything? By default, the device that should be responding when you do icmp-echo [destination ip] won’t respond back correct?

lagapidis · July 1, 2024, 5:24am

Hello Cameron

The quick answer is yes, that’s all you need. This then begs the question, how does it work without additional information?

Well, when you issue the ip sla responder command, it causes the target device to simply reply to all incoming IP SLA packets by default, providing additional information to the sending device. All of the “intelligence” of the configuration exists within the IP SLA sending device.

Prior to any SLAs being sent, the sender will send a control message to the responder with any information it needs, thus ensuring that the responder has all the information it needs to respond correctly. Does that make sense?

For more info, take a look at this Cisco command reference:

I hope this has been helpful!

Laz

camerone7276 · July 25, 2024, 1:07am

Sorry if this is already posted somewhere but I had a question about the pending option for IP SLA. When I do ip sla schedule 1 start-time pending life forever how do I activate the pending or what do I need to do to get the SLA running in this case?

What is this trigger?

lagapidis · July 29, 2024, 5:44am

Hello Cameron

When you create an IP SLA, you can place it either in the pending state or the active state.

An IP SLA operation is in the pending state when it is configured but not yet started or scheduled to run. The operation is defined with all necessary parameters, but the actual monitoring or testing has not commenced.

The SLA is waiting for a trigger to transition to the active state.
Pending state allows for configuration review and adjustments before execution begins.
For most scheduling IP SLA commands, the pending state is the default.

An IP SLA operation is in the active state when it is currently running and performing the specified monitoring or testing tasks. The operation is actively sending out probes, collecting data, and performing measurements. The operation remains active for the duration specified in the configuration or until it is manually stopped.

A “trigger” is an event, a predefined condition, or a manual configuration that causes an IP SLA to transition from pending to active.

There are various commands that can define a trigger such as ip sla monitor reaction-trigger for example. The following Cisco documentation specifies the various commands in which triggers can be defined.

Take a look at the documentation and if you have any further questions, let us know!

I hope this has been helpful!

Laz

camerone7276 · July 31, 2024, 9:18pm

Hi Laz,

Thank you for this detailed explanation. It is very helpful. I did not know this about IP SLA.