QoS LLQ (Low Latency Queueing) on Cisco IOS

Hi Laz,

Thanks for the clarity and clearing up the policing and what happens when the priority queue is full. The only other part of my query was what if the priority queue is always getting traffic sent to it? Then what happens to the other queues? If the priority queue is always being used then do the other queues suffer from queue starvation anyway because they don’t get a chance to send packets if the priority queue is always being used.

Thanks as always

Hello Michael

Let’s use this diagram from the lesson to help us out:

image

Notice that the interface is a GigabitEthernet interface, and that the priority queue is set to 50% of the total bandwidth. Now remember, if there’s no congestion, the above queues are non-existent, and all traffic is served immediately.

Now let’s assume we have congestion over an extended period of time, and we have a lot of priority traffic arriving at the interface. That priority traffic will go into the priority queue to be served immediately. However, the bandwidth of the priority queue has been limited to 50%. This means that only a bandwidth of up to 500 Mbps can use the priority queue. This is where the “implicit policer” I mentioned before comes in. Any packets above that 500 Mbps in the priority queue will be dropped.

This means that there is another 500 Mbps of bandwidth that is guaranteed for use by the CBWFQ for the non-priority traffic.

Now this is a little bit difficult for us to get our heads around because of the phrase “everything that ends up in queue 1 will be served before any of the other queues.” If there’s always something in queue 1, queues 2 to 4 will never be served!

We often think about dividing queues into sections of bandwidth, where queue 1 gets 500 Mbps, while the other queues combined get another 500 Mbps. But packets must enter the interface serially, meaning one at a time, one after the other. So to achieve the above mentioned bandwidths, the scheduling of the sending of packets from the queues occurs by sharing the use of the interface over time, something similar to time division multiplexing. For each second of time, 50% of the time will be devoted to serving queue 1, and 50% will be devoted to serving the other three queues, based on their percentages. These limitations are policed to ensure that bandwidth starvation will not occur.

I hope this has been helpful!

Laz

Brilliant as always, thanks Laz

1 Like

Hey Laz,
I can’t hang the idea as stuff getting realy messed up and oppose each other.

From my understanding, the LLQ should be served immidiaetly when traffic queue in it while congestion occurs.

If the queues work by percentage, then the PQ might have 100Mbps capacity (10%) , and 2nd queue has 500Mbps capacity (50%), and the 3rd queue will have 400Mbps (40%) capacity.

Lets assume that there’s been a congestion and now traffic will start to buffer up , the interface is capable to transmit 1Gbps and there is extra 50Mbps in the queues, and just before about dropping packets from the buffers being full , the interface can start to transmit the queued data - so what will / can happen?

one option would be that the first 20Mb will refer to the PQ and will be transmitted, after that, another 30Mbps in the 2nd queue should be served but as soon as 20Mbps of traffic has been served, the PQ gets another 100Mbps in it to be served , so the router should stop forwarding the data in the 2nd queue and immidiaetly start to serve again the PQ for 100Mbps.

In this scenario the PQ will serve a total of 120Mbps instead of the 100Mbps limit that configured as the Implicit policer’s bucket should have replenished its tokens while the 2nd queue was active for serving those pitifull 20Mbps.

Those, I can’t understand the statement you said about that each queue can be taking the exact % of Bandwidth which it configured for , as in actual practice the PQ was configured for 10% but served 12% total and the 2nd queue’s serving had to be stopped in the middle of the operation in my theory which reflect my understanding about the LLQ operation and how policers and shapers should be working like.

Another thing that doesn’t sound quite right is that the queues will use TDM mechanism to utilize the BW - this way the entire link won’t be used at all for 80Mbps of traffic as it was scheduled to serve only the PQ for 10% of the time since the congestion occured ,and the PQ had only 20Mb in it.

Hello Nitay

Neither can I! That’s why I’ll do my best to clarify the questions you have. Let’s take a look at the diagram once again:
image

First of all, we should keep in mind that queuing will only take place when there is congestion. I know we mentioned this before, but let me clarify. The percentages per queue are the maximum that is made available, whenever all queues are full. Imagine that all queues have a continuous flow of data well above the allocated bandwidth for each. These percentages are maxed, and they remain so for the duration of that congestion.

Now imagine that all the queues are at max except for Queue 3. Let’s say that’s only using half of its allotted bandwidth. That’s still congestion, but there is some “free bandwidth”, specifically 100Mbps free. How does that get filled up? Traffic in queues 2 and 4 above their allotted bandwidths, will be forwarded using “best effort”. This will not take place for the LLQ however, because a priority queue is policed to its configured rate, not extending into any unused capacity.

So you see traffic belonging to Queue 2 or 4 can actually exceed the 20% or 10% of bandwidth, if there is free capacity, but that excess will be treated as best effort. So you see, any “free capacity” will always be filled in with excess traffic from other queues.

I understand your confusion here. What I said was that it is similar to TDM. For example, if we consider a situation where all queues are maxed, over a particular time period, say 1 second, 50% of that time will be used to forward the traffic in Queue1, 20% of the time to Queue2, 20% to Queue 3 and 10% of the time to Queue 4. It is unavoidably so because packets exit an interface serially. The main point of my statement was to show that a lower priority queue will not stop completely until higher priority queues are empty, but will simply be given a smaller allotment of bandwidth (divided over time).

I hope this has been helpful!

Laz

1 Like

Hi Laz,

  1. I couldn’t get how are you matching the Voice and Signaling Traffic in class-map mean How EF code mean is voice traffic and CS3 mean is signaling traffic and where from did you get EF code ?
  2. Could you elaborate how queue would be prioritized on the basis of Kbps and Bandwidth and how packets will get exit on the basis of this priority criteria?
  3. Class Sector name EF where did it come from b/c in ip precedence topic you mentioned only up to CS7 ?
  4. last thing why are we using TOS while we are pinging ?

Hello Pradyumna

The TOS field, as well as the later redefined DS field, contains information about the classification of traffic. How this information is interpreted and acted upon depends primarily on vendor design as well as the QoS model being used. The IP Precidence and DSCP Values lesson describes multiple methods and models that are used, and their definitions in specific RFCs. Specifically, Rene states that:

We have a lot of different values that we can use for the TOS byte…IP precedence, CS, AF and EF. So what do we really use on our networks?

The short answer is that it really depends on the networking vendor. IP Precedence value 5 or DSCP EF is normally used for voice traffic while IP precedence value 3 or DSCP CS3 or AF31 is used for call signaling.

See if your networking vendor has a Quality of Service design guide, they usually do and give you some examples what values you should use.

What values are actually used and how a device acts upon them is not only vendor-specific, but it is also fully configurable by you, at least on Cisco devices. So why is EF used for voice traffic and CS3 for signalling? Because this is a common convention used by IP phone and VoIP equipment vendors. The class maps defined in the lesson simply match any packets with those markings, and the service policy that references them acts upon them accordingly.

In this particular case, when there is no congestion on the link, both voice and signalling traffic are free to use any available bandwidth. When there is congestion, packets that match these criteria are forwarded immediately for as long as the total bandwidth that each type consumes is less than the configured priority. Once this priority bandwidth is reached, the forwarding of these packets will fall back to normal CBWFQ scheduler.

ToS is the name of the keyword used to change the value found within the ToS field. But remember, the ToS field is the same as the DS field, and is interpreted accordingly by intermediate devices. So even though the tos keyword is being used, you are actually defining DSCP values using the same keyword. The valid values for this keyword are 0 to 255 which means that in this way you define any value of this 8 bit field.

I hope this has been helpful!

Laz

Hi Laz,
Thanks a lot but I still have a doubt that shall we need to remember these TOS/DSCP values according to traffic classification?
2) I think you did not get my question related traffic prioritization, actually I want to know that suppose we prioritized a first queue on this basis of either b/w or kbps then queue can have a many packets at the same time, at that time how will they be processed and how situation can be create so that traffic in another queue will not be processed and how the packets in prioritized queue will distribute queue’s b/w or kbps speed whatever we are using?
3) for TOS related/= question I want to know that , is any specific reason so that we are using TOS in ping? what we want to know by using TOS with ping?

Hello Pradyumna

I don’t believe you will have to remember the specific conventions for particular QoS codes such as EF for voice and AF31 for call signalling. Such information is not usually asked for on exams. But you should know that EF and AF31 are definitions for Expedited Forwarding and Assured Forwarding.

I’m still not clear on the specific question but I’ll do my best. Remember that on an interface, only one packet/frame can be sent at a time. Also, remember that queuing will take place only if there is congestion. If there is no congestion, that means that there is free bandwidth that can be used immediately, so packets that arrive on the interface are served immediately. If there is congestion, then over any specific period of time, a certain number or percentage of packets will be served from each queue.

Let’s take 100ms as a duration of time and let’s look at the following setup on a GigabitEthernet port:
image
Remember from this post, that a port can either transmit at its full speed or at 0 Mbps. What we see as throughput is really the average speed over time. If these queues are all full, within those 100ms, a total of 50 ms will be used to send the packets found in output queue1, 20ms for queue2, 20 for queue3, and 10 for queue 4 (based on the percentage). If it is based on bandwidth, then you calculate what percentage of the rated speed of the interface that bandwidth takes up. Now, these time periods are not continuous but are interspersed throughout each time interval. Does that make sense? If I didn’t answer your question sufficiently, feel free to let me know…

By specifying the values found within the DSCP/ToS field of the ping, you can see what kind of behaviour your QoS mechanisms will have on your network. You can send a series of pings and see the response times of each, and see if any are dropped. But this makes sense only if you ping during times of high congestion on your network, otherwise, any values you put in will not have any effect.

I hope this has been helpful!

Laz

Hello Sir,

Thank you for your explanation.
At ISP we match with LAN:

class-map match-any lanRT-MEDIA
*** match access-group name defRT-MEDIA***
class-map match-any lanC0
class-map match-any lanRT-SIG
*** match access-group name defRT-SIG***
class-map match-any WANinC0
*** match ip dscp cs7 ***
class-map match-any WANoutRT
*** match ip dscp cs6 ***
*** match ip dscp cs5 ***
class-map match-any WANoutC0
*** match access-group name defC0***
*** match ip dscp cs7 ***
class-map match-any WANinRT
*** match ip dscp cs6 ***
*** match ip dscp cs5 ***
!
policy-map WANoutQoS
*** class WANoutRT***
*** priority percent 90***
*** class WANoutC0***
*** bandwidth remaining percent 10 ***
*** class class-default***
*** bandwidth remaining percent 90 ***
policy-map WANinQoS
*** class WANinC0***
*** class WANinRT***
*** class class-default***
policy-map LANinQoS
*** class lanRT-MEDIA***
*** set ip dscp cs6***
*** class lanRT-SIG***
*** set ip dscp cs5***
*** class lanC0***
*** set ip dscp cs7***
*** class class-default***
*** set ip dscp default***

I want to know what’s the difference in results between your configuration and mine, please.
Such as we
Best regards,
Brahim

Hello Brahim

When implementing class maps, there are many things that you can match to and many different ways of implementation. In your particular class maps that you shared, you have matches using

  • access lists
  • DSCP values

And you have policy maps calling those class maps that will match them and assign either bandwidth percentages to them, or set various DSCP values.

in order to understand policy-maps and class-maps better, take a look at the following post which includes additional links:

I hope this has been helpful!

Laz

the first sent is matching specific access lists. In particular, the match-any keyword is being used. When multiple match criteria exist, match statements take place using a logical OR function. A packet must match any of the match statements within the stated ACLs to be accepted.

Hi Rene/Laz,

My question is related to class-default.

Lets say example,

Voice traffic is allotted of priority 50% of BW
Business application is allotted of priority 10% of BW

As my understanding, there will be a class-default for all traffic that was not matched.
But without explicitly giving BW for default class, how much BW will it take?

Hello Levisle

The default class will simply take up whatever is left. Since 50% is allocated for voice, and 10% for business applications, what remains is 40%. So even though this is not explicitly stated, it is implicitly determined.

However, this does not mean that the traffic will actually be distributed in a 50% 10% 40% manner. Remember that any QoS mechanisms begin to kick in only when there is congestion. Before there is congestion, all traffic goes through, and the percentage of the traffic of each type may vary.

Now you must also keep in mind that by default, there is a max-reserved-bandwidth parameter that is set to 75%. This value reserves 75% of the bandwidth to be used by LLQ, and leaves 25% to be used by things like routing protocol updates and Layer 2 keepalives. Therefore, the minimum bandwidth that the class-default class will use is 25%. This can be adjusted using the max-reserved-bandwidth command.

I hope this has been helpful!

Laz

so in this example you only classified the data, right? Who did the marking?I just want to confirm that there is a step omitted for a reason.
Thanks

R2(config)#class-map VOICE
R2(config-cmap)#match dscp ef

R2(config)#class-map CALL_SIGNALING
R2(config-cmap)#match dscp cs3

Hello Abdulrahman

In the lesson, the configuration on R2 simply configures the classification. In other words, this router will receive packets, examine their DSCP values, and either match those values or not match them. Depending upon whether they are matched or not, they are provided with a certain bandwidth and priority.

The actual marking in the lesson is done by R1. When the pings are sent through, the TOS parameter is set to different values, and so different class maps are being matched. This appears in the output of the show policy-map interface GigabitEthernet 0/2 command shown in the lesson.

In a real-life situation, it would be the IP phones and the voice gateways that would set the DSCP values of the voice packets and the call signalling. Such marking would be configured within these devices, and would then be acted upon by the classification configuration as shown in R2.

I hope this has been helpful!

Laz

Hi
what router model did you use for practice?

Hello Alberto

This lab was performed using CML using the following image:

Cisco IOS Software, IOSv Software (VIOS-ADVENTERPRISEK9-M), Version 15.9(3)M2, RELEASE SOFTWARE (fc1)

I hope this has been helpful!

Laz

Brilliant practical demo and explanation. I have one question regarding how multiple priority queues function.

“Ref: Brad, Edgeworth; Rios Ramiro Garza; Hucaby David; Gooley Jason. CCNP and CCIE Enterprise Core ENCOR 350-401 Official Cert Guide”

Is the data in the priority LLQ’s voice and data (pictured above) serviced on FIFO basis and the benefit of such a configuration is to provide more granular policing at times of congestion so it’s not done cumulativley on both voice and video. Or does IOS re-order the packets to provide preferential treatment to the favoured packet type, i.e. voice over video ? It’s a little confusing as different sources seem to contradict themselves. I think Rene alludes to a level of prefential treatment in the below quote, but I might be misinterpreting what he is saying:

"This can be useful if, for example, you have real-time voice and video traffic and you want to prioritize both. With a single priority queue, you can’t decide if voice or video traffic should be prioritized first. Within the priority queue, life is best effort."

I’ve also read the following in the below article and the OCG seem’s like it’s suggesting the same thing but isnt clear to me.

Queuing does not differ when comparing using a single low-latency queue with multiple low-latency queues in a single policy map. IOS always takes packets from the low-latency queues first, as compared with the non-low-latency queues (those with a bandwidth command), just like before. However, IOS does not reorder packets between the various low-latency queues inside the policy map. In other words, IOS treats all traffic in all the low-latency queues with FIFO logic.

Again thank you for all the support.

Hello Nizar

The mechanism that prioritizes traffic when applying LLQ is the fact that you have a priority queue. That priority queue is served before the CBFWQ queues. However, the packets within that LLQ priority queue are not further reordered or prioritized. They come into the queue on a FIFO basis.

Can you share with us the documentation that seems to be indicating something different? That way we can help to clarify the issue further.

I hope this has been helpful!

Laz

Hi,
At work we have some Catalyst 2960XR switches. I’m trying to follow the lesson with them, but the “priority” command seems to be missing inside the configuration of the class-maps in the policy-maps, as the attached screen capture shows. Does this mean that these switches don’t support LLQ or is it configured in some other different way in them? The same happens with the “bandwidth” command.

Regards.
priorityqueue