QoS LLQ (Low Latency Queueing) on Cisco IOS

Hello Nguyen

The “match-all” here indicates that all statements in the specific class map must be matched in order to consider it a match. In contrast, you can have “match-any” which simply matches any one of the statements in the policy map. In our case here, we only have one statement in the policy map, so the match-all or match-any designation here will make no difference.

The traffic that is matched is always based on the condition in the class map. Specifically, for VOICE, any packet with a DSCP value of ef will be matched. Similarly for CALL_SIGNALING, any DSCP value of cs3 will be matched. We don’t need an access list here since in the class maps we are referencing DSCP values directly. If we were to use source and destination IP addresses as criteria, then we could create access lists that can be referenced by the class maps.

I hope this has been helpful!

Laz

1 Like

Hi Laz,

Thank your explain but i always think when a packet through router, it will classification by source IP and after that it marked by set an DSCP value and QoS policy will treat it base DCSP value

**

The traffic that is matched is always based on the condition in the class map. Specifically, for VOICE, any packet with a DSCP value of ef will be matched. Similarly for CALL_SIGNALING, any DSCP value of cs3 will be matched. We don’t need an access list here since in the class maps we are referencing DSCP values directly. If we were to use source and destination IP addresses as criteria, then we could create access lists that can be referenced by the class maps.

**
I don’t see how router know that packet is VOICE packet ort CALL_SIGNLING for marking .

Please correct if i wrong.

Hello Nguyen

In this scenario, voice traffic is DEFINED as that having a DSCP of ef. This is how the router knows the packet is indeed voice. Similarly, call signaling traffic is DEFINED as having a DSCP of cs3. The class maps are matching based on this criteria. The action that is being taken in each case is defined in the policy map. Specifically, voice traffic (DSCP of ef) will have a priority queue for 2000 Kbps and call signaling traffice (DSCP of cs3) will have a bandwidth guarantee of 1000 Kbps.

I hope this has been helpful!

Laz

1 Like

Hi Laz,
Agree and nothing if voice traffice or call signaling traffic defined with DSCP value and QoS processing
Something i still confuse here, example behind R1, I have subnet 192.168.100.0/24 , a phone with IP 192.168.100.100 and a PC with IP 192.168.100.200.

With your config :

  • How router know traffic from 192.168.100.100 is voice and call signal and treat it with high priority than from source 192.168.100.200 is normal.

  • And from PC, i using and app as 3CX to make phone call, and browser web at same times, how does router know and define with traffic from 192.168.100.200

Hi Laz,
Following this matrix table


so that your config following Cisco recomend and when config as below , router implicit know which is VOICE traffic and match it with ef DSCP value ?

R2(config)#class-map VOICE
R2(config-cmap)#match dscp ef

Hello Nguyen

It is true that there is no guarantee that traffic that is marked as EF is voice and traffic that is marked as CS3 is call signalling. However, if you are managing your own network, then you can configure phones and other VoIP equipment to mark voice and call signaling traffic in this way, so you can discern what it is. It all depends on what QoS administrative policy you as a network engineer decide on using.

Once again, if you configure your network so that the phone will assign the appropriate DSCP markings, then you know that any packet with those markings is voice or call signalling. If there are no markings, then you know that it is not voice or call signalling, and can be treated accordingly.

This is a special case where you must use the capabilities of the operating system you are running on to mark voice packets. For windows devices for example, you must apply a Windows group policy to mark all the voice data with the desired DSCP which is generated by the softphone. You can use your favourite search engine to find out how to do this on Windows, or other operating systems.

As you can see, Cisco recommends EF for voice traffic and CS3 for call signalling. These are the DSCP values that Rene is using to match traffic. What Rene doesn’t mention in the lesson, which may be the source of the confusion here, is that we are implicitly ASSUMING that voice traffic has EF and call signalling has CS3 DSCP markings. If left to their defaults, this is what you would see, at least for IP phones, and this is why Rene used them.

I hope this has been helpful!

Laz

1 Like

Dear Rene,

I have some thing to clarify,
Assume,
LLQ is configured to have 20% of the total bandwidth,
and other 2 Queues are attached to CBWFQ with 50% and 30%,
and the total bandwidth of the link is 1Gig.

So in an event of congestion, LLQ gets 200 Mbps guaranteed ryt?
As per my understanding now, the remaining portion of the bandwidth which is 800 Mbps is shared among the other 2 queues in a round robin fashion.
Which means one CBWFQ queue gets 400 Mbps and the other one gets 240 Mbps.
Am I correct?

Hello Roshan

The values in percent correspond to the total bandwidth of 1 Gig. This means that 20% for the LLQ is 20% of 1000Mbps =200 Mbps. Queues 2 and 3 get 50% and 30% respectively, of the 1000Mbps as well, not of the remaining 800Mbps.

So queue 1 which is an LLQ queue will have 200Mbps guaranteed. Queue 2 and queue 3 will be served in a weighted round robin fashion resulting in them getting 500 Mbps and 300 Mbps respectively.

I hope this has been helpful!

Laz

1 Like

Thanks,

:slight_smile:

1 Like

Hello Team

Could you please help me to understand bandwidth remaining ratio concept. Also can you please let me know, how can I calculate the bandwidth of the class from the bandwidth remaining ratio command?

For example,

class pe_af4_output
  **bandwidth remaining ratio 251 account user-defined 24**
  queue-limit 282000 bytes
  random-detect dscp-based
  random-detect exponential-weighting-constant 7
  random-detect dscp 32 70500 bytes 112500 bytes 10
  random-detect dscp 34 70500 bytes 112500 bytes 10
  random-detect dscp 36 21000 bytes 57000 bytes 5
  random-detect dscp 38 21000 bytes 57000 bytes 5
 class pe_af3_output
  **bandwidth remaining ratio 108 account user-defined 24**
  queue-limit 186000 bytes
  random-detect dscp-based
  random-detect exponential-weighting-constant 6
  random-detect dscp 24 46500 bytes 75000 bytes 10
  random-detect dscp 26 46500 bytes 75000 bytes 10
  random-detect dscp 28 13500 bytes 37500 bytes 5
  random-detect dscp 30 13500 bytes 37500 bytes 5

Hello Payal

The way the bandwidth remaining ratio is used for sub-interfaces and class queues. It is used so that each of these can use a different ratio of the remaining (unprioritized) bandwidth. Otherwise, they are all treated the same.

The ratio is a weighted value. It can be any value between 1 and 1000. Here we see that the ratio is being employed between two class queues.

Specifically, the ratio of bandwidth that class pe_af4_output should get compared to pe_af3_output is 251/108 or 2.324 times more (of the unprioritized) bandwidth.

For more information about this feature, take a look at the following links:



I hope this has been helpful!

Laz

Hi

After reading this lesson and others would i be right in saying that the different types of QoS tags don’t actually “do” anything when you tag a packet with a particular QoS tag. Meaning that they don’t actually police, queue or drop packets by themselves. They’re just “tags” that are associated with particular devices and applications like VOIP is associated with the “ef” tag and “cs3” is associated with call signalling…

Thanks again

Hello Michael

Yes, you’re absolutely right. The QoS markings that are placed in the L2 frame header and in the L3 packet header are simply markings which don’t actually do anything. By marking packets/frames, you are simply classifying traffic.

In order for QoS mechanisms to function, a network device must be configured to do something specific with packets based on those markings. Queuing, policing, shaping, and trusting, can all be configured to take these markings into account and to prioiritize them accordingly.

Also, remember that QoS mechanisms only kick in whenever there is congestion. Congestion will occur whenever there is more traffic than a particular network path can handle, either due to bandwidth limitations of the physical interface, or due to a configured bandwidth limitation.

I hope this has been helpful!

Laz

Hi Laz,

Thanks for clarifying that up, i have one more question in relation to this if you don’t mind.
In regards to the priority queue and whatever bandwidth is assigned to it, how would any of the other queues ever get packets processed as well if there would always be a constant flow of voice packets through the priority queue?
I understand the config in the lesson but i keep thinking if there is always traffic coming into the priority queue then how would any other packet in another queue be processed as well?

Thanks again

Hello Michael

If you have a steady stream of marked voice packets and QoS mechanisms have been configured for LLQ, then these voice packets will indeed all go to the priority queue. You must remember however that the full capacity of the interface is at all of the queues’ disposal. And queuing will only “kick in” if you have congestion.

If you don’t have congestion on this interface, this means that the traffic that arrives on the interface is served immediately, no matter what its classification. In other words, as soon as a packet arrives at the interface, there is no delay in it being forwarded.

Queuing (and QoS mechanisms in general) have no meaning unless congestion is present. Now when congestion is present, and you configure a QoS LLQ, there is what is known as “an implicit policer” that limits the bandwidth that can be consumed by the priority queue. This is done to prevent bandwidth starvation of the non-priority queue flows serviced by the CBWFQ scheduler. The policing rate is set to match the bandwidth allocation for that particular priority queue. In the lesson, this is set to 2000 Kbps. So any excess traffic on the priority queue will be dropped by the policer. And keep in mind that the policer will only be “en force” whenever there is congestion and the queuing mechanism kicks in. If it does not, there is no limitation.

It is up to the network designer to ensure that the expected network traffic patterns and the infrastructure designed to carry them, are sufficiently well designed to allow for QoS mechanisms to handle high priority traffic successfully.

You can find out more about how this behaviour of LLQ at the following Cisco documentation:

I hope this has been helpful!

Laz

Hi Laz,

Thanks for the clarity and clearing up the policing and what happens when the priority queue is full. The only other part of my query was what if the priority queue is always getting traffic sent to it? Then what happens to the other queues? If the priority queue is always being used then do the other queues suffer from queue starvation anyway because they don’t get a chance to send packets if the priority queue is always being used.

Thanks as always

Hello Michael

Let’s use this diagram from the lesson to help us out:

image

Notice that the interface is a GigabitEthernet interface, and that the priority queue is set to 50% of the total bandwidth. Now remember, if there’s no congestion, the above queues are non-existent, and all traffic is served immediately.

Now let’s assume we have congestion over an extended period of time, and we have a lot of priority traffic arriving at the interface. That priority traffic will go into the priority queue to be served immediately. However, the bandwidth of the priority queue has been limited to 50%. This means that only a bandwidth of up to 500 Mbps can use the priority queue. This is where the “implicit policer” I mentioned before comes in. Any packets above that 500 Mbps in the priority queue will be dropped.

This means that there is another 500 Mbps of bandwidth that is guaranteed for use by the CBWFQ for the non-priority traffic.

Now this is a little bit difficult for us to get our heads around because of the phrase “everything that ends up in queue 1 will be served before any of the other queues.” If there’s always something in queue 1, queues 2 to 4 will never be served!

We often think about dividing queues into sections of bandwidth, where queue 1 gets 500 Mbps, while the other queues combined get another 500 Mbps. But packets must enter the interface serially, meaning one at a time, one after the other. So to achieve the above mentioned bandwidths, the scheduling of the sending of packets from the queues occurs by sharing the use of the interface over time, something similar to time division multiplexing. For each second of time, 50% of the time will be devoted to serving queue 1, and 50% will be devoted to serving the other three queues, based on their percentages. These limitations are policed to ensure that bandwidth starvation will not occur.

I hope this has been helpful!

Laz

Brilliant as always, thanks Laz

1 Like

Hey Laz,
I can’t hang the idea as stuff getting realy messed up and oppose each other.

From my understanding, the LLQ should be served immidiaetly when traffic queue in it while congestion occurs.

If the queues work by percentage, then the PQ might have 100Mbps capacity (10%) , and 2nd queue has 500Mbps capacity (50%), and the 3rd queue will have 400Mbps (40%) capacity.

Lets assume that there’s been a congestion and now traffic will start to buffer up , the interface is capable to transmit 1Gbps and there is extra 50Mbps in the queues, and just before about dropping packets from the buffers being full , the interface can start to transmit the queued data - so what will / can happen?

one option would be that the first 20Mb will refer to the PQ and will be transmitted, after that, another 30Mbps in the 2nd queue should be served but as soon as 20Mbps of traffic has been served, the PQ gets another 100Mbps in it to be served , so the router should stop forwarding the data in the 2nd queue and immidiaetly start to serve again the PQ for 100Mbps.

In this scenario the PQ will serve a total of 120Mbps instead of the 100Mbps limit that configured as the Implicit policer’s bucket should have replenished its tokens while the 2nd queue was active for serving those pitifull 20Mbps.

Those, I can’t understand the statement you said about that each queue can be taking the exact % of Bandwidth which it configured for , as in actual practice the PQ was configured for 10% but served 12% total and the 2nd queue’s serving had to be stopped in the middle of the operation in my theory which reflect my understanding about the LLQ operation and how policers and shapers should be working like.

Another thing that doesn’t sound quite right is that the queues will use TDM mechanism to utilize the BW - this way the entire link won’t be used at all for 80Mbps of traffic as it was scheduled to serve only the PQ for 10% of the time since the congestion occured ,and the PQ had only 20Mb in it.

Hello Nitay

Neither can I! That’s why I’ll do my best to clarify the questions you have. Let’s take a look at the diagram once again:
image

First of all, we should keep in mind that queuing will only take place when there is congestion. I know we mentioned this before, but let me clarify. The percentages per queue are the maximum that is made available, whenever all queues are full. Imagine that all queues have a continuous flow of data well above the allocated bandwidth for each. These percentages are maxed, and they remain so for the duration of that congestion.

Now imagine that all the queues are at max except for Queue 3. Let’s say that’s only using half of its allotted bandwidth. That’s still congestion, but there is some “free bandwidth”, specifically 100Mbps free. How does that get filled up? Traffic in queues 2 and 4 above their allotted bandwidths, will be forwarded using “best effort”. This will not take place for the LLQ however, because a priority queue is policed to its configured rate, not extending into any unused capacity.

So you see traffic belonging to Queue 2 or 4 can actually exceed the 20% or 10% of bandwidth, if there is free capacity, but that excess will be treated as best effort. So you see, any “free capacity” will always be filled in with excess traffic from other queues.

I understand your confusion here. What I said was that it is similar to TDM. For example, if we consider a situation where all queues are maxed, over a particular time period, say 1 second, 50% of that time will be used to forward the traffic in Queue1, 20% of the time to Queue2, 20% to Queue 3 and 10% of the time to Queue 4. It is unavoidably so because packets exit an interface serially. The main point of my statement was to show that a lower priority queue will not stop completely until higher priority queues are empty, but will simply be given a smaller allotment of bandwidth (divided over time).

I hope this has been helpful!

Laz

1 Like