QoS LLQ (Low Latency Queueing) on Cisco IOS

lagapidis · February 6, 2024, 6:35am

Hello Nicolas

Yes, I think you’ve described it very well. Your description is comprehensive and understandable, and I believe it is correct. Thanks for your clear explanantion and for the way you framed the issue.

Laz

ndsodaro · February 16, 2024, 6:38pm

Thanks Laz for the great support.

I do have another question & I’m Interested in hearing your thoughts.

When it comes to classification and marking when implementing congestion management, If I were to classify traffic by subnet-only with ACLs & not include any dscp markings, is there a risk for my underlay network traffic to be impacted or is it best to mark everything so that you avoid a situation when your routing protocol traffic gets blackholed ( e.g OSPF underlay w MPLS & MP-BGP). The network I am dealing with has no voice traffic but the intent is to give priority to mission critical subnets priority while the transport links are at capacity.

lagapidis · February 19, 2024, 7:59am

Hello Nicolas

It is possible to classify traffic by subnet only with ACLs. However, there are a few implications and caveats that you must keep in mind when you do this:

First, concerning the underlay network, if your congestion management strategy does not account for the specific needs of routing protocol traffic (like OSPF, MPLS, MP-BGP), there is a risk that this critical traffic could be deprioritized or dropped in congested scenarios. Routing protocol traffic is essential for the stability and efficiency of your network. If it’s impacted, it could lead to routing inefficiencies or even outages.

Secondly, by classifying traffic solely based on subnets, you may not fully distinguish between different types of traffic within the same subnet. You may not have voice traffic, but there are other traffic types that you may want to ensure will not be affected by congestion. This could result in a prioritized subnet being treated equally, regardless of its actual importance or requirements.

While classifying traffic by subnet-only with ACLs is a valid approach, it’s generally beneficial to incorporate DSCP markings for more nuanced and effective traffic management, especially for ensuring the priority of mission-critical traffic and the integrity of your routing protocol traffic (regardless of the subnet from which they originate). So I would say that you can go ahead and use an implementation that prioritizes based on the subnet, but you should also include some QoS mechanisms for other important traffic as well. So a combined approach would help mitigate the risks of network instability and ensure efficient utilization of network resources. Does that make sense?

I hope this has been helpful!

Laz

ndsodaro · February 19, 2024, 2:37pm

Hi Laz,

Yes it does , I definitely agree with you but with the way our network is designed, our most mission critical traffic is fully distinguished at the VRF level & that’s why I think the classification with ACLs should be ok for us. That said, I do agree that there should be some type of marking applied anyway so that our routing protocols & such remain unaffected - do you have any resources or guidelines on “normal critical traffic types” that I can make reference too?

This needs to be configured on a WAN link between ASR1000s Version 16.09.03 ( egress queuing only)

Tx!

lagapidis · February 22, 2024, 8:14am

Hello Nicolas

For your situation, you need to ensure that QoS is applied to control plane traffic. That is, routing protocol traffic and any other control plane messages that may be used in your topology. This feature is called Control Plane Policing or CoPP. Take a look at this lesson for more info:

First, you should identify what control plane traffic you have on your link, and then think about the kind of priority you want to give that traffic. Based on this, you can devise your strategy for ensuring control plane traffic is not dropped. This can be applied in conjunction with your classification based on subnets using ACLs. Here is some more useful information from Cisco documentation about CoPP.

I hope this has been helpful!

Laz

ndsodaro · June 13, 2024, 1:16am

hey Laz…

I hope all is well. It’s been a while since we last touched base. We’re encountering an issue where our QoS configuration works perfectly in our lab environment but fails in production.

Device Information
Model: Cisco ASR1001-HX
Processor: (1SR) processor (revision 1SR)
Memory: 3853454K/6147K bytes
Processor Board ID: TTM22380033
Software: Cisco IOS-XE

Our extended access control lists (ACLs) aren’t being hit in production, resulting in no traffic being matched to our QoS class. We’ve verified the forwarding path using show cef vrf exact-route for the source and destination, confirming that the interface forwarding the traffic is where the QoS policy is applied.

When we configure the ACL with permit ip any any, traffic is matched correctly. Below is an example of the extended ACLs we are using:

ip access-list extended 2000
 permit ip X.X.X.X 0.0.0.255 X.X.X.X 0.0.0.255

ip access-list extended Test
 permit tcp X.X.0.0 0.0.15.255 any eq 1521

The QoS policy is applied on the correct egress interface (as confirmed by show cef).
The prefixes matched by the ACL are labeled and technically reside in a different VRF but it shouldn’t matter, from what I understand. Do you have any suggestions or insights into why our ACLs might not be matching as expected in the production environment? Cisco has not been able to pinpoint the issue so far.

lagapidis · June 17, 2024, 5:06am

Hello Nicolas!

All is well, thanks, I hope you’re doing great too!

My first thought about this situation is the issue you mentioned with the VRF. The prefixes matched by the ACL are labeled and technically reside in a different VRF, and this can affect the behavior. The router treats each VRF as a separate routing table. Therefore, if your ACLs are referencing IP addresses that are in a different VRF, they won’t match.

However, you mentioned that you were able to get it to work in the lab. Did you have the VRF configuration in the lab environment as well? In other words, have you reproduced the lab environment exactly in your production network?

If everything is exactly the same in both lab and production, you should then start to check things like platform, and IOS versions. It may be that there is a slightly different behavior for the lab environment equipment (real equipment? emulator?) and you may need to develop your config on the actual production equipment itself.

I don’t have an immediate solution for you, but I hope these thoughts will help you in your troubleshooting process. Let us know how you get along, and give us some more info of your results so that we can help you further.

I hope this has been helpful!

Laz

ndsodaro · June 17, 2024, 12:11pm

Yes we thought the same thing initially, even before developing this solution. But considering that ip cef matches on the exact Interface that the qos is applied on and that it worked already in the lab with different vrfs , we went ahead with it. The only difference is that we’re using asr1000 in production whereas in the lab it’s the same Image but on a different device type (c1000v) .

Cisco could not find any bugs and are actively working on it.

Regarding the acl and the vrf notion with acls, are there any articles, links on the web , or past experiences you could provide? Also , we have macsec in the mpls underlay would this impact the way the layer 3 headers are inspected ?

lagapidis · June 24, 2024, 5:23am

Hello Nicolas

Thanks for the update, I’d be interested to see what Cisco comes up with. Can you keep us posted?

Hmm, it is interesting because you’re combining a series of features such as ACLs, VRFs, and MACSEC in the MPLS underlay. However, I am not familiar with any documentation that includes all of these aspects. The fundamental question here I belive is the concept of “VRF-awareness”. Is the setup aware of the VRFs, and if so, is there Inter-VRF interaction? ACLs themselves are not inherently VRF aware but they do act within the confines of a particular VRF. Only if you configure something like VRF leaking would the ACL in one VRF be able to interact with networks on another VRF.

I wish I could have been of more help for you! Keep us posted with your progress.

I hope this has been helpful!

Laz

ndsodaro · July 12, 2024, 1:27pm

Hi Laz,

We got around this by using QOS groups.

We modified the configuration and decided to classify the traffic on the ingress sub-interfaces so that it could be appended to a unique QOS group. The Qos groups were then referenced into a different set of class-maps solely intended for egress queuing.

lagapidis · July 16, 2024, 5:42am

Hello Nicolas

Thanks for the update, it’s much appreciated!

Laz

tkaashan · October 22, 2024, 9:14am

@lagapidis @ReneMolenaar
Hello Laz & Rene,
Hope you are doing great. Your articles are amazing. Thanks .

I still have a few questions about an LLQ scenario.

An egress policy map is applied on a 10gbps interface with ISP CIR at 8gbps. The policy map has only one user-defined class: voice-video priority class with a max of 6Gbps and class-default with 2Gbps. A sample configuration is given below.

Policy-map Out-policy
Class Voice-Video 
	match ACL <for tcp and udp based Voice/video traffic>
	priority
	police 6000000000
Class Class-Default 
       police 2000000000

interface Te-0/1/1
bandwidth 8000000000
service-policy output Out-policy

The above configuration intends to restrict both classes to the configured policer CIR even when there is no congestion. My understanding is that there will not be any queueing for both classes if the traffic rate is under the configured policer values for the classes, i.e., class voice-video under 6Gbps and class-default under 2Gbps. Is this a correct assumption?
What type of queuing will MQC apply for traffic that matches the class-default when there is congestion? i.e., when a continuous stream of voice-video class traffic exceeds 6gbps and class-default traffic exceeds 2gbps?
Does fair queue with policing provide better congestion management for the class-default traffic?
Can a 75% allocation for the voice-video class severely impact Class-Default traffic or lead to starvation of class-def ult? Are there any best practices for the LLQ class allocations ratio based on the available bandwidth?
I assume the MQC will use the “bandwidth 8000000000” configuration to calculate the total bandwidth available for all classes. Is this true for IOS XE?

lagapidis · October 24, 2024, 4:48am

Hello Tkaashan

I’m glad you find the articles helpful! Let’s dive into your questions:

Yes, your assumption is correct. If the traffic rate for both classes is below the configured policer values, there will be no queuing. The policer will only begin to drop packets when the traffic rate exceeds the configured values. Remember, QoS features don’t kick in unless there is congestion. Otherwise, traffic is simply served immediately.
The class-default is treated with FIFO (First-In, First-Out) queuing by default when there is congestion. If the voice-video class traffic exceeds 6Gbps and class-default traffic exceeds 2Gbps, the excess traffic will be dropped or marked down depending on the configuration.
Fair queue with policing could provide better congestion management for the class-default traffic. It could help to ensure that all flows get a fair share of the bandwidth. However, it also depends on the type and nature of the traffic in your network.
Allocating 75% bandwidth for the voice-video class could potentially impact the Class-Default traffic. It’s essential to ensure that there’s enough bandwidth for all classes to prevent starvation. The best practice is to allocate bandwidth based on the importance and requirements of each class, and that depends upon the expected amounts of traffic of each type on your particular network.
Yes, the MQC will use the “bandwidth 8000000000” configuration to calculate the total bandwidth available for all classes. This is true for IOS XE.

I hope this has been helpful!

Laz

tkaashan · October 25, 2024, 12:03pm

Thanks, Laz.

I have a few more questions.
In response to Question 4, you mentioned that 75% of BW (6gbps)allocation to the voice-video class can impact class-default traffic. Is this because of the priority queue configuration for the voice-video class? I mean, does the priority queue active on the class voice-video during congestion force the class-default queue (FIFO) to give way and suffer delays? If we use bandwidth or police instead of priority for the class voice video, can we expect lower delays and drops for the class-default traffic during congestion?

When class voice-video video exceeds 6gbps continuously, and class-default traffic is under the allocated 2gbps, say 1.5 Gbps, there should not be any impact on class-default traffic as there is no congestion and hence no queue. Is my understanding correct, or will class-default traffic still need to wait behind the priority queue to get into the transmit ring?

I am confused about the Class-Default’s FIFO queuing behavior during congestion. Is FIFO queuing applicable only when the Class-Default is not configured with an explicit action like police? In my scenario, police action is configured. What will the order of the operations be during congestion? FIFO queue first, and packets dropped by the police when the queue is full?

lagapidis · October 29, 2024, 5:23am

Hello Tkaashan

Yes, the priority queue configuration for the voice-video class can indeed impact the class-default traffic. This is because during periods of congestion, the priority queue is serviced preferentially, which could potentially starve other queues of bandwidth, causing increased delays and potential packet loss in the class-default queue. Using bandwidth or police instead of priority for the voice-video class can help to alleviate this issue, as these methods do not provide preferential treatment to the voice-video class during times of congestion.

If the voice-video class is continuously exceeding its 6gbps allocation, but the class-default traffic is under its 2gbps allocation, there shouldn’t be any impact on the class-default traffic, assuming there is no congestion. However, if there is congestion, then the class-default traffic would indeed have to wait behind the priority queue to get into the transmit ring.

FIFO queuing is the default behavior for the class-default queue, but this can be overridden if an explicit action like police is configured. During congestion, the order of operations would typically be FIFO queuing first, followed by policing. The policing action would drop packets once the queue becomes full.

I hope this has been helpful!

Laz