QoS Traffic Policing Explained

lagapidis · April 1, 2020, 2:58pm

Hello Michael

No need to apologize, that’s what we’re here for. Indeed this is a difficult topic to get your head around, I’m glad however that it is clearer. The truth is, if there was a video animation of a bucket, with a simulated arrival of packets, and the animated addition and removal of tokens as a result, it would be of great benefit for understanding all of these procedures. I have attempted to find something like this, but was unable to. Maybe we’ll work on creating such a video!! We’ll see…

Laz

nitayp1 · May 20, 2020, 9:24pm

Hi Laz and Rene,

Is there a way to configure burst limit size as in juniper devices?

for example, juniper states that some command in the single rate, two-color policer configuration can specify the maximum burst-size-limit of the traffic.
this way you could configure 100Mb link with 10Mb burst size maximum , how could you do something like that in cisco manners?

it sounds like the algorithm has 1 bucket, and another mini bucket inside of it that uses tokens from burst streams. for example: if traffic is sent with 10 packets that has 5ms gap between of them, then the tokens will be taken from the mini bucket instead of the whole bucket, and if teh time gap between each packet is more than 5ms, then the tokens will be taken from the main bucket, this way you could limit the maximum burst size as long that you have a specific definition for “what kind of traffic delimiter is a burst?”

It doesn’t seems right to give a definition of “what is a burst” by a delimeter.

I found this document from juniper’s website about how to calculate the maximum burst size, and I got pretty confused and I didn’t understood what they meant:

It looks like they want to calculate the burst time for a specific policer , and assume that this is the interface’s BW, and then they calculate how long it would actually take to burst that traffic for the real interface’s BW - I just don’t get that.

So , what should I think of that? they mention that burst time should’nt be below 5ms for the policed bw interface, I don’t like this explenation at all and I’m so confused

lagapidis · May 22, 2020, 9:12am

Hello Nitay

Although I don’t have experience with Juniper equipment, taking a brief look at the documentation you shared, it looks to me that it is similar in description with the description in the QoS Traffic Policing lesson:

Specifically, it seems to be referring to the levels of traffic between the exceeding and violating categories. As far as the 5ms goes, I believe that to be a general guideline due to the fact that it has been seen in practice, that anything less would degrade functionality.

I’m sorry for not going into much more depth, as my experience with Juniper lies somewhere between extremely limited and non-existant.

Laz

nitayp1 · May 22, 2020, 5:48pm

Thanks Laz,
Let me try to ask my question again in a different approach:

In one of your QoS lessons about the QoS on switches, you have a video with a great example that limits the burst size and it configured this way:

policy-map BURST-LIMIT
   class class-default
      police 256000 25600 conform-action transmit exceed-action drop

I can’t understand what is that 8K burst size limit is for and how is it different from configuring two tokens bucket approach like this example:

policy-map BURST-LIMIT
   class class-default
      police 256000 
        conform-action transsmit
        exceed-action set-dscp-transsmit 0
        violate-action drop

You taught us that in the Bc+Be approach you can burst traffic after inactivity, but in my first example you can burst too, but it seems like it is much limited then the second example.

As I know, ISP would use my first example, but if not, why would they use the 2nd example for policing? (or the cir + pir option) does it fair to limit the burst size so much to 10% of the bandwidth policer?

After all, if you limit the burst size, then the 2nd approach should be useless and the Be bucket would never be used, is that right?

From some calculation I made, if you have 10G link, and the ISP will police a customer’s link to 100Mb and 10Mb burst size, then you could configure a shaper that sends 10Mb of data per 100ms and The Tc is 100ms, but does it count as a burst? cuz you could also configure a shaper that can send 20Mbs with a Tc of 200ms - and for this point of thinking you need a definition for “what traffic pattern is a burst?”, or maybe the 100ms Tc means that you can send 10ms from the whole Tc as a burst of 10Mbs, but that means you must have maximum Tc of 100ms in the shaper because a 200ms Tc means you have a potential to send 20Mbps of traffic as a burst in 10% the time of the Tc, and this way the ISP would drop 50% of the burst. (As it takes 10ms to transmit 10Mbps on a 10Gig link).

So maybe the term “burst” and the way it is calculated on network devices isn’t known to me and that’s why I’m having trouble and I can’t understand how to engineer the network as better as possible.

The definition of “burst” that I know is traffic that’s being transmitted on a very short period of time which also escorted with a long “quiet” pause between each burst, and the time-space between each packet in the burst is unknown to me and I’m not sure if there should be a little gap between each of the packets in the burst, and if there is, then it should be the same time gap or not?

Thank you again for your time to help me understand this topic in a real environment

ReneMolenaar · May 26, 2020, 9:10am

Hi @nitayp1,

Let’s start with the definition of burst:

I like the Wikipedia explanation of burst transmissions:

In telecommunication, a burst transmission or data burst is the broadcast of a relatively high-bandwidth transmission over a short period.

Which is similar to yours. The way we allow bursts and whether there is a (time) gap or not, depends on whether you do policing or shaping, and how it is configured.

Let’s talk about policing first.

nitayp1:

that limits the burst size and it configured this way:
policy-map BURST-LIMIT
   class class-default
      police 256000 25600 conform-action transmit exceed-action drop
I can’t understand what is that 8K burst size limit is for

With a single rate two-color policer like this, you only have one bucket. That’s it. When the bucket is full, tokens are discarded.

The number 25600 here is your burst bytes, this is the “burst allowance”. When a packet arrives, the policer checks whether the bucket has enough tokens (bytes) to allow the incoming packet. If so, you remove the number of tokens from the bucket (packet length) and transmit the packet.

With a burst allowance of 25600 bytes and packets that are 1500 bytes each, you can transmit 25600 / 1500 = 17 packets that conform. The 18th packet will be exceeding and dropped because there are not enough tokens in the bucket. The “burst” in this sense, is the number of packets you can transmit until your bucket is empty.

nitayp1:

how is it different from configuring two tokens bucket approach like this example:
policy-map BURST-LIMIT
   class class-default
      police 256000 
        conform-action transsmit
        exceed-action set-dscp-transsmit 0
        violate-action drop
You taught us that in the Bc+Be approach you can burst traffic after inactivity, but in my first example you can burst too, but it seems like it is much limited then the second example.

With the single rate two-color policer we only have one bucket. When the bucket is full, we throw away the tokens. This example is a single rate three-color policer so we have two buckets.

We fill the Bc bucket first and the spillage goes into the Be bucket. Whatever doesn’t fit into the Be bucket, is thrown away. This solution might be fairer to customers because, with two buckets, you can transmit for a longer duration until your traffic is policed.

The exceed action here transmits the packets but sets the DSCP to 0. It’s likely that another policer, further into the network still drops these packets.

We still use the Be bucket. You spend tokens from the Bc bucket first, when it’s empty, we use the Be bucket. The same logic with the burst allowance applies here.

This is up to the ISP and it depends on what is in the contract or what is “fair” to the customer.

If you pay for a 100Mb link and you use a single rate two color policer or single rate two-color policer then 100Mb is your “hard” limit.

With a dual-rate, three-color policier (CIR+PIR) we fill two buckets at the same time. When you spend your tokens, you can go above the CIR rate. This allows customers to use the network up to the PIR rate when there is not much going on, with a guaranteed CIR rate. This was used often on frame-relay networks.

If you use a shaper with a Tc of 100ms and Bc of 10Mb, you’ll end up with 100Mb. Same thing with a Tc of 200ms and Bc of 20Mb. The second option might not be a good idea if you have any delay sensitive traffic like VoIP though. In both cases, we burst. We allow an X amount of traffic before we limit it.

With the shaper, there is a pause because of the Tc. If you use a Tc of 100ms, you have 100ms to transmit 10Mb. With a 10G link, that takes a very short time so most of the duration of the Tc the link won’t do anything.

With the policer, there’s no pause. We check if there are enough tokens in the bucket and if so, we transmit the packet. Tokens are renewed based on this rule:

(Packet arrival time - Previous packet arrival time) * Police Rate / 8

The number of tokens that we put in the bucket depends on the time between two arriving packets.

And yes, you have to be careful that your shaping configuration matches the policer on the other side, otherwise it is possible that they still drop your traffic. This is a good experiment to try in a lab btw

I hope this helps!

Rene

nitayp1 · May 26, 2020, 9:48am

Thanks you very much for the explenation Rene,

So I think I finally grasp the idea behind the burst limitaion as follow:
If we have 256Kbps policer with maximum burst of 25.6Kbps, then the equation of the algorithm regard policing is a little bit different as instead of maximum X tokens for 256Kbps, now you will get a maximum of 0.1X tokens, but - a big “but”, the time it takes to replenish the tokens will be associated to the “maximum bandwidth limit” and not to the “maximum burst size” which means the policer replenish tokens much faster and this way it let you burst up to 10 times the “maximum burst size” in the duration of 1 second which equals to the “maximum bandwidth limit” of 256Kbps in this example.

Thanks you very much for your time - that was the hidden last piece I was looking for to understand this topic!

bousso.diagne16 · August 5, 2020, 12:19pm

Hello,
Thanks for the course.
I have some questions due to misunderstanding.
1- Is it possible to do policing even we are not a provider? Can you give an example if yes ?

2- What happens in the beginning of the transmission ? How is the bucket filled? How much tokens?

3- After a packet is policed and gets conforming, the number of tokens out of the bucket is it the length of the packet? I mean if a packet is 100bytes and the buckets contained 500tokens (bytes), 100 tokens will be taken of right? And before replenishing the bucket has 400tokens right?

4- Is there a minimum set of tokens in the bucket??

5- For dual-rate two colors, the non conforming packet is compared with Be or Be+Bc?

Is policing done before or after classification? For marking, I understand that the packet can be re-marked after policing? So am I right if I tell the router does packet treatment in this order classification-policing-marking?

Thanks in advance.
Bousso

lagapidis · August 7, 2020, 5:53am

Hello Bousso

Policing can be done anywhere and on any port that is on a device that supports it. One example is applying policing on a WAN link between your various branches in order to avoid oversubscribing these links. It can also be applied on your own edge router in order to avoid sending too much traffic to the ISP (you may have a cost agreement with them that bills you based on usage and you want to control that). In general, policing, as well as shaping, are applied at locations of the network where some control over the traffic usage is necessary. This takes place most often at connections with the ISP or on WANs, and not generally within other areas of the enterprise network. But you will often see it on your own enterprise equipment, and not just on the ISP side.

In the beginning, the buckets are full and only start being emptied as traffic arrives on the interface. How they are emptied is further described in this post: QoS Traffic Shaping Explained - #64 by lagapidis

Yes, for policing this is correct. For shaping, you would use bits instead of bytes.

The minimum number of tokens in a bucket is zero, when enough traffic arrives quickly enough to empty the bucket.

A non conforming packet can belong to either the exceeding or violating category.

Exceeding packets are those where the number of bytes in the packet is less than or equal to the number of tokens in the Be bucket.
Violating packets are those where the number of bytes in the packet is less than or equal to the number of tokens in both the Be and Bc bucket.

An excellent resource on QoS order of operations can be found at the following Cisco documentation:

I hope this has been helpful!

Laz

robbo7987 · February 21, 2021, 5:42am

Hi, is the Be bucket full to begin with at the start of the 1 second or does it only grow if the Bc bucket if full and then the spillage goes over into the Be bucket?

Or is it similar to shaping were if you don’t use the allotted Bc during the Tc then whatever was not used is passed over into the Be?

Thanks

lagapidis · February 25, 2021, 7:02am

Hello Michael

The Be bucket begins empty, and is only filled using overflow tokens from the Bc bucket. Shaping mechanisms always try to send all the data that is sent to them by using unallocated tokens. Policing is different in that if anything violates the limits defined, the packets are simply dropped.

I hope this has been helpful!

Laz

samirkhair · March 16, 2021, 9:11pm

Hi,

I just want to clarify when the bucket is replenished because it seems there’s a confusing overlap between shaping and policing.

I know people are talking about Tc in some of the above posts but that’s actually a shaping feature. Is that correct?

And as per the notes, the bucket is replenished only when it receives a packet buy using the (packet received time - previous packet received time)* Police Rate/8 formulae to determine the number of tokens to fill the bucket with - after which it checks if there are enough tokens in the bucket for the received packet to conform. So, there is no periodic replenishing of the bucket, as per traffic shaping. Is that all correct?

Another question I have relates the the bucket size itself. I have seen Bc = CIR/32 mentioned above (is CIR actually the Police Rate - or can the terms be used interchangeably?). But if the Police rate is 128,000bps, 128,000/32 is 4,000 bits which divided by 8 is 500 bytes, but that value is 1000 bytes less than the 1500 MTU. So, where did the 32 come from? Is it a Cisco default value? Or does it vary?

Thanks,

Sam

lagapidis · March 19, 2021, 8:32am

Hello Samir

Yes, that sounds correct. Every time a packet arrives, the time between the previous packet and the current packet is used to determine how many tokens are to be added, as per the formula.

There is no periodic replenishing, but replenishing is still a factor of time, just like shaping. Shaping replenishes at specific intervals, policing replenishes when packets arrive.

CIR is the official term used to refer to the agreed upon bitrate at which the connection will be limited. The term police rate or policer rate can also be used for the same thing, but you typically wouldn’t use it in your discussions with your ISP.

The value of Bc doesn’t actually change the configured CIR. However, it changes the time interval that is used as the baseline for the calculation. Although replenishing doesn’t take place at regular intervals, there is still a use for the Tc in policing, specifically when you use Bc.

Take a look at this thread at the Cisco Community that explains it well:

Approaching it this way, you can see that the actual size of the frame itself doesn’t play a role, because it can be “included” in the Bc in several time periods.

I hope this has been helpful!

Laz

samirkhair · March 19, 2021, 6:28pm

Hi Laz,

Thanks for the explanation, it helped clear stuff up.

Thanks,

Sam

mohamadzakwan88 · November 2, 2021, 2:02pm

Hello All,

I tried below config and I was shocked that IOS accepted the command. I though ‘be’ should be a value so that the Tc would be the same for both CIR and PIR.

I calculated the Tc value for each CIR and PIR.
cir 10000000 bc 50000 == Tc: 0.5ms
pir 20000000 be 50000 == Tc: 0.25ms

This means tokens are replenished at different rate for CIR and PIR.
More packets can be forwarded with exceeded action, because PIR is replenished twice the rate of CIR.
Is this correct?

policy-map POLICY-MAP
 class class-default
  priority
  police cir 10000000 bc 50000 pir 20000000 be 50000 conform-action transmit  exceed-action set-dscp-transmit af32 violate-action drop

lagapidis · November 4, 2021, 5:46am

Hello Mohamad

Yes, you are correct that the CIR bucket and the PIR bucket will be replenished at different rates. But rather than calculating the Tc for each case, it makes more sense to examine a particular Tc and see the resulting tokens that are removed from each bucket. This matches better with the actual process.

The time between packets is determined, and from that, we determine how many tokens are removed from each bucket. The result will tell us if we are conforming, exceeding, or violating in each case.

So yes, such a command would be accepted in the Cisco IOS.

I hope this has been helpful!

Laz

rohitjamwal · February 24, 2022, 7:03am

Hi Rene,

how we can classify and mark the L2 traffic.
I’ve a scenario where 2 data-centers are connected over L2 (BGP-EVPN) and we want to rate limit the Video, Voice, Mission-critical and default traffic between them. How we can do that. Can you please provide me the configuration.

What should be the cir/pir and what parameters we can consider in this case.

lagapidis · February 26, 2022, 7:32am

Hello Rohit

It is always preferable, whenever possible to apply QoS at Layer 3. Layer 3 QoS mechanisms are much more granular and you have more flexibility over marking, classification, and QoS mechanism application.

However, modern switches, even though they operate on Layer 2, are able to use both CoS and DSCP values to apply QoS. The following lesson shows how to apply classification and marking on a switch such that QoS is applied on an L2 interface.

In the particular example, there are three types of traffic that are marked and classified accordingly.

Once marked and classified, you can then apply queuing to act upon those markings as shown in the following lesson.

I hope this has been helpful!

Laz

davidguillo87 · February 24, 2023, 3:05pm

Hi Laz,

In the Shaping lesson, 1 token = 1 bit.
“Imagine we have a bucket….this bucket we will fill with tokens, and each token represents 1 bit”
Could you confirm this concept is different for Shaping and Policing?

Thanks !

lagapidis · February 28, 2023, 5:23am

Hello David

Yes indeed, you are correct. As stated in the Shaping lesson:

…each token represents 1 bit.

And in the policing lesson:

When it comes to policing, each token represents a single byte.

I hope this has been helpful!

Laz

ahmedamrici · May 27, 2023, 7:59pm

Hello Laz,

Kindly lets say i am policing my customer at 500mbps, how does the single rate two color and single rate three color works, when a packet for my customer arrives and it is higher than the bucket total packet will be dropped ?