WRED (Weighted Random Early Detection)

This topic is to discuss the following lesson:

Hello Rene,

Thank you very much for the lesson.

but I’m still a little bit confused, Please, correct me if I’m wrong:
based on the command below if it’s set AF probability will be considered:
random-detect dscp-based

now if we have AF21 and AF33 the class different but the probability of dropping packet from AF33 more than AF21, correct?
what about if the packets AF21 and AF31? what about if we have AF21 and EF and CS3 and CS4?

also what is the meaning for fair-queue command? what is the impact when you are using it in the policy map?

Thank you,
Samer Abbas

Adding I’m looking for the drop probability for these Marking if they are in the same Policy Map with random-detect dscp-based


Hello Samer

Class 4 has the highest priority, so if you have AF33, it will have a lower drop probability than AF21 for example. But within the same class, the higher the number the higher the drop probability, so AF13 will more likely be dropped compared to AF11. So yes, you are correct.

AF31 is in a higher priority class than AF21, so AF 21 has the higher drop probability.

EF are also considered part of the DSCP based WREN procedure and are given an even higher priority than the AF markings. As for CS3 and CS4, they have a higher drop probability than the AF series of values.

Ultimately, when using the random-detect dscp-based command, you are telling the device to use the six bit DSCP value as the criteria for random drops. As Cisco documentation states, the “dscp-based argument enables WRED to use the DSCP value of a packet when calculating drop probability.” This means that the whole value (6 bits) is used, which means that CS, AF and EF values are considered in this calculation.

You are also able to optionally specify the minimum and maximum packet thresholds for the DSCP value using the random-detect dscp command.

More info can be found here:


The fair-queue command is used to implement Distributed WRED (DWRED). It’s a feature only available in the 7000 series routers. This specific command specifies the number of queues to be reserved for the specific traffic class. You can find out more about DWRED here:

I hope this has been helpful!


Hi There,

can explain me the average size calculation in WRED. Is queue size calculated based on the bandwidth available to the specific queue or total available bandwidth of link ?


Hello Ranganna

When WRED calculates the average queue size, it does so by calculating the actual size of the real queue. Specifically, the average is calculated periodically every few milliseconds. It uses the following formula:


  • o is the old average calculated the previous time
  • n is the weight factor you configure
  • c is the current queue size

The maximum size of the physical queue will depend on what kind of interface we’re talking about and what platform it is functioning on.

I hope this has been helpful!



Perhaps a note would be useful informing that the instantaneous queue depth is used for the tail drop (Exponential Weighting Constant chapter).

Also in the formula for the average queue depth:
the (instantaneous_old_average) should be change into (instantaneous - old_average)

Many thanks,

Thanks Stefanita. I fixed this and added something about the instantaneous queue depth.


1 Like


Is the MPD the number of packets that will be dropped (1 in 4 in the lesson for 25%) before we reach the maximum threshold then everything is dropped?
So 25% MPD means 1 out of 4 packets will be dropped to start with when the minimum threshold has been breached and does it continue like that until the max threshold has been reached then all packets are dropped? So the higher the number of packets the average queue depth rises but the drop rate is still 1 in 4 until the max threshold has been reached then everything dropped?

And also…Is the number of packets configured in the thresholds the amount of packets in the queue when there is congestion in the queue?

Thanks again

Hello Michael

Let me answer your last question first:

The thresholds can be defined either as number of packets, bytes, or even milliseconds, all three of which can define the “fullness” of a queue. But keep in mind that queues will only exist when there is congestion. If there is no congestion, there is no queue, and there are actually no active QoS mechanisms. If there’s no congestion, every single packet that arrives at the interface is served immediately. So, the thresholds define the average number of packets (bytes/milliseconds) in the queue, a value which will be greater than zero only when there is congestion.

To be more precise, the MPD is the maximum percentage of packets that will be dropped as the average queue depth approaches the maximum threshold.

Remember that the discard probability is a function of the current average depth of the queue:

  • For values between 0 and the minimum threshold, discard probability is zero
  • For values between minimum and maximum threshold, the discard probability ranges linearly between 0% and the MPD value.
  • For values above the maximum threshold, discard probability is 100%

So a discard probability of MPD is approached as the average queue depth approaches the maximum threshold.

One of the things that confused me initially was my interpretation of graphs like this:

We are so used to interpreting the X axis as time, so for me it was strange that the discard probability stays at 100% to infinity. But remember, that over time, the average queue depth continually changes, up and down, so over time, you continually go back and forth on the curve (the green line) in the graph.

So you can see how the discard probability changes over time, as the average queue depth changes. The thresholds and the MPD are simply parameters that are used to define the graph, and ultimately the behaviour of the mechanism.

I hope this has been helpful!


Hi Laz,

I think i understand what you mean now. So 25% for example in the lesson is the maximum amount of packets (1 in every 4) that will be dropped before the average queue depth reaches its threshold and then drops all packets.
So for example, the green line could be in the middle at around 12-13% discard probability and so that doesn’t mean that 1 in every 4 packets will be discarded, it could be 1 in every 8 because it’s half way and hasn’t approached the maximum MPD yet?

Thanks again

Hi Michael

Yes, that’s it exactly, you got it. The number of packets that will be dropped changes depending on how the average bit depth changes over time.


I’m glad this has been helpful!


1 Like


In the lesson its mentioned that the ICMP packets won’t be shaped, why? Is this because the class-default does not trigger on ICMP packets (I know this is the behaviour of the ASA’s but so sure with IOS).
And it’s also mentioned in the lesson that WRED only works with TCP but ICMP packets are being dropped to. Could you maybe shed some light on this?


Hello Michael

In the lesson, Rene states that:

Since my pings use ICMP it won’t slow down R1’s traffic rate…

The statement “slow down R1’s traffic rate” is not referring to shaping, but to the TCP slow start process. So since ICMP uses UDP and is not involved in any TCP sessions, sending these ICMP packets will not contribute to the triggering of any slow start mechanism.

The random drops that WRED executes are applied to all traffic. WRED does not distinguish between TCP and UDP traffic. However, the goal of WRED is to discard random packets (many of which belong to TCP sessions) in order to slow down the TCP sessions in such a way as to avoid TCP global syncrhonization.

Now it is up to the network designer to examine the type of traffic that exists on the network, and to determine if WRED is indeed useful for the network or not. If most traffic is not TCP, then WRED will not be of any benefit. If there is substantial TCP traffic however, it is definitely useful.

I hope this has been helpful!



I have one question in regards to use the ping with datagram size. How would Rene know that if he specify the datagram size to 160 there won’t be any tail drop but if it’s 170 there will be? Of course, based on his setup.


Hello Helen

The quick answer is that he doesn’t. During the making of this lesson, he had to experiment to find what ping sizes would bring about the results he wanted, so that the output would be meaningful and show the mechanisms in action.

Unlike the Windows or Linux operating systems, the ping function on Cisco devices will send the next echo request as soon as the echo response is received from the previous one. This means that the actual data rate the ping produces depends on the round trip time, and will thus be variable. After trial and error, he found that using a byte size of 160 didn’t have tail drops, while a higher value did.

I hope this has been helpful!