MTU Troubleshooting on Cisco IOS

19 posts were merged into an existing topic: MTU Troubleshooting on Cisco IOS

hi guys,

how can i imagen the whole process with respect to mtu?
e.g. if a router creates an ip packet, will it look at the configured ip mtu before it creates the packet and according to that value it will determine the size of the packet? and if the packet is bigger in size than the configured value it will be fragmented?

and if the packet is received at the destination, the receiving host will check the size of the packet and again compares it with the configured ip mtu, if smaller or same size ->no problem; if bigger and df bit set ->dropped; if bigger and df-bit no set ->fragmentation.
but the ip mtu i checked first and if it does pass that layer then the l2 mtu will be checked?

thanks for any help!

Hello Florian.

It is important to realise that the MTU is a layer 2 trait: it determines the maximum frame size on the link. The MTU is specifically involved in the process of encapsulation from Layer 3 to Layer 2. In other words, it is the largest size in bytes that a link will allow for a frame. Look at it as the largest size of a box that will carry the payload in the post office. If your payload is too big, you’ll have to separate it into two packages to send by mail.

So let’s say your computer is creating an IP packet that has a size of 1750 bytes. When the encapsulation occurs to place it on the network, since the NIC will have a default MTU of 1500 it fragments the packet into two frames each of less than 1500 bytes.

Moving the example to a router, if for example a router has an MTU on one interface of 1500 and 1450 on another, and a frame of size 1490 comes into the first interface, it will be deencapsulated, IP addresses will be examined, routing will be determined. Let’s say the egress interface is that with an MTU of 1450. When the router reencapsulates the packet to a frame, it will look at the MTU and see that the packet cannot be placed into one frame since the MTU larger than the allowed value for that interface. It will fragment the packet into two frames.

If however, the3 DF bit is set, the router will not fragment the packet into multiple frames, but will drop it.

I hope this has been helpful!

Laz

2 Likes

hi laz,

thanks a lot for your time.
your description is with respect to the layer 2 mtu, right? but in the article there is also an IP mtu mentioned.
what is that about then?

thanks
florian

Hello again Florian.

Yes, you are correct, I was describing the Layer 2 MTU. The IP MTU is an additional parameter that you can configure on an interface such that all outgoing IP packets (payload + header) will never be larger than this value. Anything larger will be fragmented before it is sent out of the interface.

_Note that this is a parameter that can only be implemented on a layer 3 interface, that is, on an interface that has an IP address. Any layer 2 interfaces will not have this capability._

In Rene’s example, for the purposes of the lesson, he set the Layer 2 MTU to something smaller than the IP MTU on the same interface. This means that no IP packets larger than the layer 2 MTU (which was set at 1460) would successfully be transmitted from this interface. You would never put such a configuration on a router’s interface in a production network. The IP MTU must be =< the Layer 2 MTU in order for traffic to be forwarded.

In the real world, the IP MTU parameter would be adjusted if you know that you have links downstream that have smaller Layer 2 MTUs so that no fragmentation will take place when enapsulating from layer 3 to layer 2 and that no frames will be dropped because of MTU restrictions.

By adjusting both the IP MTU and the Layer 2 MTU at the appropriate interfaces on your network, you can ensure that no frames will be dropped because of MTU restrictions.

I hope this has been helpful!

Laz

4 Likes

Hi Laz,

thanks again for your reply. Really appreciate your time!

So this means that the IP MTU parameter would only be useful for packets originated from the device where the IP MTU command is set? If a packet is created the IP MTU is checked and according to that value the packet will be fragmented or not?

As for packets arriving at a router the media MTU would be checked!?

Is that correct?

Thanks
Florian

Hello Florian.

The IP MTU parameter is helpful when you know that the packets, as they move downstream from your router, will encounter Layer 3 MTUs smaller than the default 1500 bytes.

(An example of such a situation is the traversal of a QinQ VLAN that requires an additional 4 bytes in the header of the Layer 2 frame which results in a maximum MTU of 1496.)

Like I mentioned before, it is unlikely that you would configure a Layer 2 MTU parameter smaller than a Layer 3 MTU parameter on the same port. This would result in any IP packets larger than the Layer 2 MTU attempting to exit that port being blocked if the DF flag is 1 or being fragmented if the DF flag is 0.

As for packets arriving at the router on that port, yes, the Layer 2 MTU would be checked.

I hope this has been helpful!

Laz

1 Like

Hi Laz,

thanks for your help!

Regards
Florian

1 Like

Hello everyone,
On several occasions I have seen problems with secure protocols such as https, fttps, ssh, when PMTUD is disabled or the IMCP is filtered, and large packets are sent. It is usually solved by reducing mss size, but I do not know what the reason is. Could anyone tell me what the reason is?

Hello Diego.

The behaviour you describe is due to the way in which secure protocols deal with packet fragmentation. HTTPS for example, sets the DF (Do not Fragment) IP flag in the IP header to 1. SSH does the same. This indicates to routers along the path to drop the packet if it is too large to be handled by an MTU that’s not large enough. If this occurs, routers will send back an ICMP Fragmentation Needed packet that indicates that the packets have to be sent again, but this time, in smaller pieces to avoid fragmentation. This functionality is the Path MTU Discovery mechanism, or PMTUD. You can now see why if PMTUD is disabled, or if ICMP is filtered secure protocols sending large packets can fail to function.

If PMTUD and ICMP cannot be implemented for whatever reason, then the other solution would be to reduce the Maximum Segment Size (MSS) to something smaller than the MTU. This forces a host to create TCP segments that are smaller than the minimum MTU on the path. This has to be implemented manually however unlike PMTUD. Note that the MSS is implemented at Layer 4, where segment sizes can routinely be in the tens of thousands of bytes and are regularly broken up into IP packets. If the MSS is smaller than the MTU, there will always be a one to one ratio of segments to packets, and those packets will never be larger than the MSS set at layer 4. This solves the problem of the possibility of larger packets not being able to be handled by smaller MTU limitations.

I hope this has been helpful!

Laz

1 Like

Hello Laz

Thank you very much for your reply.
Almost all hosts have PMTUD enabled by default, or at least all Windows-based, so they send the packets with the df set to 1. Therefore, this should happen for both https and http, but only occurs with secure traffic. Even applying a policy in the CPE to force the df to 0, I have been able to verify that it also happens.
At the protocol level, the only difference I find between them is the TLS overhead, but at layer level 6 !!, so there should be no difference.
Looking at google, I always find the same answer, “you shoud adjust the mss”, which I have been able to check that works, but I do not know the reason. For example: http://forums.juniper.net/t5/SRX-Services-Gateway/SRX-110-HTTPS-sites-not-loading/m-p/290266

Thanks for you help

Hello Diego

This is a very interesting topic and it brings up a lot of related issues as well. After doing some research I’ve found the following:

Using HTTP and HTTPS as an example, in general, servers and clients using HTTPS set the DF flag on packets while HTTP clients and servers do not. This is not based on an RFC or a standard, but it is the general functionality that occurs based on experience. Using wireshark, several tests have been implemented that show this to be true. However, it really depends on the implementation of the web server/client. This of course can be changed on both ends. So, IN GENERAL, HTTP does not set the DF while HTTPS does.

Extending this reasoning, secure protocols IN GENERAL have DF set while insecure protocols do not. When SSL is used for example, DF is set.

Having said that, this is not always the case. There are cases where PMTUD does function correctly with HTTPS and other secure protocols. Your personal experience may have been such that the network setup didn’t support PMTUD with HTTPS for whatever reason.

The bottom line however, is that if PMTUD does not function because of ICMP filtering for example, a workaround is to make the MSS smaller than the presumed smallest MTU in the path in question, thus forcing all frames to be smaller and preventing the dropping of frames.

I hope this has been helpful!

Laz

2 Likes

Hi Laz, I’m happy to talk to you again.
Since Windows 2000, all Windows versions have PMTUD active, so send ip packets with DF to 1, for any protocol and regardless of the role they play (client or server). I am trying to find out if the same thing happens in Linux.
I worked on an ISP, and by default all our customers CPEs were configured with a policy on the LAN interface to set to 0 the DF bit of the traffic sent by the customer, since the network core filtered ICMP traffic. According to your explanation, with this configuration the problem should not occur, and yet it happened!!! :rage:, so we had to adjust the tcp mss.

Hello Diego.

So let me go through your scenario as I think out loud:

So if your core network filters ICMP, PMTUD cannot be used successfully. This means that your Windows machines are attempting to use PMTUD but are getting no responses since ICMP is filtered. So they’re sending their data with the DF at 1

When your CPE device receives the packet, it clears the DF bit to 0, thus allowing for fragmentation. HTTP data going through your core network is successfully transmitted, however HTTPS data is not. (Unless of course you adjust MSS) Correct?

So this means that for some reason, your CPE device is unsuccessful in clearing the DF bit for secure protocols like HTTPS.

With the information that we have, it seems to me that the clearing of the DF bit by the CPE may not actually be successful for secure protocols and that is why it is necessary to resort to tweaking the MSS. However, in order to find out why, get back to us with the following:

Questions and suggestions:

  1. What type of CPE device are you using? Is it Cisco?
  2. What is the config on the CPE that is clearing the DF bit to 0?
  3. Can you get a wireshark output of packets after they egress the CPE and check if the DF bit is actually set for both HTTP and HTTPS?

Laz

Hi,
So much is being said here. Thanks all. I want to understand… if i ping -l 1600 X.X.X.X or with any packet lager than 1472 is not going to or through my switch it gets dropped.
It seems everywhere else on the network i can ping with large packets but the moment i go through this switch then i cant. What could be blocking this ?
I though when the switch is configured mtu 1500 - this means that all frames smaller than 1500 will be allowed to go through and all frame larger will be fragmented before sending them out and not dropping frames at all ?

“sh system mtu
System MTU size is 1500 bytes
System Jumbo MTU size is 9198 bytes
System Alternate MTU size is 1500 bytes
Routing MTU size is 1500 bytes”

Hello Sion

When you ping using the -l command in Windows, you are specifying the payload of the ICMP packet to be 1500. The header of the ICMP packet however has a size of 28 bytes (IP header 20 bytes + ICMP header 8 bytes = 28 bytes), so essentially, you are sending a packet that has a size of 1528. If you specify 1472 as the size, then the actual size of the ICMP packet is 1472 + 28 byte header = 1500 bytes.

So this behaviour is correct since the settings on the switch are at 1500 bytes for the MTU.

I hope this has been helpful!

Laz

Hi Lazaros,

Ok. So then, how do i allow bigger packets to go through?
i can ping elsewhere though the network with “ping x.x.x.x size 5000” with no issue but not through that switch.
I have seen network where I’d usually test the speed of the link by sending large packets.

Switch#ping 50.1.1.1

 

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 50.1.1.1, timeout is 2 seconds:

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms

Switch#ping
Protocol [ip]:
Target IP address: 30.1.1.1
Repeat count [5]: 1
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: y
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]: y
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]: v
Loose, Strict, Record, Timestamp, Verbose[V]:
Sweep range of sizes [n]: y
Sweep min size [36]: 1472
Sweep max size [18024]: 1600
Sweep interval [1]:
Type escape sequence to abort.
Sending 129, [1472..1600]-byte ICMP Echos to 50.1.1.1, timeout is 2 seconds:
Packet sent with the DF bit set
Reply to request 0 (1 ms) (size 1472)
Reply to request 1 (8 ms) (size 1473)
Reply to request 2 (1 ms) (size 1474)
Reply to request 3 (9 ms) (size 1475)
Reply to request 4 (1 ms) (size 1476)
Reply to request 5 (8 ms) (size 1477)
Reply to request 6 (1 ms) (size 1478)
Reply to request 7 (1 ms) (size 1479)
Reply to request 8 (1 ms) (size 1480)
Reply to request 9 (1 ms) (size 1481)
Reply to request 10 (8 ms) (size 1482)
Reply to request 11 (1 ms) (size 1483)
Reply to request 12 (1 ms) (size 1484)
Reply to request 13 (1 ms) (size 1485)
Reply to request 14 (1 ms) (size 1486)
Reply to request 15 (1 ms) (size 1487)
Reply to request 16 (1 ms) (size 1488)
Reply to request 17 (9 ms) (size 1489)
Reply to request 18 (1 ms) (size 1490)
Reply to request 19 (8 ms) (size 1491)
Reply to request 20 (1 ms) (size 1492)
Reply to request 21 (9 ms) (size 1493)
Reply to request 22 (1 ms) (size 1494)
Reply to request 23 (8 ms) (size 1495)
Reply to request 24 (1 ms) (size 1496)
Reply to request 25 (8 ms) (size 1497)
Reply to request 26 (1 ms) (size 1498)
Reply to request 27 (9 ms) (size 1499)
Reply to request 28 (1 ms) (size 1500)
Request 29 timed out (size 1501)
Request 30 timed out (size 1502)
Request 31 timed out (size 1503)
Request 32 timed out (size 1504)
Request 33 timed out (size 1505)
Request 34 timed out (size 1506)
Request 35 timed out (size 1507)
Request 36 timed out (size 1508)
Request 37 timed out (size 1509)
Request 38 timed out (size 1510)
Request 39 timed out (size 1511)
Request 40 timed out (size 1512)
Request 41 timed out (size 1513)
Request 42 timed out (size 1514)
Request 43 timed out (size 1515)
Request 44 timed out (size 1516)
Request 45 timed out (size 1517)
Request 46 timed out (size 1518)
Success rate is 61 percent (29/47), round-trip min/avg/max = 1/3/9 ms
Switch#

Hello Sion

You wrote before that the MTU configuration for your switch is:

Assuming there is no routing taking place on this switch, the system MTU size is what is restricting larger frames. You must increase this appropriately to allow for larger frames.

I hope this has been helpful!

Laz

Hi Laz,

There’s no routing configured on the switch. Apart from the ip gateway config.
The highest MTU size that can be configured on the switch is 1998.

Switch(config)#system mtu ?
  <1500-1998>  MTU size in bytes
  jumbo        Set Jumbo MTU value for GigabitEthernet or TenGigabitEthernet interfaces
  routing      Set the Routing MTU for the system
Switch(config)#system mtu

In short, what im trying to do is to ping with large packets, even 5000bytes as this is going through on my other network devices. I can increase the system mtu size to its maximum (1998 bytes) but then how about 5000bytes?

How can i resolve this, so that i can do a ping -l 5000 successfully?

Hi All,
Thanks for the effort in trying to assist me.
i found the reason why i couldnt ping with large packets. It was because there was a setting on my ASA firewall which disabled fragmented packets to go through.

The link below can share more info.
http://www.cisco.com/c/en/us/td/docs/security/asa/asa82/configuration/guide/config/conns_protect.html