MotorolaWorldwide
Search
Service ProvidersBusinessConsumers

Path MTU Discovery Problems

NIR_067

It's not your routers fault!....unless
     ......You have been playing with your filters!

Description of Problem and Possible Causes

What is Path MTU Discovery and how can it affect users in VPN or PPPoE environments?

Applications' behavior:

Over some IP paths, a TCP/IP node may send small amounts of data (typically less than 1500 bytes) with no difficulty, but transmission attempts with larger amounts of data hang, then time out. The behavior is generally only experienced in one direction. When attempting to send large amounts of data, transfers may succeed in one direction but fail in the other direction. This problem is likely caused by a PMTUD failure, different LAN media types, or defective links. These problems are described below:

WEB / HTTP

Browser connects to the server, but no data is received. Another variation would be that only half of a web page is loaded and then the browser stalls.

FTP

The control connection to the server establishes. For example using Unix you can login, can change directory(cd) to other directories. You can also use dir (ls) command if the directory includes small number of files. You can even put files. If you try to get a file, it just hangs, unless the file is small enough to be unaffected.

Telnet

You can connect to the host and login by putting in user name and password. However after hitting ENTER on the user name and password the session hangs just after successful login . No screen refresh is ever received due to the large size of the packets trying to be sent back from the telnet host.

SMTP (Send)

Sending mail with attachments will fail.

POP3

You may be able to login to the server and read short mails. If you try to get the list of mails on large mailbox or try to read a long mail, it hangs.

To correctly handle oversized packets a router should do one of the following:

  1. Packet fragmentation:

    In PMTU Discovery, router fragmentation is not possible because the workstation is sending packets with the DF (Do not Fragment) bit set in TCP packets. (This is not an issue with UDP). The router is essentially being told not to fragment. An ICMP message is sent back to the host telling the workstation what the next hop MTU is for the link.

    If a host was not using PMTU discovery the router would simply fragment the packet according to the next hop MTU. The packets would be reassembled at the destination. Not a very efficient method; however the application problems seen above would not occur. Fragmentation however can place high loads on routers and server resources.

  2. PMTU Discovery:
    (Much of what is described below is summarized and quoted from RFC 1191 by J. Mogul).

    Path MTU is a technique for dynamically discovering the maximum transmission unit (MTU) of an arbitrary internet path. This process changes the way routers generate one type of ICMP message. Sounds like a good idea, right? Well in most cases, yes. However, there are many instances where Path MTU discovery does not work properly. This guide will try to explain the problems while at the same time offering some suggestions as to how to fix this problem on your network.

    When one IP host has a large amount of data to send to another host, the data is transmitted as a series of IP datagrams. It is usually preferable that these datagrams be of the largest size that does not require fragmentation anywhere along the path from the source to the destination. This datagram size is referred to as the Path MTU (PMTU), and it is equal to the minimum of the MTU's of each hop in the path. A shortcoming of the older Internet protocol suite was its the lack of a standard mechanism for a host to discover the PMTU of an arbitrary path.

    The older practice was to use the lesser of 576 and the first-hop MTU as the PMTU for any destination that is not connected to the same network or subnet as the source. In many cases, this resulted in the use of smaller datagrams than necessary, because many paths have a PMTU greater than 576. A host sending datagrams much smaller than the Path MTU allows is wasting Internet resources and probably getting suboptimal throughput due to not fully utilizing the maximum data payload possible.

    PMTUD is implemented by having an IP sender set the "Don't Fragment" (DF) flag in the IP header. If an IP packet with this flag set reaches a router whose next-hop link has too small an MTU to send the packet without fragmentation, that router discards that packet and sends an ICMP "Fragmentation needed but DF set" error to the IP sender. When the IP sender receives this Internet Control Message Protocol (ICMP) message, it should learn to use a smaller IP MTU for packets sent to this destination, and subsequent packets should be able to get through. Below is an example of such a message. The bolded text actually shows the ICMP message reporting the next hop MTU for the sending workstation named "Karpywork".

    Customers often state that launching affected applications while using PPPoE client software on a workstation work with no problems. However, when PPPoE is running on a router, they are not able to use certain applications. Pings from the workstation to the remote site without the DF bit set are also successful. The problem is that a router ( R9100, 4541, 3320, 3546 or 3341) is negotiating the MTU for the PPPoE link while a machine on the LAN doesn't know the Internet connection is through PPPoE and only recognizes the MTU of the LAN (1500.) This also holds true for the server on the remote site which also sees the first hop MTU as 1500 bytes from their ethernet connections.

What the ICMP message looks like:

Destination Unreachable Message for Modified Type 3, Code 4

Various problems can cause the PMTUD algorithm to fail, so that the IP sender will never learn the smaller path MTU but will continue unsuccessfully to retransmit the too-large packet, until the retransmissions time out.

PMTU Failures

Some problems include the following:

Router Problems
  1. The router with the too-small next hop path fails to generate the necessary ICMP error message.
  2. Some routers in the reverse path between the small-MTU router and the IP sender discards the ICMP error message before it can reach the IP sender. (Firewalls along the path block the ICMP messages required for sender to properly use PMTU) This relates to PMTU black holing (RFC 2923) which is described at:

    Diagnoses and Treatment of Black Hole Routers:
    http://support.microsoft.com/
    A search for black hole routers will return documents relevant to this issue.

Server and Workstation Problems

  1. The IP stack of the host or sender ignores the received ICMP Destination Unreachable message. This would be the case that is described in the following article which affected Windows 2000 Service Pack 2.

    http://support.microsoft.com/
    A search for PMTU detection will return documents relevant to this issue.

    Solution: A workaround is to configure the IP workstation to disable PMTUD. This behavior causes the IP sender to send their datagrams with the DF flag clear. When the large packets reach the small-MTU router, that router fragments the packets into multiple smaller ones. The smaller, fragmented data reaches the destination where it is reassembled into the original large packet.

  2. If you were filtering ICMP at the workstation this would also cause PMTU to fail.

    Solution: Allow Type 3 Code 4 ICMP to be received by your workstation.

  3. The most common way of discovering a PMTU issue would be to issue the following command from your workstation at a command prompt.

    At a command prompt issue the following command:
    ping  -f  -l  1462  X.X.X.X ( X.X.X.X= destination IP address of server) . The X.X.X.X can be the IP address of any device to which you are testing the PMTU. If you receive a "Packet needs to be fragmented but DF set" back instead of a ping reply then the PMTU is lower than the size of the ping packet you specified. Do another ping with ping  -f  -l  1462  X.X.X.X. Keep stepping down the size by ten until you receive a ping reply.

    For example if the first ping reply you receive is at ping  -f  -l  1462  X.X.X.X after stepping down by ten, start increasing the size by one to 1433. If 1433 receives a "Packet needs to be fragmented but DF set" then you know the PMTU to be 1432 for your test since this is the last size which allow a ping reply WITH the DF bit set.

    After determining your PMTU you need to make a decision on how to fix your problem. You could try to contact the person responsible to the black hole across the link. This unfortunately is often not possible. The easiest thing for an individual is to either turn off PMTU discovery on the workstation or set an MTU at the level you discovered with your ping test. There are applications which will allow registry PMTU changes. An example of PMTU configuration programs would be "DR. TCP". An internet search of the web will also discover various resources for modifying your registry. DR TCP is NOT endorsed by Netopia but is only listed as a possible solution.

    WARNING: Modifying your registry should only be attempted by an experienced user. Please contact your network administrator, consultant or IT professional before attempting this configuration change. Netopia is not recommending this change! We are only pointing out that this modification is one of many possible solutions.

Firewall Issues

Often Network Administrators try to block all ICMP traffic across their network. Unfortunately they do not turn off PMTU on their servers or workstations and consequently break PMTU. This is not a well thought out plan since they are relying on ICMP DUs for PMTU but at the same time blocking this type of behavior. The other issue is that the server actually becomes vulnerable to DOS (Denial of Service) attacks due to an exhaustion of server resources from open sessions.


www.motorola.com  |  Terms of Use  |  Privacy Statement   |  Media Center  |  Site Map  |  Contact Us
© 2008 Netopia, Inc., a Motorola Company. All rights reserved.