The Internet as a Connectionless, Datagram Network

Most people are used to working with the Internet as a Connection-oriented network. That means that your node sets up a connection to another node, using the “3 way handshake”, then messages are streamed back and forth over that connection, then when the traffic is done, the connection is torn down (with another “3 way handshake”).

Most of the Internet protocols people are familiar with work this way, including HTTP (Web), SMTP (E-mail), FTP (File Transfer), etc. There are only a few network protocols in widespread use that are not connection oriented, including DNS queries and VoIP. Those protocols are “connectionless” (no setup or teardown), each transmission is a single, isolated packet or “datagram”.

The Transport Layer provides two protocols, the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP). UDP is a very simple “thin” wrapper that adds a source and destination port number to an IP packet and does little else.

Datagrams are sent on a “best effort” basis. The sender transmits the packet and hopes for the best. If no response is received, the packet might have gone astray on the way to the destination, or the response might have gone astray. In either case, the packet is usually retried some number of times, and then if still no response, reports that it is unable to send the packet.

TCP on the other hand, is a very complex “thick” layer that imposes a connection oriented paradigm on IP packets. It allows you to think in terms of streams of data instead of discrete packets. A single email message or web page might require quite a few packets to send – with TCP you don’t need to worry about this – TCP manages streams for you. It also does error detection with re-transmission of lost packets for you.

At the Internet Layer (with IP), the Internet is a connectionless network. It doesn’t even know about “port numbers”. It would be very difficult for most network programmers to use IP directly. Fortunately, we have the Transport Layer to hide this complexity from us. But both IPv4 and IPv6 are connectionless protocols that work in terms of datagrams, not streams. They are also “unreliable” protocols. That is a technical term – it doesn’t mean that IP was badly designed, or faulty. It just means that there is no automatic error detection and re-transmission, as there is with TCP. The programmer must explicitly handle lost packets and do retries as necessary. When using protocols over TCP, that is done for them.