Max Networking Basics

In order to maximize the potential of the built-in networking classes that are new with version 4.5 of Max, it's helpful to understand a few things about the way that modern network protocols operate. This article is intended to help you get off the ground by answering some basic networking questions.

Q: What are the differences between the net.maxhole, net.udp.*, net.multi.* and net.tcp.* classes?

Which of the above classes to choose depends on your application. All of them are based on either UDP or TCP, the two dominant protocols that computers use to communicate in today's networks. To communicate with the internet these protocols both use what is called the IP (Internet Protocol) layer, which is why you sometimes see acronyms like TCP/IP.

UDP (User Datagram Protocol) is the simpler of the two protocols. To send data via UDP, all that's required is a destination port and IP address. When using UDP your data is wrapped into a packet with a header that, among other things, contains the packet's destination address. UDP does not provide any feedback on the status of the packet after it has been sent, so if for some reason the packet can't be delivered to the destination - the internet goes down, someone trips over a network cable, the receiving computer isn't listening on the proper port, etc - the application that sent the packet will never know. Furthermore, UDP provides no guarantee that the packets you send to another host will arrive in the order that they were sent. If there is a single, well-defined route between the computers the probability of packets arriving out-of-order is close to zero. When data is being transmitted over the internet, however, different packets can take different routes to the same destination, and consequently the probability of UDP packets arriving out-of-order is high. The higher your rate of transmission, the higher your probability that packets will arrive out-of-order.

Like UDP, TCP (Transmission Control Protocol) sends data in packets. However, extra information is embedded in the packet headers to help the protocol track the packets it has sent. When a computer receives a TCP packet it sends back a small ACK (acknowledgement) message which includes a packet ID. This allows TCP to track which packets have been received successfully by the remote computer. If the sender does not receive an ACK within a certain time the packet is re-sent. Because each packet is labelled chronologically, the TCP implementation on the receiving computer is able to sort the incoming packets into the proper order even if they were received out-of-order.

These differences between the protocols are evident in an examination of the Max objects provided with v4.5 for use with the mxj Java object. net.udp.send has an inlet to take data to send to the specified address and port, but no outlets that report back on the status of the packets. net.tcp.send on the other hand has three outlets that report on the status of the data you've input to be sent: one outlet reports what data was sent successfully, another reports what data was not sent successfully, and a third outlet reports the number of packets that are still "in limbo" - that is, no acknowledgement of their successful transmission has been received, but the algorithm hasn't given up on them yet.

To make the back-and-forth communication between TCP hosts possible, the protocol requires that a connection be made between the two computers prior to sending any data. No such connection is required with UDP. Having to make this connection causes a delay before any data can be sent. The time delay is roughly equivalent to the RTT (round trip time) that it takes a packet to reach the destination and come back. For applications sending a lot of data the cost of this connection is mortgaged over the time it takes to send all of the data, and therefore is usually negligible. Currently with net.tcp.send and net.tcp.recv, the TCP connection is created and destroyed for each new piece of input data to be sent, similar to HTTP. This results in less efficient transfer of data, but allows for one net.tcp.recv object to receive data from multiple clients. In a future version the object may allow a TCP connection to remain intact between data transfers. Some examples of common applications that use TCP are SMTP (email), Telnet (remote login), FTP (file transfer) and HTTP (the www). Some examples of applications that use UDP are DHCP (bootstrap IP protocol), NTP (network time protocol) and Traceroute.

net.maxhole was built in order to make simple network commmunication as easy as possible. Simply instantiating net.maxhole objects in the default form allows instant networked communication without having to think about IP addresses, ports, or protocols. net.maxhole uses multicasting, which is just a UDP transmission to a special kind of IP address that every computer will receive. So whereas with regular UDP you specify a single receiver and the packet is intelligently routed to just that one machine, with a multicast address the routers send the packet to every computer. The range of a multicast packet is controlled by a special piece of data called the time to live that exists in every IP packet header: every time the packet is received by a router, the time to live is decreased by one, and when the time to live reaches zero, the packet is destroyed. For net.maxhole the time to live is 1, so it will only be passed through a single router to other machines in your local network. For more adventurous multicasting, the net.multi.send implementation of the multicasting protocol allows you to set the time to live via the "maxhops" attribute. An example of a system that uses multicasting is Apple's Rendezvous service-discovery technology (which other organizations call ZeroConf.)

In summary, if your networking demands are modest the easiest object to use is net.maxhole. It should be appropriate for most small local network situations you can throw at it. If communication latency and throughput is of utmost importance, and you don't want to multicast to all the machines on the network, then UDP is worth investigating. If reliability is important, then using TCP is recommended.

Q: How can I send and receive data through a router's firewall?

When thinking about networking it is useful to think in terms of clients and servers. A server process sits and waits for a connection on a port, and a client process initiates the communication to the server. In terms of the Max networking classes, net.tcp.recv is a server, and net.tcp.send is a client. A connection between a client and a server has four important pieces of data: the server's IP, the server's port, the client's IP and the client's port.

TCP and UDP both define a group of well-known ports to identify well-known services: FTP is 21, HTTP is 80, etc. Well-known ports tend to live in the range between 0 and 1024. Clients, however, use ephemeral, or short-lived ports. These port numbers - which typically range between 49152 and 65535, depending on the OS - are automatically assigned to a client by the UDP and TCP code that lives in the operating system's kernel. UDP and TCP guarantee the uniqueness of the ephemeral ports they assign, which prevents conflicts in communication between different client processes. This scheme is what allows your computer to have more than one active FTP session going, or be able to download web pages from more than one site at the same time.

A server that wants to simultaneously handle more than one process also needs to use its own ephemeral ports to create connections so that it can keep listening for connection requests on its well-known port. For example, if you make a FTP request to 142.34.56.222:21 from 145.35.52.65:57474, the server may request that the connection be made on port number 50423 instead of 21. ie, the final connection would be between 142.34.56.222:50423 and 145.35.52.65:57474.

Local area network routers add more complication. To the outside world all the computers behind the router have the same IP address, so the router must distribute all incoming packets based on the port. The complication is that each computer controls its own ephemeral ports, and so there is the potential for conflict. For instance, if two computers behind the same firewall both request FTP sessions with outside sources, there is nothing preventing both of the computers from assigning port 50000 to the request.

The solution, which is called NAT (Network Address Translation), ensures that there is no conflict between ports by changing the outgoing port numbers without any of the computers knowing about it. When a router sees a new outgoing request it changes the ephemeral port number for the client computer to a new port number from a table maintained by the router. The remote server returns the request to the port that the router set up, and then before passing the data back to the client computer the router changes the port back to the number that the client is expecting.

Most routers have a built-in firewall that stops incoming data before it reaches computers on the local area network. Usually the router provides a method of configuring the firewall so that an administrator can decide which ports should be open and closed, and which computers in the LAN should receive the data sent to the open ports. When initiating a TCP connection from a client within the LAN to a server outside the firewall it should not be necessary to explicitly configure the firewall since the router will adjust the firewall's rules to appropriately pass data. UDP, however, is a connectionless protocol, so to receive or send UDP packets through a router the port must be explicitly opened by an administrator.

by Ben Nevile on September 9, 2004