Distributed System Network Configuration

The runtime configuration of a distributed system depends to a certain extent on the topology of the desired network. There are two typical scenarios in the IP networking world. One is a local area network (LAN), in which all members of the system are on the same IP subnet and on the same switched Ethernet network. This topology would be found in systems for large scale distributed computations or simulations. The second scenario involves clusters of computing nodes (i.e. general purpose computers, motes, etc.) which are located on a LAN with each other, but in which each cluster is on a different network or subnet. This scenario is referred to as a wide area network (WAN) configuration, in which two or more networks are linked together. An example of this scenario is a fleet of widely separated autonomous vehicles, each with a system of local nodes on the same subnet for processing, and connections to other vehicles over a WAN for information sharing.

In the first scenario, the members of a distributed system can often be configured to automatically discover each other, such that no explicit network configuration is required. This allows for a very flexible architecture in which new nodes can join and leave the network dynamically as they are needed. In a Distrix-based system, this architecture would use the reliable UDP protocol, to automatically discover other Distrix nodes with multicast discovery packets. A Distrix system configured in this manner would allow each node to automatically connect to and communicate with every other node in the network.

In the second scenario, the members of a distributed system generally need to be explicitly configured to connect to one another. In some situations (such as a Distrix based system using multicast discovery with routers configured to forward multicast packets appropriately) automatic discovery is still possible, but this is often not the case. When automatic discovery is not possible, each node of the system must be configured to initiate a connection to at least one other node in the system so that information can be exchanged. An advantage of this type of manual connection setup is that you are given fine-grained control over the network topology, and can sometimes design a system that is optimized for the task at hand. For example, if one node communicates frequently with another, you would likely want a direct connection between them. However, if two nodes communicate only infrequently, an indirect connection through the network is probably sufficient.

As you can see, the type of networks involved in a distributed system determine not just the network configuration of the system, but also the methods used to achieve that configuration.