Network Objects

Distributed programming involves programs running on a number of networked computers. The reasons for running on several computers may be:

It is possible to have several programs cooperating in this way using the low level network communication primitives, for instance the Berkeley Sockets. The X windows graphics server accepts socket connections from remote programs, receives graphics commands and sends out events (key or mouse input) on these connections. Nonetheless, this is relatively painful to implement.

Remote procedure calls

To alleviate the difficulty of developing client server distributed applications, tools and libraries such as the Sun Remote Procedure Calls (RPC) have been developed. It is then relatively easy to separate a number of procedures from a program and have them transparently running on a remote computer; the procedures in the remote program are accessed almost as easily as any procedure in the local program.

To achieve this, each procedure is assigned a unique identifier, and procedures to write and read the arguments and return value to/from the network must be supplied. A tool named rpcgen reads a file describing the remote procedure calls (their arguments and return values) and automatically generates the procedures that cross the network boundary. Each interface (set of remote procedures) is registered at a central coordinator which allocates a unique identifier for it.

The rpcgen tool also creates a main program for the remote procedure server, which registers the server to the portmap daemon, accepts connections from remote programs, and receives and executes remote calls. The programmer then simply has to implement these calls. Finally, rpcgen creates procedure stubs that relay their arguments to the procedure server across the network.

A common problem when passing arguments to remote procedures is deciding if a copy of the argument must be transmitted or if a reference (pointer) is sufficient. When a reference is transmitted, the remote procedure must not try to access it as a local reference. Furthermore, in a garbage collected environment, it is dangerous to freely transmit references. The transmitted reference may become invalid if the garbage collector moves the object to another address. Moreover, the garbage collector must not destroy the object while references are still being held in remote programs.

To make a remote call, a program gets a connection to the procedure server; the portmap daemon is queried and returns the address of the server, given the interface number and procedure name. The remote procedures are then accessed through the stubs, almost like local procedures, with the difference that the handle to the procedure server must be added to the list of arguments. The stubs relay the arguments to the procedure server, which executes the remote procedure call, and come back with the return value received.

Building distributed programs in this way is somewhat simpler than using network connections directly. However, it is still more complex than local calls. The connection to the remote procedure server must be established. Remote calls are likely to fail from time to time because of network failures and each call must be checked for errors. Finally, it is relatively difficult to build programs which do more than just wait for incoming requests. Indeed, Sun Remote Procedure Calls were developed in an environment without multi-threading. With threads, you may have one or more threads awaiting asynchronous requests while another thread carries its own computation.

Network objects

A step up from the Sun Remote Procedure Call model is to make the remote access even more transparent. The file describing remote procedures, that must be written by the programmer, would not be required if more powerful tools could generate automatically the procedures to write and read objects to/from the network, directly from their declarations.

Furthermore, Remote Method Calls may offer a better model than procedures. Indeed, in remote procedure calls, a server handle must be supplied to the call. With objects, however, it is possible to have two derived types from a base type, a concrete type which provides data members and implements the methods, and a proxy type which contains the server handle and has methods which simply relay the arguments to the concrete object in a remote, server, process. The interesting property is that both types of objects are accessed in the same way, through their common ancestor type.

The DEC Systems Research Center came up with a clever design for network objects along these lines, as described in research report 115 Network Objects. They build upon existing Modula-3 functionality: threads, garbage collection, weak references, run time type information, procedures to read/write objects to streams (disk, network connection...) and Modula-3 code parsing tools.

Objects for which proxys are needed, because they should be accessible remotely, must inherit from the NetObj type. The base type must be a pure object type: only methods, no data members. From there, a concrete type may be derived which contains data members and provides methods to operate on the data members. A tool named stubgen, based on the Modula-3 code parsing library M3tk, parses the base type declaration and automatically generates the derived proxy type.

The NetObj runtime module then takes care of all network objects communications. Whenever arguments are sent to remote processes, ordinary objects are copied (along with the objects they are refering to recursively). Network objects, however, are not copied but replaced by network references (object identifier within the process, process identifier within a computer and computer identifier within the Internet). Network references are created for concrete objects. Proxys only store the network reference of the corresponding concrete object. A table is maintained to store all the concrete objects for which network references were exported. A concrete object should not be garbage collected until all exported network objects references have been dropped by the remote processes.

Whenever arguments are received by the NetObj runtime, network references must be replaced by network objects. If the network reference corresponds to a concrete object within this process, it is readily found in the table of exported concrete objects. Otherwise, a proxy must be created for the remote object, if it does not already exist. A table of weak references to proxys created is thus maintained. The first time a network reference for a given concrete object in a remote process is received, a proxy is created for it. When a network reference to the same object is received later, the proxy is simply found in the table and reused.

When a proxy becomes unused in a process, it gets garbage collected. The weak reference in the proxy table gets cleared and its cleaning procedure is called. This way, a message to the remote process can be sent to inform the concrete object that its network reference is not used any more in this process.

Often, the set of types defined in each process is not the same. In such a case, when a network reference is received, a proxy of the most specialized type known to this process is created. Suppose that in process A Rectangle inherits from Shape and NetObj but process B only knows Shape which inherits from NetObj. When a network reference to a Rectangle is received by B, a proxy of type Shape will be created. However, if that network reference is further sent from B to process C which knows about Rectangle, a proxy of type Rectangle in nevertheless created.

Each process may be both a client and a server. Indeed, a process may carry its own computation and do remote method calls while at the same time asynchronously receiving remote method calls. This is possible because the NetObj runtime starts its own execution thread. Furthermore, it usually maintains a pool of available threads such that a number of remote method calls may be handled simultaneously by several threads. This is important because otherwise a remote method call involving a long computation would block all the other remote method calls, if a single thread was used.

The remaining problem is how are network objects exported in the first place. A special process, the netobjd daemon, maintains a table of exported network objects and their names. A process may then export one of its concrete objects to the daemon and give it a name. Another process may then retrieve by name this object (network reference) from the daemon. The current implementation does not include access control.


Copyright 1995 Michel Dagenais, dagenais@vlsi.polymtl.ca, Wed Mar 8 14:41:03 EST 1995