Remote Procedure Calls

来源:百度文库 编辑:神马文学网 时间:2024/04/18 15:21:39
October 1st, 1997 byEd Petron inSysAdmin
A thorough introduction to RPC for programmers of distributed systems.
As any programmer knows, procedure callsare a vital software development technique. They provide theleverage necessary for the implementation of all but the mosttrivial of programs. Remote procedure calls (RPC) extend thecapabilities of conventional procedure calls across a network andare essential in the development of distributed systems. They canbe used both for data exchange in distributed file and databasesystems and for harnessing the power of multiple processors. Linuxdistributions provide an RPC version derived from the RPC facilitydeveloped by the Open Network Computing (ONC) group at SunMicrosystems.
In case the reader is not familiar with the following terms,we will define them here since they will be important in laterdiscussion:
Caller: a program which calls a subroutine
Callee: a subroutine or procedure which is called by the caller
Client: a program which requests a connection to and service from a network server
Server: a program which accepts connections from and provides services to a client
There is a direct parallel between the caller/calleerelationship and the client/server relationship. With ONC RPC (andwith every other form of RPC that I know), the caller alwaysexecutes as a client process, and the callee always executes as aserver process.
The Remote Procedure Call Mechanism
In order for an RPC to execute successfully, several stepsmust take place:
The caller program must prepare any input parameters to be passed to the RPC. Note that the caller and the callee may be running completely different hardware, and that certain data types may be represented differently from one machine architecture to the next. Because of this, the caller cannot simply feed raw data to the remote procedure.
The calling program must somehow pass its data to the remote host which will execute the RPC. In local procedure calls, the target address is simply a machine address on the local processor. With RPC, the target procedure has a machine address combined with a network address.
The RPC receives and operates on any input parameters and passes the result back to the caller.
The calling program receives the RPC result and continues execution.
External Data Representation
As was pointed out earlier, an RPC can be executed betweentwo hosts that run completely different processor hardware. Datatypes, such as integer and floating-point numbers, can havedifferent physical representations on different machines. Forexample, some machines store integers (C ints) with the low orderbyte first while some machines place the low order byte last.Similar problems occur with floating-point numeric data. Thesolution to this problem involves the adoption of a standard fordata interchange.
One such standard is the ONC external data representation(XDR). XDR is essentially a collection of C functions and macrosthat enable conversion from machine specific data representationsto the corresponding standard representations and vice versa. Itcontains primitives for simple data types such as int, float andstring and provides the capability to define and transport morecomplex ones such as records, arrays of arbitrary element type andpointer bound structures such as linked lists.
Most of the XDR functions require the passing of a pointer toa structure of “XDR” type. One of the elements of this structureis an enumerated field called x_op. It‘spossible values are XDR_ENCODE,XDR_DECODE, or XDR_FREE. TheXDR_ENCODE operation instructs the XDR routineto convert the passed data to XDR format. TheXDR_DECODE operation indicates the conversion ofXDR represented data back to its local representation.XDR_FREE provides a means to deallocate memorythat was dynamically allocated for use by a variable that is nolonger needed. For more information on XDR, see the informationsources listed in the References section of this article.
RPC Data Flow
The flow of data from caller to callee and back again isillustrated in Figure 1. The calling program executes as a clientprocess and the RPC runs on a remote server. All data movementbetween the client and the network and between the server and thenetwork pass through XDR filter routines. In principle, any type ofnetwork transport can be used, but our discussion of implementationspecifics centers on ONC RPC which typically uses eitherTransmission Control Protocol routed by Internet Protocol (thefamiliar TCP/IP) or User Datagram Protocol also routed by InternetProtocol (the possibly not so familiar UDP/IP). Similarly, any typeof data representation could be used, but our discussion focuses onXDR since it is the method used by ONC RPC.

Figure 1. RPC Data Flow
Review of Network Programming Theory
In order to complete our picture of RPC processing, we‘llneed to review some network programming theory.
In order for two processes running on separate computers toexchange data, an associationneeds to be formed on each host. An association is defined as thefollowing 5-tuple: {protocol, local-address,local-process, foreign-address, foreign-process}
The protocol is the transport mechanism(typically TCP or UDP) which is used to move the data betweenhosts. This, of course, is the part that needs to be common to bothhost computers. For either host computer, thelocal-address/process pair defines theendpoint on the host computer running that process. Theforeign-address/process pair refers to theendpoint at the opposite end of the connection.
Breaking this down further, the termaddress refers to the network address assignedto the host. This would typically be an Internet Protocol (IP)address. The term process refers not to anactual process identifier (such as a Unix PID) but to some integeridentifier required to transport the data to the correct processonce it has arrived at the correct host computer. This is generallyreferred to as a port. The reasonport numbers are used is that it is not practical for a processrunning on a remote host to know the PID of a particular server.Standard port numbers are assigned to well known services such asTELNET (port 23) and FTP (port 21).
RPC Call Binding
Now we have the necessary theory to complete our picture ofthe RPC binding process. An RPC application is formally packagedinto a program with one or moreprocedure calls. In a manner similar to theport assignments described above, the RPC program is assigned aninteger identifier known to the programs which will call itsprocedures. Each procedure is also assigned a number that is alsoknown by its caller. ONC RPC uses a program calledportmap to allocate port numbersfor RPC programs. It‘s operation is illustrated in Figure 2. Whenan RPC program is started, itregisters itself with the portmap process running on the same host.The portmap process then assigns the TCP and/or UDP port numbers tobe used by that application.

Figure 2. Portmap Operation
The RPC application then waits for and accepts connections atthat port number. Prior to calling the remote procedure, thecaller also contacts portmap in order toobtain the corresponding port number being used by the applicationwhose procedures it needs to call. The network connection providesthe means for the caller to reach the correct program on the remotehost. The correct procedure is reached through the use of adispatch table in the RPC program. The same registration processthat establishes the port number also creates the dispatch table.The dispatch table is indexed by procedure number and contains theaddresses of all the XDR filter routines as well as the addressesof the actual procedures.
RPCGEN: The Protocol Compiler
Listing 1. Source foravg.x
If the discussion of the mechanisms supporting RPC soundscomplex, that‘s because it is. Fortunately, the development of RPCapplications can be greatly simplified through the use ofrpcgen, the protocol compiler.rpcgen has its own input language which is used to declareprograms, their procedures and the data types for the procedures‘parameters and return values. This is best illustrated by anexample. The source code for an average procedure is shown inListing 1. If we store this source code in a file called avg.x andinvoke rpcgen with the following command:
rpcgen avg.x
Obtain the header file avg.h shown in Listing 2. This filecontains all of the function prototypes and data declarationsneeded for the development of our application. It will alsogenerate three other source files:
avg_clnt.c: the stub program for our client (caller) process
avg_svc.c: the main program for our server (callee) process
avg_xdr.c: the XDR routines used by both the client and the server
These sources are to be used “as is” and must not be edited.
Listing 2. Header Fileavg.h
To complete the application at the server end, we need codeto provide the actual “smarts” required to correctly process theinput data. This must be created manually. The code for the sampleapplication presented here is shown in Listing 3. This code takesthe XDR decoded array from the client and separates and averagesthe values. It returns the result which is then XDR encoded fortransmission back to the client.
Listing 3. Server Code for AverageApplication
To complete the application at the client end, the input datamust be packed into XDR format, so that it can be sent to theserver. The client program is also generated manually and is shownin Listing 4. The Makefile shown in Listing 5 can be used to buildthe application.
Listing 4. Client Code for AverageApplication
Listing 5. Makefile
Testing and Debugging the Application
The best way to test the RPC application is to run both theclient and the server (the caller and callee) on the the samemachine. Assuming that you are in the directory where both theclient and the server reside, start the server by entering thecommand:
avg_svc &
The rpcinfo utility can beused to verify that the server is running. Typing the command:
$ rpcinfo -p localhostgives the following output:program vers proto port100000 2 tcp 111 portmapper100000 2 udp 111 portmapper22855 1 udp 122122855 1 tcp 1223Note that 22855 is the program number of our application from avg.xand 1 is shown as the version number. Since 22855 is not aregistered RPC application, the rightmost column is blank. If weadd the following line to the /etc/rpc file:avg 22855rpcinfo then gives the following output:program vers proto port100000 2 tcp 111 portmapper100000 2 udp 111 portmapper22855 1 udp 1221 avg22855 1 tcp 1223 avgTo test the application, use the command:$ ravg localhost $RANDOM $RANDOM $RANDOMand the following values are returned:value = 9.196000e+03value = 2.871200e+04value = 3.198900e+04average = 2.329900e+04Since the first argument to the command is the DNS name for thehost running the server, localhost is used. If you have access to aremote host that allows RPC connections (ask the systemadministrator before you try), the server can be uploaded and runon the remote host, and the client can be run as before, replacinglocalhost with the DNS name or IP address of the host. If yourremote host doesn‘t allow RPC connections, you may be able to runyour client from there, replacing localhost with the DNS name or IPaddress of your local system.
A Brief Look at DCE RPC
The ONC implementation of RPC is not the only one available.The Open Software Foundation has developed a suite of tools calledthe Distributed Computing Environment (DCE) which enablesprogrammers to develop distributed applications. One of these toolsis DCE RPC which forms the basis for all of the other services thatDCE provides. Its operation is quite similar to ONC RPC in that ituses components that closely parallel those of ONC RPC.
Application interfaces are defined through an InterfaceDefinition Language (IDL) which is similar to the language used byONC RPC to define XDR filters. Network Data Representation (NDR) isused to provide hardware independent data representation. Insteadof using programmer-defined integer program numbers to identifyservers as does ONC RPC, DCE RPC uses a character string called auniversal unique identifier (UUID) generated by a program calleduuidgen. A program calledrpcd (the RPC daemon) takes theplace of portmap. An IDL compiler can be used to generate C headersand client/server stubs in a manner similar to rpcgen.
Although the entire DCE suite is commercially sold andlicensed, the RPC component (which is the basis for all the otherservices) is available as freeware. See the references section formore information on DCE RPC.
Further Study
The sample application presented here is certainly a naiveone, but it serves well in presenting the basic principles of RPCs.A more interesting set of applications can be found in the NetworkInformation System (NIS) package for Linux (see the referencessection). Also, the Linux kernel sources contain an implementationof Sun‘s Network File System (NFS), an excellent example of the useof RPC applied to the problem of distributed file access.
In addition to distributed data access, RPC can also be usedto harness the unused processing power present on most networks.The book Power Programming with RPC, listed inthe references section, presents an image processing applicationthat uses RPC to distribute CPU intensive tasks over multipleprocessors. With RPC, you have the capability to boost theperformance of your applications without spending a dime onadditional hardware.
References

Ed Petronis a computer consultant interested inheterogeneous computing. He holds a Bachelor of Music in keyboardperformance (piano, harpsichord and organ) from Indiana Universityand a Bachelor of Science in computer science from Chapman College.His home page, The Technical and Network Computing Home Page athttp://www.leba.net/~epetron, is dedicated to Linux, The X WindowSystem, heterogeneous computing and free software. Ed can bereached via e-mail at epetron@wilbur.leba.net.