[Bernstein09] 2.4. Remote Procedure Call

来源:百度文库 编辑:神马文学网 时间:2024/05/01 12:15:37

2.4. Remote Procedure Call

Remoteprocedure call (RPC) is a programming mechanism that enables a programin one process to invoke a program in another process using an ordinaryprocedure call, as if the two programs were executing in the sameprocess (or more precisely, in the same address space).

Thereare several benefits to programming in the RPC style. First, theprogrammer can still write and reason about a program as if all theprogram’s procedures were linked together in a single process.Therefore, the programmer can focus on correctly modularizing theprogram and ignore the underlying communications mechanism. Inparticular, the programmer can ignore that the program is reallydistributed, which would add significant complexity to the programmingtask if it were made visible.

Second,the RPC style avoids certain programming errors because of the simplerequest-response message protocol that it implies. Using RPC, a programreceives a return for every call. Either the caller receives a returnmessage from the called procedure, or the system returns a suitableexception to the caller so it can take appropriate action. By contrast,using asynchronous message passing, a program has explicit statementsto send and receive messages. These send and receive operations issuedby communicating programs define a communication protocol. Thisrequires the programmer to handle the message sequences and errorsdirectly. For example, each program must be ready to receive a messageafter the message is sent to it. Programs have to cope with certainerror conditions, such as waiting for a message that never arrives, orgiving up waiting for a message and coping with that message if it doeseventually arrive later. In RPC, these problems are dealt with by theRPC implementation rather than by the application program.

Third,RPC implementations can hide the differences in parameter formatbetween the programming languages in which the client’s and server’sprogram are written. RPC implementations also can hide differencesamong processors such as Intel x86, AMD, PowerPC, and SPARC and thedifferences among operating systems such as Windows and Linux.

To understand how RPC works, consider the example in Figure 2.9. This program consists of three procedures:

  • PayCreditCard, which pays a credit card bill

  • DebitChecking, which subtracts money from a checking account

  • PayBill, which calls PayCreditCard and DebitChecking to pay a credit card bill from a checking account

Figure 2.9. Credit Card Payment Example. PayBill brackets the transaction and calls two subprograms, PayCreditCard and DebitChecking, which it calls by RPC.
Code View: Scroll / Show All
Boolean Procedure PayBill (acct#, card#)
{ int acct#, card#;

long amount;
Boolean ok;

Start; /* start a transaction */
amount = PayCreditCard(card#);
ok = DebitChecking(acct#, amount);
if (!ok) Abort else Commit;
return (ok);
}
long Procedure PayCreditCard (card#);
{ int card#;
long amount;

/* get the credit card balance owed */
Exec SQL Select AMOUNT
Into :amount
From CREDIT_CARD
Where (ACCT_NO = :card#);
/* set the balance owed to zero */
Exec SQL Update CREDIT_CARD
Set AMOUNT = 0
Where (ACCT_NO = :card#);
return (amount);
}

Boolean Procedure DebitChecking (acct#, amount);
{ int acct#;
long amount;
/* debit amount from checking balance if balance is sufficient */
Exec SQL Update ACCOUNTS
Set BALANCE = BALANCE - :amount
Where (ACCT_NO = :acct# and BALANCE ≥ amount);
/* SQL Code = 0 if previous statement succeeds */
return (SQLCODE == 0);
}


Letus assume that these three procedures execute in separate processes,possibly on different nodes of a network. Therefore, the invocations ofPayCreditCard andDebitChecking byPayBill are remote procedure calls.

PayCreditCardtakes a credit card account number as input, returns the amount ofmoney owed on that account, and zeroes out the amount owed. The firstSQL statement selects the amount of money from the credit card table,which contains the amount of money owed on each account number. Thesecond statement zeroes out that amount (i.e., the entire balance ispaid off) and returns the amount actually owed for the account.

DebitCheckingsubtracts a given amount of money from a given account. In the SQLstatement, if the balance in that account is greater than or equal tothe amount of money to be debited, then it subtracts the amount ofmoney to be debited from the account balance. In this case, the SQLstatement succeeds and therefore sets SQLCODE to zero, soDebitCheckingreturns true. On the other hand, if the balance in that account is lessthan the amount of money to be debited, then the SQL statement does notupdate the account balance. Since the SQL statement failed, SQLCODE isnot set to zero andDebitChecking returns false.

Each of these programs is useful by itself. ThePayCreditCard program can be used to process credit card bills. TheDebitCheckingprogram can be used to process debits and credits against a checkingaccount from an ATM. Using these two programs, we can easily write aPayBill program that implements a bill-paying service by paying a customer’s credit card bill out of his or her checking account.

ThePayBillprogram takes a checking account number and credit card number andtries to pay the credit card bill out of the checking account. Theprogram starts a transaction, pays the credit card bill (which returnsthe amount of money owed), and tries to debit that money from thechecking account. If theDebitCheckingprogram returns true—meaning that there was enough money to pay thebill—the program commits. If it returns false, then there wasn’t enoughmoney to pay the bill and the transaction aborts. In both cases thePayCreditCard program updates the credit card table. But if thePayBillprogram aborts the transaction, the abort automatically undoes thatupdate, thereby leaving the bill for that credit card account unpaid.(IfDebitChecking returns false, its SQL update failed and has no effect on theACCOUNTS table.)

Transactional RPC

TheRPC runtime system has some extra work to do to allow a transaction toinvoke an RPC. It has to pass the transaction context from the callerto the callee (which may be hidden, as in Figure 2.9and earlier examples) and must throw transaction-related exceptionsback to the caller. In addition to the transaction ID, the context mayinclude security credentials, the identity of the system that startedthe transaction, and other system information that is required by thecallee to continue operating within the same transaction. An RPCmechanism that does this additional work is called a transactional RPC.

Atransactional RPC system may also need to do some work to supporttwo-phase commit. For example, as part of making an RPC call, it mayneed to call the transaction manager on the caller’s and callee’ssystems to notify them that the transaction has now moved to a newsystem. This information is needed later when the two-phase commitprotocol is initiated, so the transaction managers know which systemsare participants in the transaction. We’ll discuss these issues atlength in Chapter 8.

Sometimes,the RPC mechanism itself is used to transmit the two-phase commitmessages. This is an implementation strategy for the vendor of thetwo-phase commit implementation. It is a sensible one, but has noeffect on the functionality available to application developers. Thisis not what is meant by the term “transactional RPC.”

Binding Clients and Servers

The programs shown in Figure 2.9are incomplete in that they don’t show how the caller and calleeprocedures discover each other’s interfaces and establish connectionsthat enable them to communicate. First of all, to make remote procedurecall worthwhile in this situation, thePayBill program would probably be running in a different process, possibly on a different system, than thePayCreditCard orDebitChecking programs. To compile and run the programs on these different systems,PayBill needs to reference the external proceduresPayCreditCard andDebitChecking. This is done by writing an interface definition for each program to be called—in this casePayCreditCard andDebitChecking.

Interface Definitions

An interface definition specifies the name and type of the program and its parameters. It is processed by the interface compiler or stub compiler,which may be part of the programming language compiler if the latterhas built-in RPC functionality. The interface compiler produces severaloutputs, one of which is a header file (consisting of data structures)for the caller to use. In this case the interface compiler wouldproduce header files forPayCreditCard andDebitChecking that could be included with thePayBill program so that it can be compiled. The interface compiler also produces proxy and stub procedures, which are the programs that interface thePayBill caller to thePayCreditCard andDebitCheckingservers via the network. The caller’s program is linked with a proxyand the server’s program is linked with a stub. The interface compilerproduces both the header files and the proxy and stub procedures. Figure 2.10 illustrates the interface compiler operation.

Figure 2.10. Interface Compiler Operation. Theinterface compiler produces header files for the caller and callee touse, and proxy and stub procedures that provide an interface betweenthe caller and callee and the underlying network.


Marshaling

Anotherfunction of the proxy and stub procedures is to lay out the procedurename and parameters into a stream, which can be sent in a message. Thisis called marshaling.

Somecare is needed to avoid marshaling too much information, such asrepeatedly copying and sending the same object class information. Inaddition, it is sometimes hard to maintain identity when sending itemsof a type. For example, Java enumerations don’t maintain identity overRPC.

As part ofmarshaling parameters, the proxy can translate them between the formatof the caller and the callee. In the previous examples, all theprograms were written using the same language, but that needn’t be thecase. ThePayCreditCard andDebitChecking programs might have been written some time ago in one language, whereas thePayBillprogram was added later to introduce the new service and was written ina different language. In this case the client proxy translates theparameters into a standard format that the callee can understand, andthe server stub translates that into the appropriate format for theprocedures calledPayCreditCard andDebitChecking.

Communication Binding

Besideslinking in the proxy and stub, there is the issue of creating acommunication binding between these programs so they can communicateover the network. The runtime system has to know where each serverprocess exists (e.g.,PayCreditCard andDebitChecking), so it can create bindings to each server process when asked (e.g., byPayBill). Two activities are involved:

  • Each server program must export or publish its interface, to tell all the systems on the network that it supports this interface. It must also tell where on the network it can be found.

  • When the PayBill program wants to connect to the server, it must create a communications connection using that information exported by the server.

These activities are ordinarily supported by a registry service.For a Web Service, its interface is typically contained within a WebServices Description Language (WSDL) file that can be retrieved from aregistry. A registry is used to store and retrieve the interfaceinformation and is accessible from any computer in the distributedsystem. For example, when thePayCreditCard program is initialized in a process, its location can be written to the registry service (step 1 in Figure 2.11). This location could be “process 17315” of network node 32.143, URL www.xyz.net (which is defined in the WSDL file). When thePayBill program asks to connect to thePayCreditCard program, it calls the registry service (one of the RPC runtime calls mentioned earlier) to find out wherePayCreditCard is located (step 2). The registry service returns the instances ofPayCreditCard it knows about (in this case, there is one). If there are any running,PayBillmay connect to any one of them (step 3). Some implementations ofcommunication bindings automate server selection to balance the loadacross multiple identical servers. Having received the network addressof the server process number (in this case 32.143.17315), thePayBill process can now communicate with the server, so it can issue RPCs toPayCreditCard.

Figure 2.11. Using a Registry Service. Whenit’s initialized, the server stores its name and address in theregistry service. Later, the client gets the server’s address and usesit to create a communication binding.


Mappinginterface or server names into network addresses has to be a dynamicfunction, to support periodic reconfiguration. For example, if a serveron one system fails, and the system manager recreates that server onanother system, the mapping needs to be updated to reflect the newlocation of the server. The system manager may also want to moveservers around to rebalance the load across servers, for example due tochanging input patterns.

Theregistry that supports the binding activity needs to be accessible fromall machines in the distributed system. This functionality ordinarilyis supported by a network directory service, usually by replicating itscontents onmany servers. For this reason, registries are often implemented on topof a network directory. For good performance, the network directoryprovides a client layer that caches recently accessed information. Theclient usually has connections to multiple directory services, so itcan quickly switch between them if one fails.

Insteadof using a replicated repository, a simpler primary-copy approach maybe supported. In this approach, a central copy of the repository ismaintained, and each system keeps a cached copy that is periodicallyrefreshed. This arrangement gives fast access to the cached mappingduring normal operation. When a reconfiguration requires that thecentral copy be updated, the central copy must notify the other systemsto refresh their caches.

Muchof this work is done by the RPC runtime system, but some may be exposedto the application. For example, the application may have to issuecalls to get the network address and create a communications binding.Most systems hide this. A distinguishing feature among differentimplementations of RPC is how much of this complexity the applicationprogrammer has to cope with.

Dispatching

Whenan RPC call arrives at the target system, the RPC runtime library needsto invoke the designated server process. If the multithreaded processor server pool doesn’t exist, then the runtime creates it. If theserver is a multithreaded process, then the runtime needs to assign thecall to a thread. It can create a new thread to process the call,assign the call to an existing thread, or put the call packet on aqueue (e.g., if the process is already executing its maximum allowablenumber of active threads). If a server pool is used, then it assignsthe call to a server process, or if all server processes are busy itenqueues the request.

Application Programmer’s View

Althoughthe RPC style does simplify some aspects of application programming, itmay also introduce some new complexities. First, to write theseprograms, one may have to write interface definitions for the servers.This is a new programming task that isn’t needed in the single-processcase.

Second,to support synchronous waiting by the caller, one needs a multithreadedclient so that blocking a caller doesn’t stall the client process.Programmers find it challenging to write thread-safeapplications for multithreaded servers. Program-level locking problemsslow throughput, consume processor cycles, or worse—a single memorycorruption can stop many threads. As the number of available processorcores is projected to increase dramatically in the coming years,finding ways to simplify thread-safe programming is a hot researchtopic in computer science.

Third, the client and server programs need startup code to connect up or bindthe programs together before they first communicate. This includesimporting and exporting interfaces, defining security characteristics,setting up communication sessions, and so on. Although much of this canbe hidden, sometimes a lot of it isn’t. Finally, communication failuresgenerate some new kinds of exceptions, such as a return message thatnever shows up because of a communications or server failure. Suchexceptions don’t arise in the sequential case when the programs arerunning inside of the same process.

Object-Oriented RPC

In an object-oriented programming model, procedures are defined as methods of classes. There are two types of methods, class methods and object methods. A class method is invoked on the class itself, such as the methodnew,which creates an object (i.e., instance) of the class. Most methods areobject methods, which are invoked on an object of the class, not theclass itself. For example, the procedures in Figure 2.9 could be defined as object methods of three classes:PayBill as a method of the Billing class,PayCreditCard as a method of the CreditCard class, andDebitChecking as a method of the CheckingAccount class. (Class definitions are not shown in Figure 2.9.)

Toinvoke an object method, the caller uses a reference (i.e., a binding)to the object. This could be created by the caller when it invokes themethodnew. If the class is remote, then this invocation ofnewis itself an RPC, which returns a reference to a new object of theremote class. The object lives in the remote class, while the referenceis local to the caller. The reference is thus a local surrogatefor the remote object. The caller can now invoke an object method onthe surrogate, which the caller’s runtime system recognizes as an RPCto the real object that resides in the remote class.

As an optimization, the invocation of the methodnew usually is executed locally in the caller’s process by creating the surrogate and not yet calling the methodnewon the remote class. When the caller invokes an object method on thenewly created object for the first time, the caller’s runtime systemsends both the invocation of the methodnewand the object method in a single message to the remote class. Thissaves a message round-trip between the caller and the remote class.Since the only thing that the caller can do with the newly createdobject is to invoke methods on the object, there’s no loss offunctionality in grouping the remote invocation of the methodnew with the first invocation of a method on it.

Aremote object may need to live across multiple object method calls, sothat the object can retain state information that is accessible tolater invocations of the object’s methods. For example, the firstinvocation of an object could invoke an ExecuteQuery method, whichexecutes an SQL query. Later invocations of the object could invoke aGetNext method, each of which returns the next few rows that are in theresult of that query. Other examples of retained state are discussed inSection 2.5.

Callbacks

A callback enables the callee of an RPC to invoke the caller. The caller of the RPC includes a so-called context handleas a parameter to the RPC. The callee can use the context handle tocall back to the caller. One use of callbacks is to pass along a largeparameter from caller to callee a-chunk-at-a-time. That is, instead ofsending the large parameter in the original RPC to the callee, thecaller sends a context handle. The callee can use this context handleto call back to the caller to get a chunk of the parameter. It executesmultiple callbacks until it has received the entire large parameter.

Thecontext handle passed in a callback could be an object. In a sense, acallback is an object-oriented RPC in reverse; it is the RPC calleethat holds a reference to the caller, rather than having the callerhold a reference to the callee.

An RPC Walkthrough

Nowthat we have explained the main components of an RPC system, let’s walkthrough an example to see what happens, beginning-to-end. In Figure 2.12, the client application calls the server application. The client application could be thePayBill program, for example, and the server application could bePayCreditCard. As we discussed, there are proxy and stub programs and a runtime system along the path.

Figure 2.12. RPC Implementation. The numbers indicate the sequence of actions to process a call from the client to the PayCreditCard server program.
[View full size image]

The client application issues a call to the server, sayPayCreditCard. This “Call PayCreditCard” statement actually calls the client’sPayCreditCard proxy (1). The proxy is a procedure with the same interface as the server application; it looks exactly likePayCreditCard to the client. Of course thePayCreditCard proxy doesn’t actually do the work. All it does is send a message to the server.

ThePayCreditCard proxy marshals the parameters ofPayCreditCardinto a packet (2). It then calls the communications runtime for theRPC, which sends the packet as a message to the server process (3).

TheRPC runtime creates a communications binding between the processes andadds it as a parameter to subsequent send and receive operations. Theclient’s RPC runtime sends each message to the server’s RPC runtime.The server’s RPC runtime contains a binding of message types toprocesses and procedures within them and uses it to direct each messageto the right procedure.

Theserver process’s RPC runtime system receives the message (4). It looksat the packet’s header and sees that this is a call to thePayCreditCard program, so it calls thePayCreditCard server stub. The server stub unmarshals the arguments and performs an ordinary local procedure call to thePayCreditCard program (5). ThePayCreditCard program takes the call and runs just as if it had been called by a local caller instead of a remote caller (6).

WhenPayCreditCard completes, the whole mechanism runs in reverse:PayCreditCard does a return operation to the program that called it. FromPayCreditCard’sviewpoint, that’s the server stub. When it returns to the server stub,it passes a return value and perhaps some output parameters. The serverstub marshals those values into a packet and passes them back to theRPC runtime system, which sends a message back to the caller.

Thecaller’s system receives the packet and hands it to the correctprocess. The process’s RPC runtime returns to the correct proxy forthis call, which unmarshals the results and passes them back as part ofits return statement to the originalPayCreditCard call, the client’s call.

System Characteristics of RPC

AnRPC system needs to be engineered for security, fault tolerance,performance, and manageability. Some RPC systems are engineeredspecifically for interoperability across multiple programminglanguages, data formats, and operating systems. We discuss these systemissues in the following subsections.

Security of RPC

Whena client binds to a server, the client first calls the runtime systemto find the server’s address and to create a communications binding tothe server. A secure gatekeeper is needed to control the creation ofthese bindings, since not all clients should be able to connect to anyserver for any purpose. As an extreme example, it shouldn’t be possiblefor any workstation to declare itself the network-wide electronic mailserver, since it would allow the workstation to eavesdrop on everyone’smail.

In general,when a client connects to a server, it wants to know who it is actuallytalking to—that the server is who it says it is. Moreover, the serverwants to authenticate the client, to be sure the client is who itclaims to be. This requires authentication;that is, a secure way to establish the identity of a system, a user, amachine, and so forth. Thus, when binding takes place, the runtimesystem should authenticate the names of the client and the server (see Figure 2.13).This ensures, for example, that the server can prove that it really isthe mail server, and the client can prove that it’s really a clientthat’s allowed to connect to this server.

Figure 2.13. RPC Security. Thecommunication system authenticates the client and server when itcreates a communication binding between them (in 1 and 2). The serverchecks the client’s authorization on subsequent calls for service (in3).


Having authenticated the client, the server still needs to exercise access control;that is, to check whether a client is authorized to use the procedure.Access control is entirely up to the server. The server’s transactionalmiddleware or operating system may help by offering operations tomaintain a list of authorized clients, called an access control list. But it’s up to the server to check the access control list before doing work on behalf of a client.

Fault Tolerance in RPC

Acommon fault tolerance problem is determining what a program should doif it issues an operation but doesn’t get a reply that tells whetherthe operation executed correctly. We saw an example of this in Section 1.3, Handling Real-World Operations,in dealing with a missing reply to a request to dispense $100 from anATM. This problem also arises in RPC when a client issues a call anddoes not receive a reply. The key question is whether it is safe toretry the operation.

Suppose a client calls a server that processes the call by updating a database, such as theDebitChecking program in Figure 2.9.If the client does not receive a return message, it’s not safe to trythe call again, since it’s possible that the original call executed,but the return message got lost. CallingDebitChecking again would debit the account a second time, which is not the desired outcome.

The property that says it is safe to retry is called idempotence. An operation is idempotentif any number of executions of the operation has the same effect as oneexecution. In general, queries are idempotent—it doesn’t matterhow many times you call, you always get back the same answer (if thereare no intervening updates) and there are no side effects. Most updateoperations are not idempotent. For example,DebitChecking is not idempotent because executing it twice has a different effect than executing it just once.

Aserver is idempotent if all the operations it supports are idempotent.It is useful if a server declares that it is idempotent (e.g., itsoperations are all queries). The RPC runtime system learns that factwhen it creates a binding to the server. In this case, if the clientRPC runtime sends a call but does not receive a reply, it can try tocall again and hope that the second call gets through. If the server isnot idempotent, however, it’s not safe to retry the call. In this case,the client could send a control message that says “Are you there?” or“Have you processed my previous message?” but it can’t actually sendthe call a second time, since it might end up executing the call twice.

Evenif it resends calls (to an idempotent server) or it sends many “Are youthere?” messages (to a non-idempotent server), the caller might neverreceive a reply. Eventually, the RPC runtime will give up waiting andreturn an exception to the caller. The caller cannot tell whether thecall executed or not. It just knows that it didn’t receive a reply fromthe server. It’s possible that a server will reply later, after the RPCruntime returns an exception. At this point, it’s too late to doanything useful with the reply message, so the RPC runtime simplydiscards it.

Looking at the issue a bit more abstractly, the goal is to execute an idempotent operation at least once and to execute a non-idempotent operation at most once. Often, the goal is to execute the operation exactly once.Transactions can help. A call executes exactly once if the server isdeclared non-idempotent and the RPC executes within a transaction thatultimately commits. We will explore exactly-once behavior further in Chapter 4.

System Management

We’vediscussed RPC assuming that both the client and server process arealready up and running, but of course somebody has to make all thishappen to begin with. These are system management activities: to createclient and server processes and communications sessions to support RPCbindings. Sometimes these are dynamic functions that are part of theRPC system. In TP systems, they are usually static functions that arepart of initializing the application, done in the transactionalmiddleware.

Thesystem manager also has to track the behavior of the system. Thisrequires software to monitor all the low-level system components andmake them visible with abstractions that are intelligible to the systemmanager. For example, if someone calls the Help Desk saying, “I can’trun transactions from my PC,” the system manager has to check, amongother things, whether the PC is communicating with the server, whetherthe server processes are running, whether the client and server arerunning compatible versions of the proxy and stub, and so on.Similarly, if there are performance problems, the system manager has totrack the message load for each of the systems, determine whether theserver has enough threads to run all the incoming calls, and so on.

Interoperability of RPC

Inthe example, suppose that the client and server applications usedifferent programming languages with different data formats. In thatcase, the client proxy and the server stub need to translate theparameters between the client’s and server’s format. There are two waysto do this:

  • Put the parameters into a standard, canonical format that every server knows how to interpret.

  • Ensure that the server’s stub can interpret the client’s format, known as receiver-makes-it-right.

Canonicalforms include XML Schema, CDR (used in RMI/IIOP), and XDR (used in theSun RPC). When using a canonical format, the client proxy translatesthe parameters into the standard format, the server translates them outof standard format, and likewise for the return parameters—the serverstub puts them into standard format and the client proxy puts them backinto client format.

Thisis fine if the client and server are running different languages, butwhat if they’re running the same language? For example, suppose they’reboth using Java or C#. The client proxy is going through all the extrawork of taking the data out of Java format and putting it into standardformat, and then the server is taking it out of standard format andputting it back into Java format. For this reason, thereceiver-makes-it-right technique often is used. The client proxymarshals the parameters in the client’s format, not in a standardformat, and tags them with the name of the format it’s using. When thereceiver gets the parameters, if it sees that they’re in the sameformat that the server is using, it just passes them unmodified to theserver. However, if they’re not in the right format, it does thetranslation, either via a standard format or directly into the targetformat. This saves the translation expense in many calls, but requiresthe server to support format translations for every format it might seeas input.

Evenwhen the client and server are running the same language in the sameexecution environment, some machine-dependent translation may berequired. This arises because there are two different ways of layingout bytes in words in computer memory, sometimes called little-endianand big-endian. The difference is whether the bytes are laid out inincreasing addresses starting with the least-significant byte(little-endian) or most-significant byte (big-endian) within the word.In other words, is the low-order bit in the first or last position ofthe word. (Intel and compatible processors use little-endian. Motorola,PowerPC, SPARC, and Java wire format use big-endian. ARM and somePowerPC and SPARC processors are switchable.) When moving packetsbetween systems, it may be necessary to translate between little-endianand big-endian format, even if both systems are running the sameimplementation of the same language. Again this can be hidden by theproxies and stubs using one of the parameter translation mechanisms.

Performance of RPC

RPCis a heavily used mechanism when a TP system is distributed. Eachtransaction that’s split between two TP systems, such as between aclient PC and a server back-end, needs at least one RPC to send therequest and return the reply. It’s very important that this executesquickly. If it isn’t very fast, people will avoid using it, whichcompletely defeats its purpose.

There are basically three parts to the execution, which were illustrated in Figure 2.12.One is the proxy and stub programs that marshal and unmarshalparameters. The second is the RPC runtime and communications software,which passes packets between the stub and the network hardware. Andthen there’s the network transfer itself, which physically passes themessages through the communications hardware and over the wire to theother system.

Inmost RPC systems, the time spent performing a call is evenly splitamong these three activities, all of which are somewhat slow. In alocal area network, the overall performance is typically in the rangeof about 10,000 to 15,000 machine-language instructions per remoteprocedure call, which is several hundred times slower than a localprocedure call. So it’s very important to optimize this. There arelower-functionality research implementations in the 1500 to 2000instruction range. For web services that rely on text-based dataformats, such as XML, performance is typically even slower. Techniquesto make the system run faster include avoiding extra acknowledgmentmessages, using the receiver-makes-it-right technique to make theproxies and stubs faster, optimizing for the case where all theparameters fit in one packet to avoid extra control information andextra packets, optimizing the case where client and server processesare on the same machine to avoid the full cost of a context switch, andspeeding up the network protocol.

How to Compare RPC Systems

RPChas become a standard feature of distributed computing systems, whetheror not those systems run transactions. For example, Microsoft’s Windowsoperating systems and Linux support RPC as a built-in function. Toget RPC integrated with transactions often requires using sometransactional middleware. Many operating systems have some of thisintegration built in. This appeared first in Tandem’s Guardianoperating system and then in Digital’s OpenVMS (both now part of HP).

Whenshopping for a transactional middleware product, simply knowing that itsupports RPC, or even RPC with transactions, is not enough. You reallyhave to go to the next layer of detail to understand the exactprogramming model and how difficult it is to write programs. Some ofthese interfaces are low-level and hard to program, whereas others arehigh-level and relatively easy to program.

Onething to look for when evaluating RPC systems is which languages anddata types are supported. For example, some systems support only ageneric proxy and stub procedure, which require application programmingto marshal parameters. Most proxies and stubs are unable to translatecomplex data structures such as an array. Or they may handle it as aparameter, but only for a certain language. Bulk data transfer isdifficult using some RPC systems, for example scrolling through a longtable a portion at a time.

Anotherissue is whether transactional RPC is supported. If so, what types ofcontext are transparently propagated and what types are the applicationprogrammer’s responsibility? The types of context might include usercontext, device context, security context, file or database context,and of course transaction context.

PopularRPC implementations include the Remote Method Invocation (RMI) in Java,the Internet Inter-ORB Protocol (IIOP) from CORBA, and the MicrosoftRPC on Windows. RMI, IIOP, and Microsoft RPC closely follow theconcepts and implement the mechanisms described in the previoussections.