[Bernstein09] Section 5.4. Transactional Properties

来源:百度文库 编辑:神马文学网 时间:2024/04/29 06:41:24
5.4. Transactional Properties
Althoughit is tempting to execute all the steps of a business process withinone transaction, the vast majority of business processes require theexecution of more than one transaction. There are many reasons forthis, such as the following:
Resource availability: At the time the request to execute the business process is taken as input, only some of the people or systems that are necessary to execute the request may be available. For example, when a customer submits an order, it is immediately stored in the order processing database. But if the request arrives after normal business hours, there may be no one to process it until the next business day. As another example, one step in processing an expense claim may be getting a manager’s approval, but the manager only sets aside time to approve claims twice a week.
Real-world constraints: Processing an automobile insurance claim may require the customer to bring in the car for damage inspection and get two estimates for the cost of the repair. This could take weeks.
System constraints: When executing a money transfer between two banking systems (e.g., to automatically pay a credit card bill from a checking account), the two systems might not run compatible transaction protocols, such as two-phase commit, or be available at the same time. The transfer therefore has to run as multiple independent transactions on each system.
Function encapsulation: Different business functions are managed independently by different departments. For example, in order processing, inventory management is done in manufacturing, scheduling a shipment is done by the field service group, commission reporting is done in the sales system, and credit approval is done by the finance department. Decomposing a workflow request into steps that are processed by these separate systems or by separate reusable services in an SOA is more intellectually and organizationally manageable than designing it to run as one big transaction.
Resource contention: A long-running transaction usually holds resources, such as a lock on data or a communications device. Contention for the resource thereby slows down other transactions trying to use the resource. What starts as a performance problem, due to resource contention, may turn into an availability problem, since whole groups of transactions may be unable to run until the long-running transaction gives up its resources. For example, a money transfer between two banks could take a long time to run, because the banks are connected by slow or intermittent communication. For this reason, the operation normally runs as (at least) two transactions: one on the source system, to debit the money from the source account; and then some time later, a second one on the target system to credit the money to the target account.
Sofar in this book, we have assumed that each user request can besatisfied by the execution of a single transaction. When queuing isused for better availability and load balancing, we added transactionsthat read from and write to queues to move the request around. However,even in this case, only one transaction did the application-orientedwork that was requested.
Thisassumption breaks down for multistep business processes. One of themost important runtime requirements of business processes is that theydo not have to execute as a single transaction. Once you split theexecution of a request into multiple transactions, you no longernecessarily get the benefits of a single transaction: atomicity,isolation, and durability. Let’s look at how these properties mightbreak and what can be done about it.
Isolation
Consider a money transfer operation as an example, debiting $100 from account A and then crediting that $100 to account Bat another bank. If these run as separate transactions, then the moneytransfer request is not isolated from other transactions. For example,somebody could perform an audit of the two banks while the money is inflight, that is, after it is debited from account A and before it is credited to account B.If an auditor reads those accounts, it would look like $100 haddisappeared. Thus, if the audit and money transfer are considered to be“transactions,” they are not serializable; no serial execution of theaudit and money transfer could result in the audit seeing the partialresult of a transfer.
Ofcourse, running the money transfer as one transaction would eliminatethe problem. But as explained earlier, there are many reasons why thismay not be possible or desirable. Therefore, in contrast tosingle-transaction requests, multitransaction business processesrequire special attention to the isolation problem.
Theisolation problem of a multitransaction business process usuallyrequires application-specific solutions. For example, the bank auditprogram must have logic that can deal with in-flight money transfers.An alternative general-purpose solution is to lock data for theduration of the business process. However, for long-running businessprocesses, this creates major resource contention, which is usuallyunacceptable.
Atomicity
In the money transfer example earlier, suppose there is a failure after committing the first transaction that debits account A.This could be a failure of the business process’s application code orof the system that is running that code. In either case, as a result ofthis failure, the first bank’s message to tell the second bank tocredit account B may have been lost. If this occurs, then the second transaction to credit account B will never execute. Thus, the money transfer is not all-or-nothing.
Anyautomated solution to this problem must include maintaining the stateof the business process, that is, which steps of the business processdid and did not execute. The mechanism will need this state after therecovery from the failure that caused the business process to stopprematurely. Therefore, as we noted earlier, this state should be keptin persistent storage, such as a disk. If the state is maintained inpersistent storage, then it will be available after recovery from thefailure even if the failure was caused by a system failure in which thecontent of main memory was lost.
Giventhat the state of each business process is maintained persistently, arecovery mechanism can address the atomicity problem by periodicallypolling that state to determine whether to initiate recovery. If therecovery mechanism finds a business process that has remained in thesame state for longer than the process’s predefined timeout period,then it can initiate recovery.
Oneway that a recovery mechanism can repair a stalled business process isto run a compensating transaction for each of the steps of the businessprocess that have already executed. This approach requires that forevery step of a business process, the application programmer writescode for a compensating transaction that reverses the effect of theforward execution of the step. So in the money transfer example, thefirst transaction, which debits $100 from account A, has an associated compensating transaction that puts the money back into account A. If the system is unable to run the second transaction, which credits account B, it can run a compensation for the first transaction that debited account A.A compensating transaction may not be needed for the last step if thesuccessful completion of that step ensures that the entire businessprocess has completed successfully.
Somesystems include a general-purpose recovery mechanism to implement thisapproach, for example as part of the transactional middleware. For eachactive business process, the transactional middleware keeps track ofthe sequence of transactions that have run. During its forwardexecution, each transaction saves all the information that is needed toallow its compensating transaction to be invoked at recovery time. Forexample, it might save the name of the program that implements thecompensating transaction and the parameter values that should be usedto invoke that program. If the recovery mechanism detects that thebusiness process is unable to finish, then it runs compensations forall the transactions that committed and thereby brings the system backto its initial state (seeFigure 5.2). Thus, it automates the execution of those compensations. This is called a saga:a sequence of transactions that either runs to completion or that runsa compensating transaction for every committed transaction in thesequence.
Figure 5.2. A Saga. Thissaga has five steps, each of which is a transaction. Each step’sprogram includes a compensating transaction. Since this execution ofthe saga cannot proceed past step 3, it runs compensations for thethree steps that did execute.

Ina saga, how does the system keep track of these multiple transactionsteps to ensure that at any given time it can run the compensations ifthe saga terminates prematurely? One possibility is to store the saga’sstate in queue elements. Each transaction in the saga creates a queueelement, which is a request that incorporates or references the historyof the steps that have run so far. If at any point the saga can’t runthe next step in the request, the system can look at that history andinvoke the compensating transaction for each of the steps in thehistory. Because the queue elements are persistent, they can’t getlost. Even if one of the steps is aborted many times, eventually thesystem will recognize the fact that the saga has not completed and willrun the compensating transactions for the steps in the saga thatexecuted.
Durability
Theuse of a multistep business process to implement a request does notaffect the durability guarantee. The durability of a business process’supdates is ensured by the durability property of the transactions thatexecute those updates. If all of a business process’s updates todurable transactional resources execute in the context of transactions,then the result of those updates is durable.
Aswe saw in this section and the last, it is also important to maintain adurable copy of the intermediate state of a business process. This isnot a requirement for transactions. The reason is that a transaction isatomic; that is, all-or-nothing. However, a multistep business processmay not be atomic. To make it atomic, we need a durable copy of itsintermediate states.