Clustering and Load Balancing in Tomcat 5, Part 2

来源:百度文库 编辑:神马文学网 时间:2024/04/28 10:42:23
Clustering and Load Balancing in Tomcat 5, Part 2
bySrini Penchikala
04/14/2004
This is the second part of a series on clustering and load balancing in Tomcat 5 server. Inpart 1, I provided an overview of large-scale J2EE system design as well as various factors to be considered when designing the system for scalability and high availability. I also discussed Tomcat‘s support for clustering, load-balancing, fault-tolerance, and session-replication capabilities. In this part, we‘ll cover the architecture of a proposed cluster setup and go over the installation and configuration details in deploying the cluster (by running multiple Tomcat server instances).
Proposed Cluster Setup
Listed below are the main objectives that I wanted to achieve in the proposed Tomcat cluster:
The cluster should be highly scalable.
It should be fault-tolerant.
It should be dynamically configurable, meaning it should be easy to manage the cluster declaratively (changing a configuration file) rather than programmatically (changingJava code).
It should provide automatic cluster member discovery.
Fail-over and load-balancing features for session data with in-memory session state replication.
Pluggable/configurable load-balancing policies.
Group membership notification when a member of the cluster joins or leaves a group.
No loss of message transmission through multicast.
Clustering should be seamless to the web application and the server. It should provide both client and server transparency. Client Transparency means that the client is not aware of clustered services or how the cluster is set up. The cluster is identified and accessed as a single thing rather than individual services. And server transparency means that the application code in a server is not aware that it‘s in a cluster. The application code cannot communicate with the other members of cluster.
Four instances of Tomcat server were installed to set up the clustering environment. Tomcat was used for both load balancing and clustering requirements. The cluster setup was done using vertical scaling method (multiple Tomcat server instances running on a single machine). One server group and two clones were configured in the cluster (a server group is a logical presentation of theapplication server). The clones had same exact configuration (in terms of web application directory structure and contents) as the server group to optimize session replication. To follow are the main components in the proposed cluster setup:
Load Balancer: A Tomcat instance is configured to distribute the traffic among cluster nodes. This instance is given a code name TC-LB.
Clustering: Three Tomcat server instances were run as part of the cluster. The code names for these instances are TC01, TC02, and TC03.
Session Persistence: In-memory session replication was chosen as the session persistence mechanism. The session data was copied to all three cluster members whenever the session object is modified.
Fail-Over: The balancer application that comes with Tomcat installation is not designed to handle fail over. I wrote a utility class called ServerUtil to check the server status before forwarding any requests to it. It has two methods to verify the status of a cluster node. In the first method, it uses McastService to check if a specified server instance is currently running or not. The second method verifies a cluster node‘s availability by creating a URL object based on the web page URL passed in as a parameter. To use this class make sure catalina-cluster.jar (located in %TOMCAT_HOME%/server/lib directory) and commons-logging-api.jar (%TOMCAT_HOME%/bin directory) files are specified in the classpath.
The application architecture diagram in Figure 1 shows the main components of the cluster.

Figure 1. Tomcat cluster architecture diagram
Installation & Configuration of Tomcat Instances
Table 1. Hardware/software specifications of the machine used to set up Tomcat clustering.
Processor HP Pavilion Pentium III with 800 MHz
Memory 512 MB RAM
Hard Disk 40 GB
Operating System Windows 2000 server with Service Pack 4
JDK Version 1.4.0_02 (Note: JDK version 1.4 or a later version is required to enable Tomcat Clustering)
Tomcat Version 5.0.19
Tools Used Ant 1.6.1, Log4J, JMeter, JBuilder
Main elements of cluster framework
Java Classes
BaseLoadBalancingRule This is an abstract class created to encapsulate the common rules logic in the custom Rule classes. The custom load balancing rules used in the sample web application extend this base class.
RandomRedirectRule This class defines the logic to forward the web requests to an available server in a random manner. It uses current system time as seed to generate random numbers.
RoundRobinRule This class defined the load-balancing logic based on a round robin rule. When a request comes in it forwards the request to the next member in the list. It uses a static variable to track the next available cluster member and increments this value by one for every new request.
ServerUtil A utility class was created to check if a specific cluster node is available or not, to receive requests. This class uses McastService (org.apache.catalina.cluster.mcast package) to detect if a cluster member has left the group.
The relationship of these Java classes is shown in the class diagram in Figure 2.

Figure 2. Cluster application class diagram
Configuration Files
server.xml This file is used to configure clustering in a Tomcat server instance. The version that comes with Tomcat installation has the configuration details included in the file but they are commented out.
web.xml This configuration file is used to specify that a web application session data needs to be replicated.
rules.xml This file is used to define the custom load-balancing rules. This is the file we use to specify which load-balancing rule we want to use to distribute the load among the cluster members.
Scripts
test.jsp A simple test JSP script was used to check the server status. It displays the system time and name of the Tomcat instance from which it‘s launched.
testLB.jsp This is the startup page in our sample web application. It forwards the web requests to a load balancer filter using HTML redirect.
sessiondata.jsp This script is used to verify that no session data is lost when a cluster node goes down. It displays session details and also has HTML fields to manipulate HTTP session object.
build.xml An Ant build script was created to automate the tasks of starting and stopping Tomcat instances (the latest version of Ant 1.6.1 was used to run this script). Once a Tomcat instance has started successfully, you can call the test.jsp to verify which Tomcat instance is running by specifying IP address and port number. The JSP page displays the current system time and the name of the Tomcat instance. (You need to change the home directories for Tomcat servers specified in build.properties file to run the script in your own environment).
Several targets in the build script start or stop Tomcat instances:
Call target start.tomcat5x to start a specific Tomcat instance (for example: tomcat50). Call stop.tomcat5x to stop a specific Tomcat instance. Call stop.alltomcats to stop all running Tomcat instances.
Sample Code
The sample code used in this article is provided astomcatclustering.zip. After installing Tomcat server instances (you need four of them), extract the contents of the zip file to the tomcat directories. The sample code provided uses RoundRobinRule as the load-balancing policy. If you want to use the random redirect policy, modify the rules.xml file located in "tomcat50/webapps/balancer/WEB-INF/config" directory. Just comment-out the rule elements for RoundRobinRule and uncomment rule elements specifying RandomRedirectRule. Also, if you want to use two instances in the cluster instead of three, comment out the third rule and change maxServerInstances attribute to 2 (instead of 3).
Note: I removed all other web applications (jsp-examples, etc.) that were included in the Tomcat installation, keeping only the balancer and the sample web applications.
HTTP Request Flow
The web request flow in the sample cluster environment is as follows:
Launch the startup page (http://localhost:8080/balancer/testLB.jsp). JSP redirects the request to balancer filter (URL: http://localhost:8080/balancer/LoadBalancer). The Tomcat server working as Load Balancer (TC-LB) intercepts the web request and redirects to the next available cluster member (TC01, TC02, or TC03) based on load-balancing algorithm specified in the configuration file. The JSP script sessiondata.jsp (located in "clusterapp" web application) in the selected cluster member is called. If the session is modified the session listener methods in ClusterAppSessionListener are called to log the session modification events. sessiondata.jsp displays the session details (such as session id, last access time, etc.) on the web browser. Randomly stop one or two cluster nodes (by calling "stop.tomcat5x" target in Ant script). Repeat steps 1 through 7 to see if the requests fail over to one of the available cluster members. Also, check if the session information is copied to the cluster members without losing any data.
Figure 3 shows the web request flow represented in a Sequence diagram.

Figure 3. Cluster application sequence diagram (click on the screenshot to open a full-size view).
Cluster Configuration
A sample web application called "clusterapp" was created to run in the cluster. All the instances have the same directory structure and contents to optimize session replication.
Since Tomcat server instances in the cluster use IP Multicast to transmit sessions, we need to make sure that IP multicast is enabled on the machine where the cluster is set up. To verify this, you can run the sampleJava program MulticastNode provided in the bookTomcat: The Definitive Guide or refer to thesample tutorial available on JavaSoft web site on how to write multicast server and client programs.
When a cluster node is started the other members in the cluster show a log message on the server console that a member has been added to the cluster. Similarly, when a cluster node goes down, the remaining members display a log message on the console that a member has disappeared from the cluster. Figure 4 shows the log messages displayed on Tomcat console when a cluster node is removed from the cluster or a new member is added to the cluster.

Figure 4. Log messages when a member is added or removed from cluster
Follow the steps below to enable clustering and session replication in Tomcat server:
All session attributes must implement the java.io.Serializable interface.
Uncomment the Cluster element in server.xml file. The attributes useDirtyFlag and replicationMode in the Cluster element are used to optimize frequency and the session replication mechanism.
Enable ReplicationValve by uncommenting Valve element in server.xml. The ReplicationValve is used to intercept the HTTP request and replicate the session data among the cluster members if the session has been modified by the web client. The Valve element has a "filter" attribute that can be used to filter out requests that could not modify the session (such as HTML pages and image files).
Since all three Tomcat instances are running on the same machine, the tcpListenPort attribute is set to be unique for each Tomcat instance. It is important to know that the attributes starting with mcastXXX (mcastAddr, mcastPort, mcastFrequency, and mcastDropTime) are for the cluster membership IP multicast ping and those starting with tcpXXX (tcpThreadCount, tcpListenAddress, tcpListenPort, and tcpSelectorTimeout) are for TCP session replication ("Clustering configuration parameters" table below shows the configuration of different settings in the Tomcat server instances to enable clustering).
The web.xml meta file (located in clusterapp\WEB-INF directory) should have element. In order to replicate session state for a specific web application, the distributable element needs to be defined for that application. This means if you have more than one web application that need session replication, then you need to add distributable in all those web applications‘ web.xml files. The Tomcat clustering chapter in the bookTomcat: The Definitive Guide provides a very good explanation on this topic.
Table 2. Clustering configuration parameters
Configuration Parameter Instance 1 Instance 2 Instance 3 Instance 4
Instance Type Load Balancer Cluster Node 1 Cluster Node 2 Cluster Node 3
Code name TC-LB TC01 TC02 TC03
Home Directory c:/web/tomcat50 c:/web/tomcat51 c:/web/tomcat52 c:/web/tomcat53
Server Port 8005 9005 10005 11005
Connector 8080 9080 10080 11080
Coyote/JK2 AJP Connector 8009 9009 10009 11009
Cluster mcastAddr 228.0.0.4 228.0.0.4 228.0.0.4 228.0.0.4
Cluster mcastPort 45564 45564 45564 45564
tcpListenAddress 127.0.0.1 127.0.0.1 127.0.0.1 127.0.0.1
Cluster tcpListenPort 4000 4001 4002 4003
Note: Since all the cluster members are running on the same physical machine they use the same IP address (127.0.0.1).
If you are not using the Ant script to start and stop Tomcat instances, do not set up the CATALINA_HOME environment variable on your computer. If this variable is set, all instances will try to use the same directory (specified in CATALINA_HOME variable) to start Tomcat instances. As a result, only the first instance will successfully start and the other instances will crash with a bind exception message saying that the port is already in use: "java.net.BindException: Address already in use: JVM_Bind:8080".
Load Balancer setup
I wrote two simple, custom load-balancing rules extending the rules API to redirect incoming web requests (RoundRobinRule and RandomRedirect). These rules are based on load-balancing algorithms such as round robin and random redirect. You can write similar custom load balancing rules based on other factors like weight based and last access time, etc. The Tomcat load balancer provides a sample parameter-based load balancing rule. It redirects web requests to different URLs depending on the parameter specified in the HTTP request.
Leave the cluster and valve elements in server.xml (in TC-LB instance) commented out since we are not using this Tomcat instance as a cluster member.
Testing Setup
Session Persistence Testing
In session persistence testing, the main objective is to verify that session data is not lost when a cluster member crashes in the middle of a web request. The JSP sessiondata.jsp was used to display the session details. This script also provides HTML text fields to add/modify/remove the session attributes. After adding couple of attributes to HTTP session, I randomly stopped the cluster nodes and check the session data on the available cluster members.
Load Testing
The objective of load testing is to study the custom load balancing algorithms and see how effectively the web requests are distributed to the nodes in the cluster especially when one or more of the nodes go down. JMeter load testing tool was used to simulate multiple concurrent web users.
Steps to test load balancing in the cluster setup:
Start the load balancer and cluster instances of Tomcat server. Launch the startup JSP script (testLB.jsp). Simulate the server crash by manually stopping one or more containers at random intervals. Check the load distribution pattern. Repeat steps 1 through 4 for 100 times.
All log messages were directed to a text file called tomcat_cluster.log (located in tomcat50/webapps/balancer directory). The response times for all the web objects shown in the sequence diagram (Figure 2) were logged using Log4J messages. The elapsed times (in milliseconds) were collected and tabulated into a matrix as shown in Table 3. I followed a similar instrumentation methodology described inDesigning Performance Testing Metrics into Distributed J2EE Apps in collecting the response times during the tests.
The following tables show the elapsed times in load testing (using RoundRobinRule) and load distribution percentages (using RandomRedirectRule) respectively.
Table 3. Elapsed times for load testing
# Scenario testLB.jsp
(ms) RoundRobinRule
(ms) sessiondata.jsp
(ms) Total
(ms)
1 All three server instances running 54 76 12 142
2 Two server instances running (TC02 was stopped) 55 531 14 600
3 Only one server instance running
(TC01 and TC02 were stopped) 56 1900 11 1967
Note: All elapsed times are average values based on a load of 100 concurrent users.
Table 4. Load distribution when using random LB policy
# Scenario TC01 (%) TC02 (%) TC03 (%)
1 All three server instances running 30 46 24
2 Two server instances running (TC02 was stopped) 56 0 44
Note: Load distribution percentages are based on a load of 100 concurrent users.
Conclusion
In session persistence testing, after adding session attributes one of the cluster nodes was brought down and it was verified that the session attributes were not lost due to the server outage. The session details logged in the text file were used to study the details of session attributes.
In load testing, when one or two server instances were stopped and only one Tomcat instance was running, the response times took longer compared to when all three instances were available. When the stopped instances were restarted, the load balancer automatically found that the server is again available to take the requests and redirected the next web request, which significantly improved the response times.
The mechanism I used to find if a cluster member is available (using ServerUtil) is not the fastest way to do this. More sophisticated and robust fail-over techniques should be used in real world scenarios.
One of the limitations of the proposed cluster setup is that it only provides a single load balancer. What happens if the Tomcat instance acting as load balancer goes down? There is no way to forward the requests to any of the cluster members and the result is a so-called Single Point of Failure (SPoF). One solution to this problem is to have a second Tomcat instance as the standby load balancer to take over if the primary load balancer crashes. Typical HA options involve having two load balancers to prevent SPoF situations.
In the sample cluster setup, all Tomcat instances (including load balancer) were configured to run on the same computer. A better design is to run a load balancer instance on a separate machine from the cluster members. Also, we should limit two cluster nodes per machine to take advantage of horizontal scaling method and to improve cluster performance.
HTTP session replication is an expensive operation for a J2EE webapplication server. The implications of session management in a clustered J2EE environment should be considered during the analysis and design phases of a project rather than waiting until the web application is implemented in the production environment. The application code must be designed keeping the cluster environment in mind. If the clustering implications are not considered at the design phase, the code may need to be completely rewritten to make it work in the cluster setup, which could be a very expensive effort.
If a web application supports any kind of object caching mechanism, then caching of objects in a cluster environment should be considered at the initial stages of the application development. This is very important because keeping cached data in all the cluster nodes in sync is critical for providing accurate and up-to-datebusiness data to web users. Another important consideration is clearing the expired session data in a cluster.
Once the J2EE cluster has been successfully set up and is running, its management and maintenance will become very important to provide the benefits of scalability and high availability. With many nodes (members) in a cluster, maintenance revolves around keeping the cluster running and pushing out application changes to all cluster nodes. One way to provide these services is to implement a monitoring service that periodically checks the server availability and notifies if any of the nodes in the cluster become unavailable. This service should check the nodes at regular intervals detecting failed nodes and removing them from the active cluster node list so no requests go to those nodes. It should include, as changes and updates occur, the ability to update and synchronize all servers in the cluster. Since all requests to a web application must pass through the load-balancing system, the system can determine the number of active sessions, the number of active sessions connected in any instance, response times, peak load times, the number of sessions during peak load, the number of sessions during minimum load, and more. All this audit information can be used to fine-tune the entire system for optimal performance. Reports showing all these metrics should be generated on a regular basis to assess the effectiveness of load-balancing policies and cluster nodes.
Currently, all the configuration required to set up the cluster and load balancer is done manually by manipulating the configuration files (server.xml and rules.xml). It would be very helpful if the Jakarta group were to provide a web-based cluster administration GUI tool to perform the configuration changes needed to manage the clustering and load-balancing setup.
Resources
Tomcat 5 Home PageClustering Home Page on Tomcat siteLoad Balancer Home Page on Tomcat siteTomcat: The Definitive Guide by Jason Brittain and Ian F. DarwinJava Performance Tuning, 2nd Edition by Jack Shiraji Creating Highly Available and Scalable Applications Using J2EE, The Middleware Company, EJB Essentials Training Class Material
Srini Penchikala is an information systems subject matter expert at Flagstar Bank.