SEDA - Architecture for Highly-Concurrent Ser...

来源:百度文库 编辑:神马文学网 时间:2024/04/29 05:18:22
SEDA: An Architecture for Highly Concurrent Server Applications
Matt Welsh, Harvard University
Last updated 9 May 2006
[SEDA on SourceForge ] [Papers and talks ] [Downloads ] [Mailing list ]
Introduction
My Ph.D. thesis work at UC Berkeley focused on the development of a robust, high-performance platform for Internet services, called SEDA. The goal is to build a system capable of supporting massive concurrency (on the order of tens of thousands of simultaneous client connections) and avoid the pitfalls which arise with traditional thread and event-based approaches.
SEDA is an acronym for staged event-driven architecture, and decomposes a complex, event-driven application into a set of stages connected by queues. This design avoids the high overhead associated with thread-based concurrency models, and decouples event and thread scheduling from application logic. By performing admission control on each event queue, the service can be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity. SEDA employs dynamic control to automatically tune runtime parameters (such as the scheduling parameters of each stage), as well as to manage load, for example, by performing adaptive load shedding. Decomposing services into a set of stages also enables modularity and code reuse, as well as the development of debugging tools for complex event-driven applications.
February 19, 2007 - A Note on the status of SEDA
I continue to receive many requests for information about SEDA. I am no longer actively working on this project, so all of these web pages should be regarded as "archival".
It is also worth noting that a number of recent research papers have demonstrated that the SEDA prototype (in Java) performs poorly compared to threaded or event-based systems implemented in C. This would seem to contradict the findings in my work. (For more information I invite you to read recent papers byVivek Pai‘s group at Princeton and theCapriccio work from UC Berkeley.)
While I do not discount these later results, it is important to keep a few things in mind when interpreting them. First, the SEDA implementation in Java was developed and tuned on a particular JVM implementation (IBM JDK 1.3), on a particular version of the Linux kernel (2.2), using the /dev/poll event dispatch mechanism. More recent studies have varied the environment substantially.
I have several theories about what could be causing this poor performance, although I have not had an opportunity to perform new measurements. I have noticed that performance of the SEDA networking layer is highly dependent on a number of parameters, such as the poll interval used by the various threads. This likely needs to be tuned or redesigned to support high bandwidth networks and more recent Linux and JVM implementations. Also, SEDA imposes a high context switch overhead in certain cases, depending on the number of threads and stages used, and the processing granularity within each stage.
Tim Brecht‘s group at Waterloo has undertaken a study of competing Web server architectures and has shown that a SEDA implementation in C++, appropriately tuned, performs comparably to alternatives, so I do not believe these performance issues are fundamental to the architecture.
The most fundamental aspect of the SEDA architecture is the programming model that supports stage-level backpressure and load management. Our goal was never to show that SEDA outperforms other server designs, but rather that acceptable performance can be achieved while providing a disciplined apporach to overload management. Clearly more work is needed to show that this goal can be met in a range of application and server environments.
Please feel free to get in touch if you have new results or questions about the SEDA approach.
Our current prototype of a SEDA-based services platform is called Sandstorm. Sandstorm is implemented entirely in Java and uses theNBIO package to provide nonblocking I/O support. Support for the JDK 1.4 java.nio package is included as well. Despite using Java, we have achieved performance that rivals (and sometimes exceeds) that of C/C++. We have also implemented a SEDA-based asynchronous SSL and TLS protocol library, called aTLS. All of this software is available for download below.
We have built a number of applications to demonstrate the SEDA framework. Haboob is a a high-performance Web server including support for both static and dynamic pages. Other applications include a Gnutella packet router and Arashi, a Web-based email service similar to Yahoo! Mail.
The best place to start for more information is theSOSP‘01 paper on SEDA and thecorresponding talk slides. MyPh.D. thesis has much more information as well. If you have questions, comments, or are interested in collaborations, please feel free to contact me by e-mail (see my home page).
A number of open source and commercial systems are based on SEDA and NBIO. These include:LimeWire runs runs its server based Web crawler on NBIO.TerraLycos runs its chat servers on NBIO, supporting over 30,000 simultaneous usersRimfaxe Web ServerApache Excalibur Event PackageSwiftMQ, a JMS Enterprise Messaging ServerMULE Universal Message Objects, a distributed object brokerOceanStore, a global, secure, peer-to-peer filesystem Various companies, both large and small, are building projects based on SEDA/NBIO.
Project News
July 12, 2002: Lots of updates. CVS, release, and mailing list hosting is now athttp://seda.sourceforge.net. Now you can access the latest SEDA codebase via anonymous CVS, hopefully encouraging more collaborative development of the code.
The seda-users mailing list is back up - pleasesubscribe.
All of the code has been consolidated into a single CVS tree under the package name seda (renamed from mdw). The Haboob Web server and aTLS code are also released and more completely documented. And a nice one-line performance patch to NBIO is included that increases network bandwidth by 30% or so!