ASP.NET: 10 Tips for Writing High-Performance Web Applications

来源:百度文库 编辑:神马文学网 时间:2024/04/28 08:04:40
10 Tips for Writing High-Performance Web Applications
Rob Howard
This article discusses: Common ASP.NET performance myths
Useful performance tips and tricks for ASP.NET
Suggestions for working with a database from ASP.NET
Caching and background processing with ASP.NET
This article uses the following technologies:
ASP.NET, .NET Framework, IIS
 Contents
Performance on the Data Tier
Tip 1—Return Multiple Resultsets
Tip 2—Paged Data Access
Tip 3—Connection Pooling
Tip 4—ASP.NET Cache API
Tip 5—Per-Request Caching
Tip 6—Background Processing
Tip 7—Page Output Caching and Proxy Servers
Tip 8—Run IIS 6.0 (If Only for Kernel Caching)
Tip 9—Use Gzip Compression
Tip 10—Server Control View State
Conclusion
Writing a Web application with ASP.NET is unbelievably easy. So easy,many developers don't take the time to structure their applications forgreat performance. In this article, I'm going to present 10 tips forwriting high-performance Web apps. I'm not limiting my comments toASP.NET applications because they are just one subset of Webapplications. This article won't be the definitive guide forperformance-tuning Web applications—an entire book could easily bedevoted to that. Instead, think of this as a good place to start.
Beforebecoming a workaholic, I used to do a lot of rock climbing. Prior toany big climb, I'd review the route in the guidebook and read therecommendations made by people who had visited the site before. But, nomatter how good the guidebook, you need actual rock climbing experiencebefore attempting a particularly challenging climb. Similarly, you canonly learn how to write high-performance Web applications when you'refaced with either fixing performance problems or running ahigh-throughput site.
Mypersonal experience comes from having been an infrastructure ProgramManager on the ASP.NET team at Microsoft, running and managingwww.asp.net,and helping architect Community Server, which is the next version ofseveral well-known ASP.NET applications (ASP.NET Forums, .Text, andnGallery combined into one platform). I'm sure that some of the tipsthat have helped me will help you as well.
You should think about the separation of your application into logical tiers. You might have heard of the term 3-tier (or n -tier)physical architecture. These are usually prescribed architecturepatterns that physically divide functionality across processes and/orhardware. As the system needs to scale, more hardware can easily beadded. There is, however, a performance hit associated with process andmachine hopping, thus it should be avoided. So, whenever possible, runthe ASP.NET pages and their associated components together in the sameapplication.
Becauseof the separation of code and the boundaries between tiers, using Webservices or remoting will decrease performance by 20 percent or more.
Thedata tier is a bit of a different beast since it is usually better tohave dedicated hardware for your database. However, the cost of processhopping to the database is still high, thus performance on the datatier is the first place to look when optimizing your code.
Beforediving in to fix performance problems in your applications, make sureyou profile your applications to see exactly where the problems lie.Key performance counters (such as the one that indicates the percentageof time spent performing garbage collections) are also very useful forfinding out where applications are spending the majority of their time.Yet the places where time is spent are often quite unintuitive.
Thereare two types of performance improvements described in this article:large optimizations, such as using the ASP.NET Cache, and tinyoptimizations that repeat themselves. These tiny optimizations aresometimes the most interesting. You make a small change to code thatgets called thousands and thousands of times. With a big optimization,you might see overall performance take a large jump. With a small one,you might shave a few milliseconds on a given request, but whencompounded across the total requests per day, it can result in anenormous improvement.
Performance on the Data Tier
Whenit comes to performance-tuning an application, there is a single litmustest you can use to prioritize work: does the code access the database?If so, how often? Note that the same test could be applied for codethat uses Web services or remoting, too, but I'm not covering those inthis article.
Ifyou have a database request required in a particular code path and yousee other areas such as string manipulations that you want to optimizefirst, stop and perform your litmus test. Unless you have an egregiousperformance problem, your time would be better utilized trying tooptimize the time spent in and connected to the database, the amount ofdata returned, and how often you make round-trips to and from thedatabase.
Withthat general information established, let's look at ten tips that canhelp your application perform better. I'll begin with the changes thatcan make the biggest difference.
Tip 1—Return Multiple Resultsets
Reviewyour database code to see if you have request paths that go to thedatabase more than once. Each of those round-trips decreases the numberof requests per second your application can serve. By returningmultiple resultsets in a single database request, you can cut the totaltime spent communicating with the database. You'll be making yoursystem more scalable, too, as you'll cut down on the work the databaseserver is doing managing requests.
Whileyou can return multiple resultsets using dynamic SQL, I prefer to usestored procedures. It's arguable whether business logic should residein a stored procedure, but I think that if logic in a stored procedurecan constrain the data returned (reduce the size of the dataset, timespent on the network, and not having to filter the data in the logictier), it's a good thing.
Usinga SqlCommand instance and its ExecuteReader method to populate stronglytyped business classes, you can move the resultset pointer forward bycalling NextResult. Figure 1 shows a sampleconversation populating several ArrayLists with typed classes.Returning only the data you need from the database will additionallydecrease memory allocations on your server.
 Figure 1 Extracting Multiple Resultsets from a DataReader
Copy Code
// read the first resultsetreader = command.ExecuteReader();// read the data from that resultsetwhile (reader.Read()) {suppliers.Add(PopulateSupplierFromIDataReader( reader ));}// read the next resultsetreader.NextResult();// read the data from that second resultsetwhile (reader.Read()) {products.Add(PopulateProductFromIDataReader( reader ));}
Tip 2—Paged Data Access
TheASP.NET DataGrid exposes a wonderful capability: data paging support.When paging is enabled in the DataGrid, a fixed number of records isshown at a time. Additionally, paging UI is also shown at the bottom ofthe DataGrid for navigating through the records. The paging UI allowsyou to navigate backwards and forwards through displayed data,displaying a fixed number of records at a time.
There'sone slight wrinkle. Paging with the DataGrid requires all of the datato be bound to the grid. For example, your data layer will need toreturn all of the data and then the DataGrid will filter all thedisplayed records based on the current page. If 100,000 records arereturned when you're paging through the DataGrid, 99,975 records wouldbe discarded on each request (assuming a page size of 25). As thenumber of records grows, the performance of the application will sufferas more and more data must be sent on each request.
One good approach to writing better paging code is to use stored procedures. Figure 2shows a sample stored procedure that pages through the Orders table inthe Northwind database. In a nutshell, all you're doing here is passingin the page index and the page size. The appropriate resultset iscalculated and then returned.
 Figure 2 Paging Through the Orders Table
Copy Code
CREATE PROCEDURE northwind_OrdersPaged(@PageIndex int,@PageSize int)ASBEGINDECLARE @PageLowerBound intDECLARE @PageUpperBound intDECLARE @RowsToReturn int-- First set the rowcountSET @RowsToReturn = @PageSize * (@PageIndex + 1)SET ROWCOUNT @RowsToReturn-- Set the page boundsSET @PageLowerBound = @PageSize * @PageIndexSET @PageUpperBound = @PageLowerBound + @PageSize + 1-- Create a temp table to store the select resultsCREATE TABLE #PageIndex(IndexId int IDENTITY (1, 1) NOT NULL,OrderID int)-- Insert into the temp tableINSERT INTO #PageIndex (OrderID)SELECTOrderIDFROMOrdersORDER BYOrderID DESC-- Return total countSELECT COUNT(OrderID) FROM Orders-- Return paged resultsSELECTO.*FROMOrders O,#PageIndex PageIndexWHEREO.OrderID = PageIndex.OrderID ANDPageIndex.IndexID > @PageLowerBound ANDPageIndex.IndexID < @PageUpperBoundORDER BYPageIndex.IndexIDEND
InCommunity Server, we wrote a paging server control to do all the datapaging. You'll see that I am using the ideas discussed in Tip 1,returning two resultsets from one stored procedure: the total number ofrecords and the requested data.
Thetotal number of records returned can vary depending on the query beingexecuted. For example, a WHERE clause can be used to constrain the datareturned. The total number of records to be returned must be known inorder to calculate the total pages to be displayed in the paging UI.For example, if there are 1,000,000 total records and a WHERE clause isused that filters this to 1,000 records, the paging logic needs to beaware of the total number of records to properly render the paging UI.
Tip 3—Connection Pooling
Setting up the TCP connection between your Web application and SQL Server™can be an expensive operation. Developers at Microsoft have been ableto take advantage of connection pooling for some time now, allowingthem to reuse connections to the database. Rather than setting up a newTCP connection on each request, a new connection is set up only whenone is not available in the connection pool. When the connection isclosed, it is returned to the pool where it remains connected to thedatabase, as opposed to completely tearing down that TCP connection.
Ofcourse you need to watch out for leaking connections. Always close yourconnections when you're finished with them. I repeat: no matter whatanyone says about garbage collection within the Microsoft®.NET Framework, always call Close or Dispose explicitly on yourconnection when you are finished with it. Do not trust the commonlanguage runtime (CLR) to clean up and close your connection for you ata predetermined time. The CLR will eventually destroy the class andforce the connection closed, but you have no guarantee when the garbagecollection on the object will actually happen.
Touse connection pooling optimally, there are a couple of rules to liveby. First, open the connection, do the work, and then close theconnection. It's okay to open and close the connection multiple timeson each request if you have to (optimally you apply Tip 1) rather thankeeping the connection open and passing it around through differentmethods. Second, use the same connection string (and the same threadidentity if you're using integrated authentication). If you don't usethe same connection string, for example customizing the connectionstring based on the logged-in user, you won't get the same optimizationvalue provided by connection pooling. And if you use integratedauthentication while impersonating a large set of users, your poolingwill also be much less effective. The .NET CLR data performancecounters can be very useful when attempting to track down anyperformance issues that are related to connection pooling.
Wheneveryour application is connecting to a resource, such as a database,running in another process, you should optimize by focusing on the timespent connecting to the resource, the time spent sending or retrievingdata, and the number of round-trips. Optimizing any kind of process hopin your application is the first place to start to achieve betterperformance.
Theapplication tier contains the logic that connects to your data layerand transforms data into meaningful class instances and businessprocesses. For example, in Community Server, this is where you populatea Forums or Threads collection, and apply business rules such aspermissions; most importantly it is where the Caching logic isperformed.
Tip 4—ASP.NET Cache API
Oneof the very first things you should do before writing a line ofapplication code is architect the application tier to maximize andexploit the ASP.NET Cache feature.
Ifyour components are running within an ASP.NET application, you simplyneed to include a reference to System.Web.dll in your applicationproject. When you need access to the Cache, use the HttpRuntime.Cacheproperty (the same object is also accessible through Page.Cache andHttpContext.Cache).
Thereare several rules for caching data. First, if data can be used morethan once it's a good candidate for caching. Second, if data is generalrather than specific to a given request or user, it's a great candidatefor the cache. If the data is user- or request-specific, but is longlived, it can still be cached, but may not be used as frequently.Third, an often overlooked rule is that sometimes you can cache toomuch. Generally on an x86 machine, you want to run a process with nohigher than 800MB of private bytes in order to reduce the chance of anout-of-memory error. Therefore, caching should be bounded. In otherwords, you may be able to reuse a result of a computation, but if thatcomputation takes 10 parameters, you might attempt to cache on 10permutations, which will likely get you into trouble. One of the mostcommon support calls for ASP.NET is out-of-memory errors caused byovercaching, especially of large datasets.Common Performance MythsOne of the most common mythsis that C# code is faster than Visual Basic code. There is a grain oftruth in this, as it is possible to take several performance-hinderingactions in Visual Basic that are not possible to accomplish in C#, suchas not explicitly declaring types. But if good programming practicesare followed, there is no reason why Visual Basic and C# code cannotexecute with nearly identical performance. To put it more succinctly,similar code produces similar results.
Anothermyth is that codebehind is faster than inline, which is absolutelyfalse. It doesn't matter where your code for your ASP.NET applicationlives, whether in a codebehind file or inline with the ASP.NET page.Sometimes I prefer to use inline code as changes don't incur the sameupdate costs as codebehind. For example, with codebehind you have toupdate the entire codebehind DLL, which can be a scary proposition.
Mythnumber three is that components are faster than pages. This was true inClassic ASP when compiled COM servers were much faster than VBScript.With ASP.NET, however, both pages and components are classes. Whetheryour code is inline in a page, within a codebehind, or in a separatecomponent makes little performance difference. Organizationally, it isbetter to group functionality logically this way, but again it makes nodifference with regard to performance.
Thefinal myth I want to dispel is that every functionality that you wantto occur between two apps should be implemented as a Web service. Webservices should be used to connect disparate systems or to provideremote access to system functionality or behaviors. They should not beused internally to connect two similar systems. While easy to use,there are much better alternatives. The worst thing you can do is useWeb services for communicating between ASP and ASP.NET applicationsrunning on the same server, which I've witnessed all too frequently.
Figure 3  ASP.NET Cache
Thereare a several great features of the Cache that you need to know. Thefirst is that the Cache implements a least-recently-used algorithm,allowing ASP.NET to force a Cache purge—automatically removing unuseditems from the Cache—if memory is running low. Secondly, the Cachesupports expiration dependencies that can force invalidation. Theseinclude time, key, and file. Time is often used, but with ASP.NET 2.0 anew and more powerful invalidation type is being introduced: databasecache invalidation. This refers to the automatic removal of entries inthe cache when data in the database changes. For more information ondatabase cache invalidation, see Dino Esposito'sCutting Edge column in the July 2004 issue of MSDN ® Magazine . For a look at the architecture of the cache, see Figure 3 .
Tip 5—Per-Request Caching
Earlierin the article, I mentioned that small improvements to frequentlytraversed code paths can lead to big, overall performance gains. One ofmy absolute favorites of these is something I've termed per-requestcaching.
Whereasthe Cache API is designed to cache data for a long period or until somecondition is met, per-request caching simply means caching the data forthe duration of the request. A particular code path is accessedfrequently on each request but the data only needs to be fetched,applied, modified, or updated once. This sounds fairly theoretical, solet's consider a concrete example.
Inthe Forums application of Community Server, each server control used ona page requires personalization data to determine which skin to use,the style sheet to use, as well as other personalization data. Some ofthis data can be cached for a long period of time, but some data, suchas the skin to use for the controls, is fetched once on each requestand reused multiple times during the execution of the request.
Toaccomplish per-request caching, use the ASP.NET HttpContext. Aninstance of HttpContext is created with every request and is accessibleanywhere during that request from the HttpContext.Current property. TheHttpContext class has a special Items collection property; objects anddata added to this Items collection are cached only for the duration ofthe request. Just as you can use the Cache to store frequently accesseddata, you can use HttpContext.Items to store data that you'll use onlyon a per-request basis. The logic behind this is simple: data is addedto the HttpContext.Items collection when it doesn't exist, and onsubsequent lookups the data found in HttpContext.Items is simplyreturned.
Tip 6—Background Processing
Thepath through your code should be as fast as possible, right? There maybe times when you find yourself performing expensive tasks on eachrequest or once every n requests. Sending out e-mails or parsing and validation of incoming data are just a few examples.
Whentearing apart ASP.NET Forums 1.0 and rebuilding what became CommunityServer, we found that the code path for adding a new post was prettyslow. Each time a post was added, the application first needed toensure that there were no duplicate posts, then it had to parse thepost using a "badword" filter, parse the post for emoticons, tokenizeand index the post, add the post to the moderation queue when required,validate attachments, and finally, once posted, send e-mailnotifications out to any subscribers. Clearly, that's a lot of work.
Itturns out that most of the time was spent in the indexing logic andsending e-mails. Indexing a post was a time-consuming operation, and itturned out that the built-in System.Web.Mail functionality wouldconnect to an SMTP server and send the e-mails serially. As the numberof subscribers to a particular post or topic area increased, it wouldtake longer and longer to perform the AddPost function.
Indexinge-mail didn't need to happen on each request. Ideally, we wanted tobatch this work together and index 25 posts at a time or send all thee-mails every five minutes. We decided to use the same code I had usedto prototype database cache invalidation for what eventually got bakedinto Visual Studio® 2005.
TheTimer class, found in the System.Threading namespace, is a wonderfullyuseful, but less well-known class in the .NET Framework, at least forWeb developers. Once created, the Timer will invoke the specifiedcallback on a thread from the ThreadPool at a configurable interval.This means you can set up code to execute without an incoming requestto your ASP.NET application, an ideal situation for backgroundprocessing. You can do work such as indexing or sending e-mail in thisbackground process too.
Thereare a couple of problems with this technique, though. If yourapplication domain unloads, the timer instance will stop firing itsevents. In addition, since the CLR has a hard gate on the number ofthreads per process, you can get into a situation on a heavily loadedserver where timers may not have threads to complete on and can besomewhat delayed. ASP.NET tries to minimize the chances of thishappening by reserving a certain number of free threads in the processand only using a portion of the total threads for request processing.However, if you have lots of asynchronous work, this can be an issue.
There is not enough room to go into the code here, but you can download a digestible sample atwww.rob-howard.net. Just grab the slides and demos from the Blackbelt TechEd 2004 presentation.
Tip 7—Page Output Caching and Proxy Servers
ASP.NETis your presentation layer (or should be); it consists of pages, usercontrols, server controls (HttpHandlers and HttpModules), and thecontent that they generate. If you have an ASP.NET page that generatesoutput, whether HTML, XML, images, or any other data, and you run thiscode on each request and it generates the same output, you have a greatcandidate for page output caching.
By simply adding this line to the top of your page Copy Code
<%@ Page OutputCache VaryByParams="none" Duration="60" %>
you can effectively generate the output for this page once and reuse itmultiple times for up to 60 seconds, at which point the page willre-execute and the output will once be again added to the ASP.NETCache. This behavior can also be accomplished using some lower-levelprogrammatic APIs, too. There are several configurable settings foroutput caching, such as the VaryByParams attribute just described.VaryByParams just happens to be required, but allows you to specify theHTTP GET or HTTP POST parameters to vary the cache entries. Forexample, default.aspx?Report=1 or default.aspx?Report=2 could beoutput-cached by simply setting VaryByParam="Report". Additionalparameters can be named by specifying a semicolon-separated list.
Manypeople don't realize that when the Output Cache is used, the ASP.NETpage also generates a set of HTTP headers that downstream cachingservers, such as those used by the Microsoft Internet Security andAcceleration Server or by Akamai. When HTTP Cache headers are set, thedocuments can be cached on these network resources, and client requestscan be satisfied without having to go back to the origin server.
Usingpage output caching, then, does not make your application moreefficient, but it can potentially reduce the load on your server asdownstream caching technology caches documents. Of course, this canonly be anonymous content; once it's downstream, you won't see therequests anymore and can't perform authentication to prevent access toit.
Tip 8—Run IIS 6.0 (If Only for Kernel Caching)
If you're not running IIS 6.0 (Windows Server™2003), you're missing out on some great performance enhancements in theMicrosoft Web server. In Tip 7, I talked about output caching. In IIS5.0, a request comes through IIS and then to ASP.NET. When caching isinvolved, an HttpModule in ASP.NET receives the request, and returnsthe contents from the Cache.
Ifyou're using IIS 6.0, there is a nice little feature called kernelcaching that doesn't require any code changes to ASP.NET. When arequest is output-cached by ASP.NET, the IIS kernel cache receives acopy of the cached data. When a request comes from the network driver,a kernel-level driver (no context switch to user mode) receives therequest, and if cached, flushes the cached data to the response, andcompletes execution. This means that when you use kernel-mode cachingwith IIS and ASP.NET output caching, you'll see unbelievableperformance results. At one point during the Visual Studio 2005development of ASP.NET, I was the program manager responsible forASP.NET performance. The developers did the magic, but I saw all thereports on a daily basis. The kernel mode caching results were alwaysthe most interesting. The common characteristic was network saturationby requests/responses and IIS running at about five percent CPUutilization. It was amazing! There are certainly other reasons forusing IIS 6.0, but kernel mode caching is an obvious one.
Tip 9—Use Gzip Compression
Whilenot necessarily a server performance tip (since you might see CPUutilization go up), using gzip compression can decrease the number ofbytes sent by your server. This gives the perception of faster pagesand also cuts down on bandwidth usage. Depending on the data sent, howwell it can be compressed, and whether the client browsers support it(IIS will only send gzip compressed content to clients that supportgzip compression, such as Internet Explorer 6.0 and Firefox), yourserver can serve more requests per second. In fact, just about any timeyou can decrease the amount of data returned, you will increaserequests per second.
Thegood news is that gzip compression is built into IIS 6.0 and is muchbetter than the gzip compression used in IIS 5.0. Unfortunately, whenattempting to turn on gzip compression in IIS 6.0, you may not be ableto locate the setting on the properties dialog in IIS. The IIS teambuilt awesome gzip capabilities into the server, but neglected toinclude an administrative UI for enabling it. To enable gzipcompression, you have to spelunk into the innards of the XMLconfiguration settings of IIS 6.0 (which isn't for the faint of heart).By the way, the credit goes to Scott Forsyth ofOrcsWeb who helped me figure this out for thewww.asp.net severs hosted by OrcsWeb.
Rather than include the procedure in this article, just read the article by Brad Wilson atIIS6 Compression. There's also a Knowledge Base article on enabling compression for ASPX, available atEnable ASPX Compression in IIS.It should be noted, however, that dynamic compression and kernelcaching are mutually exclusive on IIS 6.0 due to some implementationdetails.
Tip 10—Server Control View State
Viewstate is a fancy name for ASP.NET storing some state data in a hiddeninput field inside the generated page. When the page is posted back tothe server, the server can parse, validate, and apply this view statedata back to the page's tree of controls. View state is a very powerfulcapability since it allows state to be persisted with the client and itrequires no cookies or server memory to save this state. Many ASP.NETserver controls use view state to persist settings made duringinteractions with elements on the page, for example, saving the currentpage that is being displayed when paging through data.
Thereare a number of drawbacks to the use of view state, however. First ofall, it increases the total payload of the page both when served andwhen requested. There is also an additional overhead incurred whenserializing or deserializing view state data that is posted back to theserver. Lastly, view state increases the memory allocations on theserver.
Severalserver controls, the most well known of which is the DataGrid, tend tomake excessive use of view state, even in cases where it is not needed.The default behavior of the ViewState property is enabled, but if youdon't need it, you can turn it off at the control or page level. Withina control, you simply set the EnableViewState property to false, or youcan set it globally within the page using this setting: Copy Code
<%@ Page EnableViewState="false" %>
If you are not doing postbacks in a page or arealways regenerating the controls on a page on each request, you shoulddisable view state at the page level.
Conclusion
I'veoffered you some tips that I've found useful for writinghigh-performance ASP.NET applications. As I mentioned at the beginningof this article, this is more a preliminary guide than the last word onASP.NET performance. (More information on improving the performance ofASP.NET apps can be found atImproving ASP.NET Performance.)Only through your own experience can you find the best way to solveyour unique performance problems. However, during your journey, thesetips should provide you with good guidance. In software development,there are very few absolutes; every application is unique.
See the sidebar "Common Performance Myths".
Rob Howardis the founder of Telligent Systems, specializing in high-performanceWeb apps and knowledge management and collaboration systems.Previously, Rob was employed by Microsoft where he helped design theinfrastructure features of ASP.NET 1.0, 1.1, and 2.0. You can contactRob atrhoward@telligentsystems.com.