O‘Reilly Network: What Is Web 2.0

来源:百度文库 编辑:神马文学网 时间:2024/04/26 01:13:34

Published onO‘Reilly (http://www.oreilly.com/)
http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html
See this if you‘re having trouble printing code examples
What Is Web 2.0
Design Patterns and Business Models for the Next Generation of Software
byTim O‘Reilly
09/30/2005Read this article in:Chinese
French
German
Italian
Japanese
Korean
Spanish
The bursting of the dot-com bubble in the fall of 2001 marked aturning point for the web. Many people concluded that the web wasoverhyped, when in factbubbles and consequent shakeouts appear to be a common feature of all technological revolutions.Shakeouts typically mark the point at which an ascendant technology isready to take its place at center stage. The pretenders are given thebum‘s rush, the real success stories show their strength, and therebegins to be an understanding of what separates one from the other.
The concept of "Web 2.0" began with a conference brainstormingsession between O‘Reilly and MediaLive International. Dale Dougherty,web pioneer and O‘Reilly VP, noted that far from having "crashed", theweb was more important than ever, with exciting new applications andsites popping up with surprising regularity. What‘s more, the companiesthat had survived the collapse seemed to have some things in common.Could it be that the dot-com collapse marked some kind of turning pointfor the web, such that a call to action such as "Web 2.0" might makesense? We agreed that it did, and so theWeb 2.0 Conference was born.
In the year and a half since, the term "Web 2.0" has clearly takenhold, with more than 9.5 million citations in Google. But there‘s stilla huge amount of disagreement about just what Web 2.0 means, with some people decrying it as a meaningless marketing buzzword, and others accepting it as the new conventional wisdom.
This article is an attempt to clarify just what we mean by Web 2.0.
In our initial brainstorming, we formulated our sense of Web 2.0 by example:
Web 1.0   Web 2.0
DoubleClick --> Google AdSense
Ofoto --> Flickr
Akamai --> BitTorrent
mp3.com --> Napster
Britannica Online --> Wikipedia
personal websites --> blogging
evite --> upcoming.org and EVDB
domain name speculation --> search engine optimization
page views --> cost per click
screen scraping --> web services
publishing --> participation
content management systems --> wikis
directories (taxonomy) --> tagging ("folksonomy")
stickiness --> syndication
The list went on and on. But what was it that made us identify oneapplication or approach as "Web 1.0" and another as "Web 2.0"? (Thequestion is particularly urgent because the Web 2.0 meme has become sowidespread that companies are now pasting it on as a marketingbuzzword, with no real understanding of just what it means. Thequestion is particularly difficult because many of thosebuzzword-addicted startups are definitely not Web 2.0, whilesome of the applications we identified as Web 2.0, like Napster andBitTorrent, are not even properly web applications!) We began trying totease out the principles that are demonstrated in one way or another bythe success stories of web 1.0 and by the most interesting of the newapplications.
1. The Web As Platform
Like many important concepts, Web 2.0 doesn‘t have a hard boundary, but rather, a gravitational core. You canvisualize Web 2.0as a set of principles and practices that tie together a veritablesolar system of sites that demonstrate some or all of those principles,at a varying distance from that core.

Figure 1 shows a "meme map" of Web 2.0 that was developed at abrainstorming session during FOO Camp, a conference at O‘Reilly Media.It‘s very much a work in progress, but shows the many ideas thatradiate out from the Web 2.0 core.
For example, at the first Web 2.0 conference, in October 2004, JohnBattelle and I listed a preliminary set of principles in our openingtalk. The first of those principles was "The web as platform." Yet thatwas also a rallying cry of Web 1.0 darling Netscape, which went down inflames after a heated battle with Microsoft. What‘s more, two of ourinitial Web 1.0 exemplars, DoubleClick and Akamai, were both pioneersin treating the web as a platform. People don‘t often think of it as"web services", but in fact, ad serving was the first widely deployedweb service, and the first widely deployed "mashup" (to use anotherterm that has gained currency of late). Every banner ad is served as aseamless cooperation between two websites, delivering an integratedpage to a reader on yet another computer. Akamai also treats thenetwork as the platform, and at a deeper level of the stack, building atransparent caching and content delivery network that eases bandwidthcongestion.
Nonetheless, these pioneers provided useful contrasts because laterentrants have taken their solution to the same problem even further,understanding something deeper about the nature of the new platform.Both DoubleClick and Akamai were Web 2.0 pioneers, yet we can also seehow it‘s possible to realize more of the possibilities by embracingadditionalWeb 2.0 design patterns.
Let‘s drill down for a moment into each of these three cases, teasing out some of the essential elements of difference.
Netscape vs. Google
If Netscape was the standard bearer for Web 1.0, Google is mostcertainly the standard bearer for Web 2.0, if only because theirrespective IPOs were defining events for each era. So let‘s start witha comparison of these two companies and their positioning.
Netscape framed "the web as platform" in terms of the old softwareparadigm: their flagship product was the web browser, a desktopapplication, and their strategy was to use their dominance in thebrowser market to establish a market for high-priced server products.Control over standards for displaying content and applications in thebrowser would, in theory, give Netscape the kind of market powerenjoyed by Microsoft in the PC market. Much like the "horselesscarriage" framed the automobile as an extension of the familiar,Netscape promoted a "webtop" to replace the desktop, and planned topopulate that webtop with information updates and applets pushed to thewebtop by information providers who would purchase Netscape servers.
In the end, both web browsers and web servers turned out to becommodities, and value moved "up the stack" to services delivered overthe web platform.
Google, by contrast, began its life as a native web application,never sold or packaged, but delivered as a service, with customerspaying, directly or indirectly, for the use of that service. None ofthe trappings of the old software industry are present. No scheduledsoftware releases, just continuous improvement. No licensing or sale,just usage. No porting to different platforms so that customers can runthe software on their own equipment, just a massively scalablecollection of commodity PCs running open source operating systems plushomegrown applications and utilities that no one outside the companyever gets to see.
At bottom, Google requires a competency that Netscape never needed:database management. Google isn‘t just a collection of software tools,it‘s a specialized database. Without the data, the tools are useless;without the software, the data is unmanageable. Software licensing andcontrol over APIs--the lever of power in the previous era--isirrelevant because the software never need be distributed but onlyperformed, and also because without the ability to collect and managethe data, the software is of little use. In fact, the value of the software is proportional to the scale and dynamism of the data it helps to manage.
Google‘s service is not a server--though it is delivered by amassive collection of internet servers--nor a browser--though it isexperienced by the user within the browser. Nor does its flagshipsearch service even host the content that it enables users to find.Much like a phone call, which happens not just on the phones at eitherend of the call, but on the network in between, Google happens in thespace between browser and search engine and destination content server,as an enabler or middleman between the user and his or her onlineexperience.
While both Netscape and Google could be described as softwarecompanies, it‘s clear that Netscape belonged to the same software worldas Lotus, Microsoft, Oracle, SAP, and other companies that got theirstart in the 1980‘s software revolution, while Google‘s fellows areother internet applications like eBay, Amazon, Napster, and yes,DoubleClick and Akamai.

DoubleClick vs. Overture and AdSense
Like Google, DoubleClick is a true child of the internet era. Itharnesses software as a service, has a core competency in datamanagement, and, as noted above, was a pioneer in web services longbefore web services even had a name. However, DoubleClick wasultimately limited by its business model. It bought into the ‘90snotion that the web was about publishing, not participation; thatadvertisers, not consumers, ought to call the shots; that sizemattered, and that the internet was increasingly being dominated by thetop websites as measured by MediaMetrix and other web ad scoringcompanies.
As a result, DoubleClick proudly cites on its website "over 2000successful implementations" of its software. Yahoo! Search Marketing(formerly Overture) and GoogleAdSense, by contrast, already serve hundreds of thousands of advertisers apiece.
Overture and Google‘s success came from an understanding of whatChris Anderson refers to as "the long tail," the collective power ofthe small sites that make up the bulk of the web‘s content.DoubleClick‘s offerings require a formal sales contract, limiting theirmarket to the few thousand largest websites. Overture and Googlefigured out how to enable ad placement on virtually any web page.What‘s more, they eschewed publisher/ad-agency friendly advertisingformats such as banner ads and popups in favor of minimally intrusive,context-sensitive, consumer-friendly text advertising.
The Web 2.0 lesson: leverage customer-self service andalgorithmic data management to reach out to the entire web, to theedges and not just the center, to the long tail and not just the head.
A Platform Beats an Application Every Time
In each of its past confrontations with rivals, Microsoft has successfully played the platform card, trumping even the most dominant applications. Windows allowed Microsoft to displace Lotus 1-2-3 with Excel, WordPerfect with Word, and Netscape Navigator with Internet Explorer.
This time, though, the clash isn‘t between a platform and an application, but between two platforms, each with a radically different business model: On the one side, a single software provider, whose massive installed base and tightly integrated operating system and APIs give control over the programming paradigm; on the other, a system without an owner, tied together by a set of protocols, open standards and agreements for cooperation.
Windows represents the pinnacle of proprietary control via software APIs. Netscape tried to wrest control from Microsoft using the same techniques that Microsoft itself had used against other rivals, and failed. But Apache, which held to the open standards of the web, has prospered. The battle is no longer unequal, a platform versus a single application, but platform versus platform, with the question being which platform, and more profoundly, which architecture, and which business model, is better suited to the opportunity ahead.
Windows was a brilliant solution to the problems of the early PC era. It leveled the playing field for application developers, solving a host of problems that had previously bedeviled the industry. But a single monolithic approach, controlled by a single vendor, is no longer a solution, it‘s a problem. Communications-oriented systems, as the internet-as-platform most certainly is, require interoperability. Unless a vendorcan control both ends of every interaction, the possibilities of user lock-in via software APIs are limited.
Any Web 2.0 vendor that seeks to lock in its application gains by controlling the platform will, by definition, no longer be playing to the strengths of the platform.
This is not to say that there are not opportunities for lock-in and competitive advantage, but we believe they are not to be found via control over software APIs and protocols. There is a new game afoot. The companies that succeed in the Web 2.0 era will be those that understand the rules of that game, rather than trying to go back to the rules of the PC software era.
Not surprisingly, other web 2.0 success stories demonstrate thissame behavior. eBay enables occasional transactions of only a fewdollars between single individuals, acting as an automatedintermediary. Napster (though shut down for legal reasons) built itsnetwork not by building a centralized song database, but byarchitecting a system in such a way that every downloader also became aserver, and thus grew the network.
Akamai vs. BitTorrent
Like DoubleClick, Akamai is optimized to do business with the head,not the tail, with the center, not the edges. While it serves thebenefit of the individuals at the edge of the web by smoothing theiraccess to the high-demand sites at the center, it collects its revenuefrom those central sites.
BitTorrent, like other pioneers in the P2P movement, takes a radicalapproach to internet decentralization. Every client is also a server;files are broken up into fragments that can be served from multiplelocations, transparently harnessing the network of downloaders toprovide both bandwidth and data to other users. The more popular thefile, in fact, the faster it can be served, as there are more usersproviding bandwidth and fragments of the complete file.
BitTorrent thus demonstrates a key Web 2.0 principle: the service automatically gets better the more people use it.While Akamai must add servers to improve service, every BitTorrentconsumer brings his own resources to the party. There‘s an implicit"architecture of participation", a built-in ethic of cooperation, inwhich the service acts primarily as an intelligent broker, connectingthe edges to each other and harnessing the power of the usersthemselves.
2. Harnessing Collective Intelligence
The central principle behind the success of the giants born in theWeb 1.0 era who have survived to lead the Web 2.0 era appears to bethis, that they have embraced the power of the web to harnesscollective intelligence:
Hyperlinking is the foundation of the web. As users add new content, and new sites, it is bound in to the structure of the web by other users discovering the content and linking to it. Much as synapses form in the brain, with associations becoming stronger through repetition or intensity, the web of connections grows organically as an output of the collective activity of all web users.
Yahoo!, the first great internet success story, was born as a catalog, or directory of links, an aggregation of the best work of thousands, then millions of web users. While Yahoo! has since moved into the business of creating many types of content, its role as a portal to the collective work of the net‘s users remains the core of its value.
Google‘s breakthrough in search, which quickly made it the undisputed search market leader, was PageRank, a method of using the link structure of the web rather than just the characteristics of documents to provide better search results.
eBay‘s product is the collective activity of all its users; like the web itself, eBay grows organically in response to user activity, and the company‘s role is as an enabler of a context in which that user activity can happen. What‘s more, eBay‘s competitive advantage comes almost entirely from the critical mass of buyers and sellers, which makes any new entrant offering similar services significantly less attractive.
Amazon sells the same products as competitors such as Barnesandnoble.com, and they receive the same product descriptions, cover images, and editorial content from their vendors. But Amazon has made a science of user engagement. They have an order of magnitude more user reviews, invitations to participate in varied ways on virtually every page--and even more importantly, they use user activity to produce better search results. While a Barnesandnoble.com search is likely to lead with the company‘s own products, or sponsored results, Amazon always leads with "most popular", a real-time computation based not only on sales but other factors that Amazon insiders call the "flow" around products. With an order of magnitude more user participation, it‘s no surprise that Amazon‘s sales also outpace competitors.
Now, innovative companies that pick up on this insight and perhaps extend it even further, are making their mark on the web:
Wikipedia, an online encyclopedia based on the unlikely notion that an entry can be added by any web user, and edited by any other, is a radical experiment in trust, applying Eric Raymond‘s dictum (originally coined in the context ofopen source software) that "with enough eyeballs, all bugs are shallow," to content creation. Wikipedia is already in the top 100 websites, and many think it will be in the top ten before long. This is a profound change in the dynamics of content creation!
Sites like del.icio.us andFlickr, two companies that have received a great deal of attention of late, have pioneered a concept that some people call "folksonomy" (in contrast to taxonomy), a style of collaborative categorization of sites using freely chosen keywords, often referred to as tags. Tagging allows for the kind of multiple, overlapping associations that the brain itself uses, rather than rigid categories. In the canonical example, a Flickr photo of a puppy might be tagged both "puppy" and "cute"--allowing for retrieval along natural axes generated user activity.
Collaborative spam filtering products like Cloudmark aggregate the individual decisions of email users about what is and is not spam, outperforming systems that rely on analysis of the messages themselves.
It is a truism that the greatest internet success stories don‘t advertise their products. Their adoption is driven by "viral marketing"--that is, recommendations propagating directly from one user to another. You can almost make the case that if a site or product relies on advertising to get the word out, it isn‘t Web 2.0.
Even much of the infrastructure of the web--including the Linux, Apache, MySQL, and Perl, PHP, or Python code involved in most web servers--relies on thepeer-production methods of open source, in themselves an instance of collective, net-enabled intelligence. There are more than 100,000 open source software projects listed onSourceForge.net. Anyone can add a project, anyone can download and use the code, and new projects migrate from the edges to the center as a result of users putting them to work, an organic software adoption process relying almost entirely on viral marketing.
The lesson: Network effects from user contributions are the key to market dominance in the Web 2.0 era.

Blogging and the Wisdom of Crowds
One of the most highly touted features of the Web 2.0 era is therise of blogging. Personal home pages have been around since the earlydays of the web, and the personal diary and daily opinion column aroundmuch longer than that, so just what is the fuss all about?
At its most basic, a blog is just a personal home page in diary format. But as Rich Skrentanotes,the chronological organization of a blog "seems like a trivialdifference, but it drives an entirely different delivery, advertisingand value chain."
One of the things that has made a difference is a technology calledRSS.RSS is the most significant advance in the fundamental architecture ofthe web since early hackers realized that CGI could be used to createdatabase-backed websites. RSS allows someone to link not just to apage, but to subscribe to it, with notification every time that pagechanges. Skrenta calls this "the incremental web." Others call it the"live web".
Now, of course, "dynamic websites" (i.e., database-backed sites withdynamically generated content) replaced static web pages well over tenyears ago. What‘s dynamic about the live web are not just the pages,but the links. A link to a weblog is expected to point to a perenniallychanging page, with "permalinks" for any individual entry, andnotification for each change. An RSS feed is thus a much stronger linkthan, say a bookmark or a link to a single page.
The Architecture of Participation
Some systems are designed to encourage participation. In his paper,The Cornucopia of the Commons, Dan Bricklin noted that there are three ways to build a large database. The first, demonstrated by Yahoo!, is to pay people to do it. The second, inspired by lessons from the open source community, is to get volunteers to perform the same task. TheOpen Directory Project, an open source Yahoo competitor, is the result. ButNapster demonstrated a third way. Because Napster set its defaults to automatically serve any music that was downloaded, every user automatically helped to build the value of the shared database. This same approach has been followed by all other P2P file sharing services.
One of the key lessons of the Web 2.0 era is this: Users add value. But only a small percentage of users will go to the trouble of adding value to your application via explicit means. Therefore, Web 2.0 companies set inclusive defaults for aggregating user data and building value as a side-effect of ordinary use of the application. As noted above, they build systems that get better the more people use them.
Mitch Kapor once noted that "architecture is politics." Participation is intrinsic to Napster, part of its fundamental architecture.
This architectural insight may also be more central to the success of open source software than the more frequently cited appeal to volunteerism. The architecture of the internet, and the World Wide Web, as well as of open source software projects like Linux, Apache, and Perl, is such that users pursuing their own "selfish" interests build collective value as an automatic byproduct. Each of these projects has a small core, well-defined extension mechanisms, and an approach that lets any well-behaved component be added by anyone, growing the outer layers of what Larry Wall, the creator of Perl, refers to as "the onion." In other words, these technologies demonstrate network effects, simply through the way that they have been designed.
These projects can be seen to have a natural architecture of participation. But as Amazon demonstrates, by consistent effort (as well as economic incentives such as the Associates program), it is possible to overlay such an architecture on a system that would not normally seem to possess it.
RSS also means that the web browser is not the only means of viewinga web page. While some RSS aggregators, such as Bloglines, areweb-based, others are desktop clients, and still others allow users ofportable devices to subscribe to constantly updated content.
RSS is now being used to push not just notices of new blog entries,but also all kinds of data updates, including stock quotes, weatherdata, and photo availability. This use is actually a return to one ofits roots: RSS was born in 1997 out of the confluence of Dave Winer‘s"Really Simple Syndication" technology, used to push out blog updates,and Netscape‘s "Rich Site Summary", which allowed users to createcustom Netscape home pages with regularly updated data flows. Netscapelost interest, and the technology was carried forward by bloggingpioneer Userland, Winer‘s company. In the current crop of applications,we see, though, the heritage of both parents.
But RSS is only part of what makes a weblog different from an ordinary web page. Tom Coates remarks onthe significance of the permalink:
It may seem like a trivial piece of functionality now, butit was effectively the device that turned weblogs from anease-of-publishing phenomenon into a conversational mess of overlappingcommunities. For the first time it became relatively easy to gesturedirectly at a highly specific post on someone else‘s site and talkabout it. Discussion emerged. Chat emerged. And - as a result -friendships emerged or became more entrenched. The permalink was thefirst - and most successful - attempt to build bridges between weblogs.
In many ways, the combination of RSS and permalinks adds many of thefeatures of NNTP, the Network News Protocol of the Usenet, onto HTTP,the web protocol. The "blogosphere" can be thought of as a new,peer-to-peer equivalent to Usenet and bulletin-boards, theconversational watering holes of the early internet. Not only canpeople subscribe to each others‘ sites, and easily link to individualcomments on a page, but also, via a mechanism known as trackbacks, theycan see when anyone else links to their pages, and can respond, eitherwith reciprocal links, or by adding comments.
Interestingly, two-way links were the goal of early hypertextsystems like Xanadu. Hypertext purists have celebrated trackbacks as astep towards two way links. But note that trackbacks are not properlytwo-way--rather, they are really (potentially) symmetrical one-waylinks that create the effect of two way links. The difference may seemsubtle, but in practice it is enormous. Social networking systems likeFriendster, Orkut, and LinkedIn, which require acknowledgment by therecipient in order to establish a connection, lack the same scalabilityas the web. As noted by Caterina Fake, co-founder of the Flickr photosharing service, attention is only coincidentally reciprocal. (Flickrthus allows users to set watch lists--any user can subscribe to anyother user‘s photostream via RSS. The object of attention is notified,but does not have to approve the connection.)
If an essential part of Web 2.0 is harnessing collectiveintelligence, turning the web into a kind of global brain, theblogosphere is the equivalent of constant mental chatter in theforebrain, the voice we hear in all of our heads. It may not reflectthe deep structure of the brain, which is often unconscious, but isinstead the equivalent of conscious thought. And as a reflection ofconscious thought and attention, the blogosphere has begun to have apowerful effect.
First, because search engines use link structure to help predictuseful pages, bloggers, as the most prolific and timely linkers, have adisproportionate role in shaping search engine results. Second, becausethe blogging community is so highly self-referential, bloggers payingattention to other bloggers magnifies their visibility and power. The"echo chamber" that critics decry is also an amplifier.
If it were merely an amplifier, blogging would be uninteresting. Butlike Wikipedia, blogging harnesses collective intelligence as a kind offilter. What James Suriowecki calls "the wisdom of crowds"comes into play, and much as PageRank produces better results thananalysis of any individual document, the collective attention of theblogosphere selects for value.
While mainstream media may see individual blogs as competitors, whatis really unnerving is that the competition is with the blogosphere asa whole. This is not just a competition between sites, but acompetition between business models. The world of Web 2.0 is also theworld of what Dan Gillmor calls "we, the media," a world in which "the former audience", not a few people in a back room, decides what‘s important.
3. Data is the Next Intel Inside
Every significant internet application to date has been backed by aspecialized database: Google‘s web crawl, Yahoo!‘s directory (and webcrawl), Amazon‘s database of products, eBay‘s database of products andsellers, MapQuest‘s map databases, Napster‘s distributed song database.As Hal Varian remarked in a personal conversation last year, "SQL isthe new HTML." Database management is a core competency of Web 2.0companies, so much so that we have sometimes referred to theseapplications as "infoware" rather than merely software.
This fact leads to a key question: Who owns the data?
In the internet era, one can already see a number of cases wherecontrol over the database has led to market control and outsizedfinancial returns. The monopoly on domain name registry initiallygranted by government fiat to Network Solutions (later purchased byVerisign) was one of the first great moneymakers of the internet. Whilewe‘ve argued that business advantage via controlling software APIs ismuch more difficult in the age of the internet, control of key datasources is not, especially if those data sources are expensive tocreate or amenable to increasing returns via network effects.
Look at the copyright notices at the base of every map served byMapQuest, maps.yahoo.com, maps.msn.com, or maps.google.com, and you‘llsee the line "Maps copyright NavTeq, TeleAtlas," or with the newsatellite imagery services, "Images copyright Digital Globe." Thesecompanies made substantial investments in their databases (NavTeq alonereportedly invested $750 million to build their database of streetaddresses and directions. Digital Globe spent $500 million to launchtheir own satellite to improve on government-supplied imagery.) NavTeqhas gone so far as to imitate Intel‘s familiar Intel Inside logo: Carswith navigation systems bear the imprint, "NavTeq Onboard." Data isindeed the Intel Inside of these applications, a sole source componentin systems whose software infrastructure is largely open source orotherwise commodified.
The now hotly contested web mapping arena demonstrates how a failureto understand the importance of owning an application‘s core data willeventually undercut its competitive position. MapQuest pioneered theweb mapping category in 1995, yet when Yahoo!, and then Microsoft, andmost recently Google, decided to enter the market, they were easilyable to offer a competing application simply by licensing the same data.
Contrast, however, the position of Amazon.com. Like competitors suchas Barnesandnoble.com, its original database came from ISBN registryprovider R.R. Bowker. But unlike MapQuest, Amazon relentlessly enhancedthe data, adding publisher-supplied data such as cover images, table ofcontents, index, and sample material. Even more importantly, theyharnessed their users to annotate the data, such that after ten years,Amazon, not Bowker, is the primary source for bibliographic data onbooks, a reference source for scholars and librarians as well asconsumers. Amazon also introduced their own proprietary identifier, theASIN,which corresponds to the ISBN where one is present, and creates anequivalent namespace for products without one. Effectively, Amazon"embraced and extended" their data suppliers.
Imagine if MapQuest had done the same thing, harnessing their usersto annotate maps and directions, adding layers of value. It would havebeen much more difficult for competitors to enter the market just bylicensing the base data.
The recent introduction of Google Maps provides a living laboratoryfor the competition between application vendors and their datasuppliers. Google‘s lightweight programming model has led to thecreation of numerous value-added services in the form of mashups thatlink Google Maps with other internet-accessible data sources. PaulRademacher‘shousingmaps.com, which combines Google Maps withCraigslistapartment rental and home purchase data to create an interactivehousing search tool, is the pre-eminent example of such a mashup.
At present, these mashups are mostly innovative experiments, done byhackers. But entrepreneurial activity follows close behind. Andalready, one can see that for at least one class of developer, Googlehas taken the role of data source away from Navteq and insertedthemselves as a favored intermediary. We expect to see battles betweendata suppliers and application vendors in the next few years, as bothrealize just how important certain classes of data will become asbuilding blocks for Web 2.0 applications.
The race is on to own certain classes of core data:location, identity, calendaring of public events, product identifiersand namespaces. In many cases, where there is significant cost tocreate the data, there may be an opportunity for an Intel Inside styleplay, with a single source for the data. In others, the winner will bethe company that first reaches critical mass via user aggregation, andturns that aggregated data into a system service.
For example, in the area of identity, PayPal, Amazon‘s 1-click, andthe millions of users of communications systems, may all be legitimatecontenders to build a network-wide identity database. (In this regard,Google‘s recent attempt to use cell phone numbers as an identifier forGmail accounts may be a step towards embracing and extending the phonesystem.) Meanwhile, startups likeSxipare exploring the potential of federated identity, in quest of a kindof "distributed 1-click" that will provide a seamless Web 2.0 identitysubsystem. In the area of calendaring,EVDBis an attempt to build the world‘s largest shared calendar via awiki-style architecture of participation. While the jury‘s still out onthe success of any particular startup or approach, it‘s clear thatstandards and solutions in these areas, effectively turning certainclasses of data into reliable subsystems of the "internet operatingsystem", will enable the next generation of applications.
A further point must be noted with regard to data, and that is userconcerns about privacy and their rights to their own data. In many ofthe early web applications, copyright is only loosely enforced. Forexample, Amazon lays claim to any reviews submitted to the site, but inthe absence of enforcement, people may repost the same reviewelsewhere. However, as companies begin to realize that control overdata may be their chief source of competitive advantage, we may seeheightened attempts at control.
Much as the rise of proprietary software led to theFree Softwaremovement, we expect the rise of proprietary databases to result in aFree Data movement within the next decade. One can see early signs ofthis countervailing trend in open data projects such as Wikipedia, theCreative Commons, and in software projects likeGreasemonkey, which allow users to take control of how data is displayed on their computer.

4. End of the Software Release Cycle
As noted above in the discussion of Google vs. Netscape, one of thedefining characteristics of internet era software is that it isdelivered as a service, not as a product. This fact leads to a numberof fundamental changes in the business model of such a company:
Operations must become a core competency. Google‘s or Yahoo!‘s expertise in product development must be matched by an expertise in daily operations. So fundamental is the shift from software as artifact to software as service that the software will cease to perform unless it is maintained on a daily basis. Google must continuously crawl the web and update its indices, continuously filter out link spam and other attempts to influence its results, continuously and dynamically respond to hundreds of millions of asynchronous user queries, simultaneously matching them with context-appropriate advertisements.
It‘s no accident that Google‘s system administration, networking, and load balancing techniques are perhaps even more closely guarded secrets than their search algorithms. Google‘s success at automating these processes is a key part of their cost advantage over competitors.
It‘s also no accident thatscripting languages such as Perl, Python, PHP, and now Ruby, play such a large role at web 2.0 companies. Perl was famously described by Hassan Schroeder, Sun‘s first webmaster, as "the duct tape of the internet." Dynamic languages (often called scripting languages and looked down on by the software engineers of the era of software artifacts) are the tool of choice for system and network administrators, as well as application developers building dynamic systems that require constant change.
Users must be treated as co-developers, in a reflection of open source development practices (even if the software in question is unlikely to be released under an open source license.) The open source dictum, "release early and release often" in fact has morphed into an even more radical position, "the perpetual beta," in which the product is developed in the open, with new features slipstreamed in on a monthly, weekly, or even daily basis. It‘s no accident that services such as Gmail, Google Maps, Flickr, del.icio.us, and the like may be expected to bear a "Beta" logo for years at a time.
Real time monitoring of user behavior to see just which new features are used, and how they are used, thus becomes another required core competency. A web developer at a major online service remarked: "We put up two or three new features on some part of the site every day, and if users don‘t adopt them, we take them down. If they like them, we roll them out to the entire site."
Cal Henderson, the lead developer of Flickr, recentlyrevealed that they deploy new builds up to every half hour. This is clearly a radically different development model! While not all web applications are developed in as extreme a style as Flickr, almost all web applications have a development cycle that is radically unlike anything from the PC or client-server era. It is for this reason that a recent ZDnet editorialconcluded that Microsoft won‘t be able to beat Google: "Microsoft‘s business model depends on everyone upgrading their computing environment every two to three years. Google‘s depends on everyone exploring what‘s new in their computing environment every day."
While Microsoft has demonstrated enormous ability to learn from andultimately best its competition, there‘s no question that this time,the competition will require Microsoft (and by extension, every otherexisting software company) to become a deeply different kind ofcompany. Native Web 2.0 companies enjoy a natural advantage, as theydon‘t have old patterns (and corresponding business models and revenuesources) to shed.
A Web 2.0 Investment Thesis
Venture capitalist Paul Kedroskywrites: "The key is to find the actionable investments where you disagree with the consensus". It‘s interesting to see how each Web 2.0 facet involves disagreeing with the consensus: everyone was emphasizing keeping data private, Flickr/Napster/et al. make it public. It‘s not just disagreeing to be disagreeable (pet food! online!), it‘s disagreeing where you can build something out of the differences. Flickr builds communities, Napster built breadth of collection.
Another way to look at it is that the successful companies all give up something expensive but considered critical to get something valuable for free that was once expensive. For example, Wikipedia gives up central editorial control in return for speed and breadth. Napster gave up on the idea of "the catalog" (all the songs the vendor was selling) and got breadth. Amazon gave up on the idea of having a physical storefront but got to serve the entire world. Google gave up on the big customers (initially) and got the 80% whose needs weren‘t being met. There‘s something very aikido (using your opponent‘s force against them) in saying "you know, you‘re right--absolutely anyone in the whole world CAN update this article. And guess what, that‘s bad news for you."
--Nat Torkington
5. Lightweight Programming Models
Once the idea of web services became au courant, largecompanies jumped into the fray with a complex web services stackdesigned to create highly reliable programming environments fordistributed applications.
But much as the web succeeded precisely because it overthrew much ofhypertext theory, substituting a simple pragmatism for ideal design,RSS has become perhaps the single most widely deployed web servicebecause of its simplicity, while the complex corporate web servicesstacks have yet to achieve wide deployment.
Similarly, Amazon.com‘s web services are provided in two forms: oneadhering to the formalisms of the SOAP (Simple Object Access Protocol)web services stack, the other simply providing XML data over HTTP, in alightweight approach sometimes referred to as REST (RepresentationalState Transfer). While high value B2B connections (like those betweenAmazon and retail partners like ToysRUs) use the SOAP stack, Amazonreports that 95% of the usage is of the lightweight REST service.
This same quest for simplicity can be seen in other "organic" webservices. Google‘s recent release of Google Maps is a case in point.Google Maps‘ simple AJAX (Javascript and XML) interface was quicklydecrypted by hackers, who then proceeded to remix the data into newservices.
Mapping-related web services had been available for some time fromGIS vendors such as ESRI as well as from MapQuest and MicrosoftMapPoint. But Google Maps set the world on fire because of itssimplicity. While experimenting with any of the formal vendor-supportedweb services required a formal contract between the parties, the wayGoogle Maps was implemented left the data for the taking, and hackerssoon found ways to creatively re-use that data.
There are several significant lessons here:
Support lightweight programming models that allow for loosely coupled systems. The complexity of the corporate-sponsored web services stack is designed to enable tight coupling. While this is necessary in many cases, many of the most interesting applications can indeed remain loosely coupled, and even fragile. The Web 2.0 mindset is very different from the traditional IT mindset!
Think syndication, not coordination. Simple web services, like RSS and REST-based web services, are about syndicating data outwards, not controlling what happens when it gets to the other end of the connection. This idea is fundamental to the internet itself, a reflection of what is known as theend-to-end principle.
Design for "hackability" and remixability. Systems like the original web, RSS, and AJAX all have this in common: the barriers to re-use are extremely low. Much of the useful software is actually open source, but even when it isn‘t, there is little in the way of intellectual property protection. The web browser‘s "View Source" option made it possible for any user to copy any other user‘s web page; RSS was designed to empower the user to view the content he or she wants, when it‘s wanted, not at the behest of the information provider; the most successful web services are those that have been easiest to take in new directions unimagined by their creators. The phrase "some rights reserved," which was popularized by the Creative Commons to contrast with the more typical "all rights reserved," is a useful guidepost.
Innovation in Assembly
Lightweight business models are a natural concomitant of lightweightprogramming and lightweight connections. The Web 2.0 mindset is good atre-use. A new service like housingmaps.com was built simply by snappingtogether two existing services. Housingmaps.com doesn‘t have a businessmodel (yet)--but for many small-scale services, Google AdSense (orperhaps Amazon associates fees, or both) provides the snap-inequivalent of a revenue model.
These examples provide an insight into another key web 2.0principle, which we call "innovation in assembly." When commoditycomponents are abundant, you can create value simply by assembling themin novel or effective ways. Much as the PC revolution provided manyopportunities for innovation in assembly of commodity hardware, withcompanies like Dell making a science out of such assembly, therebydefeating companies whose business model required innovation in productdevelopment, we believe that Web 2.0 will provide opportunities forcompanies to beat the competition by getting better at harnessing andintegrating services provided by others.
6. Software Above the Level of a Single Device
One other feature of Web 2.0 that deserves mention is the fact thatit‘s no longer limited to the PC platform. In his parting advice toMicrosoft, long time Microsoft developer Dave Stutz pointed out that"Usefulsoftware written above the level of the single device will command high margins for a long time to come."
Of course, any web application can be seen as software above thelevel of a single device. After all, even the simplest web applicationinvolves at least two computers: the one hosting the web server and theone hosting the browser. And as we‘ve discussed, the development of theweb as platform extends this idea to synthetic applications composed ofservices provided by multiple computers.
But as with many areas of Web 2.0, where the "2.0-ness" is notsomething new, but rather a fuller realization of the true potential ofthe web platform, this phrase gives us a key insight into how to designapplications and services for the new platform.
To date, iTunes is the best exemplar of this principle. Thisapplication seamlessly reaches from the handheld device to a massiveweb back-end, with the PC acting as a local cache and control station.There have been many previous attempts to bring web content to portabledevices, but the iPod/iTunes combination is one of the first suchapplications designed from the ground up to span multiple devices. TiVois another good example.
iTunes and TiVo also demonstrate many of the other core principlesof Web 2.0. They are not web applications per se, but they leverage thepower of the web platform, making it a seamless, almost invisible partof their infrastructure. Data management is most clearly the heart oftheir offering. They are services, not packaged applications (althoughin the case of iTunes, it can be used as a packaged application,managing only the user‘s local data.) What‘s more, both TiVo and iTunesshow some budding use of collective intelligence, although in eachcase, their experiments are at war with the IP lobby‘s. There‘s only alimited architecture of participation in iTunes, though the recentaddition ofpodcasting changes that equation substantially.
This is one of the areas of Web 2.0 where we expect to see some ofthe greatest change, as more and more devices are connected to the newplatform. What applications become possible when our phones and ourcars are not consuming data but reporting it? Real time trafficmonitoring, flash mobs, and citizen journalism are only a few of theearly warning signs of the capabilities of the new platform.

7. Rich User Experiences
As early as Pei Wei‘sViola browserin 1992, the web was being used to deliver "applets" and other kinds ofactive content within the web browser. Java‘s introduction in 1995 wasframed around the delivery of such applets. JavaScript and then DHTMLwere introduced as lightweight ways to provide client sideprogrammability and richer user experiences. Several years ago,Macromedia coined the term "Rich Internet Applications" (which has alsobeen picked up by open source Flash competitor Laszlo Systems) tohighlight the capabilities of Flash to deliver not just multimediacontent but also GUI-style application experiences.
However, the potential of the web to deliver full scale applicationsdidn‘t hit the mainstream till Google introduced Gmail, quicklyfollowed by Google Maps, web based applications with rich userinterfaces and PC-equivalent interactivity. The collection oftechnologies used by Google waschristened AJAX, in a seminal essay by Jesse James Garrett of web design firm Adaptive Path. He wrote:
"Ajax isn‘t a technology. It‘s really several technologies, eachflourishing in its own right, coming together in powerful new ways.Ajax incorporates:
standards-based presentation using XHTML and CSS;
dynamic display and interaction using theDocument Object Model;
data interchange and manipulation usingXML and XSLT;
asynchronous data retrieval usingXMLHttpRequest;
andJavaScript binding everything together."
Web 2.0 Design Patterns
In his book,A Pattern Language, Christopher Alexander prescribes a format for the concise description of the solution to architectural problems. He writes: "Each pattern describes a problem that occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice."
The Long Tail
Small sites make up the bulk of the internet‘s content; narrow niches make up the bulk of internet‘s the possible applications. Therefore: Leverage customer-self service and algorithmic data management to reach out to the entire web, to the edges and not just the center, to the long tail and not just the head.
Data is the Next Intel Inside
Applications are increasingly data-driven. Therefore: For competitive advantage, seek to own a unique, hard-to-recreate source of data.
Users Add Value
The key to competitive advantage in internet applications is the extent to which users add their own data to that which you provide. Therefore: Don‘t restrict your "architecture of participation" to software development. Involve your users both implicitly and explicitly in adding value to your application.
Network Effects by Default
Only a small percentage of users will go to the trouble of adding value to your application. Therefore: Set inclusive defaults for aggregating user data as a side-effect of their use of the application.
Some Rights Reserved. Intellectual property protection limits re-use and prevents experimentation. Therefore: When benefits come from collective adoption, not private restriction, make sure that barriers to adoption are low. Follow existing standards, and use licenses with as few restrictions as possible. Design for "hackability" and "remixability."
The Perpetual Beta
When devices and programs are connected to the internet, applications are no longer software artifacts, they are ongoing services. Therefore: Don‘t package up new features into monolithic releases, but instead add them on a regular basis as part of the normal user experience. Engage your users as real-time testers, and instrument the service so that you know how people use the new features.
Cooperate, Don‘t Control
Web 2.0 applications are built of a network of cooperating data services. Therefore: Offer web services interfaces and content syndication, and re-use the data services of others. Support lightweight programming models that allow for loosely-coupled systems.
Software Above the Level of a Single Device
The PC is no longer the only access device for internet applications, and applications that are limited to a single device are less valuable than those that are connected. Therefore: Design your application from the get-go to integrate services across handheld devices, PCs, and internet servers.
AJAX is also a key component of Web 2.0 applications such as Flickr,now part of Yahoo!, 37signals‘ applications basecamp and backpack, aswell as other Google applications such as Gmail and Orkut. We‘reentering an unprecedented period of user interface innovation, as webdevelopers are finally able to build web applications as rich as localPC-based applications.
Interestingly, many of the capabilities now being explored have beenaround for many years. In the late ‘90s, both Microsoft and Netscapehad a vision of the kind of capabilities that are now finally beingrealized, but their battle over the standards to be used madecross-browser applications difficult. It was only when Microsoftdefinitively won the browser wars, and there was a single de-factobrowser standard to write to, that this kind of application becamepossible. And whileFirefoxhas reintroduced competition to the browser market, at least so far wehaven‘t seen the destructive competition over web standards that heldback progress in the ‘90s.
We expect to see many new web applications over the next few years,both truly novel applications, and rich web reimplementations of PCapplications. Every platform change to date has also createdopportunities for a leadership change in the dominant applications ofthe previous platform.
Gmail has already providedsome interesting innovations in email,combining the strengths of the web (accessible from anywhere, deepdatabase competencies, searchability) with user interfaces thatapproach PC interfaces in usability. Meanwhile, other mail clients onthe PC platform are nibbling away at the problem from the other end,adding IM and presence capabilities. How far are we from an integratedcommunications client combining the best of email, IM, and the cellphone, usingVoIP to add voice capabilities to the rich capabilities of web applications? The race is on.
It‘s easy to see how Web 2.0 will also remake the address book. AWeb 2.0-style address book would treat the local address book on the PCor phone merely as a cache of the contacts you‘ve explicitly asked thesystem to remember. Meanwhile, a web-based synchronization agent,Gmail-style, would remember every message sent or received, every emailaddress and every phone number used, and build social networkingheuristics to decide which ones to offer up as alternatives when ananswer wasn‘t found in the local cache. Lacking an answer there, thesystem would query the broader social network.
A Web 2.0 word processor would support wiki-style collaborativeediting, not just standalone documents. But it would also support therich formatting we‘ve come to expect in PC-based word processors.Writely is a good example of such an application, although it hasn‘t yet gained wide traction.
Nor will the Web 2.0 revolution be limited to PC applications.Salesforce.com demonstrates how the web can be used to deliver softwareas a service, in enterprise scale applications such as CRM.
The competitive opportunity for new entrants is to fully embrace thepotential of Web 2.0. Companies that succeed will create applicationsthat learn from their users, using an architecture of participation tobuild a commanding advantage not just in the software interface, but inthe richness of the shared data.
Core Competencies of Web 2.0 Companies
In exploring the seven principles above, we‘ve highlighted some ofthe principal features of Web 2.0. Each of the examples we‘ve exploreddemonstrates one or more of those key principles, but may miss others.Let‘s close, therefore, by summarizing what we believe to be the corecompetencies of Web 2.0 companies:
Services, not packaged software, with cost-effective scalability
Control over unique, hard-to-recreate data sources that get richer as more people use them
Trusting users as co-developers
Harnessing collective intelligence
Leveraging the long tail through customer self-service
Software above the level of a single device
Lightweight user interfaces, development models, AND business models
The next time a company claims that it‘s "Web 2.0," test theirfeatures against the list above. The more points they score, the morethey are worthy of the name. Remember, though, that excellence in onearea may be more telling than some small steps in all seven.
Tim O‘Reilly
O’Reilly Media, Inc., tim@oreilly.com
President and CEO
Copyright © 2007 O‘Reilly Media, Inc.