Overview of TCP/IP and the Internet
来源:百度文库 编辑:神马文学网 时间:2024/04/29 18:14:14
An Overview of TCP/IP Protocols
and the Internet
Gary C. Kessler
kumquat@sover.net
16 January 2007
This paper was originally submitted to the InterNIC and posted on their Gopher site on 5 August 1994. This document is a continually updated version of that paper.
Contents
1. Introduction
2. What are TCP/IP and the Internet?
2.1. The Evolution of TCP/IP (and the Internet)
2.2. Internet Growth
2.3. Internet Administration
2.4. Domain Names and IP Addresses (and Politics)
3. The TCP/IP Protocol Architecture
3.1. The Network Interface Layer
3.1.1. PPP
3.2. The Internet Layer
3.2.1. IP Addressing and Subnet Masks
3.2.2. Conserving IP Addresses: CIDR, DHCP, NAT, and PAT
3.2.3. The Domain Name System
3.2.4. ARP and Address Resolution
3.2.5. IP Routing: OSPF, RIP, and BGP
3.2.6. IP version 6
3.3. The Transport Layer Protocols
3.3.1. Ports
3.3.2. TCP
3.3.3. UDP
3.3.4. ICMP
3.3.5. TCP Logical Connections and ICMP
3.4. The TCP/IP Application Layer
3.4.1. TCP and UDP Applications
3.4.2. Protocol Analysis
3.5. Summary
4. Other Information Sources
5. Acronyms and Abbreviations
6. Author‘s Address
An increasing number of people are using the Internet and, many for the first time, are using the tools and utilities that at one time were only available on a limited number of computer systems (and only for really intense users!). One sign of this growth in use has been the significant number of TCP/IP and Internet books, articles, courses, and even TV shows that have become available in the last several years; there are so many such books that publishers are reluctant to authorize more because bookstores have reached their limit of shelf space! This memo provides a broad overview of the Internet and TCP/IP, with an emphasis on history, terms, and concepts. It is meant as a brief guide and starting point, referring to many other sources for more detailed information.
While the TCP/IP protocols and the Internet are different, their histories are most definitely intertwingled! This section will discuss some of the history. For additional information and insight, readers are urged to read two excellent histories of the Internet: Casting The Net: From ARPANET to INTERNET and beyond... by Peter Salus (Addison-Wesley, 1995) and Where Wizards Stay Up Late: The Origins of the Internet by Katie Hafner and Mark Lyon (Simon & Schuster, 1997). In addition, the Internet Society maintains a number of on-line "Internet history" papers athttp://www.isoc.org/internet/history/.
While the Internet today is recognized as a network that is fundamentally changing social, political, and economic structures, and in many ways obviating geographic boundaries, this potential is merely the realization of predictions that go back nearly forty years. In a series of memos dating back to August 1962, J.C.R. Licklider of MIT discussed his "Galactic Network" and how social interactions could be enabled through networking. The Internet certainly provides such a national and global infrastructure and, in fact, interplanetary Internet communication has already been seriously discussed.
Prior to the 1960s, what little computer communication existed comprised simple text and binary data, carried by the most common telecommunications network technology of the day; namely, circuit switching, the technology of the telephone networks for nearly a hundred years. Because most data traffic is bursty in nature (i.e., most of the transmissions occur during a very short period of time), circuit switching results in highly inefficient use of network resources.
The fundamental technology that makes the Internet work is called packet switching, a data network in which all components (i.e., hosts and switches) operate independently, eliminating single point-of-failure problems. In addition, network communication resources appear to be dedicated to individual users but, in fact, statistical multiplexing and an upper limit on the size of a transmitted entity result in fast, economical networks.
In the 1960s, packet switching was ready to be discovered. In 1961, Leonard Kleinrock of MIT published the first paper on packet switching theory (and the first book on the subject in 1964). In 1962, Paul Baran of the Rand Corporation described a robust, efficient, store-and-forward data network in a report for the U.S. Air Force. At about the same time, Donald Davies and Roger Scantlebury suggested a similar idea from work at the National Physical Laboratory (NPL) in the U.K. The research at MIT (1961-1967), RAND (1962-1965), and NPL (1964-1967) occurred independently and the principal researchers did not all meet together until the Association for Computing Machinery (ACM) meeting in 1967. The term packet was adopted from the work at NPL.
The modern Internet began as a U.S. Department of Defense (DoD) funded experiment to interconnect DoD-funded research sites in the U.S. The 1967 ACM meeting was also where the initial design for the so-called ARPANET — named for the DoD‘s Advanced Research Projects Agency (ARPA) — was first published by Larry Roberts. In December 1968, ARPA awarded a contract to Bolt Beranek and Newman (BBN) to design and deploy a packet switching network with a proposed line speed of 50 kbps. In September 1969, the first node of the ARPANET was installed at the University of California at Los Angeles (UCLA), followed monthly with nodes at Stanford Research Institute (SRI), the University of California at Santa Barbara (UCSB), and the University of Utah. With four nodes by the end of 1969, the ARPANET spanned the continental U.S. by 1971 and had connections to Europe by 1973.
The original ARPANET gave life to a number of protocols that were new to packet switching. One of the most lasting results of the ARPANET was the development of a user-network protocol that has become the standard interface between users and packet switched networks; namely, ITU-T (formerly CCITT) Recommendation X.25. This "standard" interface encouraged BBN to start Telenet, a commercial packet-switched data service, in 1974; after much renaming, Telenet became a part of Sprint‘s X.25 service.
The initial host-to-host communications protocol introduced in the ARPANET was called the Network Control Protocol (NCP). Over time, however, NCP proved to be incapable of keeping up with the growing network traffic load. In 1974, a new, more robust suite of communications protocols was proposed and implemented throughout the ARPANET, based upon the Transmission Control Protocol (TCP) for end-to-end network communication. But it seemed like overkill for the intermediate gateways (what we would today call routers) to needlessly have to deal with an end-to-end protocol so in 1978 a new design split responsibilities between a pair of protocols; the new Internet Protocol (IP) for routing packets and device-to-device communication (i.e., host-to-gateway or gateway-to-gateway) and TCP for reliable, end-to-end host communication. Since TCP and IP were originally envisioned functionally as a single protocol, the protocol suite, which actually refers to a large collection of protocols and applications, is usually referred to simply as TCP/IP.
The original versions of both TCP and IP that are in common use today were written in September 1981, although both have had several modifications applied to them (in addition, the IP version 6, or IPv6, specification was released in December 1995). In 1983, the DoD mandated that all of their computer systems would use the TCP/IP protocol suite for long-haul communications, further enhancing the scope and importance of the ARPANET.
In 1983, the ARPANET was split into two components. One component, still called ARPANET, was used to interconnect research/development and academic sites; the other, called MILNET, was used to carry military traffic and became part of the Defense Data Network. That year also saw a huge boost in the popularity of TCP/IP with its inclusion in the communications kernel for the University of California s UNIX implementation, 4.2BSD (Berkeley Software Distribution) UNIX.
In 1986, the National Science Foundation (NSF) built a backbone network to interconnect four NSF-funded regional supercomputer centers and the National Center for Atmospheric Research (NCAR). This network, dubbed the NSFNET, was originally intended as a backbone for other networks, not as an interconnection mechanism for individual systems. Furthermore, the "Appropriate Use Policy" defined by the NSF limited traffic to non-commercial use. The NSFNET continued to grow and provide connectivity between both NSF-funded and non-NSF regional networks, eventually becoming the backbone that we know today as the Internet. Although early NSFNET applications were largely multiprotocol in nature, TCP/IP was employed for interconnectivity (with the ultimate goal of migration to Open Systems Interconnection).
The NSFNET originally comprised 56-kbps links and was completely upgraded to T1 (1.544 Mbps) links in 1989. Migration to a "professionally-managed" network was supervised by a consortium comprising Merit (a Michigan state regional network headquartered at the University of Michigan), IBM, and MCI. Advanced Network & Services, Inc. (ANS), a non-profit company formed by IBM and MCI, was responsible for managing the NSFNET and supervising the transition of the NSFNET backbone to T3 (44.736 Mbps) rates by the end of 1991. During this period of time, the NSF also funded a number of regional Internet service providers (ISPs) to provide local connection points for educational institutions and NSF-funded sites.
In 1993, the NSF decided that it did not want to be in the business of running and funding networks, but wanted instead to go back to the funding of research in the areas of supercomputing and high-speed communications. In addition, there was increased pressure to commercialize the Internet; in 1989, a trial gateway connected MCI, CompuServe, and Internet mail services, and commercial users were now finding out about all of the capabilities of the Internet that once belonged exclusively to academic and hard-core users! In 1991, theCommercial Internet Exchange (CIX) Association was formed by General Atomics, Performance Systems International (PSI), and UUNET Technologies to promote and provide a commercial Internet backbone service. Nevertheless, there remained intense pressure from non-NSF ISPs to open the network to all users.
FIGURE 1. NSFNET structure initiated in 1994 to merge the academic and commercial networks.
In 1994, a plan was put in place to reduce the NSF‘s role in the public Internet. The new structure comprises three parts:
Network Access Points (NAPs), where individual ISPs would interconnect, as suggested in Figure 1. The NSF originally funded four such NAPs: Chicago (operated by Ameritech), New York (really Pensauken, NJ, operated by Sprint), San Francisco (operated by Pacific Bell, now SBC), and Washington, D.C. (MAE-East, operated by MFS, now part of Worldcom). Thevery High Speed Backbone Network Service, a network interconnecting the NAPs and NSF-funded centers, operated by MCI. This network was installed in 1995 and operated at OC-3 (155.52 Mbps); it was completely upgraded to OC-12 (622.08 Mbps) in 1997. TheRouting Arbiter, to ensure adequate routing protocols for the Internet.
In addition, NSF-funded ISPs were given five years of reduced funding to become commercially self-sufficient. This funding ended by 1998 and a proliferation of additional NAPs have created a "melting pot" of services. Today‘s terminology refers to three tiers of ISP:
Tier 1 refers to national ISPs, or those that have a national presence and connect to at least three of the original four NAPs. National ISPs include AT&T, Sprint, and Worldcom. Tier 2 refers to regional ISPs, or those that have primarily a regional presence and connect to less than three of the original four NAPs. Regional ISPs include Adelphia, BellAtlantic.net, and BellSouth.net. Tier 3 refers to local ISPs, or those that do not connect to a NAP but offer services via an upstream ISP.
It is worth saying a few words about the NAPs. The NSF provided major funding for the four NAPs mentioned above but they needed to have additional customers to remain economically viable. Some companies — such as then-Metropolitan Fiber Systems (MFS) — decided to build other NAP sites. One of MFS‘ first sites was MAE-East, where "MAE" stood for "Metropolitan Area Ethernet." MAE-East was merely a point where ISPs could interconnect which they did by buying a router and placing it at the MAE-East facility. The original MAE-East provided a 10 Mbps Ethernet LAN to interconnect the ISPs‘ routers, hence the name. The Ethernet LAN was eventually replaced with a 100 Mbps FDDI ring and the "E" then became "Exchange." Over the years, MFS/MCI Worldcom has added sites in San Jose, CA (MAE-West), Los Angeles, Dallas, and Houston.
Other companies also operate their own NAPs.Savvis, for example, operates an international Internet service and has built more than a dozen private NAPs in North America. Many large service providers go around the NAPs entirely by creating bilateral agreement whereby the directly route traffic coming from one network and going to the other. Before their merger in 1998, for example, MCI and LDDS Worldcom had more than 10 DS-3 (44.736 Mbps) lines interconnecting the two networks.
TheNorth American Network Operators Group (NANOG) provides a forum for the exchange of technical information and the discussion of implementation issues that require coordination among network service providers. Meeting three times a year, NANOG is an essential element in maintaining stable Internet services in North America. Initially funded by the NSF, NANOG currently receives funds from conference registration fees and vendor donations.
In 1988, meanwhile, the DoD and most of the U.S. Government chose to adopt OSI protocols. TCP/IP was now viewed as an interim, proprietary solution since it ran only on limited hardware platforms and OSI products were only a couple of years away. The DoD mandated that all computer communications products would have to use OSI protocols by August 1990 and use of TCP/IP would be phased out. Subsequently, the U.S. Government OSI Profile (GOSIP) defined the set of protocols that would have to be supported by products sold to the federal government and TCP/IP was not included.
Despite this mandate, development of TCP/IP continued during the late 1980s as the Internet grew. TCP/IP development had always been carried out in an open environment (although the size of this open community was small due to the small number of ARPA/NSF sites), based upon the creed "We reject kings, presidents, and voting. We believe in rough consensus and running code" [Dave Clark, M.I.T.]. OSI products were still a couple of years away while TCP/IP became, in the minds of many, the real open systems interconnection protocol suite.
It is not the purpose of this memo to take a position in the OSI vs. TCP/IP debate (although it is absolutely clear that TCP/IP offers the primary goals of OSI; namely, a universal, non-proprietary data communications protocol. In fact, TCP/IP does far more than was ever envisioned for OSI — or for TCP/IP itself, for that matter). But before TCP/IP prevailed and OSI sort of dwindled into nothingness, many efforts were made to bring the two communities together. The ISO Development Environment (ISODE) was developed in 1990, for example, to provide an approach for OSI migration for the DoD. ISODE software allows OSI applications to operate over TCP/IP. During this same period, the Internet and OSI communities started to work together to bring about the best of both worlds as many TCP and IP features started to migrate into OSI protocols, particularly the OSI Transport Protocol class 4 (TP4) and the Connectionless Network Layer Protocol (CLNP), respectively. Finally, a report from the National Institute for Standards and Technology (NIST) in 1994 suggested that GOSIP should incorporate TCP/IP and drop the "OSI-only" requirement. [NOTE: Some industry observers have pointed out that OSI represents the ultimate example of a sliding window; OSI protocols have been "two years away" since about 1986.]
None of this is meant to suggest that the NSF isn‘t funding Internet-class research networks anymore. That is just the function ofhttp://www.internet2.edu/, a consortium of nearly 200 universities working in partnership with industry and government to develop and deploy advanced network applications and technologies for the next generation Internet. Goals of Internet2 are to create a leading edge network capability for the national research community, enable the development of new Internet-based applications, and to quickly move these new network services and applications to the commercial sector.
In Douglas Adams‘ The Hitchhiker‘s Guide to the Galaxy (Pocket Books, 1979), the hitchhiker describes outer space as being "...big. Really big. ...vastly hugely mind-bogglingly big..." A similar description can be applied to the Internet. To paraphrase the hitchhiker, you may think that your 750 node LAN is big, but that‘s just peanuts compared to the Internet.
The ARPANET started with four nodes in 1969 and grew to just under 600 nodes before it was split in 1983. The NSFNET also started with a modest number of sites in 1986. After that, the network experienced literally exponential growth. Internet growth between 1981 and 1991 is documented in "Internet Growth (1981-1991)" (RFC 1296).
The Internet Software Consortium hosts theInternet Domain Survey (with technical support from Network Wizards, who originated the survey). According to their chart, the Internet had nearly 30 million reachable hosts by January 1998 and over 56 million by July 1999. Dedicated residential access methods, such as cable modem and asymmetrical digital subscriber line (ADSL) technologies, are undoubtedly the reason that this nunber has shot up to over 171 million by January 2003. During the boom-1990s, the Internet was growing at a rate of about a new network attachment every half-hour, interconnecting hundreds of thousands of networks. It was estimated that the Internet was doubling in size every ten to twelve months and traffic was doubling every 100 days (for 1000% annual growth). For the last several year, the number of nodes has been growing at a rate of about 50% annually and taffic continues to keep pace with that growth.
And what of the original ARPANET? It grew smaller and smaller during the late 1980s as sites and traffic moved to the Internet, and was decommissioned in July 1990. Cerf & Kahn ("Selected ARPANET Maps," Computer Communications Review, October 1990) re-printed a number of network maps documenting the growth (and demise) of the ARPANET.
The Internet has no single owner, yet everyone owns (a portion of) the Internet. The Internet has no central operator, yet everyone operates (a portion of) the Internet. The Internet has been compared to anarchy, but some claim that it is not nearly that well organized!
Some central authority is required for the Internet, however, to manage those things that can only be managed centrally, such as addressing, naming, protocol development, standardization, etc. Among the significant Internet authorities are: TheInternet Society (ISOC), chartered in 1992, is a non-governmental international organization providing coordination for the Internet, and its internetworking technologies and applications. ISOC also provides oversight and communications for the Internet Activities Board. TheInternet Activities Board (IAB) governs administrative and technical activities on the Internet. TheInternet Engineering Task Force (IETF) is one of the two primary bodies of the IAB. The IETF‘s working groups have primary responsibility for the technical activities of the Internet, including writing specifications and protocols. The impact of these specifications is significant enough that ISO accredited the IETF as an international standards body at the end of 1994. RFCs2028 and2031 describe the organizations involved in the IETF standards process and the relationship between the IETF and ISOC, respectively, whileRFC 2418 describes the IETF working group guidelines and procedures. The background and history of the IETF and the Internet standards process can be found in "IETF—History, Background, and Role in Today‘s Internet." TheInternet Engineering Steering Group (IESG) is the other body of the IAB. The IESG provides direction to the IETF. TheInternet Research Task Force (IRTF) comprises a number of long-term reassert groups, promoting research of importance to the evolution of the future Internet. TheInternet Engineering Planning Group (IEPG) coordinates worldwide Internet operations. This group also assists Internet Service Providers (ISPs) to interoperate within the global Internet. TheForum of Incident Response and Security Teams is the coordinator of a number of Computer Emergency Response Teams (CERTs) representing many countries, governmental agencies, and ISPs throughout the world. Internet network security is greatly enhanced and facilitated by the FIRST member organizations. TheWorld Wide Web Consortium (W3C) is not an Internet administrative body, per se, but since October 1994 has taken a lead role in developing common protocols for the World Wide Web to promote its evolution and ensure its interoperability. W3C has more than 400 Member organizations internationally. The W3C, then, is leading the technical evolution of the Web, having already developed more than 20 technical specifications for the Web‘s infrastructure.
Although not directly related to the administration of the Internet for operational purposes, the assignment of Internet domain names (and IP addresses) is the subject of some controversy and a lot of current activity. Internet hosts use a hierarchical naming structure comprising a top-level domain (TLD), domain and subdomain (optional), and host name. The IP address space, and all TCP/IP-related numbers, have historically been managed by theInternet Assigned Numbers Authority (IANA). Domain names are assigned by the TLD naming authority; until April 1998, theInternet Network Information Center (InterNIC) had overall authority of these names, with NICs around the world handling non-U.S. domains. The InterNIC was also responsible for the overall coordination and management of the Domain Name System (DNS), the distributed database that reconciles host names and IP addresses on the Internet.
The InterNIC is an interesting example of the recent changes in the Internet. Since early 1993,Network Solutions, Inc. (NSI) operated the registry tasks of the InterNIC on behalf of the NSF and had exclusive registration authority for the .com, .org, .net, and .edu domains. NSI‘s contract ran out in April 1998 and was extended several times because no other agency was in place to continue the registration for those domains. In October 1998, it was decided that NSI would remain the sole administrator for those domains but that a plan needed to be put into place so that users could register names in those domains with other firms. In addition, NSI‘s contract was extended to September 2000, although the registration business was opened to competition in June 1999. Nevertheless, when NSI‘s original InterNIC contract expired, IP address assignments moved to a new entity called theAmerican Registry for Internet Numbers (ARIN). (And NSI itself was purchased by VeriSign in March 2000.)
The newest body to handle governance of global Top Level Domain (gTLD) registrations is theInternet Corporation for Assigned Names and Numbers (ICANN). Formed in October 1998, ICANN is the organization designated by the U.S.National Telecommunications and Information Administration (NTIA) to administer the DNS. Although surrounded in some early controversy (which is well beyond the scope of this paper!), ICANN has received wide industry support. ICANN has created several Support Organizations (SOs) to create policy for the administration of its areas of responsibility, including domain names (DNSO), IP addresses (ASO), and protocol parameter assignments (PSO).
On April 21, 1999, ICANN announced that five companies had been selected to be part of this new competitive Shared Registry System for the .com, .net, and .org domains:
America Online, Inc. (U.S.)CORE (Internet Council of Registrars) (International)France Telecom/Oléane (France)Melbourne IT (Australia)register.com (U.S.)
Phase I of the competitive registrar testbed program was scheduled to run until June 1999, although that date was subsequently extended to August; at the end of Phase I, the Shared Registry System for the .com, .net, and .org domains was to be opened to all ICANN-accredited registrars. By the end of 1999, ICANN had added an additional 29 registrars and more have been added so that there are about 100 different registrars today. Definitive ICANN registrar accreditation information can be found atICANN‘s Web site.
The hierarchical structure of domain names is best understood if the domain name is read from right-to-left. Internet hosts names end with a top-level domain name. World-wide generic top-level domains (TLDs) include:
.com: Commercial organizations (administered byVeriSign Global Registry Services through the Shared Registry System) .edu: Educational institutions; largely limited to 4-year colleges and universities from about 1994 to 2001, but also includes some community colleges (administered byEDUCAUSE) .net: Network providers; laregely limited to hosts actually part of an operational network from about 1994 to 2001 but now open to anyone, including the author of this paper! (administered byVeriSign Global Registry Services through the Shared Registry System) .org: Non-profit organizations (administered by VeriSign; after January 2003, will be administered by the Public Interest Registry (PIR), an organization formed byISOC with operational control subcontracted to Afilias, the operator of the .info domain) .int: Organizations established by international treaty .gov: U.S. Federal government agencies (managed by theU.S. General Services Administration, including the fed.us domain) .mil: U.S. military (managed by the U.S.Department of Defense Network Information Center)
The host name poodle.champlain.edu, for example, is assigned to a computer named poodle (don‘t ask why...) in the Accounting & Computing Systems Division at Champlain College (champlain), within the educational TLD (edu). The host name mail.sover.net refers to a host (mail) in the SoverNet domain (sover) within the network provider TLD (net). Guidelines for selecting host names is the subject ofRFC 1178.
Other top-level domain names use the two-letter country codes defined inISO standard 3166; munnari.oz.au, for example, is the address of the Internet gateway to Australia and myo.inst.keio.ac.jp is a host at the Science and Technology Department of Keio University in Yokohama, Japan. Other ISO 3166-based domain country codes are ca (Canada), de (Germany), es (Spain), fr (France), gb (Great Britain) [NOTE: For some historical reasons, the TLD .gb is rarely used; the TLD .uk (United Kingdom) seems to be preferred although UK is not an official ISO 3166 country code.], ie (Ireland), il (Israel), mx (Mexico), and us (United States). It is important to note that there is not necessarily any correlation between a country code and where a host is actually physically located.
There are several registries responsible for blocks of IP addresses and domain naming policies around the globe. TheAmerican Registry for Internet Numbers (ARIN), was originally responsible for the Americas (western hemisphere) and parts of Africa. In 2002, theLatin American and Caribbean Internet Addresses Registry (LACNIC) was officially recognized and now covers Central and South America, as well as some Caribbean nations. TheAfrican Regional Internet Registry (AfriNIC), still on a provisional status, will be assuming responsibility for sub-Sahara Africa. Eventually, ARIN will only cover North America and parts of the Caribbean. The European and Asia-Pacific naming registries are managed byR閟eaux IP Europ閑n (RIPE) and theAsia-Pacific NIC (APNIC), respectively.
These authorities, in turn, delegate most of the country TLDs tonational registries (such as RNP in Brazil and NIC-Mexico), which have ultimate authority to assign local domain names. An excellent overview of the recent history and anticipated future of the registry system can be found in "Development of the Regional Internet Registry System" (D. Karrenberg et al.) in the IP Journal, Vol. 4, No. 4.
Different countries may organize the country-based subdomains in any way that they want. Many countries use a subdomain similar to the TLDs, so that .com.mx and .edu.mx are the suffixes for commercial and educational institutions in Mexico, and .co.uk and .ac.uk are the suffixes for commercial and educational institutions in the United Kingdom.
The us domain is largely organized on the basis of geography or function. Geographical names in the us name space use names of the form entity-name.city-telegraph-code.state-postal-code.us. The domain name cnri.reston.va.us, for example, refers to the Corporation for National Research Initiatives in Reston, Virginia. Functional branches are also reserved within the name space for schools (K12), community colleges (CC), technical schools (TEC), state government agencies (STATE), councils of governments (COG), libraries (LIB), museums (MUS), and several other generic types of entities. Domain names in the state government name space usually take the form department.state.state-postal-code.us (e.g., the domain name dps.state.vt.us points to the Vermont Department of Public Safety). The K12 name space can vary widely, usually using the form school.school-district.k12.state-postal-code.us (e.g., the domain ccs.cssd.k12.vt.us refers to the Charlotte Central School in the Chittenden South School District which happens to be in Charlotte, Vermont.) More information about the us domain may be found inRFC 1480.
The scheme of TLD assignment and management has worked well for many years, but the pressures of increased commercial activity, network size, and international use have caused controversy about how names can be fairly assigned without violating trademarks and conflicting claims to names. In November 1996, an InternetInternational Ad Hoc Committee (IAHC) was formed to resolve some of these naming issues and to act as a focal point for the international debate over a proposal to establish additional global naming registries and global gTLDs. The IAHC was dissolved in May 1997 with the publication of theGeneric Top Level Domain Memorandum of Understanding framework. TheCouncil of Registrars (CORE) an operational body made up of all of the Registrars established under the gTLD-MoU framework.
In November 2000, the first new set of TLDs were approved by ICANN, the first one of which went online in October 2001. The seven new TLDs, their purpose, and applicants are:
.aero - aviation industry, application by Societe Internationale de Telecommunications Aeronautiques SC (SITA) .biz - businesses, application by JVTeam, LLC (administered byNeuLevel) .coop - business cooperatives, application by National Cooperative Business Association (NCBA) .info - general use, application by Afilias, LLC (administered byAfilias) .museum - museums, application by Museum Domain Management Association (MDMA) .name - individuals, application by Global Name Registry, LTD .pro - professionals, application by RegistryPro, LTD
More information about these TLDs, the registration process, and new TLDs can be found at the ICANNNew TLD Program Web page.
Last but not least, there is the never-ending issue of who owns domain names and IP addresses. I will make no claim to provide an authoritative answer but... domain names are owned by whoever registers them. This alone is a potential problem. Some ISPs are obtaining names on behalf of their customers and paying the annual fee. The issue has already arisen, "Who owns the name? The registrar or the customer?" Most ISPs have stated that they believe that the customer owns the name, even if the ISP registers the name, because there would be no reason for them to keep the name. Consider, however, that if an ISP insisted that it owned a name, it essentially ties a customer to an ISP forever, destroying the concept of domain name portability.
There is also an issue of violation of trade mark, service mark, or copyright in the choice and ownership of domain names. Consider this example from the 2001 era. A common Microsoft tag line is Where Would You Like to go Today? It so happens that the domain name wherewouldyouliketogotoday.com was registered to The Eagles Nest in Corfu, NY. I don‘t know anything about The Eagles Nest of Corfu, NY but it should not be mistaken for either Eagles Nest Enterprises of Grapevine, TX (the owner of eaglesnest.com) nor The Eagles Nest Internet Services of Newark, OH (owner of theeaglesnest.com).
In any case, suppose that Microsoft decided that someone using their service mark was not in their best interest and they pursued the issue; could they wrestle that domain name away from another registrant? Today‘s general rule of thumb is that if an organization believes that it‘s name or mark is being used in someone else‘s domain name in an unfair or misleading way, then they can take legal action against the name holder and the assignment of the name will be held up pending the outcome of the legal action. More information about this issue can be found at ICANN‘sUniform Domain-Name Dispute-Resolution Policy Web page. By the way, this is, of course, the question behind the new industry of cybersquatting; someone registers a domain name hoping that someone else with buy it from them later on!
And what about IP addresses? Prior to the widespread use of CIDR (seeSection 3.2.1), individual organizations were assigned an address (usually a Class C!) and domain name at the same time. In general, the holder of the domain name owned the IP address and if they changed ISP, routing tables throughout the Internet were updated.
Today, ISPs are assigned addresses in blocks called CIDR blocks. A customer today, whether they already own a domain name or are obtaining a new one, will be assigned an IP address from the ISP‘s CIDR block. If the customer changes ISP, they have to relinquish the IP address.
A good overview of the naming and addressing procedures can be found inRFC 2901, titled "Guide to Administrative Procedures of the Internet Infrastructure."
TCP/IP is most commonly associated with the Unix operating system. While developed separately, they have been historically tied, as mentioned above, since 4.2BSD Unix started bundling TCP/IP protocols with the operating system. Nevertheless, TCP/IP protocols are available for all widely-used operating systems today and native TCP/IP support is provided in OS/2, OS/400, and Windows 9x/NT/2000, as well as most Unix variants.
Figure 2 shows the TCP/IP protocol architecture; this diagram is by no means exhaustive, but shows the major protocol and application components common to most commercial TCP/IP software packages and their relationship.
Application
Layer HTTP FTP Telnet Finger SSH DNS
POP3/IMAP SMTP Gopher BGP
Time/NTP Whois TACACS+ SSL DNS SNMP RIP
RADIUS Archie
Traceroute tftp Ping
Transport
Layer
TCP
UDP
ICMP
OSPF
Internet
Layer
IP
ARP
Network
Interface
Layer Ethernet/802.3 Token Ring (802.5) SNAP/802.2 X.25 FDDI ISDN
Frame Relay SMDS ATM Wireless (WAP, CDPD, 802.11)
Fibre Channel DDS/DS0/T-carrier/E-carrier SONET/SDH DWDM
PPP HDLC SLIP/CSLIP xDSL Cable Modem (DOCSIS)
FIGURE 2. Abbreviated TCP/IP protocol stack.
The sections below will provide a brief overview of each of the layers in the TCP/IP suite and the protocols that compose those layers. A large number of books and papers have been written that describe all aspects of TCP/IP as a protocol suite, including detailed information about use and implementation of the protocols. Some good TCP/IP references are:
TCP/IP Illustrated, Volume I: The Protocols by W.R. Stevens (Addison-Wesley, 1994) Troubleshooting TCP/IP by Mark Miller (John Wiley & Sons, 1999) Guide to TCP/IP, 2/e by Laura A. Cappell and Ed Tittel (Thomson Course Technology, 2004) TCP/IP: Architecture, Protocols, and Implementation with IPv6 and IP Security by S. Feit (McGraw-Hill, 2000) Internetworking with TCP/IP, Vol. I: Principles, Protocols, and Architecture, 2/e, by D. Comer (Prentice-Hall, 1991) "TCP/IP Tutorial" by T.J. Socolofsky and C.J. Kale (RFC 1180, Jan. 1991) "TCP/IP and tcpdump Pocket Reference Guide", developed by the author for The SANS Institute
The TCP/IP protocols have been designed to operate over nearly any underlying local or wide area network technology. Although certain accommodations may need to be made, IP messages can be transported over all of the technologies shown in the figure, as well as numerous others. It is beyond the scope of this paper to describe most of these underlying protocols and technologies.
Two of the underlying network interface protocols, however, are particularly relevant to TCP/IP. The Serial Line Internet Protocol (SLIP,RFC 1055) and Point-to-Point Protocol (PPP,RFC 1661), respectively, may be used to provide data link layer protocol services where no other underlying data link protocol may be in use, such as in leased line or dial-up environments. Most commercial TCP/IP software packages for PC-class systems include these two protocols. With SLIP or PPP, a remote computer can attach directly to a host server and, therefore, connect to the Internet using IP rather than being limited to an asynchronous connection.
It is worth spending a little bit of time discussing PPP because of its importance in Internet access today. As its name implies, PPP was designed to be used over point-to-point links. In fact, it is the prevalent IP encapsulation scheme for dedicated Internet access as well as dial-up access. One of the significant strengths of PPP is its ability to negotiate a number of things upon initial connection, including passwords, IP addresses, compression schemes, and encryption schemes. In addition, PPP provides support for simultaneous multiple protocols over a single connection, an important consideration in those environments where dial-up users can employ either IP or another network Layer protocol. Finally, in environments such as ISDN, PPP supports inverse multiplexing and dynamic bandwidth allocation via the Multilink-PPP (ML-PPP) described in RFCs1990 and2125.
+----------+----------+----------+-------------+---------+--------+----------+ | Flag | Address | Protocol | Information | Padding | FCS | Flag | | 01111110 | 11111111 | 8/16 bits| * | * | 8 bits | 01111110 | +----------+----------+----------+-------------+---------+--------+----------+ FIGURE 3. PPP frame format (using HDLC).
PPP generally uses an HDLC-like (bit-oriented protocol) frame format as shown in Figure 3, although RFC 1661 does not demand use of HDLC. HDLC defines the first and last two fields in the frame: Flag: The 8-bit pattern "01111110" used to delimit the beginning and end of the transmission. Address: For PPP, uses the 8-bit broadcast address, "11111111". Frame Check Sequence (FCS): An 8-bit remainder from a cyclic redundancy check (CRC) calculation, used for bit error detection.
RFC 1661 actually describes the use of the three other fields in the frame: Protocol: An 8- or 16-bit value that indicates the type of datagram carried in this frame‘s Information field. This field can indicate use of a particular Network Layer protocol (such as IP, IPX, or DDP), a Network Control Protocol (NCP) in support of one of the Network Layer protocols, or a PPP Link-layer Control Protocol (LCP). The entire list of possible PPP values in this field can be found in theIANA list of PPP protocols. Information: Contains the datagram for the protocol specified in the Protocol field. This field is zero or more octets in length, up to a (default) maximum of 1500 octets (although a different value can be negotiated). Padding: Optional padding to add length to the Information field. May be required in some implementations to ensure some minimum frame length and/or to ensure some alignment on computer word boundaries.
The operation of PPP is basically as follows: After the link is physically established, each host sends LCP packets to configure and test the data link. It is here where the maximum frame length, authentication protocol (Password Authentication Protocol, PAP, or Challenge-Handshake Authentication Protocol, CHAP), link quality protocol, compression protocol, and other configuration parameters are negotiated. Authentication, if it used, will occur after the link has been established. After the link is established, one or more Network Layer protocol connections are configured using the appropriate NCP. If IP is to be used, for example, it will be set up using PPP‘s IP Control Protocol (IPCP). Once each of the Network Layer protocols has been configured, datagrams from those protocols can be sent over the link. Control protocols may be used for IP, IPX (NetWare), DDP (AppleTalk), DECnet, and more. The link will remain configured for communications until LCP and/or NCP packets close the link down.
The Internet Protocol (RFC 791), provides services that are roughly equivalent to the OSI Network Layer. IP provides a datagram (connectionless) transport service across the network. This service is sometimes referred to as unreliable because the network does not guarantee delivery nor notify the end host system about packets lost due to errors or network congestion. IP datagrams contain a message, or one fragment of a message, that may be up to 65,535 bytes (octets) in length. IP does not provide a mechanism for flow control.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL | TOS | Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TTL | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options.... (Padding) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data... +-+-+-+-+-+-+-+-+-+-+-+-+- FIGURE 4. IP packet (datagram) header format.
The basic IP packet header format is shown in Figure 4. The format of the diagram is consistent with the RFC; bits are numbered from left-to-right, starting at 0. Each row represents a single 32-bit word; note that an IP header will be at least 5 words (20 bytes) in length. The fields contained in the header, and their functions, are: Version: Specifies the IP version of the packet. The current version of IP is version 4, so this field will contain the binary value 0100. [NOTE: Actually, many IP version numbers have been assigned besides 4 and 6; see theIANA‘s list of IP Version Numbers.] Internet Header Length (IHL): Indicates the length of the datagram header in 32 bit (4 octet) words. A minimum-length header is 20 octets, so this field always has a value of at least 5 (0101) Since the maximum value of this field is 15, the IP Header can be no longer than 60 octets. Type of Service (TOS): Allows an originating host to request different classes of service for packets it transmits. Although not generally supported today in IPv4, the TOS field can be set by the originating host in response to service requests across the Transport Layer/Internet Layer service interface, and can specify a service priority (0-7) or can request that the route be optimized for either cost, delay, throughput, or reliability. Total Length: Indicates the length (in bytes, or octets) of the entire packet, including both header and data. Given the size of this field, the maximum size of an IP packet is 64 KB, or 65,535 bytes. In practice, packet sizes are limited to the maximum transmission unit (MTU). Identification: Used when a packet is fragmented into smaller pieces while traversing the Internet, this identifier is assigned by the transmitting host so that different fragments arriving at the destination can be associated with each other for reassembly. Flags: Also used for fragmentation and reassembly. The first bit is called the More Fragments (MF) bit, and is used to indicate the last fragment of a packet so that the receiver knows that the packet can be reassembled. The second bit is the Don‘t Fragment (DF) bit, which suppresses fragmentation. The third bit is unused (and always set to 0). Fragment Offset: Indicates the position of this fragment in the original packet. In the first packet of a fragment stream, the offset will be 0; in subsequent fragments, this field will indicates the offset in increments of 8 bytes. Time-to-Live (TTL): A value from 0 to 255, indicating the number of hops that this packet is allowed to take before discarded within the network. Every router that sees this packet will decrement the TTL value by one; if it gets to 0, the packet will be discarded. Protocol: Indicates the higher layer protocol contents of the data carried in the packet; options include ICMP (1), TCP (6), UDP (17), or OSPF (89). A complete list of IP protocol numbers can be found at theIANA‘s list of Protocol Numbers. An implementation-specific list of supported protocols can be found in the protocol file, generally found in the /etc (Linux/Unix), c:\windows (Windows 9x, ME), or c:\winnt\system32\drivers\etc (Windows NT, 2000) directory. Header Checksum: Carries information to ensure that the received IP header is error-free. Remember that IP provides an unreliable service and, therefore, this field only checks the IP header rather than the entire packet. Source Address: IP address of the host sending the packet. Destination Address: IP address of the host intended to receive the packet. Options: A set of options which may be applied to any given packet, such as sender-specified source routing or security indication. The option list may use up to 40 bytes (10 words), and will be padded to a word boundary; IP options are taken from theIANA‘s list of IP Option Numbers.
IP addresses are 32 bits in length (Figure 5). They are typically written as a sequence of four numbers, representing the decimal value of each of the address bytes. Since the values are separated by periods, the notation is referred to as dotted decimal. A sample IP address is 208.162.106.17.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 --+-------------+------------------------------------------------ Class A |0| NET_ID | HOST_ID | |-+-+-----------+---------------+-------------------------------| Class B |1|0| NET_ID | HOST_ID | |-+-+-+-------------------------+---------------+---------------| Class C |1|1|0| NET_ID | HOST_ID | |-+-+-+-+---------------------------------------+---------------| Class D |1|1|1|0| MULTICAST_ID | |-+-+-+-+-------------------------------------------------------| Class E |1|1|1|1| EXPERIMENTAL_ID | --+-+-+-+-------------------------------------------------------- FIGURE 5. IP Address Format.
IP addresses are hierarchical for routing purposes and are subdivided into two subfields. The Network Identifier (NET_ID) subfield identifies the TCP/IP subnetwork connected to the Internet. The NET_ID is used for high-level routing between networks, much the same way as the country code, city code, or area code is used in the telephone network. The Host Identifier (HOST_ID) subfield indicates the specific host within a subnetwork.
To accommodate different size networks, IP defines several address classes. Classes A, B, and C are used for host addressing and the only difference between the classes is the length of the NET_ID subfield:
A Class A address has an 8-bit NET_ID and 24-bit HOST_ID. Class A addresses are intended for very large networks and can address up to 16,777,214 (224-2) hosts per network. The first bit of a Class A address is a 0 and the NETID occupies the first byte, so there are only 128 (27) possible Class A NETIDs. In fact, the first digit of a Class A address will be between 1 and 126, and only about 90 or so Class A addresses have been assigned. A Class B address has a 16-bit NET_ID and 16-bit HOST_ID. Class B addresses are intended for moderate sized networks and can address up to 65,534 (216-2) hosts per network. The first two bits of a Class B address are 10 so that the first digit of a Class B address will be a number between 128 and 191; there are 16,384 (214) possible Class B NETIDs. The Class B address space has long been threatened with being used up and it is has been very difficult to get a new Class B address for some time. A Class C address has a 24-bit NET_ID and 8-bit HOST_ID. These addresses are intended for small networks and can address only up to 254 (28-2) hosts per network. The first three bits of a Class C address are 110 so that the first digit of a Class C address will be a number between 192 and 223. There are 2,097,152 (221) possible Class C NETIDs and most addresses assigned to networks today are Class C (or sub-Class C!).
The remaining two address classes are used for special functions only and are not commonly assigned to individual hosts. Class D addresses may begin with a value between 224 and 239 (the first 4 bits are 1110), and are used for IP multicasting (i.e., sending a single datagram to multiple hosts); the IANA maintains a list ofInternet Multicast Addresses. Class E addresses begin with a value between 240 and 255 (the first 4 bits are 1111), and are reserved for experimental use.
Several address values are reserved and/or have special meaning. A HOST_ID of 0 (as used above) is a dummy value reserved as a place holder when referring to an entire subnetwork; the address 208.162.106.0, then, refers to the Class C address with a NET_ID of 208.162.106. A HOST_ID of all ones (usually written "255" when referring to an all-ones byte, but also denoted as "-1") is a broadcast address and refers to all hosts on a network. A NET_ID value of 127 is used for loopback testing and the specific host address 127.0.0.1 refers to the localhost.
Several NET_IDs have been reserved inRFC 1918 for private network addresses and packets will not be routed over the Internet to these networks. Reserved NET_IDs are the Class A address 10.0.0.0 (formerly assigned to ARPANET), the sixteen Class B addresses 172.16.0.0-172.31.0.0, and the 256 Class C addresses 192.168.0.0-192.168.255.0.
An additional addressing tool is the subnet mask. Subnet masks are used to indicate the portion of the address that identifies the network (and/or subnetwork) for routing purposes. The subnet mask is written in dotted decimal and the number of 1s indicates the significant NET_ID bits. For "classful" IP addresses, the subnet mask and number of significant address bits for the NET_ID are:
Class Subnet Mask Number of Bits A 255.0.0.0 8 B 255.255.0.0 16 C 255.255.255.0 24
Depending upon the context and literature, subnet masks may be written in dotted decimal form or just as a number representing the number of significant address bits for the NET_ID. Thus, 208.162.106.17 255.255.255.0 and 208.162.106.17/24 both refer to a Class C NET_ID of 208.162.106. Some, in fact, might refer to this 24-bit NET_ID as a "slash-24."
Subnet masks can also be used to subdivide a large address space into subnetworks or to combine multiple small address spaces. In the former case, a network may subdivide their address space to define multiple logical networks by segmenting the HOST_ID subfield into a Subnetwork Identifier (SUBNET_ID) and (smaller) HOST_ID. For example, user assigned the Class B address space 172.16.0.0 could segment this into a 16-bit NET_ID, 4-bit SUBNET_ID, and 12-bit HOST_ID. In this case, the subnet mask for Internet routing purposes would be 255.255.0.0 (or "/16"), while the mask for routing to individual subnets within the larger Class B address space would be 255.255.240.0 (or "/20").
But how a subnet mask work? To determine the subnet portion of the address, we simply perform a bit-by-bit logical AND of the IP address and the mask. Consider the following example: suppose we have a host with the IP address 172.20.134.164 and a subnet mask 255.255.0.0. We write out the address and mask in decimal and binary as follows:
172.020.134.164 10101100.00010100.10000110.10100100
AND 255.255.000.000 11111111.11111111.00000000.00000000
--------------- -----------------------------------
172.020.000.000 10101100.00010100.00000000.00000000
From this we can easily find the NET_ID 172.20.0.0 (and can also infer the HOST_ID 134.164).
As an aside, most ISPs use a /30 address for the WAN links between the network and the customer. The router on the customer‘s network will generally have two IP addresses; one on the LAN interface using an address from the customer‘s public IP address space and one on the WAN interface leading back to the ISP. Since the ISP would like to be able to ping both sides of the router for testing and maintenance, having an IP address for each router port is a good idea.
By using a /30 address, a single Class C address can be broken up into 64 smaller addresses. Here‘s an example. Suppose an ISP assigns a particular customer the address 24.48.165.130 and a subnet mask 255.255.255.252. That would look like the following:
024.048.165.130 00011000.00110000.10100101.10000010
AND 255.255.255.252 11111111.11111111.11111111.11111100
--------------- -----------------------------------
024.048.165.128 00011000.00110000.10100101.10000000
So we find the NET_ID to be 24.48.165.128. Since there‘s a 30-bit NET_ID, we are left with a 2-bit HOST_ID; thus, there are four possible host addresses in this subnet: 24.48.165.128 (00), .129 (01), .130 (10), and .131 (11). The .128 address isn‘t used because it is all-zeroes; .131 isn‘t used because it is all-ones. That leave .129 and .130, which is ok since we only have two ends on the WAN link! So, in this case, the customer‘s router might be assigned 24.48.165.130/30 and the ISP‘s end of the link might get 24.48.165.129/30. Use of this subnet mask is very common today (so common that there is a proposal to allow the definition of 2-address NET_IDs specifically for point-to-point WAN links).
A very good IP addressing tutorial can be found in Chuck Semeria‘s "Understanding IP Addressing: Everything You Ever Wanted to Know." If you are really interested in subnet masks, there are a number of subnet calculators on the Internet, includingjafar.com‘s IP Subnet/Supernet Calculator,Net3 Group Inc.‘s IP Subnet Calculator, andSuper Shareware‘s Subnet Calculator.
A last and final word about IP addresses is in order. Most Internet protocols specify that addresses be supplied in the form of a fully-qualified host name or an IP address in dotted decimal form. However, spammers and others have found a way to obfuscate IP addresses by supplying the IP address as a single large decimal number. Remember that IP addresses are 32-bit quantities. We write the address in dotted decimal for the convenience of humans; the computer still interprets dotted decimal as a 32-bit quantity. Therefore, writing the address as a single large decimal number will still allow the computer to see the address as a 32-bit number. For that reason, the following URLs will all take you to the same Web page:
http://www.garykessler.nethttp://209.198.111.31http://3519442719
The use of class-based (or classful) addresses in IP is one of the reasons that IP address exhaustion has been a concern since the early 1990s. Consider an organization, for example, that needs 1000 IP addresses. A Class C address is obviously to small so a Class B address would get assigned. But a Class B address offers more than 64,000 address, so over 63,000 addresses are wasted in this assignment.
An alternative approach is to assign this organization a block four Class C addresses, such as 192.168.128.0, 192.168.129.0, 192.168.130.0, and 192.168.131.0. By using a 22-bit subnet mask 255.255.252.0 (or "/22") for routing to this "block," the NET_ID assigned to this organization is 192.168.128.0.
This use of variable-size subnet masks is called Classless Interdomain Routing (CIDR), described in RFCs1518 and1519. In the example here, routing information for what is essentially four Class C addresses can be specified in a single router table entry.
But this concept can be expanded even more. CIDR is an important contribution to the Internet because it has dramatically limited the size of the Internet backbone‘s routing tables. Today, IP addresses are not assigned strictly on a first-come, first-serve basis, but have been preallocated to various numbering authorities around the world. The numbering authorities in turn, assign blocks of addresses to major (or first-tier) ISPs; these address blocks are called CIDR blocks. An ISP‘s customer (which includes ISPs that are customers of a first-tier ISP) will be assigned an IP NET_ID that is part of the ISP‘s CIDR block. So, for example, let‘s say that Gary Kessler ISP has a CIDR block containing the 256 Class C addresses in the range 196.168.0.0-196.168.255.0. This range of addresses could be represented in a routing table with the single entry 196.168.0.0/16. Once a packet hits the Gary Kessler ISP, it will be routed it to the correct end destination.
But don‘t stop now! By shrinking the size of the subnet mask so that a single NET_ID refers to multiple addresses (resulting in shrinking router tables), we could extend the size of the subnet mask to actually assign to an organization something smaller than a Class C address. As the Class C address space falls in danger of being exhausted, users are under increasing pressure to accept assignment of these sub-Class C addresses. An organization with just a few servers, for example, might be assigned, say, 64 addresses rather than the full 256. The standard subnet mask for a Class C is 24 bits, yielding a 24-bit NET_ID and 8-bit HOST_ID. If we use a "/26" mask (255.255.255.192), we can assign the same "Class C" to four different users, each getting 1/4 of the address space (and a 6-bit HOST_ID). So, for example, the IP address space 208.162.106.0 might be assigned as follows:
NET_ID HOST_ID
range Valid
HOST_IDs 208.162.106.0 0-63 1-62 208.162.106.64 64-127 65-126 208.162.106.128 128-191 129-190 208.162.106.192 192-255 193-254
Note that in ordinary Class C usage, we would lose two addresses from the space — 0 and 255 — because addresses of all 0s and all 1s cannot be assigned as a HOST_ID. In the usage above, we would lose eight addresses from this space, because 0, 64, 128, and 192 have an all 0s HOST_ID and 63, 127, 191, and 255 have an all 1s HOST_ID. Each user, then, has 62 addresses that can be assigned to hosts.
The pressure on the Class C address space is continuing in intensity. Today, the pressure is not only to limit the number of addresses assigned, but organizations need to show why they need as many addresses as they want. Consider a company with 64 hosts and 3 servers. The ISP may request that that company only obtain 32 IP addresses. The rationale: the 3 servers need 3 addresses but the other hosts might be able to "share" the remaining pool of 27 addresses (recall that we lost HOST_ID addresses 0 and 31).
A pool of IP addresses can be shared by multiple hosts using a mechanism called Network Address Translation (NAT). NAT, described inRFC 1631, is typically implemented in hosts, proxy servers, or routers. The scheme works because every host on the user‘s network can be assigned an IP address from the pool of RFC 1918 private addresses; since these addresses are never seen on the Internet, this is not a problem.
FIGURE 6. Network Address Translation (NAT).
Consider the scenario shown in Figure 6. When the user accesses a Web site on the Internet, the NAT server will translate the "private" IP address of the host (192.168.50.50) into a "public" IP address (220.16.16.5) from the pool of assigned addresses. NAT works because of the assumption that, in this example, no more than 27 of the 64 hosts will ever be accessing the Internet at a single time.
But suppose that assumption is wrong. Another enhancement, called Port Address Translation (PAT) or Network Address Port Translation (NAPT), allows multiple hosts to share a single IP address by using different "port numbers" (ports are described more inSection 3.3).
FIGURE 7. Port Address Translation (PAT).
Port numbers are used by higher layer protocols (e.g., TCP and UDP) to identify a higher layer application. A TCP connection, for example, is uniquely identified on the Internet by the four values (aka 4-tuple)
and the Internet
Gary C. Kessler
kumquat@sover.net
16 January 2007
This paper was originally submitted to the InterNIC and posted on their Gopher site on 5 August 1994. This document is a continually updated version of that paper.
Contents
1. Introduction
2. What are TCP/IP and the Internet?
2.1. The Evolution of TCP/IP (and the Internet)
2.2. Internet Growth
2.3. Internet Administration
2.4. Domain Names and IP Addresses (and Politics)
3. The TCP/IP Protocol Architecture
3.1. The Network Interface Layer
3.1.1. PPP
3.2. The Internet Layer
3.2.1. IP Addressing and Subnet Masks
3.2.2. Conserving IP Addresses: CIDR, DHCP, NAT, and PAT
3.2.3. The Domain Name System
3.2.4. ARP and Address Resolution
3.2.5. IP Routing: OSPF, RIP, and BGP
3.2.6. IP version 6
3.3. The Transport Layer Protocols
3.3.1. Ports
3.3.2. TCP
3.3.3. UDP
3.3.4. ICMP
3.3.5. TCP Logical Connections and ICMP
3.4. The TCP/IP Application Layer
3.4.1. TCP and UDP Applications
3.4.2. Protocol Analysis
3.5. Summary
4. Other Information Sources
5. Acronyms and Abbreviations
6. Author‘s Address
An increasing number of people are using the Internet and, many for the first time, are using the tools and utilities that at one time were only available on a limited number of computer systems (and only for really intense users!). One sign of this growth in use has been the significant number of TCP/IP and Internet books, articles, courses, and even TV shows that have become available in the last several years; there are so many such books that publishers are reluctant to authorize more because bookstores have reached their limit of shelf space! This memo provides a broad overview of the Internet and TCP/IP, with an emphasis on history, terms, and concepts. It is meant as a brief guide and starting point, referring to many other sources for more detailed information.
While the TCP/IP protocols and the Internet are different, their histories are most definitely intertwingled! This section will discuss some of the history. For additional information and insight, readers are urged to read two excellent histories of the Internet: Casting The Net: From ARPANET to INTERNET and beyond... by Peter Salus (Addison-Wesley, 1995) and Where Wizards Stay Up Late: The Origins of the Internet by Katie Hafner and Mark Lyon (Simon & Schuster, 1997). In addition, the Internet Society maintains a number of on-line "Internet history" papers athttp://www.isoc.org/internet/history/.
While the Internet today is recognized as a network that is fundamentally changing social, political, and economic structures, and in many ways obviating geographic boundaries, this potential is merely the realization of predictions that go back nearly forty years. In a series of memos dating back to August 1962, J.C.R. Licklider of MIT discussed his "Galactic Network" and how social interactions could be enabled through networking. The Internet certainly provides such a national and global infrastructure and, in fact, interplanetary Internet communication has already been seriously discussed.
Prior to the 1960s, what little computer communication existed comprised simple text and binary data, carried by the most common telecommunications network technology of the day; namely, circuit switching, the technology of the telephone networks for nearly a hundred years. Because most data traffic is bursty in nature (i.e., most of the transmissions occur during a very short period of time), circuit switching results in highly inefficient use of network resources.
The fundamental technology that makes the Internet work is called packet switching, a data network in which all components (i.e., hosts and switches) operate independently, eliminating single point-of-failure problems. In addition, network communication resources appear to be dedicated to individual users but, in fact, statistical multiplexing and an upper limit on the size of a transmitted entity result in fast, economical networks.
In the 1960s, packet switching was ready to be discovered. In 1961, Leonard Kleinrock of MIT published the first paper on packet switching theory (and the first book on the subject in 1964). In 1962, Paul Baran of the Rand Corporation described a robust, efficient, store-and-forward data network in a report for the U.S. Air Force. At about the same time, Donald Davies and Roger Scantlebury suggested a similar idea from work at the National Physical Laboratory (NPL) in the U.K. The research at MIT (1961-1967), RAND (1962-1965), and NPL (1964-1967) occurred independently and the principal researchers did not all meet together until the Association for Computing Machinery (ACM) meeting in 1967. The term packet was adopted from the work at NPL.
The modern Internet began as a U.S. Department of Defense (DoD) funded experiment to interconnect DoD-funded research sites in the U.S. The 1967 ACM meeting was also where the initial design for the so-called ARPANET — named for the DoD‘s Advanced Research Projects Agency (ARPA) — was first published by Larry Roberts. In December 1968, ARPA awarded a contract to Bolt Beranek and Newman (BBN) to design and deploy a packet switching network with a proposed line speed of 50 kbps. In September 1969, the first node of the ARPANET was installed at the University of California at Los Angeles (UCLA), followed monthly with nodes at Stanford Research Institute (SRI), the University of California at Santa Barbara (UCSB), and the University of Utah. With four nodes by the end of 1969, the ARPANET spanned the continental U.S. by 1971 and had connections to Europe by 1973.
The original ARPANET gave life to a number of protocols that were new to packet switching. One of the most lasting results of the ARPANET was the development of a user-network protocol that has become the standard interface between users and packet switched networks; namely, ITU-T (formerly CCITT) Recommendation X.25. This "standard" interface encouraged BBN to start Telenet, a commercial packet-switched data service, in 1974; after much renaming, Telenet became a part of Sprint‘s X.25 service.
The initial host-to-host communications protocol introduced in the ARPANET was called the Network Control Protocol (NCP). Over time, however, NCP proved to be incapable of keeping up with the growing network traffic load. In 1974, a new, more robust suite of communications protocols was proposed and implemented throughout the ARPANET, based upon the Transmission Control Protocol (TCP) for end-to-end network communication. But it seemed like overkill for the intermediate gateways (what we would today call routers) to needlessly have to deal with an end-to-end protocol so in 1978 a new design split responsibilities between a pair of protocols; the new Internet Protocol (IP) for routing packets and device-to-device communication (i.e., host-to-gateway or gateway-to-gateway) and TCP for reliable, end-to-end host communication. Since TCP and IP were originally envisioned functionally as a single protocol, the protocol suite, which actually refers to a large collection of protocols and applications, is usually referred to simply as TCP/IP.
The original versions of both TCP and IP that are in common use today were written in September 1981, although both have had several modifications applied to them (in addition, the IP version 6, or IPv6, specification was released in December 1995). In 1983, the DoD mandated that all of their computer systems would use the TCP/IP protocol suite for long-haul communications, further enhancing the scope and importance of the ARPANET.
In 1983, the ARPANET was split into two components. One component, still called ARPANET, was used to interconnect research/development and academic sites; the other, called MILNET, was used to carry military traffic and became part of the Defense Data Network. That year also saw a huge boost in the popularity of TCP/IP with its inclusion in the communications kernel for the University of California s UNIX implementation, 4.2BSD (Berkeley Software Distribution) UNIX.
In 1986, the National Science Foundation (NSF) built a backbone network to interconnect four NSF-funded regional supercomputer centers and the National Center for Atmospheric Research (NCAR). This network, dubbed the NSFNET, was originally intended as a backbone for other networks, not as an interconnection mechanism for individual systems. Furthermore, the "Appropriate Use Policy" defined by the NSF limited traffic to non-commercial use. The NSFNET continued to grow and provide connectivity between both NSF-funded and non-NSF regional networks, eventually becoming the backbone that we know today as the Internet. Although early NSFNET applications were largely multiprotocol in nature, TCP/IP was employed for interconnectivity (with the ultimate goal of migration to Open Systems Interconnection).
The NSFNET originally comprised 56-kbps links and was completely upgraded to T1 (1.544 Mbps) links in 1989. Migration to a "professionally-managed" network was supervised by a consortium comprising Merit (a Michigan state regional network headquartered at the University of Michigan), IBM, and MCI. Advanced Network & Services, Inc. (ANS), a non-profit company formed by IBM and MCI, was responsible for managing the NSFNET and supervising the transition of the NSFNET backbone to T3 (44.736 Mbps) rates by the end of 1991. During this period of time, the NSF also funded a number of regional Internet service providers (ISPs) to provide local connection points for educational institutions and NSF-funded sites.
In 1993, the NSF decided that it did not want to be in the business of running and funding networks, but wanted instead to go back to the funding of research in the areas of supercomputing and high-speed communications. In addition, there was increased pressure to commercialize the Internet; in 1989, a trial gateway connected MCI, CompuServe, and Internet mail services, and commercial users were now finding out about all of the capabilities of the Internet that once belonged exclusively to academic and hard-core users! In 1991, theCommercial Internet Exchange (CIX) Association was formed by General Atomics, Performance Systems International (PSI), and UUNET Technologies to promote and provide a commercial Internet backbone service. Nevertheless, there remained intense pressure from non-NSF ISPs to open the network to all users.
FIGURE 1. NSFNET structure initiated in 1994 to merge the academic and commercial networks.
In 1994, a plan was put in place to reduce the NSF‘s role in the public Internet. The new structure comprises three parts:
Network Access Points (NAPs), where individual ISPs would interconnect, as suggested in Figure 1. The NSF originally funded four such NAPs: Chicago (operated by Ameritech), New York (really Pensauken, NJ, operated by Sprint), San Francisco (operated by Pacific Bell, now SBC), and Washington, D.C. (MAE-East, operated by MFS, now part of Worldcom). Thevery High Speed Backbone Network Service, a network interconnecting the NAPs and NSF-funded centers, operated by MCI. This network was installed in 1995 and operated at OC-3 (155.52 Mbps); it was completely upgraded to OC-12 (622.08 Mbps) in 1997. TheRouting Arbiter, to ensure adequate routing protocols for the Internet.
In addition, NSF-funded ISPs were given five years of reduced funding to become commercially self-sufficient. This funding ended by 1998 and a proliferation of additional NAPs have created a "melting pot" of services. Today‘s terminology refers to three tiers of ISP:
Tier 1 refers to national ISPs, or those that have a national presence and connect to at least three of the original four NAPs. National ISPs include AT&T, Sprint, and Worldcom. Tier 2 refers to regional ISPs, or those that have primarily a regional presence and connect to less than three of the original four NAPs. Regional ISPs include Adelphia, BellAtlantic.net, and BellSouth.net. Tier 3 refers to local ISPs, or those that do not connect to a NAP but offer services via an upstream ISP.
It is worth saying a few words about the NAPs. The NSF provided major funding for the four NAPs mentioned above but they needed to have additional customers to remain economically viable. Some companies — such as then-Metropolitan Fiber Systems (MFS) — decided to build other NAP sites. One of MFS‘ first sites was MAE-East, where "MAE" stood for "Metropolitan Area Ethernet." MAE-East was merely a point where ISPs could interconnect which they did by buying a router and placing it at the MAE-East facility. The original MAE-East provided a 10 Mbps Ethernet LAN to interconnect the ISPs‘ routers, hence the name. The Ethernet LAN was eventually replaced with a 100 Mbps FDDI ring and the "E" then became "Exchange." Over the years, MFS/MCI Worldcom has added sites in San Jose, CA (MAE-West), Los Angeles, Dallas, and Houston.
Other companies also operate their own NAPs.Savvis, for example, operates an international Internet service and has built more than a dozen private NAPs in North America. Many large service providers go around the NAPs entirely by creating bilateral agreement whereby the directly route traffic coming from one network and going to the other. Before their merger in 1998, for example, MCI and LDDS Worldcom had more than 10 DS-3 (44.736 Mbps) lines interconnecting the two networks.
TheNorth American Network Operators Group (NANOG) provides a forum for the exchange of technical information and the discussion of implementation issues that require coordination among network service providers. Meeting three times a year, NANOG is an essential element in maintaining stable Internet services in North America. Initially funded by the NSF, NANOG currently receives funds from conference registration fees and vendor donations.
In 1988, meanwhile, the DoD and most of the U.S. Government chose to adopt OSI protocols. TCP/IP was now viewed as an interim, proprietary solution since it ran only on limited hardware platforms and OSI products were only a couple of years away. The DoD mandated that all computer communications products would have to use OSI protocols by August 1990 and use of TCP/IP would be phased out. Subsequently, the U.S. Government OSI Profile (GOSIP) defined the set of protocols that would have to be supported by products sold to the federal government and TCP/IP was not included.
Despite this mandate, development of TCP/IP continued during the late 1980s as the Internet grew. TCP/IP development had always been carried out in an open environment (although the size of this open community was small due to the small number of ARPA/NSF sites), based upon the creed "We reject kings, presidents, and voting. We believe in rough consensus and running code" [Dave Clark, M.I.T.]. OSI products were still a couple of years away while TCP/IP became, in the minds of many, the real open systems interconnection protocol suite.
It is not the purpose of this memo to take a position in the OSI vs. TCP/IP debate (although it is absolutely clear that TCP/IP offers the primary goals of OSI; namely, a universal, non-proprietary data communications protocol. In fact, TCP/IP does far more than was ever envisioned for OSI — or for TCP/IP itself, for that matter). But before TCP/IP prevailed and OSI sort of dwindled into nothingness, many efforts were made to bring the two communities together. The ISO Development Environment (ISODE) was developed in 1990, for example, to provide an approach for OSI migration for the DoD. ISODE software allows OSI applications to operate over TCP/IP. During this same period, the Internet and OSI communities started to work together to bring about the best of both worlds as many TCP and IP features started to migrate into OSI protocols, particularly the OSI Transport Protocol class 4 (TP4) and the Connectionless Network Layer Protocol (CLNP), respectively. Finally, a report from the National Institute for Standards and Technology (NIST) in 1994 suggested that GOSIP should incorporate TCP/IP and drop the "OSI-only" requirement. [NOTE: Some industry observers have pointed out that OSI represents the ultimate example of a sliding window; OSI protocols have been "two years away" since about 1986.]
None of this is meant to suggest that the NSF isn‘t funding Internet-class research networks anymore. That is just the function ofhttp://www.internet2.edu/, a consortium of nearly 200 universities working in partnership with industry and government to develop and deploy advanced network applications and technologies for the next generation Internet. Goals of Internet2 are to create a leading edge network capability for the national research community, enable the development of new Internet-based applications, and to quickly move these new network services and applications to the commercial sector.
In Douglas Adams‘ The Hitchhiker‘s Guide to the Galaxy (Pocket Books, 1979), the hitchhiker describes outer space as being "...big. Really big. ...vastly hugely mind-bogglingly big..." A similar description can be applied to the Internet. To paraphrase the hitchhiker, you may think that your 750 node LAN is big, but that‘s just peanuts compared to the Internet.
The ARPANET started with four nodes in 1969 and grew to just under 600 nodes before it was split in 1983. The NSFNET also started with a modest number of sites in 1986. After that, the network experienced literally exponential growth. Internet growth between 1981 and 1991 is documented in "Internet Growth (1981-1991)" (RFC 1296).
The Internet Software Consortium hosts theInternet Domain Survey (with technical support from Network Wizards, who originated the survey). According to their chart, the Internet had nearly 30 million reachable hosts by January 1998 and over 56 million by July 1999. Dedicated residential access methods, such as cable modem and asymmetrical digital subscriber line (ADSL) technologies, are undoubtedly the reason that this nunber has shot up to over 171 million by January 2003. During the boom-1990s, the Internet was growing at a rate of about a new network attachment every half-hour, interconnecting hundreds of thousands of networks. It was estimated that the Internet was doubling in size every ten to twelve months and traffic was doubling every 100 days (for 1000% annual growth). For the last several year, the number of nodes has been growing at a rate of about 50% annually and taffic continues to keep pace with that growth.
And what of the original ARPANET? It grew smaller and smaller during the late 1980s as sites and traffic moved to the Internet, and was decommissioned in July 1990. Cerf & Kahn ("Selected ARPANET Maps," Computer Communications Review, October 1990) re-printed a number of network maps documenting the growth (and demise) of the ARPANET.
The Internet has no single owner, yet everyone owns (a portion of) the Internet. The Internet has no central operator, yet everyone operates (a portion of) the Internet. The Internet has been compared to anarchy, but some claim that it is not nearly that well organized!
Some central authority is required for the Internet, however, to manage those things that can only be managed centrally, such as addressing, naming, protocol development, standardization, etc. Among the significant Internet authorities are: TheInternet Society (ISOC), chartered in 1992, is a non-governmental international organization providing coordination for the Internet, and its internetworking technologies and applications. ISOC also provides oversight and communications for the Internet Activities Board. TheInternet Activities Board (IAB) governs administrative and technical activities on the Internet. TheInternet Engineering Task Force (IETF) is one of the two primary bodies of the IAB. The IETF‘s working groups have primary responsibility for the technical activities of the Internet, including writing specifications and protocols. The impact of these specifications is significant enough that ISO accredited the IETF as an international standards body at the end of 1994. RFCs2028 and2031 describe the organizations involved in the IETF standards process and the relationship between the IETF and ISOC, respectively, whileRFC 2418 describes the IETF working group guidelines and procedures. The background and history of the IETF and the Internet standards process can be found in "IETF—History, Background, and Role in Today‘s Internet." TheInternet Engineering Steering Group (IESG) is the other body of the IAB. The IESG provides direction to the IETF. TheInternet Research Task Force (IRTF) comprises a number of long-term reassert groups, promoting research of importance to the evolution of the future Internet. TheInternet Engineering Planning Group (IEPG) coordinates worldwide Internet operations. This group also assists Internet Service Providers (ISPs) to interoperate within the global Internet. TheForum of Incident Response and Security Teams is the coordinator of a number of Computer Emergency Response Teams (CERTs) representing many countries, governmental agencies, and ISPs throughout the world. Internet network security is greatly enhanced and facilitated by the FIRST member organizations. TheWorld Wide Web Consortium (W3C) is not an Internet administrative body, per se, but since October 1994 has taken a lead role in developing common protocols for the World Wide Web to promote its evolution and ensure its interoperability. W3C has more than 400 Member organizations internationally. The W3C, then, is leading the technical evolution of the Web, having already developed more than 20 technical specifications for the Web‘s infrastructure.
Although not directly related to the administration of the Internet for operational purposes, the assignment of Internet domain names (and IP addresses) is the subject of some controversy and a lot of current activity. Internet hosts use a hierarchical naming structure comprising a top-level domain (TLD), domain and subdomain (optional), and host name. The IP address space, and all TCP/IP-related numbers, have historically been managed by theInternet Assigned Numbers Authority (IANA). Domain names are assigned by the TLD naming authority; until April 1998, theInternet Network Information Center (InterNIC) had overall authority of these names, with NICs around the world handling non-U.S. domains. The InterNIC was also responsible for the overall coordination and management of the Domain Name System (DNS), the distributed database that reconciles host names and IP addresses on the Internet.
The InterNIC is an interesting example of the recent changes in the Internet. Since early 1993,Network Solutions, Inc. (NSI) operated the registry tasks of the InterNIC on behalf of the NSF and had exclusive registration authority for the .com, .org, .net, and .edu domains. NSI‘s contract ran out in April 1998 and was extended several times because no other agency was in place to continue the registration for those domains. In October 1998, it was decided that NSI would remain the sole administrator for those domains but that a plan needed to be put into place so that users could register names in those domains with other firms. In addition, NSI‘s contract was extended to September 2000, although the registration business was opened to competition in June 1999. Nevertheless, when NSI‘s original InterNIC contract expired, IP address assignments moved to a new entity called theAmerican Registry for Internet Numbers (ARIN). (And NSI itself was purchased by VeriSign in March 2000.)
The newest body to handle governance of global Top Level Domain (gTLD) registrations is theInternet Corporation for Assigned Names and Numbers (ICANN). Formed in October 1998, ICANN is the organization designated by the U.S.National Telecommunications and Information Administration (NTIA) to administer the DNS. Although surrounded in some early controversy (which is well beyond the scope of this paper!), ICANN has received wide industry support. ICANN has created several Support Organizations (SOs) to create policy for the administration of its areas of responsibility, including domain names (DNSO), IP addresses (ASO), and protocol parameter assignments (PSO).
On April 21, 1999, ICANN announced that five companies had been selected to be part of this new competitive Shared Registry System for the .com, .net, and .org domains:
America Online, Inc. (U.S.)CORE (Internet Council of Registrars) (International)France Telecom/Oléane (France)Melbourne IT (Australia)register.com (U.S.)
Phase I of the competitive registrar testbed program was scheduled to run until June 1999, although that date was subsequently extended to August; at the end of Phase I, the Shared Registry System for the .com, .net, and .org domains was to be opened to all ICANN-accredited registrars. By the end of 1999, ICANN had added an additional 29 registrars and more have been added so that there are about 100 different registrars today. Definitive ICANN registrar accreditation information can be found atICANN‘s Web site.
The hierarchical structure of domain names is best understood if the domain name is read from right-to-left. Internet hosts names end with a top-level domain name. World-wide generic top-level domains (TLDs) include:
.com: Commercial organizations (administered byVeriSign Global Registry Services through the Shared Registry System) .edu: Educational institutions; largely limited to 4-year colleges and universities from about 1994 to 2001, but also includes some community colleges (administered byEDUCAUSE) .net: Network providers; laregely limited to hosts actually part of an operational network from about 1994 to 2001 but now open to anyone, including the author of this paper! (administered byVeriSign Global Registry Services through the Shared Registry System) .org: Non-profit organizations (administered by VeriSign; after January 2003, will be administered by the Public Interest Registry (PIR), an organization formed byISOC with operational control subcontracted to Afilias, the operator of the .info domain) .int: Organizations established by international treaty .gov: U.S. Federal government agencies (managed by theU.S. General Services Administration, including the fed.us domain) .mil: U.S. military (managed by the U.S.Department of Defense Network Information Center)
The host name poodle.champlain.edu, for example, is assigned to a computer named poodle (don‘t ask why...) in the Accounting & Computing Systems Division at Champlain College (champlain), within the educational TLD (edu). The host name mail.sover.net refers to a host (mail) in the SoverNet domain (sover) within the network provider TLD (net). Guidelines for selecting host names is the subject ofRFC 1178.
Other top-level domain names use the two-letter country codes defined inISO standard 3166; munnari.oz.au, for example, is the address of the Internet gateway to Australia and myo.inst.keio.ac.jp is a host at the Science and Technology Department of Keio University in Yokohama, Japan. Other ISO 3166-based domain country codes are ca (Canada), de (Germany), es (Spain), fr (France), gb (Great Britain) [NOTE: For some historical reasons, the TLD .gb is rarely used; the TLD .uk (United Kingdom) seems to be preferred although UK is not an official ISO 3166 country code.], ie (Ireland), il (Israel), mx (Mexico), and us (United States). It is important to note that there is not necessarily any correlation between a country code and where a host is actually physically located.
There are several registries responsible for blocks of IP addresses and domain naming policies around the globe. TheAmerican Registry for Internet Numbers (ARIN), was originally responsible for the Americas (western hemisphere) and parts of Africa. In 2002, theLatin American and Caribbean Internet Addresses Registry (LACNIC) was officially recognized and now covers Central and South America, as well as some Caribbean nations. TheAfrican Regional Internet Registry (AfriNIC), still on a provisional status, will be assuming responsibility for sub-Sahara Africa. Eventually, ARIN will only cover North America and parts of the Caribbean. The European and Asia-Pacific naming registries are managed byR閟eaux IP Europ閑n (RIPE) and theAsia-Pacific NIC (APNIC), respectively.
These authorities, in turn, delegate most of the country TLDs tonational registries (such as RNP in Brazil and NIC-Mexico), which have ultimate authority to assign local domain names. An excellent overview of the recent history and anticipated future of the registry system can be found in "Development of the Regional Internet Registry System" (D. Karrenberg et al.) in the IP Journal, Vol. 4, No. 4.
Different countries may organize the country-based subdomains in any way that they want. Many countries use a subdomain similar to the TLDs, so that .com.mx and .edu.mx are the suffixes for commercial and educational institutions in Mexico, and .co.uk and .ac.uk are the suffixes for commercial and educational institutions in the United Kingdom.
The us domain is largely organized on the basis of geography or function. Geographical names in the us name space use names of the form entity-name.city-telegraph-code.state-postal-code.us. The domain name cnri.reston.va.us, for example, refers to the Corporation for National Research Initiatives in Reston, Virginia. Functional branches are also reserved within the name space for schools (K12), community colleges (CC), technical schools (TEC), state government agencies (STATE), councils of governments (COG), libraries (LIB), museums (MUS), and several other generic types of entities. Domain names in the state government name space usually take the form department.state.state-postal-code.us (e.g., the domain name dps.state.vt.us points to the Vermont Department of Public Safety). The K12 name space can vary widely, usually using the form school.school-district.k12.state-postal-code.us (e.g., the domain ccs.cssd.k12.vt.us refers to the Charlotte Central School in the Chittenden South School District which happens to be in Charlotte, Vermont.) More information about the us domain may be found inRFC 1480.
The scheme of TLD assignment and management has worked well for many years, but the pressures of increased commercial activity, network size, and international use have caused controversy about how names can be fairly assigned without violating trademarks and conflicting claims to names. In November 1996, an InternetInternational Ad Hoc Committee (IAHC) was formed to resolve some of these naming issues and to act as a focal point for the international debate over a proposal to establish additional global naming registries and global gTLDs. The IAHC was dissolved in May 1997 with the publication of theGeneric Top Level Domain Memorandum of Understanding framework. TheCouncil of Registrars (CORE) an operational body made up of all of the Registrars established under the gTLD-MoU framework.
In November 2000, the first new set of TLDs were approved by ICANN, the first one of which went online in October 2001. The seven new TLDs, their purpose, and applicants are:
.aero - aviation industry, application by Societe Internationale de Telecommunications Aeronautiques SC (SITA) .biz - businesses, application by JVTeam, LLC (administered byNeuLevel) .coop - business cooperatives, application by National Cooperative Business Association (NCBA) .info - general use, application by Afilias, LLC (administered byAfilias) .museum - museums, application by Museum Domain Management Association (MDMA) .name - individuals, application by Global Name Registry, LTD .pro - professionals, application by RegistryPro, LTD
More information about these TLDs, the registration process, and new TLDs can be found at the ICANNNew TLD Program Web page.
Last but not least, there is the never-ending issue of who owns domain names and IP addresses. I will make no claim to provide an authoritative answer but... domain names are owned by whoever registers them. This alone is a potential problem. Some ISPs are obtaining names on behalf of their customers and paying the annual fee. The issue has already arisen, "Who owns the name? The registrar or the customer?" Most ISPs have stated that they believe that the customer owns the name, even if the ISP registers the name, because there would be no reason for them to keep the name. Consider, however, that if an ISP insisted that it owned a name, it essentially ties a customer to an ISP forever, destroying the concept of domain name portability.
There is also an issue of violation of trade mark, service mark, or copyright in the choice and ownership of domain names. Consider this example from the 2001 era. A common Microsoft tag line is Where Would You Like to go Today? It so happens that the domain name wherewouldyouliketogotoday.com was registered to The Eagles Nest in Corfu, NY. I don‘t know anything about The Eagles Nest of Corfu, NY but it should not be mistaken for either Eagles Nest Enterprises of Grapevine, TX (the owner of eaglesnest.com) nor The Eagles Nest Internet Services of Newark, OH (owner of theeaglesnest.com).
In any case, suppose that Microsoft decided that someone using their service mark was not in their best interest and they pursued the issue; could they wrestle that domain name away from another registrant? Today‘s general rule of thumb is that if an organization believes that it‘s name or mark is being used in someone else‘s domain name in an unfair or misleading way, then they can take legal action against the name holder and the assignment of the name will be held up pending the outcome of the legal action. More information about this issue can be found at ICANN‘sUniform Domain-Name Dispute-Resolution Policy Web page. By the way, this is, of course, the question behind the new industry of cybersquatting; someone registers a domain name hoping that someone else with buy it from them later on!
And what about IP addresses? Prior to the widespread use of CIDR (seeSection 3.2.1), individual organizations were assigned an address (usually a Class C!) and domain name at the same time. In general, the holder of the domain name owned the IP address and if they changed ISP, routing tables throughout the Internet were updated.
Today, ISPs are assigned addresses in blocks called CIDR blocks. A customer today, whether they already own a domain name or are obtaining a new one, will be assigned an IP address from the ISP‘s CIDR block. If the customer changes ISP, they have to relinquish the IP address.
A good overview of the naming and addressing procedures can be found inRFC 2901, titled "Guide to Administrative Procedures of the Internet Infrastructure."
TCP/IP is most commonly associated with the Unix operating system. While developed separately, they have been historically tied, as mentioned above, since 4.2BSD Unix started bundling TCP/IP protocols with the operating system. Nevertheless, TCP/IP protocols are available for all widely-used operating systems today and native TCP/IP support is provided in OS/2, OS/400, and Windows 9x/NT/2000, as well as most Unix variants.
Figure 2 shows the TCP/IP protocol architecture; this diagram is by no means exhaustive, but shows the major protocol and application components common to most commercial TCP/IP software packages and their relationship.
Application
Layer HTTP FTP Telnet Finger SSH DNS
POP3/IMAP SMTP Gopher BGP
Time/NTP Whois TACACS+ SSL DNS SNMP RIP
RADIUS Archie
Traceroute tftp Ping
Transport
Layer
TCP
UDP
ICMP
OSPF
Internet
Layer
IP
ARP
Network
Interface
Layer Ethernet/802.3 Token Ring (802.5) SNAP/802.2 X.25 FDDI ISDN
Frame Relay SMDS ATM Wireless (WAP, CDPD, 802.11)
Fibre Channel DDS/DS0/T-carrier/E-carrier SONET/SDH DWDM
PPP HDLC SLIP/CSLIP xDSL Cable Modem (DOCSIS)
FIGURE 2. Abbreviated TCP/IP protocol stack.
The sections below will provide a brief overview of each of the layers in the TCP/IP suite and the protocols that compose those layers. A large number of books and papers have been written that describe all aspects of TCP/IP as a protocol suite, including detailed information about use and implementation of the protocols. Some good TCP/IP references are:
TCP/IP Illustrated, Volume I: The Protocols by W.R. Stevens (Addison-Wesley, 1994) Troubleshooting TCP/IP by Mark Miller (John Wiley & Sons, 1999) Guide to TCP/IP, 2/e by Laura A. Cappell and Ed Tittel (Thomson Course Technology, 2004) TCP/IP: Architecture, Protocols, and Implementation with IPv6 and IP Security by S. Feit (McGraw-Hill, 2000) Internetworking with TCP/IP, Vol. I: Principles, Protocols, and Architecture, 2/e, by D. Comer (Prentice-Hall, 1991) "TCP/IP Tutorial" by T.J. Socolofsky and C.J. Kale (RFC 1180, Jan. 1991) "TCP/IP and tcpdump Pocket Reference Guide", developed by the author for The SANS Institute
The TCP/IP protocols have been designed to operate over nearly any underlying local or wide area network technology. Although certain accommodations may need to be made, IP messages can be transported over all of the technologies shown in the figure, as well as numerous others. It is beyond the scope of this paper to describe most of these underlying protocols and technologies.
Two of the underlying network interface protocols, however, are particularly relevant to TCP/IP. The Serial Line Internet Protocol (SLIP,RFC 1055) and Point-to-Point Protocol (PPP,RFC 1661), respectively, may be used to provide data link layer protocol services where no other underlying data link protocol may be in use, such as in leased line or dial-up environments. Most commercial TCP/IP software packages for PC-class systems include these two protocols. With SLIP or PPP, a remote computer can attach directly to a host server and, therefore, connect to the Internet using IP rather than being limited to an asynchronous connection.
It is worth spending a little bit of time discussing PPP because of its importance in Internet access today. As its name implies, PPP was designed to be used over point-to-point links. In fact, it is the prevalent IP encapsulation scheme for dedicated Internet access as well as dial-up access. One of the significant strengths of PPP is its ability to negotiate a number of things upon initial connection, including passwords, IP addresses, compression schemes, and encryption schemes. In addition, PPP provides support for simultaneous multiple protocols over a single connection, an important consideration in those environments where dial-up users can employ either IP or another network Layer protocol. Finally, in environments such as ISDN, PPP supports inverse multiplexing and dynamic bandwidth allocation via the Multilink-PPP (ML-PPP) described in RFCs1990 and2125.
+----------+----------+----------+-------------+---------+--------+----------+ | Flag | Address | Protocol | Information | Padding | FCS | Flag | | 01111110 | 11111111 | 8/16 bits| * | * | 8 bits | 01111110 | +----------+----------+----------+-------------+---------+--------+----------+ FIGURE 3. PPP frame format (using HDLC).
PPP generally uses an HDLC-like (bit-oriented protocol) frame format as shown in Figure 3, although RFC 1661 does not demand use of HDLC. HDLC defines the first and last two fields in the frame: Flag: The 8-bit pattern "01111110" used to delimit the beginning and end of the transmission. Address: For PPP, uses the 8-bit broadcast address, "11111111". Frame Check Sequence (FCS): An 8-bit remainder from a cyclic redundancy check (CRC) calculation, used for bit error detection.
RFC 1661 actually describes the use of the three other fields in the frame: Protocol: An 8- or 16-bit value that indicates the type of datagram carried in this frame‘s Information field. This field can indicate use of a particular Network Layer protocol (such as IP, IPX, or DDP), a Network Control Protocol (NCP) in support of one of the Network Layer protocols, or a PPP Link-layer Control Protocol (LCP). The entire list of possible PPP values in this field can be found in theIANA list of PPP protocols. Information: Contains the datagram for the protocol specified in the Protocol field. This field is zero or more octets in length, up to a (default) maximum of 1500 octets (although a different value can be negotiated). Padding: Optional padding to add length to the Information field. May be required in some implementations to ensure some minimum frame length and/or to ensure some alignment on computer word boundaries.
The operation of PPP is basically as follows: After the link is physically established, each host sends LCP packets to configure and test the data link. It is here where the maximum frame length, authentication protocol (Password Authentication Protocol, PAP, or Challenge-Handshake Authentication Protocol, CHAP), link quality protocol, compression protocol, and other configuration parameters are negotiated. Authentication, if it used, will occur after the link has been established. After the link is established, one or more Network Layer protocol connections are configured using the appropriate NCP. If IP is to be used, for example, it will be set up using PPP‘s IP Control Protocol (IPCP). Once each of the Network Layer protocols has been configured, datagrams from those protocols can be sent over the link. Control protocols may be used for IP, IPX (NetWare), DDP (AppleTalk), DECnet, and more. The link will remain configured for communications until LCP and/or NCP packets close the link down.
The Internet Protocol (RFC 791), provides services that are roughly equivalent to the OSI Network Layer. IP provides a datagram (connectionless) transport service across the network. This service is sometimes referred to as unreliable because the network does not guarantee delivery nor notify the end host system about packets lost due to errors or network congestion. IP datagrams contain a message, or one fragment of a message, that may be up to 65,535 bytes (octets) in length. IP does not provide a mechanism for flow control.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL | TOS | Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TTL | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options.... (Padding) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data... +-+-+-+-+-+-+-+-+-+-+-+-+- FIGURE 4. IP packet (datagram) header format.
The basic IP packet header format is shown in Figure 4. The format of the diagram is consistent with the RFC; bits are numbered from left-to-right, starting at 0. Each row represents a single 32-bit word; note that an IP header will be at least 5 words (20 bytes) in length. The fields contained in the header, and their functions, are: Version: Specifies the IP version of the packet. The current version of IP is version 4, so this field will contain the binary value 0100. [NOTE: Actually, many IP version numbers have been assigned besides 4 and 6; see theIANA‘s list of IP Version Numbers.] Internet Header Length (IHL): Indicates the length of the datagram header in 32 bit (4 octet) words. A minimum-length header is 20 octets, so this field always has a value of at least 5 (0101) Since the maximum value of this field is 15, the IP Header can be no longer than 60 octets. Type of Service (TOS): Allows an originating host to request different classes of service for packets it transmits. Although not generally supported today in IPv4, the TOS field can be set by the originating host in response to service requests across the Transport Layer/Internet Layer service interface, and can specify a service priority (0-7) or can request that the route be optimized for either cost, delay, throughput, or reliability. Total Length: Indicates the length (in bytes, or octets) of the entire packet, including both header and data. Given the size of this field, the maximum size of an IP packet is 64 KB, or 65,535 bytes. In practice, packet sizes are limited to the maximum transmission unit (MTU). Identification: Used when a packet is fragmented into smaller pieces while traversing the Internet, this identifier is assigned by the transmitting host so that different fragments arriving at the destination can be associated with each other for reassembly. Flags: Also used for fragmentation and reassembly. The first bit is called the More Fragments (MF) bit, and is used to indicate the last fragment of a packet so that the receiver knows that the packet can be reassembled. The second bit is the Don‘t Fragment (DF) bit, which suppresses fragmentation. The third bit is unused (and always set to 0). Fragment Offset: Indicates the position of this fragment in the original packet. In the first packet of a fragment stream, the offset will be 0; in subsequent fragments, this field will indicates the offset in increments of 8 bytes. Time-to-Live (TTL): A value from 0 to 255, indicating the number of hops that this packet is allowed to take before discarded within the network. Every router that sees this packet will decrement the TTL value by one; if it gets to 0, the packet will be discarded. Protocol: Indicates the higher layer protocol contents of the data carried in the packet; options include ICMP (1), TCP (6), UDP (17), or OSPF (89). A complete list of IP protocol numbers can be found at theIANA‘s list of Protocol Numbers. An implementation-specific list of supported protocols can be found in the protocol file, generally found in the /etc (Linux/Unix), c:\windows (Windows 9x, ME), or c:\winnt\system32\drivers\etc (Windows NT, 2000) directory. Header Checksum: Carries information to ensure that the received IP header is error-free. Remember that IP provides an unreliable service and, therefore, this field only checks the IP header rather than the entire packet. Source Address: IP address of the host sending the packet. Destination Address: IP address of the host intended to receive the packet. Options: A set of options which may be applied to any given packet, such as sender-specified source routing or security indication. The option list may use up to 40 bytes (10 words), and will be padded to a word boundary; IP options are taken from theIANA‘s list of IP Option Numbers.
IP addresses are 32 bits in length (Figure 5). They are typically written as a sequence of four numbers, representing the decimal value of each of the address bytes. Since the values are separated by periods, the notation is referred to as dotted decimal. A sample IP address is 208.162.106.17.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 --+-------------+------------------------------------------------ Class A |0| NET_ID | HOST_ID | |-+-+-----------+---------------+-------------------------------| Class B |1|0| NET_ID | HOST_ID | |-+-+-+-------------------------+---------------+---------------| Class C |1|1|0| NET_ID | HOST_ID | |-+-+-+-+---------------------------------------+---------------| Class D |1|1|1|0| MULTICAST_ID | |-+-+-+-+-------------------------------------------------------| Class E |1|1|1|1| EXPERIMENTAL_ID | --+-+-+-+-------------------------------------------------------- FIGURE 5. IP Address Format.
IP addresses are hierarchical for routing purposes and are subdivided into two subfields. The Network Identifier (NET_ID) subfield identifies the TCP/IP subnetwork connected to the Internet. The NET_ID is used for high-level routing between networks, much the same way as the country code, city code, or area code is used in the telephone network. The Host Identifier (HOST_ID) subfield indicates the specific host within a subnetwork.
To accommodate different size networks, IP defines several address classes. Classes A, B, and C are used for host addressing and the only difference between the classes is the length of the NET_ID subfield:
A Class A address has an 8-bit NET_ID and 24-bit HOST_ID. Class A addresses are intended for very large networks and can address up to 16,777,214 (224-2) hosts per network. The first bit of a Class A address is a 0 and the NETID occupies the first byte, so there are only 128 (27) possible Class A NETIDs. In fact, the first digit of a Class A address will be between 1 and 126, and only about 90 or so Class A addresses have been assigned. A Class B address has a 16-bit NET_ID and 16-bit HOST_ID. Class B addresses are intended for moderate sized networks and can address up to 65,534 (216-2) hosts per network. The first two bits of a Class B address are 10 so that the first digit of a Class B address will be a number between 128 and 191; there are 16,384 (214) possible Class B NETIDs. The Class B address space has long been threatened with being used up and it is has been very difficult to get a new Class B address for some time. A Class C address has a 24-bit NET_ID and 8-bit HOST_ID. These addresses are intended for small networks and can address only up to 254 (28-2) hosts per network. The first three bits of a Class C address are 110 so that the first digit of a Class C address will be a number between 192 and 223. There are 2,097,152 (221) possible Class C NETIDs and most addresses assigned to networks today are Class C (or sub-Class C!).
The remaining two address classes are used for special functions only and are not commonly assigned to individual hosts. Class D addresses may begin with a value between 224 and 239 (the first 4 bits are 1110), and are used for IP multicasting (i.e., sending a single datagram to multiple hosts); the IANA maintains a list ofInternet Multicast Addresses. Class E addresses begin with a value between 240 and 255 (the first 4 bits are 1111), and are reserved for experimental use.
Several address values are reserved and/or have special meaning. A HOST_ID of 0 (as used above) is a dummy value reserved as a place holder when referring to an entire subnetwork; the address 208.162.106.0, then, refers to the Class C address with a NET_ID of 208.162.106. A HOST_ID of all ones (usually written "255" when referring to an all-ones byte, but also denoted as "-1") is a broadcast address and refers to all hosts on a network. A NET_ID value of 127 is used for loopback testing and the specific host address 127.0.0.1 refers to the localhost.
Several NET_IDs have been reserved inRFC 1918 for private network addresses and packets will not be routed over the Internet to these networks. Reserved NET_IDs are the Class A address 10.0.0.0 (formerly assigned to ARPANET), the sixteen Class B addresses 172.16.0.0-172.31.0.0, and the 256 Class C addresses 192.168.0.0-192.168.255.0.
An additional addressing tool is the subnet mask. Subnet masks are used to indicate the portion of the address that identifies the network (and/or subnetwork) for routing purposes. The subnet mask is written in dotted decimal and the number of 1s indicates the significant NET_ID bits. For "classful" IP addresses, the subnet mask and number of significant address bits for the NET_ID are:
Class Subnet Mask Number of Bits A 255.0.0.0 8 B 255.255.0.0 16 C 255.255.255.0 24
Depending upon the context and literature, subnet masks may be written in dotted decimal form or just as a number representing the number of significant address bits for the NET_ID. Thus, 208.162.106.17 255.255.255.0 and 208.162.106.17/24 both refer to a Class C NET_ID of 208.162.106. Some, in fact, might refer to this 24-bit NET_ID as a "slash-24."
Subnet masks can also be used to subdivide a large address space into subnetworks or to combine multiple small address spaces. In the former case, a network may subdivide their address space to define multiple logical networks by segmenting the HOST_ID subfield into a Subnetwork Identifier (SUBNET_ID) and (smaller) HOST_ID. For example, user assigned the Class B address space 172.16.0.0 could segment this into a 16-bit NET_ID, 4-bit SUBNET_ID, and 12-bit HOST_ID. In this case, the subnet mask for Internet routing purposes would be 255.255.0.0 (or "/16"), while the mask for routing to individual subnets within the larger Class B address space would be 255.255.240.0 (or "/20").
But how a subnet mask work? To determine the subnet portion of the address, we simply perform a bit-by-bit logical AND of the IP address and the mask. Consider the following example: suppose we have a host with the IP address 172.20.134.164 and a subnet mask 255.255.0.0. We write out the address and mask in decimal and binary as follows:
172.020.134.164 10101100.00010100.10000110.10100100
AND 255.255.000.000 11111111.11111111.00000000.00000000
--------------- -----------------------------------
172.020.000.000 10101100.00010100.00000000.00000000
From this we can easily find the NET_ID 172.20.0.0 (and can also infer the HOST_ID 134.164).
As an aside, most ISPs use a /30 address for the WAN links between the network and the customer. The router on the customer‘s network will generally have two IP addresses; one on the LAN interface using an address from the customer‘s public IP address space and one on the WAN interface leading back to the ISP. Since the ISP would like to be able to ping both sides of the router for testing and maintenance, having an IP address for each router port is a good idea.
By using a /30 address, a single Class C address can be broken up into 64 smaller addresses. Here‘s an example. Suppose an ISP assigns a particular customer the address 24.48.165.130 and a subnet mask 255.255.255.252. That would look like the following:
024.048.165.130 00011000.00110000.10100101.10000010
AND 255.255.255.252 11111111.11111111.11111111.11111100
--------------- -----------------------------------
024.048.165.128 00011000.00110000.10100101.10000000
So we find the NET_ID to be 24.48.165.128. Since there‘s a 30-bit NET_ID, we are left with a 2-bit HOST_ID; thus, there are four possible host addresses in this subnet: 24.48.165.128 (00), .129 (01), .130 (10), and .131 (11). The .128 address isn‘t used because it is all-zeroes; .131 isn‘t used because it is all-ones. That leave .129 and .130, which is ok since we only have two ends on the WAN link! So, in this case, the customer‘s router might be assigned 24.48.165.130/30 and the ISP‘s end of the link might get 24.48.165.129/30. Use of this subnet mask is very common today (so common that there is a proposal to allow the definition of 2-address NET_IDs specifically for point-to-point WAN links).
A very good IP addressing tutorial can be found in Chuck Semeria‘s "Understanding IP Addressing: Everything You Ever Wanted to Know." If you are really interested in subnet masks, there are a number of subnet calculators on the Internet, includingjafar.com‘s IP Subnet/Supernet Calculator,Net3 Group Inc.‘s IP Subnet Calculator, andSuper Shareware‘s Subnet Calculator.
A last and final word about IP addresses is in order. Most Internet protocols specify that addresses be supplied in the form of a fully-qualified host name or an IP address in dotted decimal form. However, spammers and others have found a way to obfuscate IP addresses by supplying the IP address as a single large decimal number. Remember that IP addresses are 32-bit quantities. We write the address in dotted decimal for the convenience of humans; the computer still interprets dotted decimal as a 32-bit quantity. Therefore, writing the address as a single large decimal number will still allow the computer to see the address as a 32-bit number. For that reason, the following URLs will all take you to the same Web page:
http://www.garykessler.nethttp://209.198.111.31http://3519442719
The use of class-based (or classful) addresses in IP is one of the reasons that IP address exhaustion has been a concern since the early 1990s. Consider an organization, for example, that needs 1000 IP addresses. A Class C address is obviously to small so a Class B address would get assigned. But a Class B address offers more than 64,000 address, so over 63,000 addresses are wasted in this assignment.
An alternative approach is to assign this organization a block four Class C addresses, such as 192.168.128.0, 192.168.129.0, 192.168.130.0, and 192.168.131.0. By using a 22-bit subnet mask 255.255.252.0 (or "/22") for routing to this "block," the NET_ID assigned to this organization is 192.168.128.0.
This use of variable-size subnet masks is called Classless Interdomain Routing (CIDR), described in RFCs1518 and1519. In the example here, routing information for what is essentially four Class C addresses can be specified in a single router table entry.
But this concept can be expanded even more. CIDR is an important contribution to the Internet because it has dramatically limited the size of the Internet backbone‘s routing tables. Today, IP addresses are not assigned strictly on a first-come, first-serve basis, but have been preallocated to various numbering authorities around the world. The numbering authorities in turn, assign blocks of addresses to major (or first-tier) ISPs; these address blocks are called CIDR blocks. An ISP‘s customer (which includes ISPs that are customers of a first-tier ISP) will be assigned an IP NET_ID that is part of the ISP‘s CIDR block. So, for example, let‘s say that Gary Kessler ISP has a CIDR block containing the 256 Class C addresses in the range 196.168.0.0-196.168.255.0. This range of addresses could be represented in a routing table with the single entry 196.168.0.0/16. Once a packet hits the Gary Kessler ISP, it will be routed it to the correct end destination.
But don‘t stop now! By shrinking the size of the subnet mask so that a single NET_ID refers to multiple addresses (resulting in shrinking router tables), we could extend the size of the subnet mask to actually assign to an organization something smaller than a Class C address. As the Class C address space falls in danger of being exhausted, users are under increasing pressure to accept assignment of these sub-Class C addresses. An organization with just a few servers, for example, might be assigned, say, 64 addresses rather than the full 256. The standard subnet mask for a Class C is 24 bits, yielding a 24-bit NET_ID and 8-bit HOST_ID. If we use a "/26" mask (255.255.255.192), we can assign the same "Class C" to four different users, each getting 1/4 of the address space (and a 6-bit HOST_ID). So, for example, the IP address space 208.162.106.0 might be assigned as follows:
NET_ID HOST_ID
range Valid
HOST_IDs 208.162.106.0 0-63 1-62 208.162.106.64 64-127 65-126 208.162.106.128 128-191 129-190 208.162.106.192 192-255 193-254
Note that in ordinary Class C usage, we would lose two addresses from the space — 0 and 255 — because addresses of all 0s and all 1s cannot be assigned as a HOST_ID. In the usage above, we would lose eight addresses from this space, because 0, 64, 128, and 192 have an all 0s HOST_ID and 63, 127, 191, and 255 have an all 1s HOST_ID. Each user, then, has 62 addresses that can be assigned to hosts.
The pressure on the Class C address space is continuing in intensity. Today, the pressure is not only to limit the number of addresses assigned, but organizations need to show why they need as many addresses as they want. Consider a company with 64 hosts and 3 servers. The ISP may request that that company only obtain 32 IP addresses. The rationale: the 3 servers need 3 addresses but the other hosts might be able to "share" the remaining pool of 27 addresses (recall that we lost HOST_ID addresses 0 and 31).
A pool of IP addresses can be shared by multiple hosts using a mechanism called Network Address Translation (NAT). NAT, described inRFC 1631, is typically implemented in hosts, proxy servers, or routers. The scheme works because every host on the user‘s network can be assigned an IP address from the pool of RFC 1918 private addresses; since these addresses are never seen on the Internet, this is not a problem.
FIGURE 6. Network Address Translation (NAT).
Consider the scenario shown in Figure 6. When the user accesses a Web site on the Internet, the NAT server will translate the "private" IP address of the host (192.168.50.50) into a "public" IP address (220.16.16.5) from the pool of assigned addresses. NAT works because of the assumption that, in this example, no more than 27 of the 64 hosts will ever be accessing the Internet at a single time.
But suppose that assumption is wrong. Another enhancement, called Port Address Translation (PAT) or Network Address Port Translation (NAPT), allows multiple hosts to share a single IP address by using different "port numbers" (ports are described more inSection 3.3).
FIGURE 7. Port Address Translation (PAT).
Port numbers are used by higher layer protocols (e.g., TCP and UDP) to identify a higher layer application. A TCP connection, for example, is uniquely identified on the Internet by the four values (aka 4-tuple)
Overview of TCP/IP and the Internet
Overview of TCP/IP and the Internet
Overview of Internet Technology - Internet Ti...
Overview and Outlook on the Semantic Desktop
Parents Love and Hate the Internet
The Blogging Phenomenon: An Overview and Theoretical Consideration
Poverty, Inequality, and the Policies of the ...
TCP/IP
TCP/IP
TCP/IP
TCP/IP
The Fringe Benefits of Failure, and the Importance of Imagination
STRATEGY AND TACTICS OF THE CLASS STRUGGLE
BlogHer and the culture of generosity
Alan Turing and the Enigma of Computability
Sparking the Hearts and Minds of Students
The Price of War, Front and Center
The Art, Science and Business of Recommendati...
The Usability and Popularity of Orkut
The axis of oil: China and Venezuela
The Rebuilding and Scaling of YellowPages.com
The Intersection of Ethics and Knowledge Management
The axis of oil: China and Venezuela
The purchase and maintenance of antique furniture