copyright notice
link to published version: Ralston, et al (eds.): Encyclopedia of Computer Science, 4th edition, John Wiley & Sons

accesses since January 1, 2002

The World Wide Web


Hal Berghel University of Arkansas

[figure 1] [figure 2] [figure 3]

NETWORK PERSPECTIVE

The World Wide Web, or "the Web" as it's commonly called, represents an important departure from more traditional network communications protocols such as Telnet and FTP. Where prior network protocols were special purpose in terms of both function and media formats, the Web is highly versatile. The Web was the first form of digital communication that had rendering and browsing utilities adequate to allow any person or group with network access to share media-rich information with anyone else. The Web represents a major paradigm shift in networked computing, both in terms of delivery of information and inter-personal, though not in-person, communication.

Formally, the Web is a client-server model for packet-switched, networked computer systems that utilizes a few key Internet protocols. The client handles all of the interaction with other components of the computing environment (i.e., other desktop applications and the server) and temporarily retains information for perusal. The networked servers are information repositories which host software to serve client requests. The procedural "glue" which makes the client-server interactivity possible is the concurrent support, by both client and server, of the protocol-pair, HyperText Transfer Protocol (HTTP) and HyperText Markup Language (HTML). The former establishes the basic handshaking procedures between client and server, while the latter defines the organization and structure of Web documents to be exchanged. At this writing, the current HTTP version is 1.0, while adoption of version 3.2 of HTML should take place by late 1997.

According to NSFNET Backbone statistics, the Web moved into first place both in terms of the percentage of total packets moved (21%) and percentage of total bytes moved (26%) along the NSF backbone in the first few months of 1995. This placed the Web well ahead of the traditional Internet activity leaders, ftp (14%/21%) and telnet (7.5%/2.5%), as the most popular Internet service. A comparison of the evolutionary patterns of the Web, Gopher and FTP are graphically depicted in Figure 1.


Figure 1.Merit NIC Backbone statistics for the Web, Gopher and FTP from 1993-1995 in terms of both packet and byte counts. (source: Merit NIC and Jim Pitkow, used with permission)

The rapid escalation of Web is a result of a unique combination of characteristics:

  1. the Web is an enabling technology - The Web was the first widespread network technology to extend the notion of virtual network machine to multimedia. While the ability to execute programs on, and retrieve content from, distributed computers was not new (e.g., Telnet and FTP were already in wide use by the time that the Web was conceived), the ability to produce and distribute media-rich documents via a common, platform-independent document structure, was new to the Web.
  2. the Web is a unifying technology - The unification came through the Web's accommodation of a wide range of multimedia formats. Since such audio (e.g., .WAV,.AU), graphics (e.g., .GIF,.JPG) and animation (e.g., MPEG) formats are all digital, they were already unified in desktop applications prior to the Web. The Web, however, unified them for distributed, network applications. One Web "browser", as it later became called, would correctly render dozens of media formats regardless of network source.
  3. the Web is a social phenomena - The Web social experience evolved in three stages. Stage one was the phenomena of Web "surfing". The richness and variety of Web documents and the novelty of the experience made Web surfing the de facto standard for curiosity-driven networking behavior in the 1990's. The second stage involved such Web interactive communication forums as Internet Relay Chat (IRC), which provided a new outlet for interpersonal but not-in-person communication. The third stage, which is in infancy as of this writing, involves the notion of virtual community. The widespread popularity and social implications of such network-based, interactive communication is becoming an active area in computing research.

END USER'S PERSPECTIVE

Extensive reporting on Web use and Web user may be found in a number of Web survey sites. Perhaps the most thorough of which is the biannual, self-selection World Wide Web Survey which began in January, 1994 (see reference, below). As this article is written, the seventh biennial Web survey (May, 1997) has just been released. Some global summary information is reported in the table, below:

TABLE 1. Summary Information on Web use

A major problem of self-selection surveys, where subjects determine whether, or to what degree, they wish to participate in the survey, is that the samples are likely to be biased. In the case of the Web survey, for example, the authors recommend that the readers assume biases towards the experienced users. As a consequence, they recommend that readers confirm the results through random sample surveys. Despite these limitations, however, the Web Surveys are widely used referenced are among our best sources of information on Web use.

An interesting byproduct of these surveys will be an increased understanding of the difference between traditional electronic surveying methodologies and a concern over possible population distortions under a new, digital lens. One may only conjecture at this point whether telephone respondents behave similarly to network respondents in survey settings. In addition, Web surveyors will develop new techniques for non-biased sampling which avoids the biases inherent in self-selection. The science and technology behind such electronic sampling may well be indispensable for future generations of Internet marketers, communicators, and organizers.

HISTORICAL PERSPECTIVE

The Web was conceived by Tim Berners-Lee and his colleagues at CERN (now called the European Laboratory for Particle Physics) in 1989 as a shared information space which would support collaborative work. Berners-Lee defined HTTP and HTML at that time. As a proof of concept prototype, he developed the first Web client navigator-browser in 1990 for the NeXTStep platform. Nicola Pellow developed the first cross-platform Web browser in 1991 while Berners-Lee and Bernd Pollerman developed the first server application - a phone book database. By 1992, the interest in the Web was sufficient to produce four additional browsers - Erwise, Midas and Viola for X Windows, and Cello for Windows. The following year, Marc Andreessen of the National Center for Supercomputer Application (NCSA) wrote Mosaic for X Windows which soon became the browser standard against which all others would be compared. Andreessen went on to co-found Netscape Communications in 1994 whose current browser, Netscape Navigator, is the current de facto standard Web browser. (see Figure 2).


Figure 2. Navigator 3.x is a recent generic "navigator/browser" from Netscape Corporation. Displayed is a vanilla "splash page" for the Web Test Pattern - a test bench for determining the level of HTML compliance of a browser.

Despite the original design goal of supporting collaborative work, Web use has become highly variegated. The Web has been extended into a wide range of products and services offered by individuals and organizations, for commerce, education, entertainment, "edutainment", and even propaganda. A partial list of popular Web applications includes:

  1. individual and organizational homepages
  2. sales prospecting via interactive forms-based surveys
  3. advertising and the distribution of product promotional material new product information, product updates, product recall notices
  4. product support - manuals, technical support, frequently asked questions (FAQs)
  5. corporate record-keeping - usually via local area networks (LANs) and intranets
  6. electronic commerce made possible with the advent of several secure http transmission protocols and electronic banking which can handle small charges (perhaps at the level of millicents)
  7. religious proselytizing
  8. propagandizing
  9. digital politics

Most Web resources at this writing are still set up for non-interactive, multimedia downloads (e.g., non-interactive Java animation applets, movie clips, real-time audio transmissions, text with graphics). This will change in the next decade as software developers and Web content-providers shift their attention to the interactive and participatory capabilities of the Internet, the Web, and their successor technologies. But at this writing, the dominant Web theme seems to remain static HTML documents and non-interactive animations.

Web technology evolved beyond the original concept in several important respects. For one, the support of the Common Gateway Interface (CGI) within HTTP in 1993 added interactive computing capability to the Web. Perhaps the most important use of CGI to this point has been the processing of CGI forms which enable input from the Web user-client to be passed to the server for processing. While, in theory, CGI programs can provide server-side programming for virtually any Web need, network bandwidth constraints and transmission delays may make some heavily interactive and volumetric applications infeasible.

Second, "plug-in" technology increased the media-rendering capability of browsers while avoiding the time-consuming spawning of so-called "helper apps" through the browser's launchpad. The speed advantage of the plug-ins, together with the tight coupling that exists between the plug-ins and the media formats which they render, make them a highly useful extension.

Third, the advent of executable content added a high level of animated media rendering and interactive content on the client side. Such object-oriented network programming languages as Java produce platform-independent program modules which are executable on enabled Web browsers. Not surprisingly, this latest extension, that involves executing foreign programs which have been downloaded across the networks, is not without security risk.

Fourth, we are beginning to see advanced information-gathering strategies which go beyond the original "information-pull" concept behind the Web. Where most users, perhaps through autonomous software agents, currently seek to draw information to them, solicited "push-phase" technology attempts to routinely and automatically dispense information to selected consumers (see Figure 3).


Figure 3.Marimba Corporation's Castanet tuner with two channels open - one which animates binary tree growth as random values are inserted, and the other which supports interactive Rubik cube play.

Within the past year several prototypes of solicited push netcasting have been deployed. Some, like Pointcast, consolidate and distribute information via a proprietary server-"transmitter". In this case, the client-side software behaves as a dedicated "peruser" for the transmissions. Other solicited push technology, such as Marimba's Castanet, contain a "tuner" which allows the client to connect to an arbitrary number of different servers. Each connection from the client to the transmitter is called a channel.

We observe that CGI, plug-ins, executable content and push technology represent significant departures from the original browser-centric paradigm of Web information exchange, and add considerably to Web capabilities.

THE WEB AS A SOCIAL PHENOMENON

The social effect of the Web is not well understood. Not surprisingly, the zeal to harness and exploit the richness of Web resources and technology, combined with the desire to capitalize on commercial Web services, have taken precedence over efforts to understand the social dimensions of Web use.

Much of what little we know of Web behavior seems to be derived from two disparate sources. Descriptive statistics produced by the Web surveys are most useful to measure isolated events and independent activities - e.g., how many Windows users used Netscape.

The second source is the study of the use of Email. Email's status as a de facto paradigm of "interpersonal though not-in-person communication" makes it a useful testbench for testing hypotheses about network behavior, generally. Since Email and the Web share several characteristics, e.g. they both minimize the effects of geographical distance between users, they are both based on user-centric models of communication, both rely on self-imposed interrupts, both are paperless and archivable by default, both create potential security and privacy problems, and neither requires continuous endpoint-to-endpoint network connectivity, Email can teach us something about Web behavior.

However, both sources provide incomplete views of Web behavior. Descriptive statistics tell us little about either the causes of emerging trends or the connections and associations between various aspects of Web use (e.g., to what extent, if any, do anonymous Web engagements promote discussion of controversial topics?)

There are differences between Email and the Web as well. Email deals with network, peer-to-peer communication partnerships, where the present Web remains primarily an information-delivery system. Email, in its most basic form at least, exemplifies push-phase technology, while the current Web is mostly pull-phase in orientation. Of course, the onset of new technologies such as Web teleconferencing and virtual communities, will change the nature of such comparisons.

While definitive conclusions about the social aspects of Web use remain elusive, some central issues have been identified for future study (see Table 2).

TABLE 2. Social issues and Web behavior

We are slowly coming to understand the capabilities of the Web for selected applications and venues. To illustrate, early use convincingly demonstrated that the Web was a popular and worthwhile medium for presenting distributed multimedia, even though we can't as yet quantify the social benefits and institutional costs which result from this use. As CGI was added to the Web, it became clear that the Web would be an important location-independent, multi-modal form of interactivity - although we know little about the motivations behind such interactivity, and even less about how one would measure the long-term utility for the participants and their institutions.

VIRTUAL COMMUNITIES

As mentioned above, the Web's primary utility at the moment is as an information delivery device - what some authors have called the "document phase" of the Web. However, more powerful and robust Web applications will soon begin to take hold. Perhaps the most significant future application will involve the construction of virtual communities.

Virtual, or electronic, communities, are examples of interactive and participatory forums conducted over digital networks for the mutual benefit of the participants and sponsors. They may take on any number of forms. The first attempts to establish virtual communities dates back to the mid-1980's with the community, "freenet" movement. While early freenets offered few services beyond Email and Telnet, many quickly expanded to offer access to documents in local libraries and government offices, Internet relay chats, community bulletin boards, and so forth, thereby giving participants and enhanced sense of community through another form of connectivity.

Virtual communities of the future are likely to have both advantages and disadvantages when compared to their veridical counterparts (Table 3).

TABLE 3. Potential Advantages and Disadvantages of Electronic Communities

CONCLUSION

The World Wide Web represents the closest technology to the ideal of a completely distributed network environment for multiform communication. As such, it may be though of as a paradigm shift away from earlier network protocols. Many feel that the most significant impact of the Web will not be felt until the 21st century, when technologies are added to make the Web fully interactive, participatory and immersive by default.


FOR FURTHER READING

  1. ACM Electronic Communities Project (information on the use of the Web for Electronic Communities)
  2. Berghel, H., " The Client Side of the Web " Communications of the ACM, 39:1, (January, 1996), pp. 30-40.
  3. Berghel, H., " Email: the Good, the Bad and the Ugly" Digital Village, Communications of the ACM, 40:4 (April, 1997), pp. 11-15.
  4. Berners-Lee, T., WWW: Past, Present and Future, Computer, 29:10, October, 1996. pp. 69-77.
  5. Comer, D., The Internet Book, Prentice-Hall, Upper Saddle River, 1997. (excellent introductory overview of the Internet and Web)
  6. NSFNET Backbone Traffic Distribution Statistics, April, 1995. http://www.cc.gatech.edu/gvu/stats/NSF/merit.html.
  7. Pitkow, J., et al, GVU's WWW User Surveys, http://www.cc.gatech.edu/gvu/user_surveys/.