JCMC 3 (2) September 1997
Message Board
Collab-U CMC Play E-Commerce Symposium Net Law InfoSpaces Usenet NetStudy VEs


Bridging the Gulfs: From Hypertext to Cyberspace [1]

Thierry Bardini
Departement de Communication
Université de Montréal



Table of Contents



Journal entry 37. Thoughts of the Brain are experienced by us as arrangements and rearrangements--change--in a physical universe; but in fact it is really information and information processing that we substantialize. We do not merely see its thoughts as objects, but rather as movement, or, more precisely, the placement of objects: how they become linked to one another. But we cannot read the patterns of arrangement; we cannot extract the information in it--i.e. it as information, which is what it is. The linking and relinking of objects by the Brain is actually a language, but not a language like ours (since it is addressing itself and not someone or something outside itself).

Philip K. Dick, Valis.


Abstract


The purpose of this paper is to focus on two main conceptions at the origin of hypertext technology, and contrast the associationist and the connectionist views. From the starting point provided by this conceptual opposition, it surveys the relationships between users and developers of new computerized communication technologies as inscriptions at the interface. Upgrading Brenda Laurel's models of the interface, it proposes a new conception of the personal interface that acknowledges the virtual presence of the designer, and locates the space of the screen as a dialogic space of mutual engagement.

Defining Hypertext


In this paper we look back at the early history of hypertext technology through the alternative visions of two pioneers of the field, Douglas Engelbart and Ted Nelson, and propose issues for the current agenda of personal computing technology for the end of the twentieth century. Following the works of [Halasz (1988)] and [Conklin (1987)], we describe the structure and the legacy of these visions to raise questions concerning the current status and future of the technology. Our work, however, differs substantially from this earlier research in that we take a sociological perspective on hypertext.

Our thesis here is that Engelbart's and Nelson's visions of hypertext reveal two cultures deeply embedded in the technology, but organized on the same operating principle. Their perspectives take two views on the user: An individual or a member of a community. Engelbart and Nelson are the prophets of these two trends deeply intertwined and absolutely undivisible, for which each of us is defined as a potential user, a class in itself or a member of an entity of greater importance. Both of them address the question of our relationship to the act of creation from diametrically--and therefore complementary and irremediably opposite, points of view.

Virtuality and metaphor are two much-discussed aspects of current interface technologies. Here we propose to link them (and a set of related concepts such as hypertext, hypermedia, user-illusion, intelligent agents, and narrative) in an historical analysis of the development of the technology, going from the seminal work of Douglas Engelbart and Ted Nelson to the current issues in the field. In this set of related concepts, we begin with the notion of hypertext. Hypertext appears historically central to the analysis we are making in this that it can be considered as the first move away from the unidimensionality of the culture of print.

Defining hypertext can be confusing. [Gygi (1990, p. 282)] categorized available definitions of hypertext into two types, "broad-spectrum" (Group I) and the "more clinical variety" (Group II). She found Group I definitions in the popular press and in advertising and marketing literature, and Group II definitions in technical journals and research efforts at developing computer-supported hypertext systems. She gave the following examples:

Group I

Group II

The definition of hypertext is the result of an historical process, in which the meaning of the term "hypertext" is progressively stabilized through negotiations among actors of the field. The term "hypertext" is usually credited to Ted Nelson, who says that he coined the term in 1962 with the idea of hyperspace in his mind. According to Nelson, his influence was mainly found in the vocabulary of mathematics, where the prefix "hyper" means "extended and generalized" [(Nelson, personal interview, 3/17/93)]. To Nelson, hypertext was a necessary tool for his work as an author, what he calls "the most fundamental tool of human thought," a tool that:

Allows you to see alternative versions on the same screen on parallel windows and mark side by side what the differences are. Not by scanning but by analysis of data structure. Now the system I started designing in the 1960s, allows you, would have allowed you, will allow you to see connections between the contents of different windows, like rubber bands between the middles of the windows [(Nelson, personal interview, 3/17/93)].

For Nelson then, hypertext was first conceived as a literary tool that enables the author of a text to extend his or her text to the multiple and successive versions of it, in order to compare them. It is a fundamental tool because "any piece of writing evolves to the very end of its creation. And the real issue is how can we hold partially organized materials for inter-comparison" [(Nelson, personal interview, 3/17/93)].

At the same time that Ted Nelson coined the term hypertext, Douglas Engelbart was beginning to implement his framework for the Augmentation of Human Intellect at Stanford Research Institute (SRI, in Menlo Park, CA). Altough his framework itself did not directly mention hypertext, the core of Douglas Engelbart's vision was based a very similar premise:

I just almost remember the events, about 1960 or 1961, I was starting looking at this kind of an augmentation system and saying if I really think that there's gonna be drastic qualitative change throughout that, then we can't start a research program which tries to cover everything, so where would be the most leverage? And I started ticking my mind in realizing we do have a totally different medium and that we know that your concepts and your mind don't seem to be just linearly thinkious {sic} core through which you jump, and that you can jump and look at different abstract levels...we've got this extremely flexible way in which computers can represent modules of symbols and can tie them together with any structuring relationship we can conceive of [(Engelbart, personal interview, 12/15/92)].

The introduction of an hypertext-like capability in Engelbart's framework responded, however, to a very different motivation than Nelson's. Engelbart's framework was based on the premise that computers should be able to perform as a powerful auxiliary to human communication and collaboration if they were to manipulate the symbols that human beings manipulate. For such augmentation to take place, a co-evolution of the computer and the human being was necessary--as in the biological notion of symbiotic association, where both entities co-evolve for an ever better fit: The computer should learn to manipulate the human language, and the human being should learn to use the computer. Our analysis [(Bardini, forthcoming)] of Engelbart's vision is that it is is based on the assumption that language is more than symbolic representation, better seen as a social construction.

For Nelson, hypertext is a fundamental tool for individual creativity, and for Engelbart, hypertext is a necessary capability of a system designed to improve communication. These two alternatives parallel two different conceptions of the user, seen either as a creative individual or as a member of a community in a human organization.

Association versus Connection


Full understanding of the origins of hypertext technology must go back to the ideas of Vannevar Bush on "association" and of Benjamin Lee Whorf on "connection". Bush's influence on hypertext is now widely acknowledged [(Nyce & Kahn, 1991)]. Scholars of the technology usually consider his 1945 article "As We May Think" as the conceptual origin of the technology and unanimously quote the following lines as the first expression of the seminal idea of hypertext:

The human mind...operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain [(Bush, 1991, p. 101)].

A deeper study of the work of Douglas Engelbart [(Bardini, 1995)] and Ted Nelson reveals that the emphasis on "association" and Bush's legacy neglects half of the influence on them. Both Engelbart and Nelson acknowledge that they were very familiar with the work of Benjamin Lee Whorf [(Engelbart, personal interview, 12/15/92; Nelson, personal interview, 3/17/93)]. Engelbart quoted Whorf in his 1962 report to the Director of Information Sciences of the U.S. Air Force Office of Scientific Research, the first comprehensive report on his augmentation framework. And Nelson learned about Whorf's theory during the course of his studies in sociology at Harvard.

In a 1927 letter printed in his first edition (1956) of Language, Thought, and Reality, Whorf introduced the concept of the connection of ideas as "quite another thing from the association of ideas." When the latter have "an accidental character" as the subject "jumps at the first idea that comes to [his] mind," the latter corresponds to a controlled association." The difference mirrors the opposition of the purpose of hypertext technology, and the representation of the user embedded in it:

"Connection" is important from a linguistic standpoint because it is bound up with the communication of ideas. One of the necessary criteria of a connection is that it be intelligible to others, and therefore the individuality of the subject cannot enter to the extent that it does in free association, while a correspondingly greater part is played by the stock of conceptions common to people [(Whorf, 1927, p. 37)].

If we consider "association" and "connection" as the two opposite poles of a continuum describing hypertext systems, we can represent Engelbart's and Nelson's locations on this axis as shown in Figure 1.



Figure 1.

The comparison of their two positions on this axis allows us to understand the main ways in which they differ. The degree of freedom of the possible associations permitted in the system, ranging from free individual association to controlled connection, describes the level of rule envisioned by the designer of the system, and to which the user must comply. The conception of the system thus mirrors the importance of the rules and limits imposed by the designer on the user. Ted Nelson stressed that his views differ from those of Engelbart in structure and hierarchy [(Nelson, 1987)]:

To me hierarchy is a special case. I don't say that hierarchies are always invalid, it's just that because they're so convenient they've been used too much. And they represent many things very badly...So hierarchy is fine where it correctly and appropriately matches up. And forcing it where it doesn't is wrong. So the whole point is create the structures that map correctly whatever you do. And if you're mapping thought or trying to present ideas, the likelihood that they are non-hierarchical is greater [(Nelson, personal interview, 3/17/93)].

On the other hand, Douglas Engelbart stressed the importance of conventions that enable the user to improve the effiency of his computerized work:

With the view that the symbols one works with are supposed to represent a mapping of one's associated concepts, and further that one's concepts exist in a "network" of relationships as opposed to the essentially linear form of actual printed records, it was decided that the concept-manipulation aids derivable from real-time computer support could be appreciably enhanced by structuring conventions that would make explicit (for both the user and the computer) the various types of network relationships among concepts [(Engelbart & English, 1968)].

There were thus two cultures, two world-views at the origin of hypertext. The first is represented by Ted Nelson and his Xanadu Project, aiming at facilitating individual literary creativity. The second is represented by Douglas Engelbart and his NLS system, a support for group collaboration. The opposition between "association" and "connection" mirrors the opposition between these two projects as two trends for future hypertext systems. Randall H. Trigg [(1983; 1991)] examined the legacy of these two seminal works and characterized further hypertext systems (second-generation systems such as Notecards, Neptune, or Intermedia) as network or outline-based. The network-based systems are the children or the grand-children of Ted Nelson's Xanadu, and the outline-based systems are those of Douglas Engelbart's NLS.

Computing Metaphors


A common ground to all hypertext systems, regardless of their location on the association/connection continuum, is the issue of non-linearity of access to information. For Nelson and Engelbart, such non-linearity comes from the thinking or creative process. A tool that enables a more efficient creative process, whether it is individual or collaborative, should therefore allow a non-linear access and display of information. The difference between the two kinds of hypertext systems is the organization of access to non-linear representations of information. For Engelbart,

No human being can hold very many concepts in his head at one time. If he is dealing with more than a few, he must have some way to store and order these in some external medium, preferably a medium that can provide him with spatial patterns to associate with the ordering, e.g., an ordered list of possible courses of action. Beyond a certain number and complexity of interrelationships, he cannot depend upon spatial-pattern help alone and seeks other more abstract associations and linkages [(Engelbart, 1961, p. 122)].

The question of the design and representation of these "more abstract associations and linkages" is a fundamental question in user-interface design, and has attracted much attention since the end of the 1960s. The most important aspect of the efforts to design adequate user interface (understood here as adequate patterns of interaction between the user and the computer) has been the introduction of powerful metaphors, as David Smith explained in his Ph.D. dissertation [(1977, pp. 23-24)]:

Images are metaphors for concepts. They provide an alternate reality which is simultaneously concrete in structure and analogic in representation...The visual medium is an extremely useful metaphorical tool not only because it has powerful representational capabilities but also because it has a rich set of topological transformations within its own domain. Two- and higher-dimensional media possess far more versatile structural operations than do one-dimensional media.

The opening of the visual dimension of the computer as a communication medium is often thought to be one of the major contributions of Alan Kay and his team (including David Smith) at Xerox Palo Alto Research Center (PARC) in the 1970s [(Bardini & Horvath, 1995)]. A major contribution of this outstanding set of computer scientists is the "desktop metaphor," that many regard as today's "dominant paradigm of interaction with a personal computer" [(e.g., Oren, 1991)]. First on the Star computer designed at Xerox PARC's System Development Division (SDD), then on the Lisa and the Macintosh at Apple Computer, and eventually on the IBM PC and its clones with Microsoft's Windows, the "desktop" is the most common "alternate reality" that allows personal computer users to visualize the computer environment in which they work. But this tremendous achievement is not without limitations, as its creators realized:

One of the most compelling snares is the use of the term metaphor to describe a correspondence between what the users see on the screen and how they should think about what they are manipulating. My main complaint is that metaphor is a poor metaphor for what needs to be done. At PARC we coined the phrase user illusion to describe what we were about when designing user interface. There are clear connotations to the stage, theatrics and magic--all of which give much stronger hints as to the direction to be followed [(Kay, 1990, p. 199)].

As Tim Oren [(1991)] says, the desktop metaphor was originally designed for systems like the Xerox Star with a few hundred files on 5 to 10 megabytes of storage: "The purely user-directed browsing style of the desktop is approaching its limits of utility, with the number of files on a single user's machine reaching 10,000 and with easy access to even more information across networks." The problem is, to paraphrase Ted Nelson, [(personal interview, 3/17/1993)], that we are now "trapped by the success" of the desktop metaphor.

As a representation of the working environment, the "desktop" metaphor is limited to two dimensions, by mimicking the physical desktop of the information worker. The main historical development that led to this situation was the progressive realization of the user as the individual owner of a personal stand-alone computing system [(Bardini & Horvath, 1995)]. In the process, the connectivity of the system to similar systems and users was somehow lost, as can be seen in the [non] functionality of Appletalk. The individual user definitively prevailed over the member of a users' community. Most of today's mainstream commercial computing products are at the "association" end of the continuum rather than on the "connection" side.

Connectivity at the Interface


According to Alan Kay, the correspondence between what the user sees on the screen and what he/she thinks he/she manipulates (in other words between what the user visualizes and his/her internal model of action) is better seen as an illusion than as a metaphor. Designing this illusion is designing the user-interface. As Allen Newell explained:

Metaphor is a particularly disarming way of arriving at truth. It invites the listener to find within the metaphor those aspects that apply, leaving the rest as the false residual, necessary to the essence of metaphor. And since within the metaphor is always, by the linguistic turn...within the listener, the invitation is to find within oneself some truth that fits [(Newell, 1991, p. 160)].

In the illusion conception of the user-interface, the user's quest for the truth of the interaction is directed (as in theater) by the designer of the interface. When a user is creating a document (task) in the desktop environment, he/she manipulates (opens, moves, closes, etc.) an iconic representation of this document that is designed to stand for the document in his internal model of what he/she is doing. For most users, moving the icon of a document on the desktop is a quite straightforward action, analogous to moving the "real" document on his/her "real" desktop. Or so says the metaphorical conception of what is happening: It belongs to the user to realize this analogy. But if, like Alan Kay, you consider this analogy as an illusion, the role of the designer is to make the user believe that what he/she does when he/she moves the document's representation is an analog to moving the real document.

In her remarkable book, Computer as Theater, Brenda Laurel [(1991, pp. 12-14)] narrates the attempt made to define the user-interface by her and the participants of a seminar at the Atari Company (where she was then working). They rapidly dismissed the simplest model of the interface (Figure 2), represented as a shadowed rectangle between the person (user) and the computer, that "encompasses what appears on the screen, hardware input/output devices, and their drivers."



Figure 2. The Pre-Cognitive-Science View of the Interface.
Source: Laurel (1991)

This over-simplistic model of the interface was dismissed as "pre-cognitive-science":

In order for an interface to work, the person has to have some idea about what the computer expects and can handle, and the computer has to incorporate some information about what the person's goals and behaviors are likely to be. These two phenomena--a person's "mental model" of the computer and the computer's "understanding" of the person--are just as much a part of the interface as its physical and sensory manifestations [(Laurel, 1991, p. 12-13)].

This new model that encompasses the "mental model" of the computer and the computer's "understanding" of the person, is represented in Figure 3.



Figure 3. The Mental-Models View of the Interface.
Source: Laurel (1991)

They called the "horrible recursion" the main problem that they encountered then with this updated model of the interface:

If you are going to admit that what the two parties "think" about each other is part of what is going on, you will have to agree that what the two parties think about what the other is thinking about them must perforce be included in the model...This elaboration has dizzying ramifications [(Laurel, 1991, p. 14)].

Facing this "nightmare," the seminar turned its attention to "more manageable concepts." Since then a manageable concept of the user interface settled down to a simpler one: "How humans and computers interact." Figure 4 represents this simpler model, where "the interface is that which joins human and computer, conforming to the needs of each." Laurel concluded that this viewpoint "avoids the central issue of what this all means in terms of reality and representation," and that "when we have such trouble defining a concept, it usually means that we are barking up the wrong tree."



Figure 4. A Simple Model of the Interface, Circa 1989.
Source: Laurel (1991).

We would much rather think that Brenda Laurel's assumption of "barking up the wrong tree" is a mere translation of a more complex process of "suspension," or more precisely, of Hegelian Aufhebung that "combines the ideas of suspension, with its connotation of temporary cessation; transcendence, which suggests a going beyond; and a kind of preservation" [(Ashmore, 1989, 111)]. We can even find a sign of such a process when Brenda Laurel says that "you can demonstrate Zeno's paradox on the user's side of the barrier until you're blue in the face, but it's only when you traverse it that things get real."

Who is that "you"? Our guess is that applies as well to the user as to the designer of the interface: The problem is to enable both of them to act within a representation [(Laurel, 1991, 21)]. The problem in defining the user-interface arose in Laurel's narrative with the cognitive science assumption that representations are part of an interface: The user's representation of the computer, on one hand, and the "computer's understanding of the person," on the other hand. What do you think that the computer understands of the person? Not much in itself, unless "you" anthropomorphize the computer [(Laurel, 1990b)]. A lot more, that is, if you understand that the computer represents the designer. In other words, the computer might get to learn about the user from the representation of the user that the designer of the interface embodies in his/her design.

Our point here is that following Laurel in her attempt to carry on in spite of the fear of the infinite regress, we realize that the interface is the representational space where user and designer meet. Her project to introduce and implement a dramatic theory of the human-computer interaction proposes to enable them to act and communicate within the interface. The infinite regress of the mutual representation of each other becomes the convergence process common to the communication process [(Rogers & Kincaid, 1981)]. The anthropomorphization of the computer translates the user's desire for the computer's responsiveness and capacity to perform action [(Laurel, 1990b, 358), two major anthropomorphic qualities of the interface implemented by the designer.

Agents, Actants, and Narrative


Brenda Laurel [(1990b, p. 358)] realized that these two qualities, responsiveness and capacity to perform action, in fact "comprise the metaphor of agency," and in her defense of anthropomorphism, she stressed later [(1991, p. 143)] that this does not argue for the personification of the computer, but for its invisibility. We now argue that this invisibility must be the result of a negotiation between user and designer on the competence of the interface agents. The first step in this direction is to realize that human agents in the interface, like characters in a play, cannot be separated from the plot itself [2]

The decisive step in the direction of a narrative conception of personal identity is taken when one passes from the action to the character. A character is one who performs the action in the narrative. The category of character is therefore a narrative category as well, and its role in the narrative involves the same narrative understanding as the plot itself. The question is then to determine what the narrative category of character contributes to the discussion of personal identity. The thesis supported here will be that the identity of the character is comprehensible through the transfer to the character of the operation of emplotment, first applied to the action recounted; characters, we will say, are themselves plots [(Ricoeur, 1992, 143)].

We propose that characters such as desktop icons or interface agents in general similarly define, and are defined by, the theatrical frame of the interface as a whole. The efficacy of the computer interface as actant depends on developing convincing "characters" in the "narrative" of the user-interface. If their negotiation is successful, user and designer reach a consensus on the competence of the agent to perform a task (an action), and the medium (the computer) disappears in the process: User and designer agree on the "truth" of the representation embodied in the agent, and, in consequence, his/hers/its action appears as "real." The object of the negotiation is the plot (or the narrative) itself, and in the present case, the alternative representations of the user and the designer of the task to be performed.

As human figures (characters), interface agents are most likely to enable consensus-reaching through a process of identification. At the opposite, the call for actants (defined as entities capable of action, but not necessarily human) is a radical step toward an extension of the process to allocate representational truth to any entity present on the interface. At the analytical level, both agents and actants are valid representational devices.[3] Even if it is possible to imagine a play without characters (in a sort of "nouveau roman" fashion), it is so only if the actants are taken for granted since the beginning of the play: To do so one would have to neglect the question "Who created them?"

At the interface, we distinguish two types of participants: (1) human-like agents, and (2) actants. What makes the agents especially interesting is that their social production enables us to consider them as a specific kind of actant in the interface. Being the result of a consensus between designer and user, the interface agent combines the two orthogonal dimensions of a representation (delegation and inscription). Delegation is the process by which the agent is granted the right to represent action in the interface, and inscription is the process that enables the agent to perform this action. These two processes conjointly define the competence of the agent as the embodiment of the consensus reached by users and designers (Figure 5).



Figure 5. The Competence of the Agents as the Result
of the Dual Process of Delegation and Inscription

This joint construction of the competence of the agent has important consequences on the nature of the agency of the interface agent, in relationship to the human actors who negotiated his/her competence. The problem of the anthropomorphization of the computer and the negotiation of the competence of human-like interface agents should provide the ground for a sociology of computing that respects the uncertainty of this phenomenom, as Woolgar and Grint remind us:

The sociology of computing already assumes that the object is of a different order than those entities that enjoy social relationships. The computer is assumed to be an object, not a human, and is thus allowed a different set of actions, behavior, and so on. This premise, this presumed moral order, is built into a sociological approach that construes perspectives as formulas applicable to entities more or less indifferently to problems of their description and constitution. The root difficulty is that the essentially uncertain character of the object (the computer) is ignored. But how and in what circumstances actants like computers assume characteristics different from humans is part of the problem to be addressed [(Woolgar & Grint, 1991, p. 376)].

The French philosopher Michel Serres gave us the key to construct such a sociology of computing. We have described the joint construction of interface agents as the result and the communication process of a dialog between user and designer on the ways to perform a specific task in the computing environement. We also said, following Brenda Laurel, that if this dialog is successful, the computer (the medium) disappears in the process. By referring to a renovated conception of the dialogue, Michel Serres enabled us to shade a new light on this process:

Following scientific tradition, let us call noise the set of these phenomena of interference that become obstacles to communication...This set of phenomena has appeared so important to certain theoreticians of language that they have not hesitated to transform our current conception of dialogue in reference to it: Such communication is a sort of game played by two interlocutors considered as united against the phenomena of interference and confusion, or against individuals with some stake in interrupting communication. These interlocutors are in no way opposed, as in the traditional conception of the dialectic game; on the contrary, they are on the same side, tied together by a mutual interest; They battle together against noise...They exchange roles sufficiently often for us to view them as struggling together against a common enemy. To hold a dialogue is to suppose a third man and to seek to exclude him; a successful communication is the exclusion of the third man [(Serres, 1982, pp. 66-67, emphasis in the original)].

If we follow Serres, we understand that it is not the medium that disappears in the successful negotiation of the competence of the interface agents, but the "third man," the source of noise. This also means that the successful process of negotiation of the competence of the interface agent in the same time constitutes as a human-like actant and excludes him as a human actor. In the case of a successful negotiation between user and designer, the interface agent is under control, he/she will not create noise but docilily perform what he/she is expected to. He/she will become the perfectly servile actant, and in the process will be robbed of his/her humanity.[4] Woolgar's and Grint's uncertainty lies exactly here: The interface agent is not human nor non-human by definition, but it is the process of negotiation of his/her competence that gives or denies this unique quality, his/her humanity. And a valid sociology of computing should provide a way to describe this process.

By ascribing the processes of delegation and inscription to the user and the designer respectively, we now realize that we were considering a radical case. In fact, it is likely that both user and designer delegate and inscribe agents. This exchange of the roles is itself a result of the communication process, that according to Serres, is meant to be the exclusion of the "third man." In this sense, the degree of interactivity of the interface can be seen as the relative opportunity for both user and designer to take part in the two dimensions of the representation process. The joint construction of the plot is the consensus reached on a set of agents and actants whose competences are negotiated between user and designer. In setting the negotiation at the level of the entire set of agents and actants, we consider here the representativity of the interface as a whole, that is, as a narrative socially constructed between user and designer. If we return to our previous opposition between metaphor and illusion, we now see how absurd it can be to try to ascribe the leading role exclusively to the designer (as the director of the illusion) or to the user (as the one who realizes the metaphor). In most case, we now realize that both are actively involved in the process.

Navigation and the Space of the Interface


We first characterized the interface as the space in which user and designer meet. Then, we focused on the modalities of their meeting, and described it as the negotiation leading to the construction of the entities capable of action (agents, actants) that inhabit this space. Now, we direct our attention to the "physical" properties of the space of the interface. By "physical" we mean here these properties of the interface that refer to the body and the perceptions of the person using or designing the system.

So far, what we have called "user" and "designer" are abstract macro-actors, disembodied characters in our narrative, pure essences. The central tension between the two dimensions of their being, biologically concrete individual or abstract member of a social community, has been somehow evacuated from our narrative, through the artifice of directing our discourse to their action. It is now time to consider how is this action possible. However multiple and collective they may be in action through time, user and designer are, no doubt possible, flesh and blood at any given moment, embodied in organic individual sensorimotor systems that we call their bodies. It all starts with perceptions:

Basically, embodied (sensorimotor) structures are the substance of experience, and experiential structures "motivate" conceptual understanding and rational thought. As I have emphasized, perception and action are embodied in self-organizing sensorimotor processes; it follows, then, that cognitive structures emerge from recurrent patterns of sensorimotor activity. In either case, it is not, as Lakoff would have it, that experience strictly determines conceptual structures and modes of thought; it is, rather, that experience both makes possible and constrains conceptual understanding across the multitude of cognitive domains [(Varela, 1992, 335)].

The historical sketch of the evolution of the models of the user-interface that we proposed previously must then be paralleled by a historical sketch of the ways that the sensorimotor system of the user interacts with the computer through time [( see Walker, 1990 for such an attempt)]. We examine the turning point that led Douglas Engelbart and his colleagues at SRI to introduce the body of the person in his framework for the augmentation of human intellect. For Engelbart, the augmentation of the human intellect should start with a systematic analyis of the potential candidates for change, beginning with what he refers to as "the basic human capabilities" (Figure 6).



Figure 6. Engelbart's Simplified Model of the Basic Human Capabilities.
Source: Engelbart (1988, p. 215).

For Engelbart, the sensorimotor system ("the body") is at the interface between the "mental part" of the human being and the "outside world." The deliberate decision to "begin with the basics" led Engelbart and his group to develop a series of artifacts that mirrors the importance of the body on the computer side of the interface. We refer here to the display system (the eye), the mouse, and the keyset (the hands). Prior to the work of Engelbart's ARC group, the physical interaction between human and computer was mostly limited to typing. In many ways, the keyboard and the teletypedisplay available in the early 1960s were a mere extension of the punch card as a communication medium between human and computer. The communication was based on the manipulation of symbols, first numeric, and then alphanumeric. The body of the user entered the picture only as a medium to transfer his/her symbolic manipulations, and the connection of the body to the rest of the world was incidental only. Hands and eyes were extensions of the tool system, as input and output devices.

In developing the mouse and the chord keyset in 1964, Engelbart and his ARC group at SRI made a quantum leap in human-computer interaction: The introduction of the body as whole, as a set of connected basic sensorimotor capabilities. The experimentations that the group conducted was not limited to the hands and the eye, but involved many other parts of the body (the knee, the back, the head) as potential sensorimotor ways to control a pointer on the screen. The liberation of the left hand from the typing processs, made possible by the invention of a chord keyset (one-handed typing), allowed a direct connection between the eye (perception) and the hand (motor). [5]

Among the various devices developed by Engelbart and his group at SRI, the most famous is the mouse. But a little-remarked aspect of the invention of the mouse is crucial to our thesis, as well as, we may claim, to its fate as the most used pointing device in modern user-interfaces. We refer here to the origin of the idea of the mouse as revealed by Douglas Engelbart:

I remember thinking, "Oh, how would you control a cursor in different ways?" I remember how my head went back to a device called a planimeter that engineering uses. It's a little simple mechanical thing that has a bent arm and the elbow of the arm has a little disc that rides on it and this little disc is out here and you start out following some closed path and when you are all done, you can read the two discs and do something about it and calculate the actual area that's included inside [(SOHP, I3, 3/4/1987)].

What Engelbart does not say here, however, is that the planimeter was re-invented by Vannevar Bush in 1913 as his Master's thesis project at Tufts College. He called the device that he invented (by using the principle of the planimeter) a Profile Tracer, "an arrangement of gears, shafts, and servo-driven pens which translated mechanical motion into graphical mathematics" [(Owens, 1991, p.29)]. [6] As the conceptual grandchild of the planimeter, the mouse also translates motion (the arm of the holder) into graphical mathematics. It therefore not only allows the user to point at any object on the screen, but also introduces a direct connection between the topographical space of the interface and the human gesture of the user. By extension, the invention of the mouse opens space for any translation of human motion into the electronic space of the computer interface. This point is fundamental in that it allows us to evacuate definitively the notion of cognition as purely intellectual representation, to introduce instead the "embodied action" in the computer space:

I have argued that perception does not consist of the recovery of a pregiven world, but rather from the perceptual guidance of action in a world that is inseparable from our sensorimotor capacities. Cognitive structures emerge from recurrent patterns of perceptually guided action. I can summarize, then, by saying that cognition consists not of representations but of embodied action.. Correlatively, the world we know is not pregiven; it is, rather, enacted through our history of structural coupling [(Varela, 1992, 336)].

Engagement and the Holodeck


If you have ever watched Star Trek: The Next Generation, you know what the holodeck is. The holodeck is a computerized room and the room is the interface. First, a character launches an application from outside the room. When he enters the room, he is projected in the four dimensional environment he selected. Most characters use the holodeck for their leisure, and the most common environment is Conan Doyle's Sherlock Holmes adventures.

In the episode entitled "Ship in the bottle," Dr. Moriarty manages to control the computer from inside the fictitious space of the holodeck and literally becomes alive, as alive as the "real" characters of the series. So alive, in fact, that he threatens to rule the ship. The ultimate solution found by Captain Picard and his crew is to recreate a simulation of the ship environment itself and lure Moriarty into believing that it is the real ship.

The whole point of this episode is that Picard's solution is to create an illusion within the holodeck illusion, to give Moriarty the feeling that is he is going through the mirror. To us spectators, the whole plot works because once we accept that it could be real in the first place, we cannot tell the difference between the "real ship" and its holodeck recreation.

Brenda Laurel [ (1991, p. 113)] calls this decision that it "could be real," engagement, similar to the theatrical notion of "willing suspension of disbelief," that she attributes to the early nineteenth century critic and poet Samuel Taylor Coleridge [7]:

Coleridge believed that any idiot could see that a play on a stage was not real life...Coleridge noticed that, in order to enjoy a play, we must temporarily suspend (or attenuate) our knowledge that it is "pretend." We do this "willingly" in order to experience other emotional responses as a result of viewing the action. When the heroine is threatened, we feel a kind of fear for and with her that is recognizable as fear but different from the fear we would feel if we were tied to the railroad track ourselves. Pretending that the action is real affords us the thrill of fear; knowing that the action is pretend saves us from the pain of fear. Furthermore, our fear is flavored by the delicious expectation that the young lady will be saved in a heroic manner--an emotional response that derives from knowledge about the form of melodrama [(Laurel, 1991, p. 113], emphasis in the original).

Why should we temporarily suspend our disbelief? In the case of theater or fiction, Coleridge argued that it is for the sake of experiencing other emotional responses, in aristotelian terms catharsis, the pleasurable release of emotion. After Coleridge, [Laurel (1991, pp. 120-122)] argued that the same process should occur in interacting with a computer, where catharsis stems from achieving a given task. If we push the analogy further, we can infer that the reason for the user to do so is based on his/her trust that the computer will indeed enable him/her to achieve the task (just as the play will enable us to experience other emotions).

When a play fails, when we do not feel the release of emotion associated with the charcaters' experiences, we blame the director or the actors: They were not "real enough," we do not buy it. When the interface fails, most of us blame ourselves: I don't get it, I must be stupid. Donald Norman [(1988)] studied the problem in details and ended up with two important definitions to analyze the problem: The gulfs of execution and evaluation. These gulfs reflect the distance between the mental representations of the user and the physical components of the system. The gulf of execution is the difference between the intention (of the user) and the allowable actions in the system he uses. The gulf of evaluation "reflects the amount of effort that the person must exert to interpret the physical state of the system and to determine how well the expectations and intentions have been met."

Bruno Latour [(1992, n. 14)] noticed that Norman's book is "an excellent introduction to the study of the tense relations between inscribed and real users", but that it "never considers the shaping of the artifact by the engineer [sic] themselves." Following Latour, we argue here for a symmetric account of the gulfs, where the problem lies in the reciprocate gulfs between user's and designer's mental models of the other. If we agree on the idea of the interface as a socially constructed narrative involving both user and designer at any given time, we must then call for a mutual engagement of both user and designer in the process. In other words, there is also an inscribed designer in the system, and this inscribed designer should be able to act within the representational space. Now, if we look at cognition as embodied action, we have to realize also that the design and use of an interface requires the embodied action of both of its main protagonists, the user and the designer.

Cyberspace and Simstim


The word cyberspace was coined by science fiction writer William Gibson in his book Neuromancer [(1984, p. 5)]:

Case was twenty-four. At twenty-two, he'd been a cowboy, a rustler, one of the best in the Sprawl...He'd operated on an almost permanent adrenaline high, a byproduct of youth and proficiency, jacked into a custom cyberspace deck that projected his disembodied consciousness into the consensual hallucination that was the matrix.

In the same book, Gibson also introduced Simstim, a interactive simulation system that allows the user to experience (to "flip in") somebody else's perception as if he were in his or her body :

Cowboys didn't get into simstim, he thought, because it was basically a meat toy. He knew that the trodes he used and the little plastic tiara dangling from a simstim deck were basically the same, and the cyberspace matrix was actually a drastic simplification of the human sensorium, at least in terms of presentation, but simstim itself struck him as a gratuitous multiplication of flesh input (p.55).

Cyberspace and Simstim are the two polar opposite representations that inform our imagination about virtual reality: Disembodied but highly interactive ("a consensual hallucination") and/or vivid but passive ("the passenger behind the eyes"). The coupling of both of these representations makes possible what Brenda Laurel [(1991, p. 188)] considers as the confluence of the three enactment capabilities of virtual reality: Sensory immersion, remote presence, and tele-operation. Simstim is sensory immersion, cyberspace is the remote presence of the consciousness, and tele-operation is the resulting action, action at a distance, action of a body that is not exactly the body of the user anymore, available only in cyberspace.

But if cyberspace was coined by Gibson, it was anticipated--and in a limited way, achieved--by Engelbart, whose hypertext vision was of a multi-dimensional data-space in which users can "fly" [(Engelbart, personal interview, 12/15/1992)]. For Engelbart the "workstation is the portal into a person's 'augmented knowledge workshop'--the [virtual] place in which he [sic] finds the data and tools with which he does his knowledge work, and through which he collaborates with similarly equipped workers" [(Engelbart, 1988, p. 187, emphasis added)]. From the beginning, the sensorimotor aspects of this "portal" were emphasized, leading to the development of the mouse and the principle that workstations need "three-dimensional color display[s]" within which symbolic representations can be directly manipulated [(Engelbart, 1962, p. 14)].

Ted Nelson also insists on the convergence of movie-making and software or user-interface design [(Nelson, personal interview, 3/17/93)]. He used the term "virtuality" to refer to their common ground:

I believe that movies and computer screen are both best understood in still larger terms. It is for this I propose the term virtuality. The virtuality of a thing is what it seems to be, rather than its reality, the technical or physical underpinning on which it rests. Virtuality has two aspects: conceptual structure--the ideas of the thing--and feel--its qualitative and sensory particular [(Nelson, 1990, p. 239)].

We propose here a categorization of the various media technologies described in this paper, on the basis of their relative degree of interactivity, or, in other words, of the relative enactment capabilities they provide (Table 1). Broadly, some media are for "being there"; observing the (real or contrived) actions of others with various degrees of the characteristics we have attributed to media and user-interfaces. Theater is the oldest user-interface. Available only near the physical stage of the action and (usually) non-interactive, theater is the fundamental "spectative" technology that has served as a model for the other real-time spectative media forms of film, television, and even radio [(McLuhan, 1964)]. [8].


Table 1. Classification of Technologies for "Being There," "Doing," and "Doing There."

Enactment Capabilities
Technology Sensory Immersion Local Tele- Operation Remote Presence Remote Tele- Operation
1. Spectative
    Theatre
    Simstim

*
*


*
. .
2. Associative
    Desktop
    Hypercard
    Holodeck



*

*
*
*
. .
3. Connective
    Hypertext
    Cyberspace
.
*
*


*

*
*

Computer user-interfaces are operative in that the user's role in the dialogue is active and instrumental. We propose two dimensions of the above-introduced concept of "tele-operation." "Tele-" means distance, but the distance need not be spatial. Local tele-operation is operation separated by one or more levels of abstraction, or representation, from what the user and designer mutually understand as the "real" or "physical" operation. Moving an icon on a desktop, following a link from a document to a video clip in HyperCard, interacting with Moriarty in the Star Trek Holodeck--these are operations at one remove from the manipulations of memory, disks, and files that the programmer and user "know" underlies the symbolic representations of the interface. But such systems are confined to the local arena in a spatial sense, and largely to the individual user. They are associative, not connective.

Connective technologies are operative not only across conceptual but also physical distance. Normally they are interactive in that other actors, working with similar interfaces, can join the dialogue with the individual actor and his/her interface. Engelbart's conception of Hypertext as implemented in his NLS is essentially a spatially distributed system for what has now come to be known as computer-supported cooperative work (or play). Here, Gibson's notion of cyberspace adds the telepresence and sensorimotor dimensions to connective interface technology. Our categorization is convenient for separating the streams of development in both computer user-interfaces and theater, which we agree with Laurel [(1991)] are closely allied in concept and method.

Conclusion


In the endless dialog between the designer and the user of a specific human-computer interaction one of the two participants is almost irremediably virtual when the other is present. At the origin, at the very moment of the invention, the designer talks to his alter-ego, the user, through the representative and constitutive process of his own creative mirror image. This first phase of the life-cycle of a technology was the "era of the reflexive user" [(Bardini and Horvath, 1995)]. The process here at stake is the process of inscription, by which the designer of the system inscribes the user (for a related analysis, see [(Woolgar, 1991)]. At the end, after the widespread diffusion of the technology, the user has usually no escape to avoid the endless dialog with his creator forever hidden behind his interface: The mirror is stained.

But this metaphor feels especially wrong to most users in its asymmetry: Why can't we, users, inscribe the designer? The designer's mental model of the user does not evolve with the user's use of the system he/she designed, because the user is constrained in a passive role regarding the design of the system, even if the system is supposed to be interactive. A call for symmetry is a call for a reflexive designer. When using a system, the user should be able to change it, to re-design it. To call for an extended virtuality is to call for an alternate reality where roles can be exchanged.

The only interesting virtuality to us is the virtuality of communication, of potential mutual understanding and collaboration. Most designers are now in the business of designing a product. Ideally, they should think of themselves as providing a service, engaging in a communicative act whose purpose is to help the client in doing something. In the joint design of the interface agents and actants, furure interfaces should involve both embodied action of users and designers alike. Design should be open-ended and subject to transformations resulting from the interaction of designers and users. Just as hypertext modifies the relationships between author and reader of the text and blurs the distinction between them [(Landow, 1992; Tuman, 1992)], the implementation of hypermedia interfaces in virtual space-like designs ought to allow a real connection of users and designers. This would be the ultimate meaning of interactivity.

In [1942], Benjamin Lee Whorf wrote the following lines:

A noumenal world--a world of hyperspace, of higher dimensions--await discovery by all the sciences, which it will unite and unify, awaits discovery under its first aspect of a realm of PATTERNED RELATIONS, inconceivably manifold and yet bearing a recognizable affinity to the rich and systematic organization of LANGUAGE, including au fond mathematics and music, which are ultimately the same kindred language. The idea is older than Plato, and at the same time as new as our most revolutionary thinkers...All that I have to say on the subject that may be new is of the PREMONITION IN LANGUAGE of the unknown, vaster world--that world of which the physical is but a surface or skin, and yet which we ARE IN, and BELONG TO.

More than 50 years later, we are just beginning to experience such a premonition. But more than the widespread availabilty of easy-to-use programming languages to the users, this vision still requires one more shift in our ways to think as the computer medium.We have argued here that this shift is best characterized as the acknowledgment of the necessity of an open dialog between users and designers of the technology, based on a mutual engagement. If the computer-as-a-medium is destined to disappear to live the room for the messages it carries, it will only happen once we all realize that we engage into a communicative act each time we hit a stroke, move a mouse, and tomorrow maybe, enter the interface.

Footnotes


[1] The present work was funded by Mitsubishi International Corporation. Several other papers [(Bardini and Horvath, 1995; Bardini, 1993 and Bardini, 1996)] and a forthcoming book [(Bardini, forthcoming)] report our research project on "the social construction of personal computing." The present paper draws on two series of personal interviews carried out in December, 1992 and March, 1993 in Silicon Valley. We thank our respondants in these interviews.

[2] Numerous examples of the interconnection of character and narrative can be found in the literature. For example, Stoppard's [(1967)] Rosencrantz & Guildenstern are Dead, by shifting the focus of Shakespeare's Hamlet to two "minor" characters, recasts the entire narrative such that the original play can never be approached in the same way by an audience familiar with Stoppard's interpretation.

[3] This point is made by the Actor-network theory as evolved by [Callon (1991)] and [Latour (1992)] at the Ecole des Mines in Paris. Their analysis of a set of negotiations describes the progressive constitution of a network in which both human and non-human actors assume identities according to prevailing strategies of interaction. Actors' identities and qualities are defined during negotiations between representatives of human and non-human actants. In this perspective, "representation" is understood in its political dimension, as a process of delegation. The most important of these negotiations is "translation," a multifaceted interaction in which actors (1) construct common definitions and meanings, (2) define representativities, and (3) co-opt each other in the pursuit of individual and collective objectives. In the actor-network theory , both actors and actants share the scene in the reconstruction of the network of interactions leading to the stabilization of the system. But the crucial difference between them is that only actors are able to put actants in circulation in the system.

[4] d. Brenda Laurel is well aware of this point when she makes a principle of good interface design to "think of agents as characters, not people" [(Laurel, 1991, p. 145)]: "All agents must be represented in such a way that the appropriate traits are apparent and the associated styles and behaviors can be successfully employed to establish probability and causality. Too much "noise" in the system (that is too much complexity in character--as would probably result from an "accurate" model of human personality) makes probability and acusality harder to deploy in the formulation of action."

[5] The visions of Engelbart and the ARC group may have been influenced by McLuhan's [(1964)] view of technologies as direct extensions of sensorimotor organs, both in their function and psychological impact. Engelbart was certainly aware of McLuhan's writings [(Engelbart, personal interview, 12/15/1992)].

[6] The Profile Tracer was intended to mechanize the tedious work of tracing a topographical profile. It was an instrument box slung between bicycle wheels, continously recording the elevation along a path for which one desired the profile [(Owens, 1991)].

[7] Coleridge also happens to be (do you still believe in coincidences?) the author of "Kubla Khan," a hypertextual poem that gave Nelson's Project Xanadu its name.

[8] The latest (speculative) elaboration of these media is Gibson's Simstim, a way to tap directly into the sensorimotor inputs of actors (here meant in both agent and performance senses).


References



About the Author


Thierry Bardini is currently Assistant Professor at the Department of Communication, Université de Montréal, where he co-directs the Research Laboratory in Multimedia Communication. He joined the department in 1993 after two years as a visiting scholar at the Annenberg School for Communication, University of Southern California. His research program, entitled "The Social Construction of the New Media User", is devoted to the social history of the new communication technologies, with an emphasis on the history of personal computing and the sociology of artistic development and uses of the new media. Author of approximately a dozen papers or book chapters in Science, Technology and Human Values, Knowledge and Policy, Technologies de l'Information et Société, Réseaux, and other journals, Bardini is currently working on the manuscript of his first book, The Personal Interface, forthcoming in 1998 from Stanford University Press.
Address: Departement de Communication, Université de Montréal, C.P. 6128, Succursale Centre-ville, Montréal (Québec) H3C 3J7 CANADA.