Back to Vol. 2, No. 3 Table of Contents

A MOO-Based Virtual Training Environment

Michael Mateas
School of Computer Science
Carnegie Mellon University

Scott Lewis
Intel Architecture Laboratories
Intel Corporation


Table of Contents


Abstract

We have implemented a virtual environment to support the training of engineers in Panels of Experts (POE), a vehicle for gathering customer data. The environment, which is implemented using multi-user domain (MUD) technology, simulates a hotel conference facility, the context in which POEs generally take place. Within the environment, simulated customer data gathering activities support training through practice. We describe the environment, discuss some issues of communication and interaction raised by the technology, and relay the experiences of new users within this environment.


Introduction

The User Interface Research Group at Tektronix has developed and successfully deployed an interface design methodology called Customer Centered Design (CCD) (Grossman, Lynch, & Stempski, 1992; Knox, Bailey, & Lynch, 1989; Palmiter, Lynch, Lewis, & Stempski, 1994). This methodology involves collecting information regarding the projected customer's tasks and domain. The primary data collection vehicle is the Panel of Experts (POE). It is the responsibility of the UI Research Group and the CCD Group (in one of the divisions) to train engineers in POEs and the rest of the CCD methodology. This paper describes the difficulties of providing POE training to non-user-interface specialists, presents the features of a virtual learning environment that addresses some of these difficulties, and describes an implementation of this learning environment using Multi-User Domain (MUD) technology.

The Problem: Training Engineers in Customer Data Gathering Techniques

POE Description

A POE takes place in a hotel conference facility. Nine to twelve potential customers gather with Tektronix engineers for a half day to a day. In a typical POE, each customer participates in five structured data gathering activities: questionnaire, brainstorming, process maps (PM), directed dialogs (DD), and Feature/Function Tradeoff (FF).

In the Process Map technique, a customer uses sticky notes and pens to draw a picture of a recent work experience on a large sheet of paper. A Tektronix engineer asks the customer questions while they draw their picture. In the Directed Dialog technique, a customer is asked to perform a series of tasks using a simulated product. The Directed Dialog captures task information while simultaneously testing early user interface concepts. In the Feature/Function Tradeoff, the customer is asked to design an ideal product for their work situation within a constrained budget. This forces the customer to make cost-benefit tradeoffs regarding product features.

Training Difficulties

Number of Experts. The number of POE experts within Tektronix is limited. Questions regarding POEs must funnel through these people. There is currently no mechanism to ask questions of these people as a group or for engineers to leverage off each other's experience. An internal mailing list or news group could address this problem by providing asynchronous communication. We would prefer a solution, however, that also supports serendipitous synchronous group communication in the informal style of a hallway conversation. What is needed is a place where people can drop by to discuss POEs with experts and other interested parties.

Expertise is Heavily Dependent on Practice. Expertise in many of the POE activities is a matter of practice. In particular, successful Process Map and Directed Dialog sessions depend on the facilitator's skill in asking the right question at the right time. This skill is not conveyed well by static artifacts such as class notes. Current POE training consists of a one day class on CCD techniques and a half day practice POE where Tektronix personnel play the role of customers. This half day practice POE is the only time engineers have to actually practice the techniques in the dynamic setting of a POE before performing the POE with real customers.

Engineers are Geographically Distributed. Tektronix's engineers are geographically distributed across four sites. All of the CCD experts reside at one of these four sites. If one or more of the experts are consulting with an engineering group at a remote site, they must make at least two trips; one for the one day seminar and one for the practice POE.

The Solution: A Virtual Training Environment

This section provides an implementation independent description of a training environment that addresses the training difficulties above. The third section will describe a specific implementation of a training environment having these features.

To meet the training difficulties outlined above, we designed a virtual training environment. We chose the environment approach to computer-mediated training over more traditional computer aided instruction or intelligent tutoring system approaches because of our desire to support an experiential approach to learning ((Soloway, 1993)). The experiential approach emphasizes learning by doing and by collaborative learning. This is a good fit for POE training, given that acquiring skill in the data gathering activities is so dependent on practice, and that POEs are inherently collaborative.

Features of the Environment

Awareness of Others. The environment provides the telepresent awareness of other participants. Students and instructors are aware of each others activities and can communicate synchronously. This allows students to collaboratively engage in learning activities.

Hotel Conference Facilities. The environment models the context in which POEs occur: a hotel conference facility. Besides providing the context for other learning activities, the virtual conference facilities are the explicit mechanism for teaching engineers the spatial and temporal structure of a POE. Data gathering activities are spread across several meeting rooms and suites. Every hour, customers and engineers change rooms, as participants move from activity to activity. By virtualizing this environment, groups of engineers have a safe environment within which to experience common mistakes, such as forgetting a Directed Dialog script or forgetting to label a tape.

Simulated Activities. The environment contains agents representing customers and engineers. These agents engage in dynamic simulations of data gathering activities. Students within the environment can interact with these agents in two modes. In the first mode, they can watch two agents, one playing the role of a Tektronix engineer and the other the role of a customer, engage in an activity such as Directed Dialog or Process Map. The simulation can be started, stopped, and cued to specific points. In this mode, students can discuss the techniques employed by the simulated engineer, an expert can provide real time commentary regarding the simulated engineer's technique, or a simulated expert can provide canned commentary. In the second mode, a student can take the role of the Tektronix engineer and engage the simulated customer. At each step in the interaction, a student is presented with a set of choices. Each choice causes the simulated customer to react differently. Other students can watch and ask questions or provide suggestions, an expert can provide coaching, or a simulated expert can provide suggestions and comments.

Access To Information Sources. The environment facilitates asynchronous access to information. A bulletin board allows students to post questions to be answered by an expert or another student. A mail system and online documents are also available from within the environment. These documents contain POE training materials as well as Directed Dialog scripts, weighted sets of features and functions and questionnaires developed for previous POEs.

Benefits

The environment provides a place to meet with experts and other interested parties to discuss POEs, allows engineers to practice POEs, and supports remote training. The telepresence technology within the environment allows distributed parties to have synchronous or asynchronous interactions.

Communication with Experts. Student interactions with experts can take the form of lecture, coach, informal synchronous, and informal asynchronous interaction. In the lecture model, a POE class is scheduled for a specific time. Students and experts enter the environment at that time. An expert gives a guided tour of the environment, presenting supplementary materials (such as slides) on students' screens. The class has a definite agenda and time limit. The coaching style of interaction may occur within a class context or on its own. Here the expert provides comments and suggestions while students interact with the environment. Informal interaction occurs when a student enters the environment at an unplanned time and finds an expert present. Finally, asynchronous interaction occurs when a student leaves a note on the bulletin board or sends email. An expert can respond to these messages at a later time.

Collaborative Learning. Students can engage in collaborative learning without the presence of an expert by practicing the simulated activities and discussing or commenting on each other's performance. The environment also serves as a meeting place for informal discussions of problems and experiences. The bulletin board provides a mechanism for asynchronous student conversations.

Individual Learning. Students can engage in individual learning within the environment. Simulated experts give tours of the conference facility and provide commentary on student performance of the data gathering activities. Because of telepresence facilities, other students or experts entering the environment are aware of students engaged in individual activities. This facilitates engaging in informal discussion.

The Technology: LambdaMOO

Multi-User Dungeons

Our virtual learning environment is built using a virtual environment tool called LambdaMOO. LambdaMOO is an object oriented Multi-User Domain (MUD, MOO = MUD Object Oriented) developed by Pavel Curtis at Xerox PARC (Curtis, 1992). A MUD is a text-based virtual environment reminiscent of text-based adventure games. Upon entering a MUD, one sees a description of the starting room.

The Starting Place

You are outside the Overlook Hotel. To the north is the entrance to the hotel.

A player (participants in a MOO are referred to as players) can navigate through the world and manipulate objects by typing commands.

north

The Overlook Hotel Lobby

A typical hotel lobby, with ornate but inexpensive decoration. You see a stairway to the west, conference rooms to the northwest and northeast, a check-in counter straight ahead, a hallway to the east and a restaurant to the southeast.

You see a newspaper, Check-In Desk, and Concierge here.

take newspaper

You take newspaper.

drop newspaper

You drop newspaper. >

What differentiates MUDs from text-based adventures is their support for multi-user telepresence and user modification of the environment.

Multi-user Telepresence. Players are aware of other players in the same room (locations within a MUD are referred to as rooms). Every player sees descriptions of other player's activities within the room. In addition, players can communicate with each other. The primary facilities for communication are say and emote. Say is the vehicle for verbal communication. When a player says something, other players in the room see a description of what was said.

say Do you know the way to San Jose?

You say, "Do you know the way to San Jose?"

Emote is the vehicle for non-verbal communication. When a player emotes something, other players in the same room see a description of the player performing the emote action.

emote shrugs his shoulders

michamat shrugs his shoulders

Construction. Every MUD provides a programming language that allows players to endow their objects and locations with arbitrary dynamic properties. Over time, a world evolves as players add more objects and locations. A world may, depending on its purpose, limit construction privileges to a subset of players.

Uses for MUDs

Astro-VR. Curtis and Nichols are using LambdaMOO to implement an environment supporting astronomy research (Curtis, 1993). Astronomers entering the environment can have informal conversations, listen to lectures presented by colleagues, and explore each other's research results. Astro-VR extends the MOO to include other media types besides text. An astronomer giving a lecture will be able to display slides on colleague's screens.

The Infopark. Evard built an environment using LambdaMOO to support system administration activities at Northeastern University (Evard, 1993). System administrators are distributed at several locations throughout the campus. This decentralization made it difficult for them to coordinate their activities. The MOO provides an environment where they can meet and discuss issues. Several of the users found it useful to have a window into the environment open on their workstation at all times. If any activity occurs in the environment, they can attend to it.

MediaMOO. Amy Bruckman used LambdaMOO to build an environment for media researchers called MediaMOO (Bruckman & Resnick). MediaMOO models a portion of the MIT campus. Regularly scheduled discussion groups meet to discuss computer-based media.

The next section describes the infrastructure we built using LambdaMOO to support POE training.

The Infrastructure

Environment Topology

The overall structure of the training environment is a textual description of a hotel conference facility. Students can wander to see how different activities are laid out in different rooms. Asynchronous communication takes place via a bulletin board object in the lobby. Bulletin boards have already been implemented in other MOOs which allow players to place new notes and read existing notes. One section of the MOO, known as the Secret Control Center, is off limits to normal players. This is an area open only to system maintainers which contains objects associated with the MOO's operation.

User Interface

The conventional interface to a MOO is a telnet connection. The user types commands and textual responses scroll past. This approach has difficulties if a room contains more than one player. Speech and actions may issue from other players at any moment. Output text from these other players is mixed with commands. Though the output text does not interfere with the command (other than visually), the effect can be disconcerting.

Peter Murray-Rust has written a graphical interface to MOOs called exMOO (Murray-Rust). exMOO provides separate scrolling windows for input and output, separate windows to display a room's contents and a list of other connected players, as well as buttons for commonly executed commands.

Click here to see an example screen from the exMOO user interface.

Because of these advantages, we are using exMOO as our MOO interface. We have altered exMOO to support the execution of programs initiated by actions within the MOO. This is done by adding an out-of-band protocol between exMOO and the MOO itself. Support for execution of external programs facilitates multimedia extensions to the MOO.

Multi-Media Support

Our basic approach to multi-media MOO extensions is to explicitly represent every piece of media (video segment, audio segment, slides, etc.) as an object within the MOO. A player object is also defined for each media type (a video player, a tape recorder, a slide projector, etc.). We have defined verbs within the environment for manipulating the media objects. When a player performs these actions, the MOO sends an out-of-band message to exMOO. ExMOO starts the appropriate executable (e.g. a video or audio player) and plays the media file within the file system corresponding to the media object represented within the MOO.

Audio and video support are used to provide a richer experience of the environment. Media objects which provide a representation of some other object in the environment (such as an agent) are stored in the Secret Control Center. If we need to perform maintenance on the MOO, we can watch the media objects be accessed from within the Secret Control Center as players interact with the simulations.

Audio Teleconferencing

Room based audio teleconferencing is supported using the audio teleconferencing tool vat (Eriksson, 1994). vat is one of the applications built on top of the Internet multicast facility. A vat session is started for every room, with the name of each session being the name of the associated room. As players move from room to room, they are automatically disconnected from the vat session associated with the room the player is leaving, and connected to the vat session associated with the room the player is entering.

Agents

Agents are used in the MOO to play the role of customers, engineers and experts in the simulated data gathering activities. The interactions available to an agent are represented by coupled sets of state machines. The two state machines in Figure 2 represent a simple conversation between A and B.

[Figure 2. Simple conversation modeled with coupled state
machines.]

Figure 2. Simple conversation modeled with coupled state machines.

A transition made in one state machine causes a transition to occur in the coupled state machine. For example, when A says "How are you?", B transitions from start to b1. Now B chooses one of the two options available. Whichever option is chosen will cause a transition in A. In this manner the conversation continues.

Currently, five action types can be defined on an arc: talk, emote, video, audio, and executable. If a video or audio action type is defined for an arc, a corresponding video or audio file is played on the screen. This allows an agent to have a multi-media presence. As a student interacts with an agent, a video window displays the agent's response. The executable action type is there to provide a hook for attaching arbitrary actions to an agent's utterance. For example, in a Process Map simulation, it would be nice to have a graphic window displaying the process map. As the customer and engineer agent interact, the process map in the graphic window should update. This could be accomplished by using the execute action type to start and stop a bitmap viewer.

The coupled state machines support both agent-agent and human-agent interaction. In the case of agent-agent interaction, an agent randomly selects one of the arcs available to it from the currently active node. This causes a transition to occur in its partner agent's state machine. The partner then randomly selects an arc and so on. The simplest interaction structure occurs if each agent has a linear state machine. In this case the two agents engage in the same conversation every time. In the case of human-agent interaction, the human's possible choices are represented by a state machine. At each point in the conversation, the user is presented with a list of choices corresponding to the outward arcs from the currently active node. When the user makes a choice, this causes a transition in the agent's state machine. The agent then makes a conversational move in the manner described above, and the conversation continues.

The state machine framework is used to design pedagogical conversations. For example, a student performing a Process Map with a customer agent is presented with a list of possible questions they can ask the customer. At any given point during the Process Map, some of these questions are more effective than others. Depending on the choice made, the student gathers more or less information from the customer. A human or agent expert comments on the performance in real time. Alternatively, the student could be given an exam by an expert agent to determine which pieces of information the student missed.

Lecture Support

We have added class management support for facilitating teacher-student interactions. Teachers can add students to a class and expel them from a class. Students in a class automatically follow the teacher as the teacher moves from room to room. A student may choose to leave the class, however, if they wish to stay behind to look at something while the class moves on. In addition to navigation support, the teacher can start programs on the student's computers. This allows the teacher to do things like display online slides on the terminals of all the students in the class, display video for the students, launch World-Wide Web browsers, etc.

We have designed, but not yet implemented, support for canned lecture material. When a teacher is asked a common question for which they already have an answer, they will be able to select the answer off a menu and have the response directed to the questioner or all students present. This speeds textual communication within the MOO.

User Feedback and Interpretation

Our first opportunity for user feedback was an informal presentation to colleagues and students. We presented the ideas in a lecture format, with slides describing the material and periodic short demonstrations of the implementation on networked workstations in the lecture room. The system demonstrations were primarily non-interactive for the audience members, as we presented the main concepts and then interacted with each other and the training environment in the roles of student and instructor.

The dominant response of this audience to the lecture and demonstration was content overload. Audience members commented that they were overwhelmed by the richness and depth of the MUD. This is interesting, we think, because even with little exposure the MUD was experienced by this audience as having great depth. Our belief is that the MUD provides subtle cues that it is a complex environment, with the potential to match the complexity of a real environment. Further, users seem inclined to attribute all of the interaction potential of a real environment to the MUD given the presence of only a few cues.

Even with the difficulty for new users created by such a rich environment, the audience members were generally intrigued and wanted to have actual individual experience with it. We felt that a weakness of the presentation was that the users could only reflect on the demonstration of interaction with the MUD, and did not have an opportunity to experience the MUD first hand. We resolved to make subsequent tests of the MUD first hand for new users, and use the MUD itself as a presentation vehicle.

Our second presentation was given in a computer training room with nine networked Sparcstations. The audience was co-located but still had full access to the virtual environment. The slides for the talk were presented within the MUD. One of the workstation screens was also projected at the front of the room. We took turns lecturing from our respective workstations. At various points in the talk, we would demonstrate features of the MUD environment. The majority of the communication was verbal, but some communication took place within the MUD.

The primary difficulty with this approach was that participants had to pay attention to two environments at once. Some participants had previous MUD experience. They were quickly communicating and moving in the MUD. This MUD activity distracted these participants from the content of the talk. Other participants had never seen a MUD; they were distracted by the mechanics of the interface. The MUD novices, once they understood the basics of the interface, also wanted to explore. We observed a clear tension between lecture style communication and the free-wheeling exploration encouraged by MUDs.

Our third presentation was given to two usability specialists in the divisions. When the environment is deployed, they will be the primary instructors. The purpose of this presentation was to gather feedback as to the suitability (at least from an instructor perspective) of the MUD-based training environment.

Based on our first two presentation experiences, we decided to do this presentation completely within the MUD. All communication between the usability specialists and ourselves was mediated by the MUD. No formal presentation was prepared in advance. We wished to leverage a MUD's facility for supporting informal exploration and conversation. We gave a tour of the hotel topology, demonstrating the branching video and other media extensions. Because the usability specialists did not have multi-cast OS kernels on their machines, we were not able to use the audio conferencing. One of the instructors had no previous MUD experience and the other had a small amount of previous experience.

In general, the instructors did not think that the MUD would be an appropriate training environment. They had difficulties engaging in conversations. The multithreaded nature of MUD conversations was difficult for them to follow. It's interesting to contrast this view with that of more experienced MUD users, who generally find multithreaded conversations a positive feature. One instructor felt that they "might as well have been a robot" since it was so difficult to express oneself. They would have both preferred to have been silent observers, with no communication expectation placed upon them.

They found the MUD interface difficult to learn. Commands are not readily apparent. As new users, they often knew what they wanted to do but did not know how to do it (poor affordances). Also, the constantly scrolling text was a problem. They found it difficult to follow all of the activities in the MUD.

The first reaction to the primarily textual nature of the environment was that it made the interface look unfinished ("20 year old technology stuck in a window"). However, when asked if they would prefer a graphical rendering of the environment, they replied that the textual description was preferable because it was more compelling! On the positive side, they enjoyed exploring the environment. One of the usability specialists felt that he was sucked into the environment. He was concentrating so hard on what was happening that occasionally he had to pull away from the screen and take a deep breath to relax! Both of them found the MUD engaging but felt that the engagement might be a distraction to training. There was a feeling that the fun aspects of the environment were at cross purposes to training. Finally, while they agreed that MUDs have educational potential, they felt that usability techniques were not an appropriate content area. No clear reason was given.

Interpretation

The communication difficulties indicate that textualizing verbal and non-verbal communication channels is a learned skill. A MUD requires a user to organize thoughts to a greater degree before making an utterance. This is probably due to the lack of prosodic cues which allow a speaker to make corrections and change course in mid sentence. Also, the emote channel requires users to textualize nonverbal responses. In the presentation to the usability specialists, we often had to explicitly elicit feedback ("are you understanding me?"). They did not give the nods, smiles, etc. which more experienced MUD users constantly emote to each other.

The interface to most MUDs needs improvement. Formatting the text and providing clickable commands is a first step towards improving the readability of MUD text and providing more affordances on possible actions in the world. One approach currently being explored is to use the a web browser as a front end to MUD output. However, this approach only works for static text. The dynamic text produced by players and objects appears in a separate telnet window. Even if a transport protocol supporting dynamic pages was added to the web infrastructure, the deeper question is how a dynamically changing page should be rendered. Is the entire screen redrawn anytime some screen element changes? The problem of formatting MUD text is a superset of the problem of formatting hypertext; a MUD is a dynamic hypertext.

MUD environments seem to foster playful relationships between users. The usability specialists felt that the playful nature of a MUD environment was not appropriate for training. There is a cultural assumption that playful relationships and meaningful work are mutually exclusive. New communication technologies such as MUDs challenge this assumption. We predict that the line separating work and play will blur as mediated environments become more common.

All of our new users found the environment intimidating. We suspect that this was due to two factors 1) the interface problems described above; and 2) the difficulty of mapping intentions to actions in a novel environment. Our environment provides cues that it can mimic or exceed the real world's level of complexity and novelty. The text-based nature of the interaction provides few of the normal affordances that exist in the real world. It is immediately clear to the new user that much can happen in the MUD, but it is very unclear how one can respond to this complexity. We suspect that the difficulty of mapping intention into action within the MUD contributes to intimidation and performance pressure for the new user.

The apparent complexity of interacting with the environment for the new user is enhanced by "unusual" or "strange" events. For example, a player may leave a room with the announcement: "slewis shrinks to a single point of green light, buzzes about your head and exits to the northeast". What are the social conventions for response to this behavior? Such an experience is out of the range of new user's experience and therefore adds new complexity.

Introducing MUDs to New Users

We have experimented with three ways of presenting this virtual environment to new users (demonstration only, combined interaction and demonstration, interaction only) and found difficulties for new users with each of these presentation strategies. Demonstration or short-term use of the environment does not provide a representative experience for the new user. The success of MUDs on the Internet indicates that long-term use of MUDs is very engaging for some users (Dibbel, 1993), but initial exposure to the environment is consistently less compelling.

References