Collaboration Online: The Example of Distributed Computing
Anne Holohan
Department of Sociology University of Trento, Italy
Anurag Garg
Department of Computer Science and Telecommunications University of Trento, Italy
Abstract
Distributed Computing is a new form of online collaboration; such projects divide a large computational problem into small tasks that are sent out over the Internet to be completed on personal computers. Millions of people all over the world participate voluntarily in such projects, providing computing resources that would otherwise cost millions of dollars. However, Distributed Computing only works if many people participate. The technical challenge is to slice a problem into thousands of tiny pieces that can be solved independently, and then to reassemble the solutions. The social problem is how to find all those widely dispersed computers and persuade their owners to participate. This article examines what makes a collaborative Distributed Computing project successful. We report on data from a quantitative survey and a qualitative study of participants on several online forums, and discuss and analyze Distributed Computing using Arquilla and Ronfeldt's (2001) five-level network organization framework.
Introduction
The world's computing power is no longer primarily centered in supercomputers and institutional computer rooms, but is instead concentrated in the millions of Personal Computers (PCs) that are installed world-wide in households, businesses, hospitals, airports, and just about every center of human activity. In the last 15 years, the processing speed of PCs has increased rapidly and the Internet has expanded to the consumer market, creating a network of millions of fast computers connected to each other. At the same time, the computational needs of a typical user's applications have not kept pace, resulting in a vast pool of idle processing power. One activity that has grown in the past decade to utilize this spare processing power is Distributed Computing (DC).
In a DC project, the resources of a large number of standard desktop computers are pooled to tackle a computational problem that would otherwise require very powerful and expensive computers that are normally available only to large institutions. In Distributed Computing, a large computing problem is divided into small tasks that are assigned over the Internet to be processed by individual users on their own computers. In order to participate in a DC project, a user must download the project software from the DC project website and install it on their computer. Once installed, the project software contacts the project website and is assigned a task. Since DC software is designed to run in the background, the task takes up only the idle processing power of the computer and does not interfere with the normal use of the computer. Upon completion of this task, the result is sent back to the project server and a new task is downloaded if the user so desires. Typically, individual users' contributions are listed publicly on the DC project's website in the form of ranking tables that display the number of tasks completed or the amount of computer time contributed by an individual member, allowing contributions to be compared across all participants.
Participation in DC is voluntary and participants do not receive compensation for their work. Because of the voluntary nature of the collaboration and the lack of a central authority, there is a continual need for participant recruitment and retention. The technical challenge lies in slicing a problem into millions of tiny pieces that can be solved independently, and then reassembling the solutions. A challenge also resides in distributing this work to, and collecting the results from, millions of participants around the globe without interruption, and in making sure that the contributions are properly recorded. The social problem lies in finding all those widely-dispersed computers and in persuading their owners to participate. This article argues that the technical and social aspects are mutually reinforcing—for many participants, the social aspects of Distributed Computing are what drive the scientific endeavors, and vice versa, and this becomes a virtuous circle. Emerging from individual participation is a non-traditional form of organization. Instead of scientific research conducted inside a hierarchical institution, such as a university, participants in DC projects can either work alone or elect to join a team of individuals working together on the same project.
All DC projects have their own website that is used to disseminate information about the project. The websites also host project software for downloading by participants and publicly display participant contributions in a ranking table of top contributors. Top contributing teams are similarly recognized through a table of top teams. Many teams have their own websites dedicated to the project and may also host online discussion forums for discussing the progress of the team and individual team members. Similarly, many DC projects also have their own discussion forums where members of different teams can come together to discuss the practical and political aspects of Distributed Computing and specific projects and attempt to resolve any technical problems that may arise during the running of the project. What emerges is a dispersed network of individuals and teams working autonomously towards the common goal of scientific research on the Internet and communicating through online forums, team home pages, and the DC project website.
This article investigates the motivations and experiences of participants in DC, and documents the emergence of a network organization formed around teams, project websites, and online forums. Arquilla and Ronfeldt (2001) argued that a network organization can be described and evaluated by examining the organization at five levels: organizational, technological, narrative, social, and doctrinal. Distributed Computing is an example of the emergence of such a network organization.
In order to better understand the demographics and the motivations of the participants, we conducted a qualitative survey in August 2004, posting questions on the forums that the participants themselves use. This was followed by a quantitative online survey conducted in January 2005 on the website www.surveymonkey.com. The targets of these activities were DC participants who took part in one or more of five online DC forums; three of the forums belong to DC teams (who work on one or more DC projects) and two to DC projects themselves. The goal of our research is to glean knowledge that will deepen our understanding of this emerging type of scientific collaboration and facilitate the efficacy of future collaborative projects. We aim to find out how DC works in practice, who is involved, why they participate, and how this knowledge can be used to better design future collaborations.
The Phenomenon of Distributed Computing
Before we proceed, a brief note on terminology is in order. The use of the term Distributed Computing to describe the specific form of activity we have researched is somewhat atypical, as the term signifies something more generic to computer scientists. Technically, distributed computing is the act of performing a computation on a number of different processors which may reside in the same machine (as in the case of a multi-processor machine), on different computers, on the same network (as in the case of clusters), or on different machines on different networks that are physically located far away from each other. In the Distributed Computing community, the term has been appropriated to signify only the last of these three senses. In this article, we use the term in this restricted sense as used by the participants and project leaders to describe their activities. To make this distinction clear, we capitalize the term "Distributed Computing" (or use the abbreviation DC) in the rest of the article wherever it is used in this restricted sense.
Distributed Computing is often confused with Peer-to-Peer (P2P) Computing and with Grid Computing. For any application or software to be truly peer-to-peer, the clients must communicate with each other directly. Unlike P2P computing, Distributed Computing relies on the standard client-server architecture where clients (software programs) "talk" only to the project server and do not talk to one another directly. The users receive the data to be processed by downloading it from the project website, and once it is processed on the user's personal computer, they upload the results to a central server. Similarly, Grid Computing is an architecture that makes computing resources available through a service where "clients or users plug in to the service and consume it in much the same way that a utility customer plugs into the electrical grid and consumes electricity" (GP2, 2003). The Grid is an emerging software and hardware architecture that coordinates resources that are not subject to any centralized control. It uses standard, open, general-purpose protocols and interfaces to deliver nontrivial qualities of service (Foster, 2002). Unlike in Distributed Computing, in Grid Computing both the providers and the consumers of computational power are drawn from the same set of users. Moreover, the grid architecture endeavors to provide some guarantees as to task completion, which are completely absent in Distributed and P2P Computing.
One of the first widespread Distributed Computing projects was the Great Internet Mersenne Primes Search (GIMPS) that started in late 1995. The objective of this project is to perform a systematic search in order to find new Mersenne primes (prime numbers of a special form 2n - 1, where n is itself a prime number). Since the fastest known algorithm for testing the primality of Mersenne numbers is much faster than the fastest known algorithms for testing the primality of other numbers of a similar size, new Mersenne primes are usually extremely large. Until GIMPS began, finding extremely large primes had mostly been the preserve of people with access to fast (and expensive) supercomputers. In fact, almost all of the 22 Mersenne primes found from 1952 to 1996 were found using supercomputers (Caldwell, n.d.). Within a year of GIMPS' arrival, this changed and all eight Mersenne primes found since then (each the largest known prime at the time of discovery) have been found by GIMPS participants using consumer hardware such as ordinary desktop computers. The project initially started by distributing work via email to about 30 participants using Pentium 90s. It later graduated to using a server for assigning work to participants over the Internet automatically, and now boasts some 48,000 participants using over 71,000 machines (as of February 1, 2005).
The most famous Distributed Computing project is undoubtedly SETI@home (SETI stands for the "Search for Extra-Terrestrial Intelligence") that originated at the University of California, Berkeley in May 1999. The project asks people to download data recorded on a telescope from the servers at Berkeley and to analyze it for alien signals. The project has captured the public imagination and has been enormously successful, with over 5.3 million users as of February 2005. News stories about the project have appeared in almost all major media outlets, including the New York Times, CNN, the Discovery Channel, and National Public Radio. Ironically, the project was born due to the United States Congress's decision in 1993 to end SETI funding. After several years of planning, the project leaders decided to launch the project that appealed to the general public's fascination with extra-terrestrial life and their desire to utilize an otherwise wasted resource in opposition to the grant-withholding bureaucrats and politicians.
The success of SETI@home has led to a spate of new projects, with no fewer than a dozen launched from 2000 to 2005. The aims of these projects, their backers, and their level of seriousness vary widely. They include such projects as the Stanford University backed Folding@home project that looks at how proteins fold (this project has lead to several academic articles and carries the hope of future breakthroughs in medical research), to the tongue-in-cheek Project Dolphin that counts the number of keys you press on your keyboard every day.
An integral part of Distributed Computing since the beginning has been the existence of teams. A team typically consists of a group of people with a common institutional affiliation—such as the same school, university, or place of work—who pool their contributions to compete against other teams. However, a shared institutional affiliation is not necessary, and some of the most successful teams have been those that have formed out of a common interest. Examples include "Team Art Bell" in SETI, which was formed out of the listeners of Art Bell's late night radio show, or "The Knights Who Say Ni!", formed by a group of Monty Python fans. The level of interaction among team members varies widely. Typically, teams that formed due to a common interest of the members have greater interaction than teams formed due to common institutional affiliation. The former are composed of "active" participants who identify themselves with the group of which they are a part, as opposed to teams formed due to some institutional affiliation where membership is a passive activity and does not require exercise of free choice. The size of a team also varies widely, from teams consisting of only one or two members to Team Art Bell and SETI.Germany with over 10,000 participants each.
Motivations for Participating in Collaborative Projects
Research on motivations for involvement in collaborative technological or scientific projects on the Internet has focused almost exclusively on the Open Source Software (OSS) community. However, participants in the OSS communities do not form teams and there is no element of direct competition. As a result, the research focuses only on the motivation of individual participants in a non-directly-competitive environment. In the following sections, we look at the motivations to join DC individually and the motivations to join and continue in teams and online forums. We find that motivation is fueled by the opportunity for both cooperation and competition individually and in combination, "co-opetition" (Brandenburger & Nalebuff, 1996).
Individual Participant Motivation
There are illuminating similarities and differences between individual participants in the OSS movement and individual participants in DC projects. In an OSS project, programmers write code voluntarily. The source code for software is made available for anyone to modify, improve, or extend, on condition that any programmer will then share the resulting code and software with anyone who wishes to use it. OSS has been so successful that many important components of the Internet, such as the Apache web server and the Mozilla Firefox web browser, have been created through OSS projects. The seemingly counter-intuitive idea of programmers working for free in the highly competitive and commercial world of software and computing has been in recent years the subject of growing theorizing and analysis, particularly with regards to programmers' motivation. DC is similarly driven by individuals using their resources—mostly computers, and time if they are active on teams or in the forums—in a voluntary capacity for the advancement of scientific research.
While research on motivation in OSS has posited that how creative a person feels when working on the project is the strongest and most pervasive driver (Lakhani & Wolf, 2003), basic participation in DC requires no extra computer knowledge. Participants in teams and forums figure out ways to speed up and expand the capability of their computer resources, and that does require considerable levels of technical comfort and knowledge to participate in many of the teams and forums. Overall, participating in DC is nowhere near as creative as being involved in OSS, and the technical expertise required is much less than that in OSS.
A cost-benefit analysis approach posits that motivation stems from immediate and delayed payoffs associated with participation (many people are involved in OSS as part of their job) and the user need for particular software (Von Hippel, 2001). Lerner and Tirole (2000, 2002) forcefully argue for motivations of practical benefits to users, such as having good software, enhanced reputation from being associated with a successful project, and potential for OSS projects to lead to further commercial opportunities. However, the technical involvement in a DC project will typically not have an impact on the everyday use of computer technology by the participant, and will not aid their chances for further commercial opportunities through being associated with a particular DC project.
Economists, including Benkler (2002), have argued that the benefit of peer production in OSS is the reduction in transaction costs (Coase, 1988; Williamson, 1994) in particular the matching of human capital to projects in a way that is superior to that produced by price signals in the market. However, in DC the matching that is required is not that of skill to task but of idle computing capacity to task: anyone who has a PC can participate.
The most relevant insights into the motivations of DC participants can be gleaned from work done on OSS participation using Maslow's (1987) theory on motivation, which describes a hierarchy of needs that drive people, ranging from the satisfaction of physiological needs to the need for self-actualization. Hars and Ou (2001), looking at OSS participants, distinguished between intrinsic motivation and extrinsic motivation, with the former including the desire to feel competent and self-determining. Another variant of intrinsic motivation is altruism, where a person seeks to increase the welfare of others. DC participants provide something for others (processing data for scientific projects) at their own costs (time, energy, opportunity costs, use of PC resources), and therefore belong to this category. A variant of this internal motivation of altruism is what Hars and Ou (2001) label "community identification." Participants may identify themselves with the DC project or team and align their goals with those of the community. They may treat other members of the community as their kin, and thus be willing to do something beneficial for them.
Motivation in Teams and Online Forums
Individually manifested, altruism alone in its various forms is not enough to explain why millions of people participate, and why for thousands it becomes a regular and intrinsic part of their lives through participation in teams and online forums. Our research indicates that the answer lies at the intersection of motivation and the interactional and organizational possibilities emerging through the Internet. The network organizational form can capture people's motivation better than hierarchical organizations and channel it in a self-sustaining mode (Weber, 2004). Specifically, an Internet-based community simultaneously allows competition and cooperation. The Internet allows the formation of virtual teams of participants who cooperate by sharing knowledge and expertise, and who compete against other such teams.
Through the Internet, both individual and team participants in DC can be rewarded in ways that encourage further participation. Beenan et al. (2004) state that participants in collective projects will not typically persist in participation, or will participate minimally, if they feel their contribution is not "special," no matter how important the altruistic elements of the project are to them; they will become "social loafers." Beenan et al. (2004) begin by observing that people exert less effort on a collective task than they do on a comparable individual task. They posit a collective effort model that identifies conditions under which people will be less inclined to social loafing. These conditions include: (a) believing that their effort is important to the group's performance, (b) believing that their contributions to the group are identifiable, and (c) liking the group they are working with.
In DC, an individual's prominence in terms of the contribution to the project and compared to other participants is rendered visible through project websites where statistical tables of contributions are posted. Participants can also work in teams that come together to compete within a particular project, and in forums where teams discuss strategy and share technical tips to aid their quest to achieve higher rankings in the statistics tables.
In addition, abundant research since the 1960s shows that providing people with specific, high-challenge goals stimulates higher task performance than easy or "do your best" goals (Locke & Latham, 2002). The design recommendation from the goal-setting literature for online communities is that these communities should set specific and challenging contribution goals for their members, both individuals and teams, and on the project websites provide statistical information on achievement of those goals by individuals and teams (Soares, Silva, & Silva,1998).
Why, then, if there are millions of potential participants in a DC project and the number of people involved in teams is—as we show further below—nowhere near that, are teams important? The answer, as Weber (2004) noted with regard to OSS projects, is that even if the number of participants is very big, it says nothing about the scope of contributions. In OSS, only a subset of users contributes in significant ways. In a case study of the open source Apache server, Mockus, Fielding, and Herbsleb (2000) showed that 85-90% of the code was written by about 15 developers; a greater number helped fix defects and a greater number again helped report problems. In a survey of 100 OSS projects by Krishnamurthy (2002), it was found that the majority of mature OSS projects were carried out by a very small number of developers, most commonly one person. Healy and Schussman (2003) argue from this that participation is skewed both within and across projects, with a small number of people doing the hard-core programming and a small number of projects attracting the most activity. In DC, too, the contribution of a small proportion of participants accounts for a disproportionately large amount of results and has a disproportionately large impact on the rest of the DC community.
In general, a relatively small percentage of participants are part of a team in any given Distributed Computing project. However, participants who are part of a team are usually more "productive" and more enthusiastic about the project. This is particularly true of teams that organize around forums, and is evidenced by their success in terms of their contribution to the projects themselves. The proportional contribution from participants who are members of a team varies considerably from project to project, as some projects are more conducive to team formation whereas other are not. It also depends on the history of the project and the stage at which it was adopted by the major teams in Distributed Computing. In GIMPS, for instance, as of February 2005, the top two teams contribute over 10% of the total project output and the top 10 teams over 18%. In SETI@Home, which is a much larger project, the top 10 teams still combine to produce 7.6% of the work. This is a significant amount, as the combined 50,000 or so participants who are part of these 10 teams make up less than 1% of the over 5 million participants in SETI.
Thus, teams are important because they reward participation. As we shall also show, teams are particularly important for people who are the most productive participants.
Network Organization
The network organization operationalizes "co-opetition." The DC community, formed through the teams, project sites, and online forums whose members are voluntary participants, is an example of the emergent network organizational form (Castells, 1996; Nohria & Eccles, 1992). In providing the means for channeling participants' motivations to compete and cooperate, i.e. "co-opetition" (Brandenburger & Nalebuff, 1996), teams and forums provide powerful new insights into the recruitment and retention of volunteer collaborators in this new type of collaborative network.
Arquilla and Ronfeldt (2001) posit that the design and performance of networks depend on what happens across five levels of analysis (which are also levels of practice). They argue that the strongest networks will be those in which the organizational design is sustained by a winning story and a well-defined doctrine, and in which all this is layered atop advanced communications systems and rests on strong personal and social ties at the base. Their model reflects the emerging organization of DC, and the social and narrative aspects capture the "co-opetition" that is a very powerful motivator in DC. We give a brief summary of the five levels of a network organization according to Arquilla and Ronfeldt (2001), and proceed to show that DC bears the characteristics of a network organization.
Organizational Level
Network forms of organization have the advantage of flexibility, adaptability, and greater speed of response over other forms of organization, such as the hierarchical. A network organization typically has a flat organizational chart, with expertise rather than roles being paramount, and with information flowing where needed. The value of information is seen as increasing as it is shared, rather than, as in hierarchical organizations, decreasing. Scientific research on the Internet, DC, is the most open public and interactive form of scientific research yet.
Distributed Computing projects are by their nature decentralized. However, they also have the form of hub-and-spokes network organizations. The project server is the hub—from which work assignments are downloaded and to which the results are uploaded. The project managers have no authority over the participants or any control over how much or when they participate. They do have administrative control over the server, the distribution of work, and most importantly, the accounting mechanism that records the contribution of each participant.
Each project has a website where the contributions of each individual are recorded and tables rank teams, and within teams, individuals are ranked for their processing contribution. To properly motivate the participants it is critical that an appropriate metric be used to measure the amount of computational effort expended by each participant. As one of the forum members pointed out in the August 2004 survey, this metric must be
…fair, accurate, and cheater-proof…This metric is more or less cast in stone, because zeroing out the scoreboard and starting over will cause many participants to quit the project. The metric must take care to motivate the right participant behavior and not create a conflict of interest between the project leadership and the volunteer participants, because most project participants are caught up in the competitive aspect and if there is a conflict between goals of the project (scientific benefits) and maximizing their own score, many participants will choose the latter.
This metric is usually expressed in terms of the number of work units completed. However, in the case when tasks vary in size, this is not sufficient and a better metric is the amount of computational time expended. If the correlation between the credit awarded and the actual time spent completing a task is not precise, it gives the participants that are caught up in the competitive aspect of the project an incentive to do only tasks that require proportionately less time for the credit awarded. While this is not necessarily cheating, it does create an incentive that is not necessarily in the interest of the project. On the other hand, this mechanism may be exploited by the project managers to give priority to a work category within the project if required. For example, in the GIMPS project, in order to encourage more participants to do more of a task called "Lucas-Lehmer testing" as opposed to the other type of task called "Factoring," 30% less credit is awarded when the same amount of computing time is spent on both work types. This helps keep the right balance of participants performing both work types, as "factoring" requires only about 10% as much work as "Lucas-Lehmer testing" does.
A responsive project manager can make a big difference. George Woltman at GIMPS is known to respond promptly to questions, suggestions for improvement, and to provide prompt solutions to any problems. This has contributed in large measure to the large and loyal participant base the project has developed despite the lack of any institutional support. Participants can become disillusioned by the lack of support being provided by some project managers and switch to another project where their "needs" are better met. This was evidenced in the large number of participants who left the SETI@Home project in early 2001 when the network connection to the server that assigned work to participants was overloaded, resulting in participant computers being idle for long periods of time.
Discussion forums provide another hub-and-spokes network organization form. There is usually one forum administrator and several moderators. Anyone who is a member can make posts and reply to them. Since the forums are public, anyone can read all posts without being a member. The administrator and moderators can edit or delete posts that are deemed objectionable. They can also ban members who indulge in repeated abuses, although this step is seldom taken in the forums we have studied. SETI@home started their own message boards in 2002 after noting the success of "team message boards in encouraging participation." This is one case where the project managers (the hierarchical organization) have learned from the participants (the network organization).
Thus, the organizational structure of Distributed Computing network of projects and forums is that of a decentralized network organization where there is no hierarchical structure. The hub facilitates but has little authority. If the radical step of cutting someone off is taken, this would be for actions that violate the overall ethos of the project, and in the few times that it has happened, it has had the consensus of all other members. If the moderators or the project leaders exercise their authority in a way deemed objectionable or arbitrary by other participants, the team or the project stand to lose participants. In this way, the organizational structure is relatively flat and democratic.
If the participants do not want to be part of the social structure, they can choose to operate autonomously. However, once they begin to collaborate, the need for some form of leadership emerges, although it remains consensual and facilitative.
Technological Infrastructure
This level looks at the pattern of, and capacity for, information and communication flows within an organizational network, and whether there are appropriate technologies to support them. How well do the information technologies suit the organizational design and the narrative and doctrinal levels? The new information and communications technologies are crucial for enabling network forms of organization and doctrine (DeSanctis & Fulk, 1999). Technology shapes collaboration and the collaboration in turn influences how the technology is used, as is evident in DC, which would not exist, either technically or socially, without the Internet.
The Internet is critical for DC projects, as it is in OSS. We expected to find—and did find—in Weber's words "that open source developers [substitute DC participants] make enormous use of Internet-enabled communications to coordinate their behavior" (2004, p. 84). Collaborative research done through Distributed Computing is not possible without the Internet. It uses technology as the primary tool of the research (technological requirement) and serves the social side, facilitating recruitment and retention of collaborators. Participants' computers do the "data crunching," results are posted on the Internet, and forums provide a shared interactive space for participants.
Distributed Computing is a relatively new phenomenon and was simply not technically feasible before the mid-1990s. Its expansion has been a direct manifestation of increased computer power and increased network connectivity for the consumer. Computing power has gone from being a precious resource available only to the scientific community—and that too through time-sharing on large expensive computers—to being available to everyone. Initially ordinary computer users had limited capacity on their PCs but today, the majority of PCs have enormous spare computing capacity. The huge amount of computing power that was going to waste every day was a factor in motivating nearly a quarter of the respondents to begin their involvement with Distributed Computing. Indeed, a miniscule fraction of worldwide PC computing power is actually used everyday. As computers are becoming faster, they are staying idle more and more of the time.
In the mid- to late 1990s, Internet connections at home became affordable and widespread with broadband connections becoming increasingly common; the number of computers and amount of data that people could "download" from the project server increased exponentially. Without the Internet, there could not be any distribution of work units and therefore no computing would have been performed. As the number of people owning computers—and most importantly, having Internet connections—increased, it became feasible to start large-scale Distributed Computing projects.
Doctrinal Level
A set of guiding principles and practices—a doctrine—enables members of a network to be "all of one mind" even though they are dispersed and perhaps devoted to different tasks. It can provide a central, ideational, strategic, and operational coherence that allows for tactical decentralization. The network can be "leaderless," having no single leader who stands out, or have multiple leaders, and can use consultative and consensus-building mechanisms for decision-making. A DC project has the projects' websites, and the teams and forums have leaders, but leadership is by consensus not by fiat, and if a participant does not want to be part of the social collaboration, they can opt to work alone.
There is no centralized authority directing how DC projects should be run; there are millions of people simultaneously doing the same thing: downloading data and processing it without any central instruction. The use of teams to drive the production of work units—using cooperation on the team to drive competition that contributes to the larger cooperative task of producing work units—is the most successful strategy used so far in Distributed Computing. Participants are cooperating together with all the other participants on a particular project, most broadly with the goal of scientific research and with the specific scientific goals of the project, e.g., finding life in outer space in SETI. If they are members of a team, they are simultaneously competing with the other teams on the project on the more immediate goal of racking up the most contributions and coming out on top of the table of statistics documenting contributions. It is a form of cooperation and competition, co-opetition, which is very successful here, as it is in the business and R & D communities of Silicon Valley.
Distributed Computing teams also use "swarming" (Arquilla & Ronfeldt, 2001) tactics where team members communicate in order to coordinate to boost performance in competitive circumstances with other teams. For instance, the Ars Technica forum has created a team called the Ars Flying Squad (AFS) that consists of members who are flexible about what project they participate in, as their loyalties lie more with Ars Technica than with any particular project. As a result, AFS members switch projects on short notice in order to shore up the team's standing in a given project, usually to prevent another team for overtaking Ars Technica in the project or to gain a place in the team standings from another team. AFS activity is usually accompanied by a flurry of activity on the online forum as well as in the IRC chat rooms, as decisions are made and members mobilized in decentralized fashion in a non-hierarchical organizational structure.
Individual participants who are not part of a team are also roped in by the competitive element. Goals are set by project websites and teams and in SETI, for example, individual users are recognized on the website and thanked by email and through certificates when they achieve milestones such as completing 1,000 tasks (known as workunits in SETI@Home).
Narrative Level
Networks are held together by the narratives or stories that people tell. These are grounded expression of people's experiences, interests, and values. A story expresses a sense of identity and belonging—who "we" are, why we have come together, and what makes us different from "them." The stories also communicate a sense of cause, purpose, and mission. They express aims and methods as well as cultural dispositions—what "we" believe in, what we mean to do, and how.
As Agre (2003, p. 39) put it, "Technologies often come wrapped in stories about politics. These stories may not explain the motives of the technologists, but they do often explain the social energy that propels the technology into the larger world". The technology and the social aspect drive each other on in a virtuous circle and both are essential for the success of DC. The narrative of the projects and the narratives of the teams are of central importance to the success of DC projects. As noted above, both SETI@Home and GIMPS had a political story of an anti-establishment nature. The former was started in response to the U.S. Congress cutting funding, whereas the latter took on supercomputers and won.
Social Underpinnings
The full functioning of a network also depends on how well, and in what ways, the members are personally known and connected to each other. To function well, networks may require higher degrees of interpersonal trust than do other approaches to organization, i.e., hierarchies. Personal friendships and bonding experiences are often necessary for the successful formation and functioning of collaborative groups. This, as Arquilla and Ronfeldt (2001) point out, reflects the ancient, vital necessity of belonging to a group and associating one's identity with it. Although successful participation in DC does not require social interaction, for participants who are the most diligent, the social interaction is a vital element of the whole enterprise.
The degree to which there are social connections underpinning the collaborative relationships is a significant factor for retention of participants, as is evident from our survey results. Social connection is more likely where participants share several aspects of their identity, and can feel part of the group. At the level of a project (without participation in a forum), it is enough to be part of that project. Persisting with the work indicates a trust in the project and one's fellow participants, particularly if one is a member of a team. That trust stems from sharing the "process" and being part of the same project, which can be seen to be an institution (body of people organized around unspoken but agreed upon rules) (Zucker, 1986; 1987; Zucker & Darby, 1995).
Trust resulting from working together is "process based trust," and trust resulting from being part of the same DC project is "institutional based trust" (Zucker, 1987). Forming "institutional based trust" is reinforced greatly through the additional activity of participation in a team and in a forum. The forums are a place where there is friendly competition among members of a particular team, as well as cooperation among team members, in order to improve the relative standing of their own team. This structure is replicated at the team level, where there is competition among the teams and at the same time cooperation among the team to help achieve the project goals. In this way, a hierarchy of interests is created with the interests of the individual, the team, and the project coming into conflict and confluence at different times. The forums provide a place to discuss and analyze the performance of different teams as well as of members within a team. All the techniques to improve performance are shared with everyone on the forum. In the case of Ars Technica and Free-DC, participants are all members of the same team, and in the case of Mersenneforum, with people from diverse teams. This sharing evolves into a sense of community, which in turn reinforces competition within and between teams.
Methodology
The data discussed here were collected in two surveys: (1) an online survey of Distributed Computing participants given in August 2004, using mostly open-ended qualitative questions, and (2) an online quantitative survey of a wider sample of DC participants in January 2005.
The first survey consisted of 18 questions that were posted on three online discussion forums. Two of these forums (Ars Technica and FreeDC) were set up by Distributed Computing teams, while the third (mersenneforum) discusses the GIMPS project. We received a total of 37 complete responses. The survey began with questions on the age and gender of the respondent, followed by questions on the geographical location and computer literacy level of the respondent, the number of projects and what those projects were that they are involved in, the number of, and which, teams they were involved in, the average number of daily visits to projects and forums, and the amount of daily time spent on average on the forums. These were followed by nine semi-structured questions asking how the respondents heard of, were recruited into, and their motivation for participating in Distributed Computing, and what influenced their choice of project(s). We asked about their contacts and relationships with the other participants in the projects and teams they were involved in, and the role of trust in these relationships. Finally, we asked them what they thought about the significance of Distributed Computing for themselves, and for society. All the direct quotes in this article come from this first survey.
For the second, quantitative, survey, we posted a 10-item questionnaire on surveymonkey.com and received 323 responses. We posted a link about the existence of the survey on several forums. Three of these, the Ars Technica Distributed Computing Arcana, the Free-DC Forum, and the AnandTech Distributed Computing Forum, are for participants affiliated with a particular team or organization (Ars Technica, FreeDC, and AnandTech respectively) and are used to discuss a variety of Distributed Computing projects. Another forum on which we posted a notice, Mersenneforum.org, was initially dedicated to a single project, the Great Internet Mersenne Prime Search, but now serves as the forum for several mathematical Distributed Computing projects. Our final point of advertisement, the SETI@home Message Board, is dedicated only to participants in SETI@home. The last two, therefore, do not cater to an audience with a particular team affiliation and the respondents from these are not necessarily part of a team. All the tables and the statistics in this article come from the second survey.
In the 2005 quantitative survey, the first three questions were demographic: the age, gender, and geographical location of the respondent. We asked about how the respondent had been introduced to DC, how many projects they have been involved in, how many teams they have been part of, and how many hours a week they spent on DC-related interaction. We then asked questions investigating motivation, with scaled options on motivation for participation (from very important to not important), scaled options for goals of participation, and a simple question that asked whether they would participate in DC if there were no interaction with other participants. We analyzed these data using simple descriptive statistics to compile a picture of who is participating, what they are doing, what motivates them, and the importance of the teams and forums. We supplemented this analysis with data from the answers to the qualitative questions we asked in August 2004.
In addition, a check on the veracity of the responses and the salience of the questions was made by one of the authors of this article, who has been a participant in Distributed Computing since June 1999 and has been active since January 2001 in three of the forums where we posted our questions. This has given us an in-depth working knowledge of this kind of forum, the terminology used therein, the formal and informal rules of interaction, and the evolution of the collaborative process.
Respondent Demographics
The 323 respondents in the 2005 study were broadly similar demographically. They were overwhelmingly male (318 out of 323) and a majority was between the ages of 26 and 49 (190 out of 323). Our surveys received responses from five sites (two project-specific and three team-specific), while there currently exist approximately 20 DC sites with up to 200 affiliated team sites and forums. We suspect that the high representation of males in our surveys is true of the DC community as a whole, but this needs further investigation. Most of the survey respondents (227 out of 323) were in the United States or Canada, which is not surprising, given that all the forums that we advertised the survey on were hosted in the United States.
| 0-18 |
18 (6%) |
| 19-25 |
64 (20%) |
| 26-49 |
190 (59%) |
| 50+ |
51 (16%) |
Table 1. Age distribution of survey respondents
| U.S./Canada |
227 (70%) |
| Europe |
76 (24%) |
| Australasia |
11 (3%) |
| Asia |
7 (2%) |
| Central and South America |
1 (0.3%) |
| Africa |
1 (0.3%) |
Table 2. Location of respondents
Our sample of 323 respondents is not necessarily representative of the average Distributed Computing participant. These respondents are all active participants in the forums and are thus highly motivated participants in Distributed Computing. However, as we show later, it is precisely this group of participants that has the maximum impact on the DC projects and the teams therein. Moreover, as the objective of this article is to analyze collaborative processes, it makes sense to focus on this self-selected group of intensively collaborative participants.
Survey Results and Analysis
Technology
Technology played a key role in recruiting people, with most respondents already using the Internet as a routine part of their lives. Of the 323 respondents in the 2005 study, 43% (137) heard about Distributed Computing first through an online newsgroup, online discussion board, online news article, or online comic strip. Another 21% (66) heard about DC through non-Internet media: newspapers, magazines, TV, or radio, and 15% (47) heard about Distributed Computing through websites of the projects or the parents organizations of the teams themselves. A further 14% (44) were introduced to DC through a friend or acquaintance offline, and 8% (26) through a friend online.
The technology at the project center is also a factor in retaining participants. A client—the software program that does the processing—should be stable and well-behaved (i.e., should not crash). It should not interfere with normal usage of the PC, and it must be easily deployable by the user on his or her home or office computer. Distributed computing participants who are network or system administrators and receive permission from their employers for a company-wide or department-wide deployment bring in a much larger number of computers. Such participants need a project that caters to their special needs; a project that can be deployed en masse with very little human intervention.
From the qualitative survey of 2004, it emerged that technological factors are directly behind the most cited reasons by heavily involved participants for choosing one project over another. The most basic factor is the characteristics of the Distributed Computing program (client) itself. Some participants prefer an unobtrusive client that does not appear on their computer screen at all, whereas others prefer a program with pretty graphics. For instance, SETI has the option of a SETI screensaver built into the software program. Other users want clients that do not require too much memory, disk space, or network usage. Still others prefer clients that do not require a constant Internet connection. Hence, the technology behind the client software plays a major role in recruiting participants to a particular DC project.
Doctrine: Teams
The importance of "co-opetition" in the form of teams and forums cannot be overestimated in encouraging participation in DC. Of the 323 respondents in 2005, 177 (55%) said they would not participate in DC if there were no interaction with other participants through teams, statistics, and discussion forums. This percentage rises sharply as the participants become more involved; of the 90 respondents who are involved in six or more projects, 60 (67%) would not participate in DC without the presence of the interactive element.
In our qualitative 2004 study, there were several reasons given by the respondents for being part of a team. Many respondents cited loyalty to the team's parent organization—such as the Ars Technica website or the Free-DC website—as a reason for being part of the team. If we look as an example to SETI@Home—a very large project that incorporated team formation from the beginning—they allowed teams to be formed in the following categories: schools and colleges, companies, clubs, and government agencies. The first two categories were further subdivided into primary and secondary schools, junior colleges, universities, and departments and into small, medium, and large companies. Of these, clubs are perhaps the only group that relies on self-organization and thus uses forums as an online meeting point for the members since the members of "club" teams usually do not have any meeting place in real life. It is an example of the strength of the collaborative process that nine of the top 10 teams in the SETI team contribution statistics are all club teams with the Hewlett-Packard team being the only exception.
Other reasons respondents in the 2004 study cited for joining a particular team include: being referred through a friend, getting online technical help through the forum of a particular team, the team having a good forum community, the team "having a homey feel to it," the people being friendly and intelligent, liking the attitude of other team members, liking the philosophy of the team, the team having a good website dedicated to participant contribution statistics, and country affiliation.
Narrative Level
The teams (and their online forums) develop a certain reputation, a certain story about "who we are." Figure 1 summarizes the reasons listed by the respondents in the quantitative survey for participating in Distributed Computing and the importance they attributed to each one.
This story of "who we are" is constituted by and reflected in the personal narrative for each participant. There are several "stories" that the respondents cited as to who they are. The "official" story is that they are contributing to a worthy cause. 205 respondents (64%) cited the scientific contributions being made by the project as a "very important" reason for participating in Distributed Computing. The second most commonly cited "very important" reason was contributing to the statistics/friendly competition. However, not wanting to waste resources, the social aspect, the feeling of being part of something important, and the gain of technical knowledge all had considerable numbers of people citing them as "very important."
|
Figure 1. Reasons cited by participants for participating in distributed computing
The 2004 qualitative survey respondents further indicated the existence of a narrative-building process. They called their teams "cool," consisting of intelligent people, easy going, relaxed and committed to scientific research and worthy causes. Some typical responses from the qualitative study in 2004 that summarized the feelings of several participants are the following: "It's enjoyable for most of the same reasons online gaming is enjoyable. Friendly competition between geographically separated people of similar mindedness…" Another replied: "It's fun to be a part of something much bigger than just me."
Their self-identity as "techies" put their opposition to wastage of computer power, and the importance of monitoring hardware and network health and learning more about computers, at the center of their narrative for belonging to the collaborative network. Mobilizing people towards a goal of scientific advancement and doing things that establishment science is neglecting or under-funding, "fits" the self-representation of the respondents.
The "unofficial" story—indicated by the often "sheepish" framing of the primary factor that motivates them—is the competition for success in the statistics tables on the Distributed Computing websites. In the 2005 survey, almost half (133, or 42%) of the respondents said that the statistics and the friendly competition are very important. In the qualitative responses from 2004, the respondents cited the thrill of seeing their name up on the leader-board and getting respect from fellow techies.
In both studies, almost all respondents cited the statistics, contributing to a worthy cause, and their own and broader technological advancement in their "story" of why they are involved in Distributed Computing. These are all facets of the same fluid "techie" identity; respondents in the 2004 study often cited one reason for starting and another for staying on. In general, respondents decided to try out Distributed Computing, as there was nothing to lose. Soon they got addicted to the point of responding to the survey question of why they did Distributed Computing by asking in turn, "Why not DC?"
Social Underpinnings
Trust results from individuals working on the same DC project, producing "process based trust" and "institutional based trust." Trust is reinforced greatly through the additional activity of participation in a team and in a forum. The forums provide a place to discuss and analyze the performance of different teams, as well as of members within a team. All the techniques to improve performance are shared with everyone on the forum. This sharing evolves into a sense of community. This sense of community in turn reinforces competition within and between teams.
Participating in DC and spending time on DC forums is a regular activity for most participants: 212 (66%) of the 323 respondents in the 2005 survey spent over one hour each week on average communicating with other DC participants on the forums or on Internet Relay Chat (IRC).
| Less than one hour |
111 (34%) |
| 1-3 hours |
115 (36%) |
| 4-10 hours |
66 (20%) |
| 10+ hours |
31 (10%) |
Table 3. Time spent by participants on DC each week
However, the most active members, who are part of six or more projects, show a different pattern of the amount of time they spend online on DC-related activities. Table 4 illustrates the association between the number of projects a participant is involved in and the amount of time he spends on DC-related activity every week. In the most active group, 15 out of 90 spend less than an hour, 29 spend more than 1-3 hours, 26 spend 4-10 hours, and 20 respondents spend more than 10 hours a week. The amount of time spent and the level of commitment increases as the number of projects a participant is involved in increases.
| 32 |
28 |
10 |
0 |
| 46 |
34 |
12 |
7 |
| 18 |
24 |
18 |
4 |
| 15 |
29 |
26 |
20 |
Table 4. Number of hours spent on DC-related activities vs. number of projects
For the small core of enthusiasts, involvement has a spin-off pattern. In the 2005 survey, 78% of the respondents had participated in more than one project at one point or another, leading us to conclude that the concept of Distributed Computing appeals to the participants more than any single project and that more often than not, the first project serves as an "entry" project. The usual pattern seems to be of participants being interested in one project and starting with just one. As they participate more actively in the forums, they become aware of the existence of more projects and start trying them out. The reasons for this diversification are many. Sometimes, the project they started on ends. Sometimes, they "help out" by participating in a project to help their team or organization improve its standings in the statistics. At other times, they find another project more interesting and switch allegiances. As in most human activity, most participants become "bored" doing the same thing all the time. Hence, they constantly seek new things to do within Distributed Computing, either within the same project or in a different project. This theme emerged repeatedly in the comments made by respondents of the 2004 qualitative survey on the forums.
| 1 |
70 |
| 2 or 3 |
99 |
| 4 or 5 |
64 |
| 6 + |
90 |
Table 5. Number of projects participants are involved in
The sense of community is important for attracting and retaining volunteers. As Wellman has noted: "Communities are networks of interpersonal ties that provide sociability, support, information, a sense of belonging and social identity" (2001, p.1). As one participant in a forum on Ars Technica put it: "Team Vodka Martini became my home." Although they are typically young male professionals working in computing, they are avowedly open and inclusive, the principle bond being that of working and having a love of computers, and then commitment to the project. Learning from each other is one of the dominant themes in discussions on the forums: 248 (77%) of the participants of the 2005 survey said that the gain of technical knowledge motivated them to take part in DC. Similar feelings were reflected in the qualitative responses from the 2004 survey with a typical comment being: "DC has increased my knowledge of pc's 10x [times]."
Thus, the social bonds were critical in maintaining and increasing involvement once the initial entry was made. The social ties contributed to cooperation and competition, building a virtuous circle.
Conclusion
The most active and productive participants in DC exemplify "co-opetition," cooperation through competition within the network organization of a DC project. Primarily driven by a desire to contribute to scientific research, participants' willingness to cooperate is enhanced through feeling that their contribution is important and unique, as manifested in their placement in the statistical tables of contributors in the DC projects. This would not be possible without the Internet and without the role that project websites, teams, and forums play. Recognizing this should help in the design of, recruitment to, and retention of participants in online collaborative projects. Although a high level of technical expertise is displayed by those most active in Distributed Computing, this is not necessary for participation. Similarly, in the wider context of collaborative activity, the majority of participants do not have to be very skilled, but a core of highly skilled participants is required to drive the project forward. This core imparts its knowledge to the less skilled participants, eventually bringing some of them to the same level, and provides the vision and leadership necessary to the success of the collaboration.
The Internet solves the technical challenge of DC and online collaboration in general, and has shown itself to play a key role in the social challenge of inducing participation. The technical challenge of slicing up large projects into pieces and putting them together again is met by the ubiquity of PCs and Internet connectivity. The social aspect, getting the millions of PC owners to participate and to maintain participation, needs continued attention. This study provides further insight that can help to make online collaborations more successful.
A Distributed Computing project is a new type of network collaboration not dependent on geographical or professional linkages. Enabled by and centered on information technology, the most enthusiastic and productive participants display a longevity and loyalty typical of long-time employees and/or scientific collaborators. Distributed Computing has the potential to harness an enormous resource of people and computing power previously untapped, and which is growing all the time as computers get faster and the number of highly trained technological workers increases.
As David Anderson of the Space Sciences Laboratory at Berkeley, home of one of the largest DC projects, argues, the development of Distributed Computing provides "a basis for global communities centered around common interests and goals" (2004, p. 1). This is particularly the case for people interested in scientific research. Most of the respondents from our 2004 qualitative study felt that the online collaboration of Distributed Computing is in its infancy and see it as a tool that has great potential for scientific and humanitarian contributions. Their comments included the following:
"Computer modeling is an important tool in the world of mathematics, medicine and science."
"Everyone should be encouraged to join in on the effort when possible."
"It makes available resources that would otherwise be financially unattainable."
"DC is a great tool to solve various problems but above all a way to get people working together to find solutions to various projects from physics to math to biology."
Several respondents believe that Distributed Computing will move to a different level when a big breakthrough happens with the aid of Distributed Computing itself, and the idea of Distributed Computing as an effective problem solver is communicated worldwide. The initiatives towards Peer-to-Peer computing and Grid Computing that are emerging target the same untapped resource. Some experts believe that the lines between these three will blur and a "global overlay" computer will eventually emerge, causing the Sun Microsystems tagline, "The Network is the Computer," finally to come true.
The study of online collaboration is still in its infancy. The lessons from Distributed Computing can be summarized as follows: (1) the network organization form can encompass millions of unskilled participants around the world; (2) at the same time, a core of skilled and productive participants is critical to success; and (3) the optimal way to harness the core, and potentially the remaining participants, is cooperation through competition, i.e., co-opetition. These are early markers that we can make use of to improve existing collaborations and design future collaborative systems.
Acknowledgments
Anne Holohan's research was funded by the European Commission's FP6 Marie Curie Incoming International Fellowship. Anurag Garg's research was sponsored by Project Wilma, which is funded by the Autonomous Province of Trento.
References
Agre, P.E. (2003). Peer-to-peer and the promise of Internet equality. Communications of the ACM, 46 (2), 39-42.
Anderson, D. (2004, March). Public Computing: Reconnecting People to Science. Retrieved June 26, 2005, from http://boinc.berkeley.edu/boinc2.pdf
Arquilla, J., & Ronfeldt, D. (2001). Social Networks and Netwar. Santa Monica, CA: Rand Corporation.
Beenen, G., Ling, K., Wang, X., Chang, K., Frankowski, D., Resnick, P., & Kraut, R. E. (2004). Using social psychology to motivate contributions to online communities. Proceedings of the 2004 ACM Conference on Computer Supported Cooperative Work.
Benkler, Y. (2002). Coase's penguin, or Linux and the nature of the firm. Yale Law Journal, 112 (3), 339-446.
Brandenburger, A. M., & Nalebuff, B. J. (1996). Co-Opetition: A Revolutionary Mindset That Combines Competition and Co-Operation. Garden City, N.Y., Doubleday.
Caldwell, C. (n.d.). The largest known prime by year: A brief history. The Prime Pages. Retrieved June 24, 2005 from http://www.utm.edu/research/primes/notes/by_year.html
Castells, M. (1996). The Information Age: Economy, Society and Culture. Volume 1: The Rise of the Network Society. Oxford: Blackwell.
Coase, R. ([1937] 1988). The Firm, the Market and the Law. Chicago: University of Chicago Press.
DeSanctis, G., & Fulk, J. (Eds.). (1999). Shaping Organizational Form: Communication, Connection and Community. Walnut Creek, CA: AltaMira.
Foster, I. (2002). What is the Grid? A three point checklist, GRIDtoday, 1 (6). Retrieved June 24th, 2005 from http://www.gridtoday.com/02/0722/100136.html
GP2. (2003, December). The difference between P2P and distributed computing and grid computing. Retrieved June 24, 2005 from http://www.mersenneforum.org/showthread.php?t=1549
Hars, A., & Ou, S. (2001). Working for free? Motivations for participating in open-source projects. International Journal of Electronic Commerce, 6 (2), 25-39.
Healy, K., & Schussman, A. (2003). The ecology of open-source software development. Working Paper. Retrieved June 26, 2005 from http://opensource.mit.edu/papers/healyschussman.pdf
Krishnamurthy, S. (2002). Cave or community? An empirical examination of 100 mature open source projects. First Monday, 7 (6) Retrieved June 24, 2005 from http://www.firstmonday.org/issues/issue7_6/krishnamurthy/index.html
Lakhani, K., & Wolf, R. G. (2003). Why hackers do what they do: Understanding motivation and effort in free/open source software projects. MIT Sloan Working Paper No. 4425-03. Retrieved June 24, 2005 from http://ssrn.com/abstract=443040.
Lerner, J., & Tirole, J. (2000). The simple economics of Open Source. National Bureau of Economic Research (NBER), Working Paper 7600. Cambridge, MA: National Bureau of Economic Research.
Lerner, J., & Tirole, J. (2002). Some simple economics of Open Source. Journal of Industrial Economics, 50 (2), 197-234.
Locke, E. A., & Latham, G.P. (2002). Building a practically useful theory of goal setting and task motivation: A 35 year odyssey. American Psychologist, 57 (9), 705-717.
Maslow, A. (1987). Motivation and Personality, 3rd edition. London: HarperCollins Publishers.
Mockus, A., Fielding, R. T., & Herbsleb, J. (2000). A case study of open source software development: the Apache Server. Proceedings of the 22nd International Conference on Software Engineering, 263-272.
Nohria, N., & Eccles, R.G. (1992). Networks and Organizations: Structure, Form, and Action. Boston: Harvard Business School Press.
Soares, F., Silva, L. M., & Silva, J. G. (1998, June). How to get volunteers for web-based metacomputing. Proceedings of Distributed Computing on the Web (DCW '98), Rostcok, Germany.
Von Hippel, E. (2001). Innovation by user communities: Learning from open source software. Sloan Management Review, 42 (4), 82-86.
Weber, S. (2004). The Success of Open Source. Cambridge: Harvard University Press.
Wellman, B. (2001). The rise of networked individualism. In L. Keeble (Ed.), Community Networks Online (pp. 17-42). London: Taylor & Francis.
Williamson, O. (1994). Transaction cost economics and organization theory. In N. J. Smelser & R. Swedberg (Eds.), The Handbook of Economic Sociology (pp. 77-107). Princeton, N.J.: Princeton University Press.
Zucker, L. (1986). Production of trust: Institutional sources of economic structure 1840 to 1920. Research in Organizational Behavior, 8, 53-111.
Zucker, L. (1987). Institutional theories of organization. Annual Review of Sociology, 13, 443-64.
Zucker, L., & Darby, M. (1995). Social construction of trust to protect ideas and data in space science and geophysics. National Bureau of Economic Research, Working Paper 5373. Cambridge, MA: National Bureau of Economic Research.
Appendices
Appendix 1
Questionnaire Posted on SurveyMonkey, http://www.suveymonkey.com, January 2005
-
Are you
-
Male
-
Female
-
Are you between the ages of?
-
0-18
-
19-25
-
26-49
-
50+
-
Where do you live?
-
Europe
-
Asia
-
Central or South America
-
Australia/New Zealand
-
Africa
-
United States/Canada
-
How did you get introduced to/first hear about the first (or only) dc project you participate in?
-
Website of the projects
-
Friend or acquaintance online
-
Friend or acquaintance offline
-
Non-Internet based media
-
Online (newsgroup, cartoon, IRC, non-dc related message boards)
-
How many projects have you been/are you involved in?
-
1
-
2 or 3
-
4 or 5
-
6 plus
-
What is the total number of teams you have been or are part of?
-
1
-
2 or 3
-
4 or 5
-
6 plus
-
What motivates you to give resources - your computer power, your time - to the project(s)? (Choosing one of not important/quite important/very important)
-
Contributing to scientific research
-
Not wanting to waste resources
-
Contributing to the statistics/friendly competition
-
Social aspect
-
Being part of something bigger than oneself
-
Gain of technical knowledge.
-
How many hours do you spend per week communicating with other participants in DC, e.g. on forums or on IRC?
-
Less than one hour
-
1-3 hours
-
4-10 hours
-
10 plus hours
-
Would you participate in Distributed Computing if there was no interaction with other participants (no forums, statistics tables, etc?)?
-
Yes
-
No
-
How important are the following goals to you? Please choose one of very important/quite important/not important.
-
Your personal ranking
-
The objective of your chosen project
-
Your team ranking
-
The Contribution to Science that DC makes
Appendix 2
Questionnaire posted on Ars Technica, Free-DC and Mersenneforum in August 2004
-
Are you
-
Male
-
Female
-
Are you between the ages of:
-
0-18
-
18-25
-
25-50
-
50+
-
Where do you live?
-
United States
-
Europe (specify which country)
-
Asia (specify which country)
-
South America
-
Canada
-
Australasia including Pacific Islands
-
Africa
-
How did you get introduced to/first hear about Distributed Computing?
-
How many projects are you involved in?
-
Are you on any teams? If so, how many? Why did you choose that
particular one? Here please consider all teams from the same "organization" say ArsTechnica, or AnandTech or Free-DC as the same team.
-
Why do you give resources e.g. your computer power, your time - to the
project(s)? Can you describe exactly why this is rewarding for you?
-
How did you choose each particular project? What factors influence
this decision? (e.g. Project objective, Client suits your needs, Project Management, Server Reliability, Good Team etc.)
-
How computer literate are you? Do you consider yourself
-
An expert
-
Knowledgeable
-
A beginner?
-
Has distributed computing increased your knowledge of technology?
-
Do you know other people involved in the project (i.e. apart from online interaction)?
-
How frequently do you typically communicate with them?
-
Daily?
-
Weekly?
-
Monthly?
-
Every few months?
-
How many minutes or hours do you spend per visit to any Forum that
you participate in?
-
How would you describe your relationship with the other members of
a) your team b) the forum that team participates in? (Like work
colleagues/acquaintances/casual friends/close friends/strangers)
-
Do you trust the people you interact with in distributed computing and if so, why?
-
How would you feel if you discovered that somebody was not being
honest in their statistics? Will you report it, confront him/her or not worry too much about it?
-
Why is distributed computing important to you?
-
Do you think is distributed computing important to society and if so, why?
Appendix 3
List of Distributed Computing Websites Referenced in this Article
-
Ars Technica Distributed Computing Arcana http://episteme.arstechnica.com/eve/ubb.x?a=frm&s=50009562&f=122097561
-
Ars Openforum: http://episteme.arstechnica.com/
-
Ars Technica Main Website: http://arstechnica.com
-
Free-DC Forum http://www.free-dc.org/forum/
-
Mersenne.org Forum http://mersenneforum.org
-
Folding@Home http://folding.stanford.edu
-
Distributed.net http://www.distributed.net/
-
Great Internet Mersenne Prime Search (GIMPS) http://www.mersenne.org
-
SETI@Home http://setiathome.berkeley.edu
About the Authors
Anne Holohan
received a Ph. D. in Sociology from University of California, Los Angeles in 2002. She is currently a Marie Curie International Post-Doctoral Fellow in the Department of Sociology and Social Research at the University of Trento, Italy. Her research focuses on inter-organizational cooperation and the role of information and communication technologies in network organizations. Her book, Networks of Democracy: Lessons from Kosovo for Afghanistan, Iraq and Beyond, was published by Stanford University Press in 2005.
Address: Department of Sociology, University of Trento, Via Verdi 26, Trento 38100 Italy
Anurag Garg
received his Ph. D. in Computer Science from the University of California, Los Angeles in 2003. He is currently a Post-Doctoral Fellow in the Department of Computer Science and Telecommunications at the University of Trento, Italy. His research interests include computer networks, trust management on the Internet, and edge computing.
Address: Department of Computer Science and Telecommunications, University of Trento, Via Sommarive 14, Povo 38050 Italy
|