Workshop on Mass Collaboration - Day One

Introduction to the Workshop - Ulrike Cress

Why a workshop in mass collaboration? Recent mass phenomena: Wikipedia, tagging, blogging, Scratch, massive open online courses and connectivism, citizen science, maker-space

Who is creating these? Nature article on massively collaborative mathematics (see the wiki PolyMath).

How do we describe these phenomena? Is it just aggregation? What role does coordination play? Is it a mass phenomena? Is it an emergent phenomena? Is it collective intelligence? And what are the processes behind this?

In science we need new methods for this. Previously, we would passively observe - but now we want people interacting. We have to try and find what these methods can do.

Can we design mass collaboration? Is it just something that self-starts, or can we create this?

CSCL 2013, we brought together people to talk about this. This led to the larger workshop we are hosting today.

A Brief History of Mass Collaboration
Allan Collins, Northwestern University

Homo sapiens traded with others many miles away, while Neandrathals did not. Trading leads to specialization, which is the first basis for mass collaboration - it leads to people getting better at producing things, via division of labour. This altogether cretaes a virtuous cycle of increasing tradem specialization and learning.

The next major development is the development of cities. Geoffrey West - when creativity is measured by patents, by researchers, etc., a city 10 times larger is 17 times as creative. Examples of hotbed communities include Cremona, Hollywood and Silicon Valley.

Marshall's theory of hot spots: these areas exist for three reasons:
- pooled resources - workers and firms are drawn to these places
- specialized products and services - for example, hairdressers and agents in Hollywood
- 'ideas in the air' - information and skills flow easily between people

Brown and Duguid on Silicon Valley - 'The Social Life of Information' - there are these 'networks of practice' across firms, eg. sofwtare engineers, LAN designers, etc. So knowledge flows across firms to find its most successful niche. Even where there are failures, that seeds different companies in the Valley. People can see what's doing, what's doable, and what's being done.

The third major development was the invention of writing and printing. Writing allows you to share ideas at a great distance and hand down ideas to later generations. It is what makes 'study' possible (Walter Ong) which is critical to science, history and geography. Printing led to the spread of books and universal literacy.

The world scientific community, fourth, created a set of norms and a set of structures. For example, scientific meetings like this foster interaction among scientists. It produced scientific journals to spread findings and ideas. And it produced government support, because the more science you have the more invention you have. Scientific norms include: objectivity, replication, equal standing, and sharing of data.

Clay Shirky, in 'Here Comes Everybody', argues that the internet and the web are making it much more possible for all sorts of new ways to collaborate to occur. Here's the list:
- web communities: Xanga, fan fiction, Scratch
- collaboratories - share tools, data, designs
- digital libraries: videos, satellite data, models
- publis repositories: Flickr, YouTibe, Wikipedia
- collective action: Twitter, Facebook
- crowdfunding: ArtistShare, Kickstarter
- MOOCs: Coursera, Udacity, edX
- Games: Foldit
- Open source: Linus, Wikipedia

Shirky makes the point that weird and risky ideas have a much better chance of taking hold and that 'collaborative entrepreneurs' like Linus Torvolds and Jimmy Wales can succeed. This process, he writes, undermines existing hierarchy - eg., 'Voice of the Faithful' - which responded to the Church's suppression of the priests molesting young people - it is only something that happens in this particular communications world.

I would like to thing the ideas Seymour Papert and Ivan Illich are now more likely to be brought to fruition - for example, the 'Samba Schools' where people teach each other in Brazil. Schooling segregates us, it creates peer culture. Illich wrote of of (a) resources everybody could have a hold of, which is what collaboratories enable; (b) skill exchanges where people who had expertise can share it with others; (c) peer communities.

As you get more and more collaboration, you get a speed-up of innovation. This leads to what Toffler called Future Shock.

Gerhard Fischer, Center for LifeLong learning & Design, University of Colorado, Boulder

There are two types of problem transcending individual human minds or individual disciplines:
- problems of magnitude which individuals and even large teams cannot solve and require a contribution from all stakeholders, for example, Scratch, Wikipedia, 3D-Warehouse
- problems of a systemoic nature requiring the collaboration of many different minds from different backgrounds, for example, approaches to lifelong learning, aging population, etc.

For example, an article from Axel Pentland (MIT) in Der Spiegel asking the question, where does mass collaboration start? At six people, at 60 people? Units with more people than the size of a mid-sized city are difficult to organize with today's technology.

So, what is the research methodology here: from how things are, to how things should be. Marx: philosophers interpret the world in different ways, but what matters is to change it. So, how do we study how things could be? This requirs theoretical frameworks going beyond antidotal (anecdotal?) examples helping to interpret data in order to undertsand the context- and application-specific nature of mass collaboration.

People have employed new media in learning organizations as gift-wrapped around our existing understanding of learning and education. But we need a rethink: "distance education is not learning in a classroom at a distance." We evolve new forms of learning. For example, we define rich multidimensional landscapes of learning. Eg. 'how' - in a classroom the instructionist domain scales well, but problem-based learning does not scale. (Say).

Looking at the shift from consumer cultures to cultures of participation, where everyone can participate in personally meaningful problems. The Encyclopedia Britannica is an example of consumer culture; but Wikipedia is an example of a culture of participation (of course, we should differentiate between different levels of participation). Other examples: iTunesU, YouTibe, Encyclopedia, PatientsLikeMe, Scratch, Stepgreen, SketchUp and 3D Warehouse.

Two different models for knowledhe construction and sharing:
- model-authoritative (you first filter, then you publish - the output filters are not needed because the content is authoritative);
- model-democratic (you publish, and then you filter - you need better output filters to (eg.) find the information).

We had a research project that analyzed the SAP Peer Community Network (SCN). It was designed to help companies know what they know. What is the 'tea time' for mass collaboration networks? Some of the dimensions investigated included:
- responsiveness
- engagement intensity
- role distribution
- reward mehcanisms

Another project: the CreativeIT wiki - in which we learned that most wikis are dead on arrival. You see this a lot - they set up a wiki where everyone contributes, but you go back 6 months, and there's nothing there. Our wiki - we put a lot of effort into it, we seeded it, and it did not take off. So we are studying why this is the case.

There are ecologies of participation - we can find clearly identified roles, from project leadre and core members out to bug reporter, readers and passive users. (SD - and then there are a series of mechanisms designed to move participants up one level of participation).

So we turn to MOOCs - the hype is that MOOCs will revolutionalize higher eductaion. There is both over-hype, but also under-estimation of MOOCs. So what did MOOCs contribute? They generated discussion transcenting the narrow confines of academic circles, they represented an innovative model that shook up models of learning and institutions, and they might force residential research-based universities to focus on core competencies.

But we need frames of reference on MOOCs: many of them are looked at by econonomic (scale, productivity, cost) and technology perspectives. But another perspective: global versuss local. MOOCs can reach out beyond national boundaries. In the US you have miles per gallon, while in Europe it goes litres per 100km. Now if you think about mass collaboration and trying to create a common understanding among people, this is probably a common problem. (SD- yes!)

We worked on the Envisionment and Discovery Collaboratory (EDC) - we created 'reflection spaces' where people could act on what they were reflecting. But we found there were only 12 people around the board - what about mass collaboration? Perhaps a vitual equivalent to the face-to-face meeting? This would fit the mass collaboration paradigm. Then you can have the local one having theirs, but you could collaborate with people in Cosat Rica.

So - what are the open issues.
- can there be a lower limit for the number of participants? Is this number context-dependent?
- is there a difference between collaboration, cooperation, coordination, participation, etc.?
- in MOOCs (often with over 10K people) - does any mass collaboration take place among the paarticipants?
- does any mass collaboration take place in Facebook and Twitter? If no, hat are the future developments to create mass collaboration?
- is there 'participation overload' in the way there is information overload?
- is active participation in whatever form an absolutr prerequiste for mass collaboration
- are there problems society is facing that make mass collaboration an necessity?
- what si the role of personal indiosyncracies?

In summary: mass collaboration and education is an important theme for further reserach.

Mass Collaboration as Co-Evolution of Cognitive and Social Systems
Ulrike Cress

Mass Collaboration is typically presented as an artifact - fro example, Wikipedia. So we see mass collaboration as a conflict of two systems, a cognitive system and a social system. The cognitive system is autopietic - is exists through its own operattions, it operates by a process of meaning, and is operstionally closed; thoughts build on thoughst.

On the other side, the social system is also seen as a system, but it operates not through thinking, but through communication, which entails a reciprocal understanding. It happens between people, but it is stimulated or irritated by its environment. It as a system tries to make meaning - it processes information, some things become central, others things die out - it's the the system that decides over time.

These systems interact - both systems can be an environment for the other, both systems can be irritaed by the other. Each can stimulate the other's development. This co-evolution - both systems develop each other. How do we study these systems?

1. Wikipedia - how it operates, how it builds meaning. For example, the coverage of Fukishima (Daiichi Nuclear Power Plant). Point of reference: the Wikipedia 'norms' - neutral point of view, citations from authority, etc. In the first 9 days: 1200 edits, 213 substantial. 194 had a reference, 19 did not. For the references, some came with a reference, some were added after the fact. There were only 4 deviances from neutral point of view, an these were deleted almost immediately. So the principles were followed.

The quality of the constructed knowledge - by day 9, according to experts, the Wikipedia was a balanced and objective presentation of what happened (even though most authors had no formal knowledge in engineering or nuclear power). So, laypeople wrote the article, but as a collective the social system could make meaning.

2. What triggers co-evolution? The difference between both systems - how they are able to irritate each other. There must be a difference between the personal and the social system. We ran a test where a person and the system had different 'pro' and 'con' arguments for an issue. What we found was that a middle level of incongruity created the most change (and the most learning).

3. Large-scale study of Wikipedia - to confirm this result. Eg., a domain 'traditional and alternative medicine'. We found about 45,000 articles (via machine larning) - these were being modified by a large number of people. We created article profiles to determine whether articles were more or less in favour of alternative medicine. We could also do the same for the authors. We could thus calculate the incongruity between the author and the article.

When authors started working on articles, they were at a middle level of incongruity. So the most productive activity took place at the middle level. It is the incongruity that triggers co-evolution. This created productive heterogeneity. But there's an optimal level of heterogeneity. If a person wants to change the system, he/she much adapt to the system. Hence mass collaboration is not free or not democratic at all. If you bring in an idea that is not accepted by the system you will have no impacted on the system.

Individual vs Collaborative Information Processing
Aileen Oeberst

There is a great deal of literatire about the benefits of collaboration, but from psyhcological literature we know that individual information processing is biased. Does this bias level out in collaboration? Or does this translate to collective bias. Or does iit become mroe extreme?

So, for a bit of research, we took as a question, whether individual biases are mirrored in collaboratively authored Wikipedia articles. Wikipedia has of course strict rules such as verifiability and neutral point of view. These rules are intended to prevent bias, and they're pretty good, but there is a concrete counterexample.

The bias is hindsight bias. We say 'we could have foreseen this'. The bias iss, your perception after the event is different from what they were before the event. In hindesight, the liklihood, inevitability and foreseeability of an event is always increased. Take Fukishima. Or the Turkey coal mine disaster. People try to explain these things, to make maning out of them. So you selectively focus on information that is consistent with the outcome of the event, and ignore the information that would have spoken for a different outcome.

Hindsight bias has beeen repeatedly demonstrated. Once you know about it, you see it repeatedly in newspapers. It's widespread, difficult to avoid, and people usually are not aware of it. So it is reasonable to assume that hindsight bias is shared by Wikipedia authors, and that it enteres into Wikipedia articles. So is there evidence for hindsight distortion in Wikipedia articles?

The method was to find things where an article existed before the event. For example, there was an article about the Fukishima power plant. They were analyzed to ask 'to what extent does the article suggest the event was likely to happen'? The number would be the same if there is no hindsight bias. But there was a significant increase. Of 17 events, 6 or 8 events did not have any tendency at all, while the others demonsrated a range of between factrs of 1 to 7. Eg., before the accident, there's a small 'accidentss' section, after te event, there was a large selection of design issues and risks (mostly from data that existed before the event).

So why select these and not other references. First, because it added to the explanation. But also, there was a selection based on relevance.

Limits: we can't conclude that all of Wikipedia is boased, nor say we've found an overall pattern. There is a substantial number of articles without any tendency.

A second study looked at more events, including both unknown (disastersm etc) and known (eg., elections) events. Catgeories included distastersm decisions, elections, personal decisions, etc. The same mechanism for evaluating the events was used. What we see is that only for disasters do we see the significant hindsight bias.

So: there was no hindsight bias based on whether the event what known in advance or not. Nor is there a general hindsight bias. But there was a hindsight bias for disasters. But - in hindsight - this makes sense. They have considerable impact. You would like to prevent them. So they elicit a particular need to be explained. And this creates a search for antecedents.

Future work: to example whether collaboration increases bias, whether using biased resources increases biases, looking at other biases (eg., ingroup biases, such as distorted representations of their own group - is there a difference of representation in different laanguage versions in Wikipedia? eg. Manypedia analysis of the 6-day war in different languages).

Wai-Tat Fu - Illinois

From Distributed Cognition to Collective Intelligence

Perspectives, from cognitive science, and from HCI/CSCW.

How do we define success in mass collaboration?

How do we define success for a cognitive system? Perhaps in a competition, eg. Deep Blue versus Kasporov. The outtcomes were controversial - Deep Blue did win. But also the process - did the program just do search? A human can evaluate 100 times fewer moves than the machine. But maybe the outcome is not what we like to consider success.

Physical Symbol System - cognitive computations involve operations of symbolic structures in a PSS. How about collaboration? Maybe we can expand from computations inside-the-head to those that involve multiple heads (cf Andy Clark's 'extended mind' theory) (cf also 'the Siemens answer' in connectivism - SD)

So, why does search become important? Cognitive computations *always* involve search. All computations are local in the sense that there's no what what happens here will impact something else. When local computations need more information it needs to know where this information is and how to access it. And local symbol structures make heuristic search possible. What matters most is whether you have enough local symbol structures that make such a search possible.

(By putting these terms all in computational terms that means you can actually implement them.)

So - even though you have local computation, you always have to access the symbol structures from a distance ('distal access'). The crux of th argument is this: local to distal access to information. This has to keep on going until you've found the distal symbol structures that you need. This local to distal symbol processing is they key to intelligence.


Chess: from 'Deep Blue' to 'modern' cheww  programs. They represent a shift from searching for massive numbers of moves, to 'intelligent search' - learning from a lot of patterns. 'Success' is defined by how much searching the computer does *not* have to do.

Web information search: based on representations of web data, so you don't have to search every listing. The agent needs 'knowledge' to choose the right action based on local infor,ation, and local information has to have sufficient structures to enable this.

(Video - Aibo robots playing soccer - better 'square robots' actually passing the ball - (Stone and veloso video))

How do robots know how to do that? There must be some kind of rules to tell them what the others are doing. So what sort of rules should we pay attention to? You have to have some kind of model for what the other person's doing.

So - for collaborative systems - what kind of rules do we want people to have, and what kind of rules might we want to impose?

Successful collaboration involves local to distal heuristic search processes to achieve a set of goals. ie., using local information, each agent needs to infer how to find the information.

Eg. in mass collaboration - we look at how multiple minds collectively 'search' for information, where knowledge is a set of symbol structures that allow efficient serach.

Eg. in animals, success depends on how well they can exploit the local-to-distal symbol sytructures.

Eg. cognitive psychology / science - success depends on how well knwoledge is structured.

Eg. sociology - network analysis shows the importance of networks, of nodes and edges, where success depends on crucial network structures.

Anotjer example: Wikipedia. Wikipedia has very good local structures.  The problems with Wikipedia that the structures seem very local - the method for linking is perhaps too stone age - they don't provide much local information for assessing the right distal knowlege. So the challeng is: how to make local structures more coherent and local. Eg., maybe by having individuals provide semantic structures.

An example of such an approach is spocial tagging. These generate local structures, but not enough to help access distal information. And again, the chalenge is, how to improve it.

- local-to-distal heuristic search plays a central role in collaboration/cooperation
- social tenchnologies that support collaboration will benefit from design features that facilitate generation of local-distal structures to support distal search

Question: what is 'distal'? We have distal processes, search, structures, etc...
What we have locally is a symbol: the distal is whatever it stands for.
Question: the argument depends on the idea that success is how easy it is to access the thoughts of another person.

Interesting concept: tagging recommendations

Propositoon (from Ulrike): intelligence is 'metaknowledge' (not in the sense of 'knowing how to know' but in the sense of having higher order mental structures or representations (but does mass collaboration require some sort of collective intelligence?))


Popular Posts