Wednesday, June 06, 2007

Open Source Assessment

This has come up in a couple of places lately, and I'd like to get the concept down on paper (as it were) so people has a sense of what I mean when I talk about 'open source assessment'.

The conversation comes up in the context of open educational resources (OERs). When posed the question in Winnipeg regarding what I thought the ideal open online course would look like, my eventual response was that it would not look like a course at all, just the assessment.

The reasoning was this: were students given the opportunity to attempt the assessment, without the requirement that they sit through lectures or otherwise proprietary forms of learning, then they would create their own learning resources.

Certainly, educational institutions could continue to offer guidance and support - professors, for example, could post guides and resources - but these would not constitute any sort of required reading, and could indeed be taken by students and incorporated into their own support materials.

This is the sort of system I have been talking about when I talk about open educational resources. Instead of envisioning a system that focuses on producers (such as universities and publishers) who produce resources that consumers (students and other learners) consume, we think of a system where communities produce and consume their own resources.

So far so good. But where does this leave assessment? It remains a barrier for students. Even where assessment-only processes are in place, it costs quite a bit to access them, in the form of examination fees. So should knowledge be available to everyone, and credentials only to those who can afford them? That doesn't sound like a very good solution.

In Holland I encountered a person from an organization that does nothing but test students. This is the sort of thing I long ago predicted (in my 1998 Future of Online Learning) so I wasn't that surprised. But when I pressed the discussion the gulf between different models of assessment became apparent.

Designers of learning resources, for example, have only the vaguest of indication of what will be on the test. They have a general idea of the subject area and recommendations for reading resources. Why not list the exact questions, I asked? Because they would just memorize the answers, I was told. I was unsure how this varied from the current system, except for the amount of stuff that must be memorized.

As I think about it, I realize that what we have in assessment is now an exact analogy to what we have in software or learning content. We have proprietary tests or examinations, the content of which is held to be secret by the publishers. You cannot share the contents of these tests (at least, not openly). Only specially licensed institutions can offer the tests. The tests cost money.

There is a range here. Because a widespread test like the SAT is hard to keep secret, various training materials and programs exist. The commercial packages give students who can afford them an advantage. Other tests, which are more specialized, are much more jealously guarded.

There are several things at work here:

- first, the openness of the tests. Without a public examination of the questions, how can we be sure they are reliable? We are forced to rely on 'peer reviews' or similar closed and expert-based evaluation mechanisms.

- second, the fact that they are tests. It is not clear that offering tests is the best way to evaluate learning. Just like teaching has for generations depended on the lecture, so also assessment has for generations depended on the test. If the system were opened up, would we see better post-industrial mechanisms of assessment?

- third, there is the question of who is doing the assessing. Again, the people (or machines) that grade the assessments work in secret. It is expert-based, which creates a resource bottleneck. The criteria they use are not always apparent (and there is no shortage of literature pointing to the randomness of the grading). There is an analogy here with peer-review processes (as compared to recommender system processes).

- fourth, the testing industry is a closed market. Universities and colleges have a virtual monopoly over degrees. Other certifications are similarly based on a closed network of providers. This creates what might be considered an artificial scarcity, driving up the cost.

The proposition here is that, if the assessment of learning becomes an open, and community, enterprise, rather than closed and proprietary, then the cost of assessment would be reduced and the quality (and fairness) of assessment would be increased, thus making credentialing accessible.

We now turn to the question of what such a system would look like. Here I want to point to a line of demarcation that will characterize future debate in the field.

What constitutes achievement in a field? What constitutes, for example, 'being a physicist'? As I discussed a few days ago, it is not reducible to a set of necessary and sufficient conditions (we can't find a list of competences, for example, or course outcomes, etc., that will define a physicist).

This is important, of course, because there is a whole movement in development today around the question of competences. The idea here is that accomplishment in specific disciplines - first-year math, say - can be characterized as mastery of a set of competences.

This is a reductive theory of assessment. It is the theory that the assessment of a big thing can be reduced to the assessment of a set of (necessary and sufficient) little things. It is a standards-based theory of assessment. It suggests that we can measure accomplishment by testing for accomplishment of a predefined set of learning objectives.

Left to its own devices, though, an open system of assessment is more likely to become non-reductive and non-standards based. Even if we consider the mastery of a subject or field of study to consist of the accomplishment of smaller components, there will be no widespread agreement on what those components are, much less how to measure them or how to test for them.

Consequently, instead of very specific forms of evaluation, intended to measure particular competences, a wide variety of assessment methods will be devised. Assessment in such an environment might not even be subject-related. We won't think of, say, a person who has mastered 'physics'. Rather, we might say that they 'know how to use a scanning electron microscope' or 'developed a foundational idea'.

While assessment in a standards-based system depends on measurement, in a non-reductive system accomplishment in a discipline is recognized. The process is not one of counting achievements but rather of seeing that a person has mastered a discipline.

We are certainly familiar with the use of recognition, rather than measurement, as a means of evaluating achievement. Ludwig Wittgenstein is 'recognized' as a great philosopher, for example. He didn't pass a series of tests to prove this. Mahatma Gandhi is 'recognized' as a great leader. We didn't count successful election results or measure his economic output to determine his stature.

In a more mundane manner, professors typically 'recognize' an A paper. They don't measure the number of salient points made nor do they count spelling errors. This is the purpose of an oral exam at the end of a higher degree program. Everything else is used to create hurdles for the student to pass. But this final process involves one of 'seeing' that a person is the master of the field they profess to be.

What we can expect in an open system of assessment is that achievement will be in some way 'recognized' by a community. This removes assessment from the hands of 'experts' who continue to 'measure' achievement. And it places assessment into the hands of the wider community. Individuals will be accorded credentials as they are recognized, by the community, to deserve them.

How does this happen? It beaks down into two parts:

- first, a mechanism whereby a person's accomplishments may be displayed and observed.

- second, a mechanism which constitutes the actual recognition of those accomplishments.

We have already seen quite a bit of work devoted to the first part. We have seen, for example, describe the creation of e-portfolios, intended a place where a person can showcase their best work.

The concept of the portfolio is drawn from the artistic community and will typically be applied in cases where the accomplishments are creative and content-based. In other disciplines, where the accomplishments resemble more the development of skills rather than of creations, accomplishments will resemble more the completion of tasks, like 'quests' or 'levels' in online games, say.

Eventually, over time, a person will accumulate a 'profile' (much as described in 'Resource Profiles'). We can see this already in systems like Yahoo Games, where an individual's profile lists the games they play and the tournaments they've won.

For the most part, recognition will be informal rather than formal. People can look at the individual's profile and make a direct assessment of the person's credentials. This direct assessment may well replace the short-hand we use today, in the form of degrees.

In other cases, the evaluation of achievement will resemble more a reputation system. Through some combination of inputs, from a more or less define community, a person may achieve a composite score called a 'reputation'. This will vary from community to community. The score will never be the final word (especially so long as such systems can be gamed) but can be used to identify leaders in a field. Technorati's 'authority' system is a very crude and overly global attempt to accomplish such a thing.

In still other cases, organizations - such as universities, professional associations, governments and companies - may grant specific credentials. In such cases, the person may put forward their portfolios and profiles for consideration for the credential. This will be a public process, with everyone able to view the presentation. Institutions will be called to account for what the public may view to be fair or unfair assessments. Institutions will, over time, accumulate their own reputations. The value of a degree will not be based on its cost, as is the case currently, but on the level of achievement required.

Most of the latter part of this post consists of speculation, based on models we have already seen implemented on the web. But the speculation nonetheless point to a credible alternative to proprietary testing systems. Specifically:

- credentials are not reduced to necessary and sufficient conditions (competences). Any body of achievement may be posited as evidence for a credential.

- these bodies of achievement - profiles and portfolios - result from interactions with a wide range of agencies and represent a person's creative and skill-based capacities

- considerations of these achievements for credentials are open, that is, the public at large may view the profiles and portfolios being accepted, and rejected, for given credentials

- there is no monopoly on the offering of credentials; any agency may offer credentials, and credibility of the agency will be based on the fairness of the process and the difficulty of the achievement

Yes, this is a very different picture of assessment than we have today. It replaces a system in which a single set of standards was applied to the population as a whole. This was an appropriate system when it was not possible for people to view, and assess, a person's accomplishments directly. No such limitation will exist in the future, and hence, there is no need to continue to judge humans as 'grade A', 'grade B' and 'grade C'.


  1. Wow, this is a pretty remarkable and rigorous challenge to how we might approach a common space of "competencies" (to use a loaded term) through a distributed teaching and learning network. This notion of open source assessment is fascinating in that it re-invents the ways in which we worship iconic letter grades and build something meaningful in their place -like a portfolio of learning (schools in the States like UCSB, Reed College, and Bowling Green (I believe) have played with the portfolio concept as an alternative to grades -but the PLE (or however you term this) makes such an enticing approach to reflecting a learning experience that much more possible in terms of labor and sharing (more often than not a portfolio system in disciplines other than art was kiboshed due to the intensive labor placed on any one expert. But, as you say, what if the labor of a distributed community of "experts made this that much easier and relevant. This is a really powerful set of re-conceptualizations you're working through here Stephen, bravo!

  2. Much of this echoes quite a few of my thoughts on assessment.

    I hope you don't mind a quick question or two so I can get a clearer picture of what you envisage.

    There's the stuff that falls between competencies and things that can easily be presented as accomplishments. For instance, I'm very good at solving mathematical problems. At the point when I finished my degree, the only proof I had of that was my examination results and the reputation of the difficulty of Oxford finals. How do you see evidence for something like that fitting into your system?

    Also in practice, is it possible providing a meaningful credential for some things is such a time-consuming or specialist task that the only people who will do so are those essentially paid by the person in question to do so anyway?

  3. Making Questions Available

    When I teach at University Levels the students are given a list of 40 questions before the exam.

    At exam time they are asked to answer 20 of the 40 questions.

    The students have to explore the complete course materials to find the answers. There is more material to read than what can be easily assimilated at the students level.

    From what I have seen giving the questions works well in the type of courses I teach. (Software Engineering,Project Management,Quality Management,etc).

    The students wouldn't really need to show up in class to pass the exam but they do show up and rarely refer to the already disclosed questions.

    There is a small learning curve for the teacher when using this approach. You get better at this approach with time. Students seem to like it.


  4. Hi Stephen,

    Thanks for some interesting ideas and futuring.
    What you have suggested would decimate traditional universities in their current configuration, however, if the universities altered their role to that of diagnosticians and cognitive coaches, perhaps there is still a role for them.
    Back in 2004 I suggested that universities needed to change (
    and this is even more the case three years later.

    The assessment does bother me though and I suspect that professional and peak industry bodies may well become those who develop and administer assessment in the future rather than an amalgam of community and interested stakeholders.

    Thanks again for your thought provoking article.



  5. "As I think about it, I realize that what we have in assessment is now an exact analogy to what we have in software or learning content. We have proprietary tests or examinations, the content of which is held to be secret by the publishers."

    Whilst understanding the thrust of this post and agreeing with much of it, I think that the in generalisation your generalisations you misrepresent some of what is happening in practice.

    By no means all assessment is based on "proprietary tests or examinations, the content of which is held to be secret by the publishers". Much learning and subsequent assessment uses an inquiry led approach where there is no 'answer' as such but work is evaluated against criteria that are not only shared but discussed between students and their teachers/facilitators, etc. This is particularly the case in programmes that are work focussed/integrated.

    There are hot debates about what these criteria should be, but in terms of mastery of a specific domain it isn't the subject knowledge that is being assessed but things like students' 'critical thinking skills'; their ability to be critically reflective, problem solvers, analytical, make an impact on the workplace/colleagues, etc.

    Cheers, Stephen.

  6. Hi Stephen. As part of a Grad Cert in Teaching & Learning I undertook in 2005, I suggested that if we were to get serious about effective teaching and assessment we ought to run the program in reverse.

    On their First day the new students would undertake their 'final exam' for the year.

    They'd then get their results. Along with their results they'd get the course program that would help them fill in any gaps in their understanding which would include a program of lectures, course resources, practise sessions and so on.

    The student could then align their studentship specifically to those events/sessions/resources that help them gain the level of understanding required to pass the subject year. That way their time would be spent in classess etc doing the stuff they need to know (according to the institution) in order to pass, rather than rehashing stuff they are already competent at.

    Oh and if they pass the end of year exam on that first day? They've finished for the year and can start on year two subjects! That'd give them more time to spend on the stuff they really need to learn.

    Marcus Barber

  7. Assessment typically leads to accreditation, which is the competitive edge that "official" institutions have over informal competition. If we can find ways to make peer assessment work a lot of the institutional scaffolding will come down.

    I believe we will see two things happening in this are:

    (1) Increased competition in the accreditation space from commercial providers.

    In South Africa, the private sector supported a business university for undergraduate students. Since the university was not able to award degrees initially, it's students received official accreditation from the existing distance education institution. Yet, because employers knew that these students enjoyed a much richer education experience than those at the distance education institution, their degrees counted a lot more. I believe that more firms will get involved indirectly in similar arrangements.

    (2) Public reputation will replace paper

    Similar to free and open source software projects, students will do increasing amounts of their learning online, in ways that can be reviewed and scrutinized by others. The trust and peer reputation that emerge from such participation can equally be reviewed online. Accreditation reduces transaction costs of assessing knowledge, but if such assessments can be done quickly and easily on the basis of demonstrated work that is available online - the days of "paper tigers" are numbered.

    I am currently at the UNU/UNESCO conference on the future of higher education, and I wish there were more stimulating ideas like this one to discuss.

  8. I wrote an article about this hypothesized system more than a year ago. Only recently I picked up the subject again, because I might turn to researching this subject. I wrote a draft proposal, which can be found on or on my website (About me/Study & Work/Reports : Proposed PhD Thesis)

  9. Stephen,

    This piece has been important to several strands of our thinking. Now they are coming together as we try to think about a grade book outside of WebCT -- and in the process are transforming both the grade book and our thinking about it.

  10. As a product of H.E. institutions that embraced constructivist pedagogies and having spent a decade in Los Angeles building video games, I think electronic games provide a model for how open source assessment could work.

    At the start of a game the rules and mission are clearly spelled out so that the gamer can weigh their personal strengths and weaknesses against the challenge and develop a strategy for meeting it.

    In this vein I propose that the key elements of open assessment are a) a clear goal b) transparent and consistent rules provided at the start c) multi-paths to the goal.

  11. This post contains a number of interesting ideas. I'm still not sure why a community would want to assess the work. You might have a tradegy of the commons-type situation with the assessment.

  12. I do agree with much of what is being said here - certainly the idea of reducing the glories of any discipline to a set of competencies has always struck me as absurd. And you're right that really successful people are indeed judged to be so by their peers. But that raises all sorts of questions about the values of the peer community, pressures to conform, pressures to keep on producing, especially in a capitalist economy based on the protestant work ethic. It would take a brave student to do something different in this kind of environment and I think that might be the source of Matt's "tragedy of the commons"

    The current system falls short in many of the ways you've illustrated, but one of its small kindnesses is that it does give students a break from time to time. It would be interesting to think of a way of incorporating systems.


I welcome your comments - I'm really sorry about the moderation, but Google's filters are basically ineffective.