Wednesday, June 06, 2007

Open Source Assessment

This has come up in a couple of places lately, and I'd like to get the concept down on paper (as it were) so people has a sense of what I mean when I talk about 'open source assessment'.

The conversation comes up in the context of open educational resources (OERs). When posed the question in Winnipeg regarding what I thought the ideal open online course would look like, my eventual response was that it would not look like a course at all, just the assessment.

The reasoning was this: were students given the opportunity to attempt the assessment, without the requirement that they sit through lectures or otherwise proprietary forms of learning, then they would create their own learning resources.

Certainly, educational institutions could continue to offer guidance and support - professors, for example, could post guides and resources - but these would not constitute any sort of required reading, and could indeed be taken by students and incorporated into their own support materials.

This is the sort of system I have been talking about when I talk about open educational resources. Instead of envisioning a system that focuses on producers (such as universities and publishers) who produce resources that consumers (students and other learners) consume, we think of a system where communities produce and consume their own resources.

So far so good. But where does this leave assessment? It remains a barrier for students. Even where assessment-only processes are in place, it costs quite a bit to access them, in the form of examination fees. So should knowledge be available to everyone, and credentials only to those who can afford them? That doesn't sound like a very good solution.

In Holland I encountered a person from an organization that does nothing but test students. This is the sort of thing I long ago predicted (in my 1998 Future of Online Learning) so I wasn't that surprised. But when I pressed the discussion the gulf between different models of assessment became apparent.

Designers of learning resources, for example, have only the vaguest of indication of what will be on the test. They have a general idea of the subject area and recommendations for reading resources. Why not list the exact questions, I asked? Because they would just memorize the answers, I was told. I was unsure how this varied from the current system, except for the amount of stuff that must be memorized.

As I think about it, I realize that what we have in assessment is now an exact analogy to what we have in software or learning content. We have proprietary tests or examinations, the content of which is held to be secret by the publishers. You cannot share the contents of these tests (at least, not openly). Only specially licensed institutions can offer the tests. The tests cost money.

There is a range here. Because a widespread test like the SAT is hard to keep secret, various training materials and programs exist. The commercial packages give students who can afford them an advantage. Other tests, which are more specialized, are much more jealously guarded.

There are several things at work here:

- first, the openness of the tests. Without a public examination of the questions, how can we be sure they are reliable? We are forced to rely on 'peer reviews' or similar closed and expert-based evaluation mechanisms.

- second, the fact that they are tests. It is not clear that offering tests is the best way to evaluate learning. Just like teaching has for generations depended on the lecture, so also assessment has for generations depended on the test. If the system were opened up, would we see better post-industrial mechanisms of assessment?

- third, there is the question of who is doing the assessing. Again, the people (or machines) that grade the assessments work in secret. It is expert-based, which creates a resource bottleneck. The criteria they use are not always apparent (and there is no shortage of literature pointing to the randomness of the grading). There is an analogy here with peer-review processes (as compared to recommender system processes).

- fourth, the testing industry is a closed market. Universities and colleges have a virtual monopoly over degrees. Other certifications are similarly based on a closed network of providers. This creates what might be considered an artificial scarcity, driving up the cost.

The proposition here is that, if the assessment of learning becomes an open, and community, enterprise, rather than closed and proprietary, then the cost of assessment would be reduced and the quality (and fairness) of assessment would be increased, thus making credentialing accessible.

We now turn to the question of what such a system would look like. Here I want to point to a line of demarcation that will characterize future debate in the field.

What constitutes achievement in a field? What constitutes, for example, 'being a physicist'? As I discussed a few days ago, it is not reducible to a set of necessary and sufficient conditions (we can't find a list of competences, for example, or course outcomes, etc., that will define a physicist).

This is important, of course, because there is a whole movement in development today around the question of competences. The idea here is that accomplishment in specific disciplines - first-year math, say - can be characterized as mastery of a set of competences.

This is a reductive theory of assessment. It is the theory that the assessment of a big thing can be reduced to the assessment of a set of (necessary and sufficient) little things. It is a standards-based theory of assessment. It suggests that we can measure accomplishment by testing for accomplishment of a predefined set of learning objectives.

Left to its own devices, though, an open system of assessment is more likely to become non-reductive and non-standards based. Even if we consider the mastery of a subject or field of study to consist of the accomplishment of smaller components, there will be no widespread agreement on what those components are, much less how to measure them or how to test for them.

Consequently, instead of very specific forms of evaluation, intended to measure particular competences, a wide variety of assessment methods will be devised. Assessment in such an environment might not even be subject-related. We won't think of, say, a person who has mastered 'physics'. Rather, we might say that they 'know how to use a scanning electron microscope' or 'developed a foundational idea'.

While assessment in a standards-based system depends on measurement, in a non-reductive system accomplishment in a discipline is recognized. The process is not one of counting achievements but rather of seeing that a person has mastered a discipline.

We are certainly familiar with the use of recognition, rather than measurement, as a means of evaluating achievement. Ludwig Wittgenstein is 'recognized' as a great philosopher, for example. He didn't pass a series of tests to prove this. Mahatma Gandhi is 'recognized' as a great leader. We didn't count successful election results or measure his economic output to determine his stature.

In a more mundane manner, professors typically 'recognize' an A paper. They don't measure the number of salient points made nor do they count spelling errors. This is the purpose of an oral exam at the end of a higher degree program. Everything else is used to create hurdles for the student to pass. But this final process involves one of 'seeing' that a person is the master of the field they profess to be.

What we can expect in an open system of assessment is that achievement will be in some way 'recognized' by a community. This removes assessment from the hands of 'experts' who continue to 'measure' achievement. And it places assessment into the hands of the wider community. Individuals will be accorded credentials as they are recognized, by the community, to deserve them.

How does this happen? It beaks down into two parts:

- first, a mechanism whereby a person's accomplishments may be displayed and observed.

- second, a mechanism which constitutes the actual recognition of those accomplishments.

We have already seen quite a bit of work devoted to the first part. We have seen, for example, describe the creation of e-portfolios, intended a place where a person can showcase their best work.

The concept of the portfolio is drawn from the artistic community and will typically be applied in cases where the accomplishments are creative and content-based. In other disciplines, where the accomplishments resemble more the development of skills rather than of creations, accomplishments will resemble more the completion of tasks, like 'quests' or 'levels' in online games, say.

Eventually, over time, a person will accumulate a 'profile' (much as described in 'Resource Profiles'). We can see this already in systems like Yahoo Games, where an individual's profile lists the games they play and the tournaments they've won.

For the most part, recognition will be informal rather than formal. People can look at the individual's profile and make a direct assessment of the person's credentials. This direct assessment may well replace the short-hand we use today, in the form of degrees.

In other cases, the evaluation of achievement will resemble more a reputation system. Through some combination of inputs, from a more or less define community, a person may achieve a composite score called a 'reputation'. This will vary from community to community. The score will never be the final word (especially so long as such systems can be gamed) but can be used to identify leaders in a field. Technorati's 'authority' system is a very crude and overly global attempt to accomplish such a thing.

In still other cases, organizations - such as universities, professional associations, governments and companies - may grant specific credentials. In such cases, the person may put forward their portfolios and profiles for consideration for the credential. This will be a public process, with everyone able to view the presentation. Institutions will be called to account for what the public may view to be fair or unfair assessments. Institutions will, over time, accumulate their own reputations. The value of a degree will not be based on its cost, as is the case currently, but on the level of achievement required.

Most of the latter part of this post consists of speculation, based on models we have already seen implemented on the web. But the speculation nonetheless point to a credible alternative to proprietary testing systems. Specifically:

- credentials are not reduced to necessary and sufficient conditions (competences). Any body of achievement may be posited as evidence for a credential.

- these bodies of achievement - profiles and portfolios - result from interactions with a wide range of agencies and represent a person's creative and skill-based capacities

- considerations of these achievements for credentials are open, that is, the public at large may view the profiles and portfolios being accepted, and rejected, for given credentials

- there is no monopoly on the offering of credentials; any agency may offer credentials, and credibility of the agency will be based on the fairness of the process and the difficulty of the achievement

Yes, this is a very different picture of assessment than we have today. It replaces a system in which a single set of standards was applied to the population as a whole. This was an appropriate system when it was not possible for people to view, and assess, a person's accomplishments directly. No such limitation will exist in the future, and hence, there is no need to continue to judge humans as 'grade A', 'grade B' and 'grade C'.