In 1989 I was reaching the peak of my career. My PhD coursework was complete and behind me, I was gainfully employed (if underpaid) teaching logic and philosophy for the U of A and Athabasca University, I was elected for my first term as president of the Graduate Students' Association, and I was riding a wave of personal and political popularity.
More importantly for me, I was finally understanding the problems that had drawn me to formal learning in the first place. Though I had started university simply because it was a requirement for advancement in the work world, over the years I had been drawn increasingly to political activism and philosophical exploration. In 1989, the pieces came together. I watched the rise of 'people power' around the world. I had seen Francisco Varela speak on AIDS and immunology at the University of Alberta hospital. I began to see how networks, whether of individuals of cells, could take shape, form patterns, act with purpose. And how this would reshape how we understood the world.
In 1990 I attended, along with a number of the other graduate students at the University of Alberta, the Connectionism conference at Simon Fraser (downtown), combining it with a National Graduate Council meeting and a week-long vacation in New Westminster I spent reinterpreting the Tao Te Ching. That summer I sat at the very top of the hill at the Edmonton Folk Festival and in a frenzy of writing, completed the first draft of what would eventually become The Network Phenomenon: Empiricism and the New Connectionism. In the fall of that year, I presented it to my doctoral committee as a proposal for the work I wished to do to complete my PhD.
It has been almost 20 years, and I thought I had put it behind me, but recently I see that my former supervisor, and chair of that committee, is now one of the people blogging on a philosophy website.
Now you can read the proposal for yourself - that's why I put it online. Having just retyped it (I'll use OCR for my other work, but I wanted to revisit this paper personally) I can see that it is an overly ambitious work covering a wide swath of theory and evidence. As a proposal, it also lacks a lot of the depth and research that one would want of a completed dissertation. Yet, still, 20 years later, the paper strikes me as genuine, original, and important. A dissertation based on this work - or even just a chapter of this work, which might have been more appropriate - would have been a worthwhile contribution to the field.
The committee didn't see it that way. Led by the chair, they engaged in an attack on the basic premises of the work, of the idea of associationist forms of reasoning and connectionist models of cognition. The idea that cognition could be non-propositional, the idea that proof would proceed by metaphor and similarity, rather than form and validity, they rejected as ridiculous. For good measure they offered the opinion that even if the work were worthwhile, it would be well beyond my capability. The committee felt that my PhD would be better spent in an investigation of mental content - something I had denied in the paper even existed! - rather than this fool's errand.
I submitted a dissertation proposal based on mental content a couple of weeks later, a 30-page overview of the field they were quite enthused about. But my heart had fallen out of the project. I wrote for myself a long diatribe attacking a book the committee was enthusiastically recommending, Jerry Fodor's Psychosemantics, called "Trash Fodor" (when I find it, I'll post it). I thought the book represented the epitome of the inanity of the cognitivist approach. I gradually turned my back on the program and on philosophy in general. I retreated to my little cabin in northern Alberta, taught logic, and worked on my computer.
I have never forgotten - or stopped believing - the work I presented in that paper. About five years later I began writing again - you can see it as the beginning of the work on Stephen's Web - and began rebuilding my understanding of learning, inference and discovery. My work continued to be informed by my understanding of connectionism, people power, the Tao, and related concepts. The structure of content networks, the organization of metadata, and my description of connective knowledge, all are based on this basic foundation.
I struggle every day with the question of whether my work is genuine, original and important, whether, indeed, it is even academically and scientifically sound. I look at the work of others - like Varela's, for example - and I am daunted and humbled. But such work, too, is rare. And what I leave behind is so different in format and method and in style and structure a comparison is probably impossible. The best I can do is to work as honestly and as openly as possible, consistent in my pratcice and my principles.
So when I realized how angry I was, even these many years later, I concluded that the best - and only - response would be to put the material into the open, and let people decide for themselves. Because there's a certain sense in which I feel I have missed out. And I'm sure some people will find it trivial and others obscure, some will find it too dense and others too simplistic, some will see in it a naive foray into amateur epistemology while others will see it as part of a wider discipline. Some will think I should have been able to complete my PhD, while others will question whether I have any academic merit at all. And I - well, I will see it as mine. As me.
Being angry was cathartic, because it made me see what I've had to come through, and I'm over it now.
You know, in life, you have certain kinds of regrets. One kind of regret revolves around the opportunities you never had - what if I had had better schools, better teachers, better jobs, better finances. What if I had been treated fairly here, rewarded justly there, shown this in that place. Things I could never be, places I could never go. These are regrets over things I cannot control. But the other kind of regret - ah. The regret of a man who was not true to himself, who did not give his all, who held himself back or conformed for the sake of advancement, of the man who stopped seeking because he was told what to believe: these are the regrets I could not bear to feel.
I guess I had a choice, back in 1990, about which kind of regret I would feel 20 years later. I do not, for an instant, think I made the wrong choice.
Tuesday, March 24, 2009
TNP: 20 Years On
Posted by
Downes
at
5:57 AM
10
comments
Links to this post
Monday, March 23, 2009
TNP 11. Projects and Investigations
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part X Previous Post
XI Projects and Investigations
A. Computational Difficulties
Let me now conclude this paper by outlining a number of areas of further investigation which ought to be pursued in order to accomplish the fullest and most useful presentation. These areas divide into two distinct categories: conceptual difficulties and computational difficulties. Let me outline some computational difficulties first.
By computational difficulties I mean aspects of the implementation of connectionist theories on computers. A number of concerns can be raised by viewing connectionism within a philosophical framework and some additional features are required. What I would like to do here is actually build a connectionist system using the C programming language and intended for application on an IBM XT compatible or clone. The large number of options, for example, different learning rules, will be incorporated as options on my own system. This system will fill a void on the market: an easy to use connectionist system which costs less than $1,000.
Having developed a connectionist system (which I'll call SDPDP) I want first to look at network variability in aconnectionist system. First of all, I want to construct nets in which different options may be employed in different parts of the same net at the same time. For example, in a PDP net [83] either every unit employs a stochastic on-off activation or every unit is activated in degrees. But in some systems, we want to be able to have units of both varities. In addition to variable structure, I want to incorporate some mechanisms of network plasticity. For in human systems, not only the connections, but the units themselves grow in response to input, especially in early life. Finally, I want to consider what I call "dimensions". For we want it to be the case that such things as religious conversions and scientific revolutions are possible. This requires that a network be able to construct [alternative] pairs of stable representations at the same time, which may alternate in priority. [84] Each of these alternative representations I cann a "dimension".
Another computational problem which I wish to consider concerns learning schedules and annealing. Currently, PDP systems employ a system which is very similar to that employed in physics. But, first, it is not clear that an annealing equation which is suitable for thermodynamics is suitable for human brains. I would like to investigate grounds for choosing one, rather than another, annealing equation. Second, it is clear to me that the annealing schedule employed is inadequate. In my view, temperature increases and decreases ought to be cyclic, as for example paterns of increased brain activity when we sleep. In addition, temperature ought to be sensitive to input, so that we can rapidly process conflicting input.
Finally, there is the hardware itself to think about. Human hardware is much smaller and more complex [than] contemporary computer technology. Perhaps we will not be able to build actual neurons, however, it seems reasonable that, now that we know exactly what we are looking for, we can make some plausible suggestions regarding how to build a computer neuron. I think that it would be best if many of the features currently represented by parameters, for example, threshold or rest values, can be implemented physically.
B. Conceptual Questions
By conceptual questions, I mean investigations into some of the things which connectionism can tell us about epistemology and the philosophy of mind. For, if the arguments concerning rules and categories are sufficiently strong, then we will want to reevaluate such concepts as knowledge and belief. For example, I would like to say that an item of knowledge is a stable pattern of activation, a pattern which tends not to change given varying input. If this is the case, then I may want to say with Feldman that "you do not have a store of knowledge, you are your knowledge." [85] In such a case, then, it becomes necessary to explore what we are and what part of us it is which is our knowledge.
In addition, I want to consider questions concerning theoretical and physical parallelism which arise. For example, through this paper I have used the terms "neuron" and "unit" roughly equivalently. I have also talked of the advisability of using this or that learning equation according to whether or not humans actually employ (or instantiate) the equation. We need to ask, first, whether or not we should design systems in parallel with human neural structure, and if so, what they would look like, and even further, how we would determine what they would look like.
As another investigation, I want to make some remarks about the nature of knowledge (as opposed to the definition of knowledge). For, if knowledge consists of stable patterns of activation then we cannot think in terms of knowledge as being sentences which have a given propositional content. It is unclear whether we can assign propositional content to patterns of activation. If that difficulty does arise, then we may want to consider some other relation between that which would serve as content (for example, representations of events in the real world) and patterns of activation. Here, perhaps, one could follow Armstrong and oldman and assert that there is a causal connection (and distinguish between appropriate and inappropriate causes). In order to successfuly defend this approach, it is necessary to give a fulla ccount of how we learn about causes.
Yet another investigation concerns consciousness. I have suggested above that there are conscious and unconscious regions of the brain. My belief is that those regions which are conscious are those which correspond to the activation of sensory input areas. In other words, my hearing someone speak a sentence and my thinking in a sentence is an activation of the same set of neurons (or an overlapping set). This solves the problem of how we can have a "stream of consciousness [86] in a non-linear network. But a much more detiled story is required here.
Finally, it is worth posing the question of whether connectionism is a type of scientific revolution, in the Kuhnian sense, or whether it is not. Some philosophers, for example Stich and Johnson-Laird, have expressed the opinion that it is not. In my own view, since so many traditional concepts must be overturned, then it is a scientific revolution. Haveing said that, however, I must ask whether or not we are working within an eliminativist paradign, as suggested by, say, Churcland, or not. In my view, there is still a role for words such as "knowledge" and "belief". If I believe this, then first I must explain this role, and then show how this role makes sense within the new paradigm.
C. Other Projects
When I began by asserting that connectionism vindicates empiricism, I embarked on a philosophical enterprise. What followed has been primarily technical and non-philosophical. I would like to return to a connectionist treatment of some philosophical issues.
For example, some contemporary [87] adocate a form of nominalism. While the philosophical debates concerning realism and nominalism are periphrial to this project, it is still the case that connectionism, if successful, should shed some light in this direction. I assume that it would support a form of nominalism, but this should be more fully explained.
Another project of a philosophical nature concerns the foundationalism-coherence debate. If we employ relevant similarity instead of truth-presenvation as a means o evaluating inference then the traditional concept of justification, if it must not be abandoned altogether, must be radically altered. This sheds a completely new light on the traditional problem and is worth investigating.
TNP: 20 Years On
[83] Rumelhart and MacClelland, Explorations.
[84] For example, we may switch back and forth between views of a Necker Cube.
[85] J.A. feldman, "A Connectionist Model of Visual Memory", in Hinton and Anderson (eds.), Parallel Models of Associative Memory, p. 51.
[86] See William James, The Principles of Psychology, p. 279.
[87] Like Nelson Goodman.
Posted by
Downes
at
3:01 PM
0
comments
Links to this post
TNP 10. Summary
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part IX Previous Post
Part X Summary
This concludes the presentation of the theory of learning and cognition which I wish to present. Before describing some of the further avenues of investigation I wish to pursue, let me summarize what I have asserted to this point in this paper.
I began by proposing a new theory of learning, connectionism, and described some prima facie objections to the theory. In order to respond to those objections, I argued that we need to reconsider some paradigms concerning rules and categories. Then I developed connectionism as an alternative theory of rules and categories. On this new theory, any concept is represented as a pattern of activations in a network of interconnected units. A category, on this view, is represented by a unit which can be actiuvated, and the members of the category are the concepts whose connections activate the unit which represents the category. Connectionist networks not only store categories in this manner, they can learn them on their own. In order to develop this idea further, I examined a number of objections to the concept of distributed representation. To meet these objections, I described how patterns are developed from perceptual input and described and defended the "picture" theory of representation.
I then turned to considering detailed objections to associationism and connectionism. These divided into problems concerning distributed representation, problems concerning perception, and problems concerning associative mechanisms in general. In order to defend against problems of distributed representation, three types of patterns of connectivity were identified and the concept of similarity was defined in terms of activation vectors. In order to develop a theory of perception, I defined perception as input activations and two types of perception, conscious and real perception, were identified. This successfully explained theory-ladeness and the development of three-dimensional representations without the requirement of a priori or innate knowledge. Finally, a number of arguments against associationism were considered. In order to show that associationist and connectionist systems can perform higher level cognitive functions, I argued that a two-stage process is employed. First, prototypical representations are constructed, and then second, these are used to support inference by analogy or metaphor. This is in turn supported by the observation that such process [can] be viewed as operations. Finally, I considered the problem of the evaluation of models and inferences in connectionist systems, and argued that we should employ the concept of relevant similarity.
TNP Part XI Previous Post
Posted by
Downes
at
2:43 PM
0
comments
Links to this post
Sunday, March 22, 2009
More on New Knowledge
Responding to Tony Bates, Bates and Downes on new knowledge: Round 3
You say > However, I don’t believe the distinction between ‘academic’ knowledge and ‘applied’ knowledge is particularly useful.
Here we agree.
You say > What is useful is a distinction between academic and non-academic knowledge, as measured by the values or propositions that underpin each kind of knowledge.
Here we disagree.
First, I'm not sure you can made the distinction stick.
Second, even if you make the distinction stick, then so much the worse for academic knowledge, because the values or propositions that underpin academic method are unsound.
You say academic method > AIMS for deep understanding, general principles, empirically-based theories, timelessness, etc
Yes. But it shouldn't. That's my point.
You say > Academic knowledge is not perfect, but does have value because of the standards it requires.
This is a statement deserving of more discussion, because I think that either academics have lost track of the standards, being devoted to process over rigor, or that the standards adhered are in fact no guarantor of worthwhile results.
You say > I also agree with Stephen that knowledge is not just ’stuff’, as Jane Gilbert puts it, but is dynamic. However, I also believe that knowledge is also not just ‘flow’.
It is neither 'stuff' nor 'flow', in my view. I explicitly reject both views in my post and in the comment that follows.
As I wrote:
"The central tenet of emergence theory is that even if stuff flows from entity to entity, that stuff is not knowledge; knowledge, rather, is something that 'emerges' from the activity of the system as a whole.
"This network - and subnets with the network (aka 'patterns of connectivity') - may be depicted as knowledge...
"A second way of representing knowledge, and one that I embrace in addition to the first for a variety of reasons, is that patterns of connectivity can be recognized or interpreted as salient by a perceiver."
The reason why this depiction is important is that knowledge, on this view, is *not* "deep understanding, general principles, empirically-based theories, timelessness, etc."
So whatever it is that academic method is aiming for, it is not knowledge.
This is a key point of contention between us:
You write > at some point each person does settle, if only for a brief time, on what they think knowledge to be. At this point it does become ’stuff’ or content. I still contend then that ’stuff’ or content does matter, though recognising that what we do with the stuff is even more important.
I disagree with.
I do describe (following o0thers) 'settling mechanisms' in the brain. We can say that we 'settle'. We can hypothesize, at least, a (thermodynamically) stable state of connections and activations in the brain.
But the 'entities' in such a system (if we can call them that) that constitute 'knowledge' do NOT have the properties of 'stuff' or 'content'. This is the key and fundamental point of my argument:
Not 'stuff' - not discrete, not localized, not atomic
Not 'content' - not semantical, not propositional, not symbolic
And that's my problem with academic method. It seeks out specifically propositions - symbolic or semantical - that are discrete, localized and atomic. Things that are _candidates_ for deep understanding, general principles, empirically-based theories, timelessness.
I think that maybe if we can untangle the vocabulary we might come to agreement on this. After all,
You say > this is likely to result in a shift in knowledge that may be very important, and it is in this area where I think Stephen and I may have some agreement.
This encourages me.
Skipping ahead quite a bit...
You write > My concern about much of the discussion of the ‘new’ knowledge is that it seems to depend on what I might call majority voting - it is the number of hits that matter, not the quality of the content.
Quite so.
Voting - and counting generally - record only the mass of a thing. They require some sort of identity (in order to identify that which is being counted).
This is distinct from the type of knowlecdge I have been trying to describe, which depends not on the quantity of things assembled, but on the way those things are interconnected.
This is what I have tried to clarify with the distinction between 'groups' and 'networks'. http://www.downes.ca/post/42521
The properties found in the group are (to my way of seeing) just those embraced by what we have been calling the academic method. If you look at the diagram http://www.flickr.com/photos/stephen_downes/252157734/ you see typical academic values: unity (of purpose, of workers, of science), coordination, closed systems, distributive (expert-based) knowledge.
Knowledge based on networks is not based on counting - not on votes, on surveys, on mass, on category or type, etc. because knowledge is not the sort of thing that can be counted, not the sort of thing that can be generalized (as a mass).
The objection to voting *is* an objection to academic method.
The new knowledge is precisely *not* knowledge by counting, knowledge by popularity.
But it's not knowledge by experts ether. Because if we say that knowledge is based on experts and expertise, then we are saying that knowledge is the 'stuff' that's in people's heads that goes from place to place. Which - again - it isn't.
Now it is reasonable to disagree with my position on knowledge, but it's important to recognize that 'network knowledge' isn't based on counting or popularity - no matter how much this is emphasized by the (popular) media.
Finally,
> Lastly, Stephen was puzzled as to why I felt a blog was not the best way to discuss this issue. What I feel the topic needs is more space and time, and a critique from philosophers would also add to the discussion, I am sure, because I do not have specialist knowledge or training in epistemology. I would like to have had more time to review other writers on this topic, and more space to elaborate my views. I feel that I could do a better job that way.
Well - take all the time and space you need. Neither are in short supply on blogs.
Indeed - and this is one thing I like - you can go back over again, return to the same point again, attack it from various angles - a whole range of things you can't really strive for in any other forum.
> It was not because I needed the discussion to be academically reviewed in the way that journals are reviewed
Good. because if we were restricted by reviewers, we could never be having this discussion. Which would be a pity.
Posted by
Downes
at
3:09 PM
1 comments
Links to this post
TNP 9. Connectionism and Justification
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part VIII Previous Post
IX. Connectionism and Justification
A. When Some Connctions Are Better Than Others
An objection exactly analogous to the objection to operationalism may be brought against connectionism in general. In connectionist systems, anything may be connected with anything else. However, it is clear that there must be some subset of the set of all possible connections such that the connections in this subset are better than the other connections. For example, among the types of connection which are possible, there is a subset of connections which corresponds to logical inference. [74] We want to distinguish these logical connections from those connections which are (for lack o a better term) merely accidental. However, there is no means, from within a strictly connectionist framework, of establishing this distinction. Therefore, connections must be evaluated according to constraints over and above any given connectionist system.
One weakness of the objection just stated is that there is no clea agreement regarding what constitutes the proper constraints for such an evaluation. Suppose, for example, we are attempting to parse a sentence in order to determine its meaning. According to some philosophers, for example, Fodor, this task may be accomplished with reference to grammar, that is, rules and structure. Others, for example Winograd, argue that semantical consideraions need sometims to be taken into account. It is also reasonable to argue that the meaning of a sentence can only be determined with respect to pragmatic, or context-dependent, constraints.
Similarily, in the philosophy of science, there is no clear agreement regarding what constitutes a good scientific theory. Some philosophers, for example van Fraassen, argue that theories ought to be evaluated according to their empirical adequac. Others, such as Hooker, argue that "epistemic virtues" such as simplicity and coherence are what guides the evaluation of a theory. According to many philosophers, most prominent among them being Popper, a scientific theory ought to be testable, bt this does not stop some theorists, for example van Daniken, from porposing untestable theories. And finally, some philosophers follow Feyerabend and assert that there are no standards of goodness for scientific theories.
These examples may appear to be out of place on the ground that, in the formal disciplines, there are clear standards for the evaluation of operations. In logic, we have the constraint of truth-preservation, specifically, an inference is valid if and only if it preserves truth, and is invalid otherwise. In mathematical equations, similarily, an operation is correct if and only if it preserves equivalence, and incorrec otherwise. Therefore,if a connectionist system cannot distinguish between, say, truth-preserving and non-truth preserving operations, then the system must be guided by some set of constraints over and above itself, that is to sa specifically, it must be guided by innate constraints. There are several examples in the literatire of this sort of consideration. Fodor [75] criticizes the "picture" theory of representation on similar grounds, and Holland (et.al.) [76] build such constraints into their system of inductive inference.
The idea here is that in any representation, there will be representational content. Representational content may be more or less representative of what it represents. For example, if the representation is propositional in form, then the proposition will be either true or false according to whether whatever is asserted by the proposition is in fact the case. the criticism, therefore, of connectonist systems is that there is no means of evalating connections such that it can be determined that their representational content corresponds, or does not correspond, to whatever happens to be the case.[77]
If I were to use the general response to ojections outlined above, then if this were an item of knowledge, I would deny it, and if it were a skill or capacity, I would explain how it can be accomplished using an associationist (connectionist) mechanism. owever, ruth does not appea to fall under wither category, and hence, needs a special discussion of its own.
B. Truth
Let me examine the concept of truth more closely. The standard, naive definition of truth is correspondence with reality, for example, a proposition P is true if and only if P. This definiion of truth is inadequate because there are many propositions which are true, for example, predictions and other subjunctive conditionals, or statements about possibility, to which by definition nothing in the world corresponds. A better definition of truth is provided by Tarski: P is true if and only if it corresponds with a model of the world.
But this is a different definition of truth than the definition of truth which is considered to apply in formal inferences, for in this case, we are talking about truth-preservation and not truth per se. A logical inference is valid strictly according to its form; the world is not a factor to be taken into consideration. Thus, the claim that logical inferences are truth-preserving by itself has nothing to do with the nature of the world or models of the world. An additional link - between truth-preservation and correspondance - must be established indepe3ndently. For, without such a link, truth-preservation by itself is no virtue. It must be shown that truth-preservation is a good means of constructing inference about the world or about models of the world.
For a certain set of inerences, we can concede that this is the case. Take, for example, an inference about points on a journey. If x arrived at A before B and x arrived at B before C, then the rules of truth-preservation tell us that x arrived at A before C. This inference is confirmed by observation. It is however by no means clear that the rules of truth=preservation always apply when we are talking about the world. First, there is no reason to believe that these laws actually apply to the real world or even t models of the real world (unless the models are governed by an a priori stupilation that they must adhere to such rules, in which case holding up the model as an example is a fancy way of begging the question). [78] And second, it is clear that we want to make many [other] inferences aout the world or models of the world, for example, inductive inferences, for which the rule of truth preservation [is] of little or no use. Therefore, in at least some cases, something other than the rule of truth preseration must be employed in order to evaluate our inference.
This is an important criticism of the objection to connectionism and associationism. In response to the objection that connectionist systems cannot provide an evaluation of this or that representation, the response is that traditional systems fare no better, or at least, are only a very slight improvement.
C. Relevant Similarity
Opposed to the concept of truth as our standard of evaluation, I wish to propose the standard of "relevant similarity". This standard has a number o advantages. First, it works, in the sense that successful inferences can be distinguished from unsuccessful inferences using relevant similarity. Second, in order to employ relevant similarity, no innate or a priori constraints are required. We know this because systems which naturally employ relevant similarity, connectionist and associationist systems, require no innate or a priori constraints. And third, the standard of relevant similarity is exremely powerful. For example, inferences ma be evaluated drecly according to relevant similarity, for example, the sample of a generalization must be relevantly similar to the whole. Or at another level, an inference may be evaluated according to whether or not its form (that is, some abstraction of the inference) is relevantly similar to previously successful inferences. Let me sketch these in a bit more detail.
Consider the typical industive inference. The premises consist of a set of instances of some phenomenon or state of affairs, for example, "A1 is a B", "A2 is a B", etc. The conclusion is either a generalization of these observations, for example, "All A are B:, or a prediction about the next instance, for example, "An+1 is a B". Standard textbooks [79] list two major fallacies which can occur in such inferences: hasty generalization, in which too few instances are observed, and unrepresentative sample, in which observations are biased in some way. Both of these fallacies can be explained with reference to relevant similarity. An industive argument works because the premises and the conclusion all describe similar phenomena, so, if the phenomena described are not sufficiently simila, the inference fails. An inrepresentative sample is significantly different from the [sample described in the] conclusion, thus, the inference fails. In a hast generalization, we have not seen enough samples to be sure we have established similarity, hence, the inference fails.
Connectionist systems using relevant similarity for the evaluation of industive inferences avoid many of the problems which plague standard work in induction. [80] For example, one may ask why we use one particular set of premises, and not so0me other set of premises. The answer is naturally provided by the clustering mechanism described above. Another problem is the question of how many instances are required before we are able to say we have sufficient grounds to draw a conclusion. This answer is given by the activation value of the abstraction from a given cluster. If that abstraction has a sufficiently high activation value compared to other evaluations, then the inference works. Otherwise, it does not. There is no clear-cut numerical answer to these questions: it will always be relative to the structure of the net as a whole. What connectionism provides, and what traditional theories do not provide, is a mechanism for determining the answer in particular cases, rather than a mechanism which determines one answer for all cases.
I have already mentioned a few cases of the second sort of evaluation, that is, an evaluation which asserts that an inference is successul if its form is sufficiently to some previously correct or successful inference. So, for example, a person learns modus ponens by being shown examples similar to "If I am in Edmonton then I am in Alberta..." and learns not to deny the antecedent in the same way.
As I mentioned above, a connectionist system will attempt to employ relevant similarity on its own. It does this because such a system tends to adjust connection weights and unit activation untl it reaches a stable or "rest" position. The exact ature o this rest position depends to some degree on how the system is constructed: change the leaning rule and you change the rest position. However, in all cases, the settled state will be one in which all and only those units who's vectors are similar to the input activation will themselves be activated (or as nealy so as possible [81]). I have illustrated how we might develop a rule of transitivity which is useful on journeys from place to place. For example, Lakoff suggests that we develop the concept of cause by analogy with human actions.
TNP Part X Next Post
[74] These are described in Rumelhart and McClelland, Parallel Distributed Processing.
[75] In Ned Block, Imagery.
[76] In Holland, Holayk, Nesbitt and Thagard, Induction: Processes of Inference, Learning and Discovery.
[77] Here I am assuming a correspondance definition of truth. Other definitions are available, see, for example, Rescher.
[78] This is very similar to the point made about scientific theories, above, for if a scientific theory is a model of the world, then, as noted, there are innumerable possible ways of building such models.
[79] For example, Jerry Cedarblom and David Paulsen.
[80] See henry Kyburg, "Recent work in the Problem of Induction."
[81] See Rumelhart and Mac Clelland on satisfying multiple simultaneous constraints in Explorations in Parallel Distributed processing, ch. 3.
[82] Philip Kitcher, The Nature of Mathematical Knowledge.
Posted by
Downes
at
7:40 AM
0
comments
Links to this post
What I Do
As you all know, I work for the Government of Canada, as part of the National Research Council's Institute for Information Technology, Learning and Collaborative Technologies group, a position I have held since November, 2001.
My thanks to the people of Canada for funding this work. As I do from time to time, in the spirit of open and transparent public service, I offer, directly from my yearly performance review, a statement of my 2008 work report and 2009 work objectives. I have reordered the document a bit and ensured that nothing personal or proprietary was included.
Career Aspirations
To be recognized as a leading voice in the field of learning and learning technology, to advance the state of knowledge in these fields, and in particular, to identify and describe new forms of knowledge and learning enabled by, and suggested by, network technologies.
To apply this knowledge to the service of Canadians and of people worldwide, thereby promoting the educational aspirations of all by supporting access to free learning resources and tools, and the fostering of skills and aptitudes that enable people to take advantage of these resources to the greatest degree possible.
Overall research objective
Development of key elements of learning networks infrastructure, which includes ongoing contributions to Synergic3 project, initiation and ramp-up of PLE project, integration of video support via BVC project.
Note that the work in each of the areas below is intensively collaborative, involving close work with people in NRC, in the government of Canada, and in companies and universities around the world.
SynergiC3 (30 percent)
2008: Chaired R&D workgroup and led research in DDRM and Metadata research areas. Ongoing work involved continuing coordination of the research effort, including maintenance of the research plan, reporting on research activities, chairing of R&D WG meetings, and supervision of two co-op. Significant support and guidance offered to NRC researchers participating in the R&D workgroup. Co-Recipient of NRC National Award for this work.
2009 Plan: Chair, R&D workgroup and lead researcher in DDRM and Metadata research areas. Ongoing work will involve continuing coordination of the research effort, including maintenance of the research plan, reporting on research activities, conduct of R&D WG meetings, supervision of students, etc.
Learning Networks (5 percent)
2008: Released prototype gRSShopper software as an instance of a personal learning environment (PLE). This software played a major role in the Connectivism & Connective Knowledge course, offered in cooperation with the University of Manitoba. Received NRC-IIT Award for this work. While learning networks foundational project will continue, obtained approval and funding for spin-off Personal Learning Environment (PLE) project.
2009 Plan: This is foundational research. It includes the continuing development of the theory of network learning, as well as the instantiation of that theory in prototype software such as gRSShopper. Related to this work is the presentation of talks on e-learning, co-teaching of a course on Connectivism, etc. Will maintain active membership in IEEE-LTSC, IMS (Common Cartridge), SCC JTC1-SC36.
Personal Learning Environments (25 percent) New
2009 Plan: Project manager for a 2-year research and development project, including all areas of staffing and staff supervision, budget management, project management, planning and development. Staffing and project plan should be completed in calendar 2009, with significant work undertaken and possible release of inbitial prototypes.
E-Learning Cluster (10 percent)
2008: Offered general support to the E-Learning Center of Excellence concept in New Brunswick, and includes work supporting the Canadian Forces and its allies, presenting to Canadian Forces, in Cornwall and Fredericton, and U.S. Forces, in Fairfax, as well as the e-learning industry in eastern Canada in general. Provided support for AIF in the form of reviews.
2009 Plan: This involves general support to the E-Learning Center of Excellence concept in New Brunswick, and includes work supporting the Canadian Forces and its allies. Also includes support for local events and cluster-building activities, such as the innovation forum. Also includes support for AIF and other granting bodies.
OLDaily (15 percent)
2008: Continued daily newsletter on the field of online learning. Current subscriptions are 4000 (email), 4000 (RSS), 2000 (web).
2009 Plan: Production and publication of a daily newsletter on the field of online learning.
Broadband Visual Communication (15 percent)
2008: Continued involvement on the BVC Social Analysis committee will lend support to that activity, most especially through discussion and support to members' work. Learned about video server-side management and ongoing research in video-conferencing and event recording, including 12 videoconference or webcast presentations over the course of the year.
2009 Plan: Continued involvement will include two aspects. First, continued membership on the BVC Social Analysis committee will lend support to that activity, most especially through discussion and support to members' work. Second, application of BVC tools, including video production tools to the personal learning environment (PLE) and related projects.
Learning and Collaborative Technologies Group
2008: Ongoing support administratively, including time recording, attending group meetings, attending Koffee Klatches (including a Koffee Klatch presentation) occasional training.
2009 Plan: Not counted as part of the percentages above, but worth noting, is ongoing support administratively, including time recording, attending group meetings, attending Koffee Klatches, occasional training, including French training (noted below). Additionally, a key activity for 2009 will be the collection of papers and transcripts for the publication of a book or collection of important works.
Posted by
Downes
at
5:34 AM
1 comments
Links to this post
Saturday, March 21, 2009
TNP 8. Associationism: Inferential Processes
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part VII Previous Post
VIII. Associationism: Inferential Processes
A. The Structure in Review
Before proceeding to a description of associationist inferential structure, I would like to draw together some of the conclusions from preceding sections in order to outline the structure in which associationist inference occurs.
The computational structure follows a connectionist model. The system consists of interconnected units which are activated or inactivated according to external input and input via connections to each other. Such systems, I have noted, automatically, via various learning mechanisms, perform various associative tasks, for example, generalization. I have suggested that the human brain is actually constructed according to connectionist principles, therefore, the computational structure is actually built into the brain.
At the data level, mental representations are distributed representations, that is, no one unit contains a given representation, but rather, the representation consists of the set of connections between a given unit and a set of other units. This set of connections can be represented by a vector which displays the pattern of connectivity which activates the unit in question. Various representations cluster according to the similarity of their respective vectors, producing abstractions and categories.
External input to the system is entered via the senses. This input consists in the activation of what I have called real input units. This input is processed unconsciously according to connectionist principles and at a certain point we become conscious of this processing. At this point, I describe the set of activations as conscious input. We produce abstractions by processing conscious input. Any input from any sensory modality will consist of a pattern of unit activation. These patterns of activation are the input patterns for the vectors referred to above.
At no point in the system described thus far is anything like a symbol or a sentence expected or required. Categorization and abstraction from external input occurs as a form of subsymbolic processing. The data from which we form categories and abstractions consists not of symbols or sentences, but rather, the data consists of what may loosely be called pictures or mental images. Mental images, at leas at the conscious level, are formed by a conjunction of external input [and] input from previously formed associations at higher levels.
In all of this processing, no formal rules of inference are expected or required. Abstractions, generalizations and categorizations are formed automatically. One way of describing the process is to say that units with similar vectors will tend to be clustered. The same process can be described in a more complex manner with reference directly and only to the connectionist principles outlined above.
B. Inference by Prototype
Let me now describe the process of inference with reference to an example. Suppose we have constructed a prototype bird (which looks pretty much like a robin). This prototype consists of a unit which is connected to a set of other units, some of themselves may be prototypes. One of these prototypes, which happens to be strongly connected to the bird prototype., represents "flight".
Now for the inference part. Suppose we have a completely new experience, say, for example, an alien being walks off a spaceship. We see this, and this establishes a certain set of input patterns. The input patterns are such that a reasonable potion of the bird vector is activated (one might say, simplistically, that it looks like a bird). The activation of the bird unit in turn tends to activate all the units to which it is connected (that is, he activation of the bird unit consists in the activation of a partial vector for some other unit, which activates that unit, and which in the end results in the activation of the entire vector). Thus, in association with our perceiving an alien, the unit representing flight is activated. From our seeing an alien which looks like a bird, we have formed the expectation that it can fly.
There is reasonable evidence that something like this actually occurs. One clear example is the manner in which we stereotype people according to their skin colour or their country of origin. What is happening here is just an instance of inductive inference: from similar phenomena, we expect similar effects. This is not a rule-governed process. The occurrence and reliability of a particular inductive inference depends on the repetition of similar phenomena in previous experience (we have to have seen birds fly) and the particular set of mental representations in a given observer. Some of our previous experiences may inhibit our comparison of the alien with a bird, in which case, we might not form the expectation that it will fly.
For any given experience, various units at various levels of abstraction will be activated or inactivated. These units will be affected not only by the input experience but also by each other. Initial expectations may be revised according to activations at higher levels of abstraction (for example, we may initially expect the alien to be able to fly, but then only later remember that aliens never fly). The process being described here is similar to Pollock's system of prima facie knowledge and defeaters. [62] The difference is, first, we are not working with propositions as our items of knowledge, and second, anything can cunt as an instance of prima facie knowledge or a defeater, depending on patterns of connectivity and activation values. (But with Pollock, I would say that prima facie knowledge tends to be that produced by sensory input, and that defeaters tend to be produced by abstracted general knowledge.)
When we say that from similar phenomena we expect similar effects, it should be pointed out that this sort of inference need not apply only to similar cases where "similarity" is conceived to be similar appearance feel, sight, sound, etc.). We have a much more precise definition of similarity which can be employed here: two concept or representation units, each of which is associated with a particular vector, are similar if and only if their vectors overlap to a sufficient degree. Now I realize that "to a sufficient degree" is rather vague. In any given system, it can be predicted with mathematical certainty what input will activate what concept (that's what computer emulations do). However, there are no general principles which can describe 'sufficiency" since there are innumerable ways two units can have similar vectors. See figure 18.
C. Inferences About Abstract Entities
One of the major stumbling blocks for empirical and associationist theories is the problem of abstract entities. We talk about such unobserved abstracts as time, space, mass and even love, yet since there are no direct observations of such entities, there are no empirical bases for such inferences.
But now we are in a position to explain how humans reason about abstract or unobserved phenomena. Consider, for example, tim. A number of linguists have pointed out that humans appear to talk about these entities, and thence, to reason about them, in terms of metaphors. [63] So, for example, we think of time as linear distance, for example, a road. Or we think of time as a resource to be bought, sold, stolen and the like (think of the term "time theft", which is currently in vogue in business journals). We draw conclusions about the nature of time by analogy with the metaphor. So, for example, we might argue that since a journey has a beginning, an end, and a 'line' between them, so does time.
An interesting observation is that these inferences vary from culture to culture. For example, there is no analogue in "undeveloped" cultures to the metaphor of time as a resource. Hence, it is not surprising to see people from such cultures treating time quite differently. Some cultures have never developed the analogy of time as a journey, but rather, identify points in time according to events (even we do this to some degree, or example, "1990 AD"). If our knowledge about time and space were, as Kant suggests, determined a priori, then we should not expect differences in our understanding and reasonings about time. Yet these differences are verified observationally. Therefore, it seems reasonable to conclude that our knowledge about space and time is not a priori knowledge. It must be learned from experience.
One question which arises is the question fo why we would develop such concepts in the first place. In order to explain this, I must do a bit of borrowing from the arguments below, but let me sketch how this done for now. I will proceed by means of an example.
Consider "mass". Mass is unobserved, and indeed, unobservable. There are no direct measurements of mass to be had. Yet mass is central to most of our scientific theories and one of the central concepts not to be tossed aside by Einstein's revision of Newtonian physics. It appears, therefore, that Newton would have had to [have] intuitively or mystically 'discovered' mass. I think that we can allow that Newton observed such things as force and acceleration. Let me borrow from below, and say he could measure these. [64] By employing Mill's fourth method of induction, he would discover that force and acceleration are proportional. This suggests an equality, so he could borrow from previously established identities the idea that there might be a similar identity at work here. because he was seeking to establish an identity, he invented a new term, mass, which converts proportionality to an equation.
The idea is that Newton wanted his equations to 'look like' other successful equations such as those of Kepler and Galileo. In order to accomplish this, he needed to invent a new term. The question remains, of course, where did the invention come from? Computationally, if we compare the vector which represents the proportionality [of] force and acceleration and the vector which represents, say, some equation from Euclid, there will be a difference. This difference is itself a vector and is determinable by, say, XOR addition or whatever. A unit which is activated by this vector becomes, in the first instance, the vector which represents mass. Later, of course, when our understanding of mass becomes enhanced by other experiments, other scientists represent mass with quite different vectors.
This last remark is an important point. There is no one vector which represents abstracts such as time, mass, love and the like. Rather, each individual human may represent these abstracts in quite different ways, depending on the metaphors available. If these concepts were innate, then we would not expect people to have such differing concepts. Whether or not people do have different understandings of time, space and the like is empirically measurable. Therefore, again, there is a means of confirming empirically this theory as compared to innateness theories.
D. Grammar, Mathematics, and Formal Inference
The three systems of grammar, mathematics and formal inference have in common the fact that they are characterized according to a set of formal rules in which abstract terms stand for well formed formulae, terms, and the like. Here I am thinking of such diverse examples as Bever, Fodor and Garrett's "mirror image" language, modus ponens, transformation rules (which require an abstract "trace" to keep track of transformations), x+y = y+y, and the like. Since all of these systems employ abstract terms, they then pose a challenge for empirical and associationist theories.
It is possible to construct abstracts in a connectionist system, as I have shown above. These abstractions are useful when we want to describe formal systems. It is quite a different matter, however, to asset that we actually employ rules containing these abstract entities when we speak, reason or add. I suggest that we do not. Rather, each of, say, a "correct" logical inference or a "grammatical" sentence is a phenomenon which is sufficiently similar to some or another "exemplar" (as I call it) or prototype of such phenomena. Again, let me give an example in order to illustrate.
Suppose we want to teach people basic propositional logic. Either we show them a set of examples and say that inferences like these are good inferences, or we teach them the rules of inference and how to apply them. So if we want to teach, say, modus ponens, then either we give students a set fo examples such as "If I am in Edmonton then I am in Alberta, I am in Edmon, thus, I am in Alberta", or we give them the logical form "If A then B, A, thus B". According to what I suggest, we employ the former method, not the latter. The use of rules alone is insufficient to teach propositional logic; no logic text is or could be written without examples. Thus, the examples are used to teach propositional logic.
I argue that a person learns grammar in a similar fashion. A person is shown instances of correct sentences. Then, when she attempts to construct sentences of her own. she attempts to emulate what she has been shown. A particular sentence is constructed by the activation of several types of units, in particular, units which represent exemplar sentences, and units which represent concepts to be represented in the new sentence. Such behaviour looks like rule-based performance, bt that is because the new sentence will be similar to the old sentence.
This is a theory which can be tested empirically. In a population of students with similar skills, one group could be taught logic via rules and substitutions, while another could be taught by examples. If this theory is correct, then the group using examples should demonstrate better performance. Holland (et.al) describe a series of experiments in which rule-based learning is compared to example-based learning. [65] Their findings are that persons who are subjected to example-based learning do about as well as persons given only rules. The best results are obtained by a combination of the two methods. [66] In my opinion, their results are not conclusive. They use as subjects college students who (presumably) have been exposed to abstract reasoning. In such cases, the rules themselves can function as prototypes. This occurs in people who are used to working with symbolic notation, for example, students with a substantial computer science or mathematics background. Further experimentation would more neutral subjects would be useful.
Connectionist systems can be shown to learn by example. In one instance, a network was trained to predict the order of words in a sentence by having been given examples of correct sentences. [67] The idea here is that different types of words, for example, nouns, verbs, and so on, are used in different contexts. A given class of words, say, a noun, will be used in similar contexts. Words are clustered according to the similarity of the contexts in which they appear. Clustering is described above. When a similar context appears in the future, a pool of words is available for use. This pool consists of words which tend to be employed in similar contexts. Selection of the exact word may depend on broader constraints, for example, visual input.
Let me emphasize that while appropriate word selection may look like rule-based behaviour, it is not necessarily rule-based behaviour, and in connectionist machines it is certainly not rule-based behaviour. As Johnson-Laird writes, "what evidence is there for mental representation of explicit grammatical rules? The answer is: people produce and understand sentences that appear to follow such rules, and their judgments about sentences appear to be governed by them. But that is all. What is left open is the possibility that formal rules of grammar are not to be found inside the head, just as formal rules of logic are not to be found there." [68]
I would like to suggest at this point that the theory that people learn formal systems by exemplar provides a solution to the Bever-Fodor-Garrett problem described above. Recall that the problem was to explain how people can determine whether or not a given string of letters is a wff in a mirror-image language. The problem for the empirical or connectionist approach was that, in order to explain how this is done, it was necessary to postulate that people follow a set of rules containing abstract [entities]. Yet, since associationism (which, of course, is characteristic o empirical and connectionist systems) is constrained by the "terminal meta-postulate", which stipulates that no term not used in the description of the input can be used in the description of the rule.
It is possible merely to deny the postulate and construct a finite-state algorithm, as Anderson and Bower have done. [69] In such a case it would be necessary to construct abstracts from partial vectors as described above. However, it is much more natural and direct to use examples of mirror-image languages to teach a connectionist system. This would be an interesting test for connectionism (and if it worked, a conclusive refutation of the problem). But I do not believe it will be that simple.
Recall ow Bever, Fodor and garret introduced the language: it is a mirror-image language. When they introduce the language in this way, they call to the reader's mind past recollection of mirrors and how they work. While the language does not, in a technical sense, preserve mirror images (the letters are not reversed), there is in a sense an analogy between the performance of mirrors and wffs in the language. In order to adequately test a connectionist system, this information would have to be provided. Clearly, this would be a complex problem. Let me suggest, however, in the absence of an experiment, that there is no a priori reason why a connectionist system, given the relevant information, could not solve this problem.
In my opinion, this is a problem common to many of the challenges to associationism. It is perhaps true that in a narrowly defined context, no associationist system can solve this or that problem. But humans do not work in narrowly defined contexts. In order to adequately test a connectionist system, it is necessary to provide the context.
E. Operationalism
The way to think of such diverse behaviours as riding a bicycle, speaking a sentence, or solving mathematical equations is to think of such behaviours as learned behaviours, learned from examples and by practice and correction. There is a wealth of literature in diverse areas which makes this same point. Kripke's account of Wittgenstein on rules is explicit about the need for practice and correction. Polanyi represents knowledge as a skill, like riding a bicycle, which can be practiced but not described. Dreyfus and Dreyfus talk about expert knowledge as being, in a manner of speaking, intuitive. Kuhn writes that learning science is not a matter of learning formulae, it's a matter of learning how to solve the problems at the back of the book. educational and psychological literature standardly speaks of knowledge being "internalized".
What I am proposing here has its similarities to a movement in the philosophy of science called "Operationalism". First clearly formulated by Bridgeman [70] it was a modified considerably by Carnap. [71] It is difficult to disentangle early operationalism from some of the Logical Positivist theses with which it is associated, for example, reductionism. In its first formulation, the idea of operationalism is to reduce all physical concepts and terms to operations. What I propose is a modification: all formal concepts and terms shoudl be understood as operations. There are several contemporary versions of operationalism. For example, Kitcher, using Mill's axioms as a starting point, formalizes mathematical knowledge in terms of operations. [72] Similarly, Johnson-Laird describes what he calls a "procedural semantics".
The key objection to operationalism - and indeed, to much of what I am proposing in this paper - was stated by L.J. Russell in his review of Bridgeman. Russell noted that scientists often consider one type of operation to be better than another. Therefore, operations are evaluated according to something over and above themselves. A similar critiism coul be made o Kitcher' axioms. Consider set theory. According to Kitcher, we define a set according to the operations of grouping or collecting. However, the objection runs, some groupings are better than others. For example, we prefer a grouping which collects ducks, robins and crows to one which collects typewriter[s], rocks and sheep. Therefore, something over and above any given operation of collecting is employed in order to evaluate that operation.
This is a very general objection to connectionism and deserved a section of its own.
TNP Part IX Next Post
[62] Pollock
[63] George Lakoff, Women, Fire and Dangerous Things, surveys these results. See also Lakoff's "Connectionist Semantics" from the Connectionism conference, Simon Fraser University, 1990.
[64] That is, I still need to explain counting.
[65] Induction, pp. 273-279. They cite Cheng, Holyoak, Nisbett, and Oliver (1986), "pragmatic versus Syntactic Approaches to Training Deductive Reasoning". Cognitive Psychology 16.
[66] Induction, p.276.
[67] Jeff Elman, "Representation in Connectionist Models", Connectionism conference, Smon Fraser University, 1990.
[68] Philip Johnson-Laird, The Computer and the Mind, p. 326.
[69] John Anderson and Gordon Bower, Human Associative Memory, pp. 12-16.
[70] Logic of Modern Physics.
[71] "The Methodological Character of Theoretical Concepts".
[72] Philip Kitcher, The Nature of Mathematical Knowledge.
[73] The Computer and the Mind.
Posted by
Downes
at
2:26 PM
0
comments
Links to this post
Friday, March 20, 2009
The New Nature of Knowledge
I have written on various occasions in the past that the nature of knowledge is changing, a premise that is directly addressed - and challenged - by Tony Bates in his blog post, Does technology change the nature of knowledge?
I want to go through his post more or less point by point, not to be annoying, but as necessary in order to unravel a thread of reasoning that, I would argue, leads him astray.
Because, right from the beginning, I think, Bates has an idea that there are different types of writing, and different types of knowledge. He writes, "I should warn you that this is probably not a particularly suitable topic for a blog - an academic paper might be more appropriate to do the subject full justice."
One must ask, right off the bat, what he can mean by that. Because certainly it is not the placement of the body of reasoning into a printed paper and journal-bound form that renders it more appropriate. No, there is a supposition that the type of writing in an "academic paper" is a different type of writing from what he is offering here.
In what way? This begins to be a bit more difficult to pin down. Certainly it is not a matter of references or scholarly ability: Bates's article is filled with both. He is current on the academic literature - much more so than I - and covers his subject with an easy facility. At most, one can suppose it is some matter of the process of academic writing, then? The matter of reviewing and editing? Ah, but no; Bates's blog post could easily fit unedited into almost any journal one cares to name, unless it is a point in principle (and this I have seen) that he reference a particular body of literature that he is not covering here.
To Bates's argument, therefore, I must post this first challenge, that there ios nothing in principle that distinguishes the content of a blog post from that of an academic article. The same content may very well be presented in either, and the difference lies only in how that content is treated: subject to secret review and editing in the one case, and open scrutiny in the other.
Ah - but then, one argues, his case is made: that there is no distinction between knowledge of the past and knowledge of today. No, this is not established: only that the distinction is not one between academic and non-academic writing. The barbarians are not at the gates; they arise from within as well as without.
Bates next captures very nicely the nature of the new sort of knowledge with some asute citation from relevant works in academia: Jane Gilbert, citing Manuel Castells, writes, "knowledge is not an object but a series of networks and flows…the new knowledge is a process not a product…it is produced not in the minds of individuals but in the interactions between people," and Jean-Froncois Lyotard, "the traditional idea that acquiring knowledge trains the mind would become obsolete, as would the idea of knowledge as a set of universal truths. Instead, there will be many truths, many knowledges and many forms of reason."
We see the result, that "the boundaries between traditional disciplines are dissolving, traditional methods of representing knowledge (books, academic papers, and so on) are becoming less important, and the role of traditional academics or experts are undergoing major change," in the graphs that represent the state of knowledge today:
http://www.downes.ca/post/48207
These are points that have been captured in a wide body of writings, from Gibson's depiction of Cyberspace to the perceptron of the 1950s and the connectionist literature of the 1980s to populist works such as Rushkoff's Cyberia and the widely popular Cluetrain Manifesto. It is hard to know where this account originates; everybody (including the academics) as as though they have discovered it for the first time.
What is important is not who came up with the theory (because we know that what I will say is that the theory is emergent from the works of numerous writers) but rather what the salient points are of the theory. From the work just cited, we can identify three major points (and those who care to look will find those points repeated throughout my own writing):
- knowledge is not an object, but a series of flows; it is a process, not a product
- it is produced not in the minds of people but in the interactions between people
- the idea of acquiring knowledge, as a series of truths, is obsolete
- non-propositional, that is, not sharp, definite, precise, expressible in language
- non-discrete, that is, not located in any given place or instantiated in any particular form
- non-objective, that is, independent of any given perspective, point of view, or experience
Bates identifies a singular feature of knowledge as discussed by Gilbert, Castells and Lyotard: "All these authors agree that the ‘new’ knowledge in the knowledge society is about the commercialisation or commodification of knowledge."
We get to this conclusion through an odd route: "'it is defined not through what it is, but through what it can do.’ (Gilbert, p.35). ‘The capacity to own, buy and sell knowledge has contributed, in major ways, to the development of the new, knowledge-based societies.’ (p.39)"
This is an oblique reference to what might be called a functional definition of knowledge, one that has its roots in the philosophical school of functionalism, "what makes something a mental state of a particular type does not depend on its internal constitution, but rather on the way it functions, or the role it plays, in the system of which it is a part, and this in turn perhaps derived from the Wittgensteinian doctrine of "meaning as use".
But functionalism is very distinct from commercialism, and it is a great leap to infer from a 'definition' of knowledge based on "what you can do" to an assessment of knowledge as a "commodification" - a turn, indeed, that turns the new definition of knowledge on its head, and returns it to the status of object, and in particular, a medium of exchange. The retreat from some account of functionalism, which is more or less accurate, to one of commercialism, is an unjustified turn, and one which should not be accepted without significant dispute.
What would explain it? I would suggest by the fact that networks of knowledge resemble networks of commerce, that there is a similarity between the 'emergent knowledge' and 'the invisible hand of the marketplace', through to the overt endorsement of market logic we see in writers such as Surowiecki's The Wisdom of Crowds. But one should not read into the advocacy of a network theory of knowledge (as we have been describing) anything like a market theory of economics, at least (crucially) not to the degree of mistaking a descriptive interpretation with a causal agent.
Return to the definition of knowledge above. It is not an object (or objective), it is not discrete, it is not a causal agent. It is emergent, which means that it exists only by virtue of a process of recognition, as a matter of subjective interpretation. Mistaking a perception of value with 'value' as an objective driver is a classic mistake of market economics (in my view) and certainly a significant misinterpretation of network theories of knowledge.
But Bates has taken that road wholeheartedly: "I have no argument with the point of view that knowledge is the driver of most modern economies, and that this represents a major shift from the ‘old’ industrial economy, where natural resources (coal, oil, iron), machinery and cheap manual labour were the predominant drivers. I do though challenge the idea that knowledge itself has undergone radical changes."
Let us be clear about the view of knowledge that Bates has explicitly endorsed: one in which knowledge has causal efficacy, one where it is a "driver", more similar to objects (like coal or iron) than ephemera (like attitudes and expectations).
Bates then sets up what we have to uncharitably (but regretfully) call the straw man. Skipping the story, we can read: "in education academic knowledge has always been more highly valued in education than ‘everyday’ knowledge. However, in the ‘real’ world, all kinds of knowledge are valued, depending on the context. Thus while values regarding what constitutes ‘important’ knowledge may be changing, this does not mean that knowledge itself is changing."
To be more charitably, what we have here (I would say) is Bates distinguishing between the two types of knowledge according to the different types of uses to which they are put. This has the merit of being consistent with a form of functionalism, and at the same time allowing two different 'types' of knowledge to be (essentially) the same, but applied in different endeavours.
This, though, nonetheless commits two errors:
- first of all, while endorsing a functionalist definition of knowledge, it assumes an as yet undefended essentialist definition of knowledge (because, if functionalism were true, then two items of knowledge which were put to different uses would in fact be two types of knowledge, since function defines typology).
- second, the depiction of knowledge that I have been calling the network account of knowledge is not simply a functionalist theory of knowledge; it has an entirely different ontology in which the former objects, however defined, no longer exist, and something that is non-discrete and non-localized and non-specific is postulated as performing the function we formerly ascribed (mistakenly) to some sort of discrete entity.
Anyhow, having made the distinction between 'academic' and 'commercial' knowledge, Bates will (with reference to Gilbert) expand on the definition of 'academic' knowledge as "‘authoritative, objective, and universal knowledge. It is abstract, rigorous, timeless - and difficult. It is knowledge that goes beyond the here and now knowledge of everyday experience to a higher plane of understanding…..In contrast, applied knowledge is practical knowledge that is produced by putting academic knowledge into practice. It is gained through experience, by trying things out until they work in real-world situations.’"
In fact, this conflates two distinct types of knowledge:
- knowledge that is academic, and
- knowledge that is abstract, rigorous, timeless
This is an important distinction to make because, first, the properties of being abstract, rigorous and timeless characterize what might be called common, practical, or 'folk' knowledge as much as the ever did academic knowledge, and second, what constitutes 'academic' knowledge is (as we see from the diagram near the head of this post) less and less abstract, rigorous and timeless.
This is what makes it possible to claim that the definition of academic knowledge is "too narrow" - much of what is represented as academic knowledge - "engineering, medicine, law, business" - apply academic knowledge, and academic knowledge (at least when well formulated) is "built on experience, traditional crafts, trail-and-error, and quality improvement through continuous minor change built on front-line worker experience."
There was, in the past, no significant distinction between 'academic' knowledge and 'practical' knowledge except where it was applied: and we could see 'abstract, rigorous, timeless' knowledge equally well in the church service, the farmer's field, or the grandmother's advice on weather. Knowledge was, in all cases, timeless wisdom. Such knowledge was power whether applied to engineering feats or to winning at three card brag.
Bates next considers the applicability of academic knowledge. It's a bit difficult to work with the argument now, since we are at such a fundamental divide, but let's consider the proposition: "my other quibble is that ‘academic knowledge’ is implicitly seen in these arguments as not relevant to the knowledge society - it is only applied knowledge now that matters. However - and this is the critical point - it has been the explosion in academic knowledge that has formed the basis of the knowledge society."
This goes to the point that academic knowledge can be used in a practical - even commercial - context, and therefore must not be distinct even functionally. The purpose to which we formerly ascribed only practical knowledge is found to result from academic knowledge (almost to the point of exclusivity): "It was academic development in sciences, medicine and engineering that led to the development of the Internet, biotechnology, digital financial services, computer software and telecommunication, etc. Indeed, it is no co-incidence that those countries most advanced in knowledge-based industries were those that have the highest participation rates in university education."
Leaving aside the question of whether these advances were in fact developed in academia or through some process we might call the academic method, let me focus on the question of the nature of these advances. Did, in all these developments - the internet, biotechnology, and the rest - did academic contribute abstract, rigorous and timeless knowledge? Certainly, there was some point at which it did. Newton's three laws were classical instances of such. The laws of thermodynamics equally so. And even in the last century, Einstein contributed to the paradigm with E=mc[2]. But recently?
I would argue - and this is a matter for empirical investigation - that the research paradigm based on "abstract, rigorous, timeless" knowledge has stalled, and that what researchers have in fact been harvesting over the last few decades is something much more like network knowledge, as I have described it above. This is a distinct form of knowledge that is not based on simple causality, laws of nature, objective perspectives, and the rest. It is (in the words of Polanyi) tacit and ineffable.
The internet is a classic example. While there are protocols, no law governs how computers interact - this is strictly a matter of agreement and individual choice. In biotechnology scientists are looking at systems and networks in everything from immunology to ecology. Financial services proves to be based on, well, Ponzi schemes rather than anything that might be called 'timeless'. And telecommunications are based on laws that have been known for decades, depending more and more on protocol and agreement, rather than natural law, for improvements.
Indeed, the sorts of knowledge that Bates identifies as important resemble more and more dynamic, interpretive, chaotic types of phenomena - our capacity to, as Rushkoff said, not navigate or surf through a dynamic information field, as though it were a gigantic wave (or office block parking garage), rather than an attempt to capture and hold:"it is not just knowledge - both pure and applied - that is important," he says, "but also IT literacy, skills associated with lifelong learning, and attitudes/ethics and social behaviour." But the point is: these are types of knowledge - they are, indeed, the new literacy, 21st century literacy.
The problem is, Bates hasn't let go of the old account of knowledge, the one with abstract, rigorous and timeless truths, knowledge based on objects, the acquisition of content. He writes, "My point is that it is not sufficient just to teach academic content (applied or not)." No, it is not sufficient to teach this type of (old-style) knowledge. It is (arguably) not even necessary. Because what we want are the new skills, based on the new more formless type of knowledge, skills that allow people to et by when nothing is abstract, rigorous, timeless: "the ability to know how to find, analyse, organise and apply information/content within their professional and personal activities, to take responsibility for their own learning, and to be flexible and adaptable in developing new knowledge and skills."
But Bates doesn't admit of this; he explicitly rejects it. "These skills and attitudes may also be seen as knowledge, although I would prefer to distinguish between knowledge and education, and I would see these changes more as changes in education. What is changing then is not necessarily knowledge itself, but our views on what educators need to do to ‘deliver’ knowledge in ways that better serve the needs of society."
This may be the case if, as he suggests, we are simply facing an explosion of new knowledge. But while we are seeing an explosion of content, our stock of abstract, rigorous and timeless truths remains constant - indeed, arguably, it has been on the decline, as we realize more and more tht the laws and principles of nature that we took for granted were at best approximations of reality and at worst projections of our own thoughts, values and beliefs on nature (how else does one explain an economic system based on the infinite expansion of capital?).
What we are experiencing a proliferation of is points of view, and with each iteration of points of view it becomes apparent that the former world in which there was only one (authoritative, lawlike and Catholic) point of view is more and more misrepresentative. The new form of knowledge is a recognition that the propositions in our content, no matter how apparently abstract, rigorous and timeless, are in fact not knowledge, but merely more sea through which we must navigate.
This is why we must change our educational system, indeed, even as Bates says, "moving away from a focus on teaching content, and instead on creating learning environments that enable learners to develop skills and networks within their area of study." Because, contra Bates, content is not still crucial (more, more accurately, no particular bit of content is crucial) and academic values that propel enquiry toward abstract, rigorous and timeless truths are not only obsolete, they are dangerous.
Indeed, I would argue even that what might (again) be called 'academic method' is itself under siege. Bates writes, "we need to sustain the elements of academic knowledge, such as rigor, abstraction and generalization, empirical evidence, and rationalism." But these very principles misconstrue what it means to reason - the practices of abstraction and generalization, for example, ought to be understood not as mechanisms for finding more truth (as the old inductivist interpretations made out) but are rather ad hoc means of creating less (but more manageable) truth.
The very forms of reason and enquiry employed in the classroom must change. Instead of seeking facts and underlying principles, students need to be able to recognize patterns and use things in novel ways. Instead of systematic methodical enquiry, such as might be characterized by Hempel's Deductive-Nomological method, students need to learn active and participative forms of enquiry. instead of deference to authority, students need to embrace diversity and recognize (and live with) multiple perspectives and points of view.
I think that there is a new type of knowledge, that we recognize it - and are forced to recognize it - only because new technologies have enabled many perspectives, many points of view, to be expressed, to interact, to forge new realities, and that this form of knowledge is emerged from our cooperative interactions with each other, and not found in the doctrines or dictates of any one of us.
Posted by
Downes
at
4:23 PM
10
comments
Links to this post
Tuesday, March 10, 2009
TNP 7. Associationism: Cognitive Structures
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part VI Previous Post
VII. Associationism: Cognitive Structures
A. Objections to Associationism
Above, I have outlined what I mean by associationism and sketched some objections. At the risk of repetition, I would now like to describe these objections in greater detail. By considering these objections, I will be able to describe a theory of associationist inference in more detail. This description depends to some extent on some of the conclusions already established regarding representations and perceptions, and will be employed below in a discussion of language and logical inference.
The general form of objections to associationism is as follows: people have the ability to know or do X, associationism is not sufficiently powerful to explain how people know or do X, therefore, people employ some means of knowing or doing X other than associationism. For example, "We know that the external world exists. However, empiricism (which depends on associationism) cannot prove that the external world exists. Hence, we must have some non-empirical means of knowing that the external world exists."
As an example of this form of argument, consider the following from Leibniz's New Essays. "The senses, although sufficient for all our actual knowledge, are not sufficient to give it all to us, since the senses never give us anything but examples, that is, individual or particular truths. Now all the examples which confirm a general truth, whatever their number, do not suffice to establish the universal necessity of that same truth.... necessity truths... must have principles whose proof does not depend on examples, nor consequently on the testimony of the senses." [51]
As another example of the same sort of argument, consider Chomsky. He argues, correctly, that certain features of language use, for example, transformation, depend on knowledge of the structure of a given sentence in the language. Step-by-step inductive operations (that is, those which employ finite state devices) are inadequate to produce this knowledge. Therefore, we must have this knowledge independently of experience. It is innate, perhaps, or the product of evolution, and is not learned from experience. [52]
Bever, Fodor and Garrett also describe what they call a formal limit to associationism. [53] According to these authors, we are able to recognize that a certain string of characters is a well-formed formula (wff) in a language L (L) only with respect to a set of rules which contain abstract character. Since association is subject to what they call the "terminal meta-postulate", which asserts that associationist rules may be described only in those terms which describe behaviour, no associationist principle may contain an abstract character. [54] Therefore it follows that on the basis of associationist principles alone we cannot determine whether or not a given string of letters is a wff in L.
These arguments are all valid arguments. Thus, in order to refute them, it is necessary to show that either the first premise is false or the second premise is false. Which of these two options we employ will vary according to circumstances. In general I take the following route. Those arguments which assert that we have this or that knowledge are refuted by a denial of the first premise; I argue that we have no such knowledge. Those arguments which assert that we have a demonstrated capacity I refute by a denial of the second premise; I argue that associationism can produce such a capacity.
B. Scepticism and Knowledge Claims
let me consider only briefly instances of the first sort of refutation. Consider Leibniz's argument, stated above, that the "universal necessity" of some general truths must be known by some means other than the senses. One part of Leibniz's argument is certainly correct: we do not arrive at such knowledge from the senses. Further, it could be taken as arguable that we do not even know general principles, such as laws of nature, from the senses, nor can we even establish that one or another such principle is probably true. In my opinion, Popper's arguments on this point are conclusive. [55]
Contra Leibniz, I argue that we do not have any cognitive access to any such universal necessity, and therefore, do not in fact know that this or that principle is universal or necessary. Here is my argument.
Leibniz's own theory of necessity and possibility is very similar to that which we employ today: a proposition is necessarily true if and only if it is true in all possible worlds. Now either possible worlds are something which we create in our own minds or they are not. If they are, then while we may be certain that a given proposition is true or not true in all (conceived) possible worlds, since it may be the case that there may be possible worlds which we have not thought of yet (alternatively: since there are worlds which we cannot imagine), then our knowledge that a proposition is true in all (conceived) possible worlds is insufficient for us to know that it is universally or necessarily true. Thus, whatever we know about possible worlds in our own mind is distinct from the possible worlds in question. Hence, our knowledge about possible worlds might be incorrect. So even if a proposition is true in all (conceived) possible worlds, we cannot know it is true in all possible worlds. therefore, we cannot know that any proposition is universally or necessarily true.
It is of course true that there are some things which we can know, for example, I know that I exist. What I am arguing here is that experience, for example, my experience of myself, is sufficient to establish those things which I do know. Scepticism serves as a good rough-and-ready means of distinguishing what I know from what I don't. In general, those things which it is claimed that we know and which associationism cannot prove (that is, for which we cannot construct associative processes for knowing) are those things that can be undermined by a sceptical argument.
There is an alternative approach for those people who don't like scepticism. Suppose it is claimed that we know some proposition, say, that the ground will not disappear under my next step. Instead of asking how we know (for which there is probably no answer, but this is the sceptical move to be avoided) we ask how we know that we know. In such cases, typically, it is necessary to argue that we behave as though we know (direct introspection tends to be unconvincing in such cases and is the only alternative answer). But now it is not necessary to explain the knowledge; it is only necessary to explain the behaviour. Connectionism allows that a person can behave in this or that way without ever knowing the principle which underlies the behaviour. Thus, we can respond to an apparent knowledge claim by saying not only that we can't know, but further, that we don't need to know. (Human beings managed to stay attached to the Earth without difficulty for centuries prior to the discovery of gravity.)
C. Association and Cognitive Capacities
In general (exceptions noted), scepticism can refute any knowledge claim. Thus, the only means of establishing that associationism is inadequate to explain human cognition is to establish that we have some demonstrated capacity which, in principle, could not have been produced employing associative mechanisms.
The "in principle" part of the argument is the tough part to establish. Above, I have sketched a new theory, connectionism, which employs associationist principles. Although the exact limits of this new theory are difficult to define, nonetheless, first, we know that it is a very powerful theory, and second, we know exactly how it works. Hence, we are now in a position to describe in detail associationist mechanisms for producing previously unexplainable behaviour (unexplainable, that is, except with reference to some innate knowledge or capacity).
At th core of my objection to such as Fodor and Chomsky is a related theory which I have sketched above, specifically the theory which asserts that cognition does not necessarily proceed according to rules and clear and distinct categories. Therefore, it will not do to argue that associationism must produce a principled mechanism for performing this or that cognitive feat. All that is necessary is that some mechanism be described, even if we allow that particular instantiations may vary, perhaps considerably. (This latter should be expected for human capacities vary considerably.)
The theory I wish to propose in response to the Fodor-Chomsky argument has two parts. In the first part, during the course of experience, human beings detect repeated experiences of similar phenomena. From these, characteristic or prototype representations of those phenomena are constructed. Then, in the second part, these prototypes are employed to produce the cognitive behaviours various philosophers have argued cannot be created by association.
D. Essences and Accidents
It is to me a mystery why people argue that an abstract is something different from an experience. Let us examine how we developed a theory of abstractions in the first place. Its origin is Aristotelian, though it receives its clearest formulation in Medieval philosophy. In order to examine essences, let us consider, for example, the essence of something concrete, say, Socrates.
Medieval philosophers such as Ockham and Scotus agreed that Socrates was composed of two parts: his essence, and his accident. His essence is that attribute which Socrates must possess in order to be Socrates. His accident is that set of features which are not necessary particular to Socrates. We might say that the essence is that which continues, unchanging, to be Socrates, and his accident is that which may change from time to time without changing the fact that Socrates is Socrates. For example, Socrates is essentially human, but only accidentally snub-nosed.
So, for example, Ockham characterizes Scotus's view as follows: "a nature is this by something added that is formally distinct (from the nature)". [56] the 'something added' is called a "contracting difference", which "contracts it (the nature) to a "determinate individual". The word 'contract', or in Latin, 'contrahere', is, for example, to apply the genus to some species, of some species to some individual. For example, 'Socrates contracts the species of humanity'. [57]
The point I wish to emphasize here is that Socrates, the single individual, is composed of two parts: the essence and the accident. If we take away the accident, then we have the essence. For any given experience, it is no difficult matter to take away that part of the experience, particularly if that experience consists of, as I have suggested above, a set of activations of neural cells. If only some of those cells activate a further set of cells, the we have succeeded in taking away some of the experience. So we can, via a connectionist process, construct something which could be the essence of Socrates. We do so by deleting from the representation some or another features of Socrates, for example, his snub nose.
A key point: this essence just is what we mean by an abstract. The debate between Ockham and Scotus illustrates the contemporary debate concerning abstracts. According to Scotus, the essence of Socrates exists. [58] Socrates just happens to be a "contraction", or a particular instantiation, of that essence. Other human beings, for example, Aristotle, are different instantiations of that same essence. For after all, both Aristotle and Socrates are essentially human. Ockham's response to Scotus is well known in its outline. If Scotus is right, then we have two distinct types of entities: particular things, for example, Socrates, and essences, for example, humanness. However, as a methodological principle, it is better not to multiply entities beyond necessity. Since we do not have to postulate some independently existing essence, it follows that we should not.
Some philosophers, for example, Kripke, apparently still believe that there are independently existing essences. [59] Most philosophers do not. From my point of view, it does not matter whether essences have independent existence. The question is whether or not, by virtue of experience alone, we can detect them. I argue that we can, and I argue that the process just is as described above: we strip the accidental features from a given experience, and are left with a representation of the essence.
E. Evaluation of Essences
Where the real dispute lies, in my opinion, is whether there is one and only one set of permissible essences. For example, it is arguable that Socrates is essentially human. But it is also arguable that Socrates is essentially snub-nosed. There are several ways to pose this question. Must we identify one, rather than another, set of essences of things? Is some or another set of essences better? Or is the determination of essences ad hoc and random? In my opinion, some types of essences are better than others, but there is no [one] way that we must define the essences of things.
I believe that the essence of Socrates is the way that Socrates is similar to other things, and that the accident of Socrates is the way in which he is different. For example, Socrates is similar to Aristotle in that they are both human, yet they are different in that only Socrates is snub-nosed. The reason why humanness is a better essence than snub-nosedness is that snub-nosed and non-snub nosed people are otherwise very similar, while humans and non-humans tend to be quite different.
Another way of saying the same thing is as follows. Recall that a given representation, say, of Socrates, consists of a set of connections between a given unit and some set of units, and that this set of connections may be represented as a vector. See figure 14. These vectors may be more or less similar, for example, "1011" is more similar to "10010" and less similar to "00001".
Now suppose that we have the following set of vectors:
111000
111001
111010
001101
001000
1111000
These vectors can be clustered according to similarity
111000
111001
111010
111100
001101
001000
It is by virtue of and only because of these clusterings that this or that identification of an essence is to be preferred. [60] In the former case, we may have the essence:
111xxx
and in the latter:
001x0x
The "x"s in this example indicate that there is no connection between a given unit in the vector and the unit which represents the essence. There are partial vectors; see figure 15.
We can produce a measure of the 'betterness' of a given essence by considering, first, the number of "x"s in a given vector, and second, the number of instances of the given essence. Suppose there are n instances of "111xxx" and there are m "x"s (in this case, m=3). Then, to use a simple example, the betterness b of "111xxx" is b=f(n,m) where f is a betterness function.
It is worth noting that this system of betterness is exactly what we would expect from a connectionist system. Take any unit "i" which is connected to a set of other units. The fewer the number of x's the greater the number of input units, hence, since input is summed, then (other things being equal) at any given time t, an essence with fewer x's will have greater activation than one with more x's. Second, if a given vector is activated frequently, then (other things being equal) a unit the activation of which depends on the activation of that vector will be activated more frequently. Since in connectionist systems, unit activation values tend to decay, then the more frequently a unit is activated, the higher its activation value will be. The function f takes into account the decay rate and the rest position toward which the unit tends to decay.
F. Abstractions, Categories, and Prototypes
What I wish to point out immediately is that an essence, defined above as a vector with some "x"s, just is an abstraction. The more "x"s a given essence has, the more abstract it will be. Abstractions, by virtue of the fact that they have many "x"s, tend at first glance to not have very much betterness; they hardly correspond to any input activation (ie., experience) at all. However, since they are so frequently activated, this initial weakness is overcome.
The definition of a category can proceed with reference to the essence or the abstract feature of the members of that category. A category just is the set of those instantiations which result in the activation of, say, vector "111xxx". This is a normal and standard type of definition of categorization: the necessary and sufficient conditions for membership in any given category will be the set of activations which correspond to "111xxx". But the story does not end there.
Suppose we have a given category, the essence of which is activated by "111xxx". However, since partial vectors can result in the activation of a given unit, the unit will be activated by "110xxx". In this case, the activation will be only two thirds as strong as in a normal case. But since this is possible, no one of the units will be a necessary condition for the activation of a given essence-unit. If the clustering is such that there is no other place to put an instance of "110xxx", then we will typically assign whatever corresponds to "110xxx" to the category defined by "111xxx". Note that we have not defined a new category "11xxxx", since the third spot on the vector remains connected. Rather, we have extended what counts as an instance of "111xxx". See figure 17.

To change the example so slightly now in order to make the next point, suppose we have a category defined by "11111x". Any and all of the following will stimulate activation of that essence:
111111
111110
011111
110111
111100
and so on. It is clear from this example that some sets of activation are better than others, that is, they result in a greater activation of the essence-unit. In this case, the activation of
111111
will create the strongest activation. Whatever it is which corresponds with this vector constitutes a "prototype" of the category defined by "11111x". [61]
Human beings actually do this. Consider, for example, the category "bird". Birds are grouped into a given category because they have some features in common, for example, they are cold-blooded, lay eggs, have wings, beaks and claws, fly, and the like. Some birds, such as robins, have all of those features. A robin is therefore a prototypical bird. Others, for example, penguins, have most but not all of these features (they don't fly). While they are still birds, we do not consider penguins to b prototypical birds.
Think about this. Imagine a "dog". Now - did you imagine a collie or German shepherd, or did you imagine a Mexican hairless?
G. Are There Real Essences?
The one objection I can think of to this sort of story is that there are "real" essences which, first, do not correspond to any given experience, and which, second, we must employ in order to construct our system of categorizations. This objection is first raised by Descartes and has its modern instantiation in Kripke.
In my opinion, whether or not there are real essences does not matter. Suppose they exist. Either we detect them or we do not. If we do not, then we have no means of employing them in order to construct categories. Therefore, if they are of any importance at all, then we must detect them. Suppose we detect them. Then we either detect them as thy are, or we do not. If we detect them as they are, then whatever they are (according to connectionist theory) will be reflected in our actual system of categorizations. If we do not detect them as they are, then the way they are does not affect our categorization. Therefore, the only case in which real essences can affect our system of categorization is a case in which, first, they exist, and second, we detect them as they are.
Suppose they exist and we detect them as they are. Either we detect them through the sense or we do not. Suppose we believe, like Descartes, that we do not detect them though the senses. Then they must be, as Descartes suggests, innate. If they are innate, however, then there could be no disagreement regarding the best system of categorization (recall that we are detecting them as they are). However, there is such a disagreement, for I disagree. Therefore, they cannot be innate. Thus, we must detect them by experience.
If they are detected by experience, however, since what we experience is distinct from that which is experienced, then even if we detect them as they are, we cannot ever know that we detect them as they are. Therefore, whether or not we detect them as they are is irrelevant, for all we can work with is the experience. This is exactly what I am proposing.
Finally, let me propose the following challenge to those people who propose that there are real essences and that we detect those essences via some non-empirical mechanism. Since according to the theory I have proposed I have an exact and clearly detailed mechanism for identifying and evaluating different schemes of categorization, then let me challenge those who propose an alternative mechanism to detail exactly how these categorizations are detected and how disputes concerning the relative merits of different systems of categorizations are to be evaluated. There is only one condition to thi challenge: the system cannot refer to experience in order to detect and evaluat systems of categorization. I propose that it cannot be done.
TNP Part VIII Next Post
[51] G.W. Leibniz, New Essays Concerning Human Understanding, pp. 42-44.
[52] Noam Chomsky, Syntactic Structures. Cited in P. Johnson-Laird, The Computer and the Mind, pp. 306-314.
[53] T.G. Bever, J.A. Fodor, M. Garrett, "A Formal Limit of Associationism", from Verbal Behaviour and General Behaviour Theory, T.R. Dixon and D.L. Horton, editors. Prentice-Hall, 1968.
[54] J.R. Anderson and G.H. Bower, Human Associative Memory; A Brief Edition, p. 15.
[55] Karl Popper, The Logic of Scientific Discovery and Postscripts, pp. 363-366.
[56] William of Ockham, Ordinatio, from martin Tweedale (ed. trans.) "Selections from William of Ockham's Ordinatio Concenring Universals." Mss.
[57] Richard McKeon, Selections from Medieval Philosophers, Vol. 2, p. 441.
[58] though not independently, that is, not in the absence of a contraction. John Duns Scotus, Opera Omnia, Vol. XVI, sec/ 275. See also Martin Tweedale, "Dpoes Scotus' Doctrine on Universals Make any Sense?", p. 104.
[59] Sail Kripke, Naming and Necessity, p. 127. After asserting that gold is essentially element 79, he writes that "According to the view I advocate, then, terms for natural kinds are much closer to proper names than is ordinarily supposed."
[60] See Jeff Elman, "Representation in Connectionist Models", Connectionism conference, for an account of how clustering occurs according to word functionality. See also George Lakoff, Women, Fire and Dangerous Things, ch. 2.
[61] This is a simplified version of the theory proposed by Anderson and Mozer in "Categorization and Selective neurons: in Anderson and Hinton, Parallel Models of Associative Memory, pp. 213-236.
Posted by
Downes
at
12:34 PM
0
comments
Links to this post
TNP 6. The Problems of Perception
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part V Previous Post
VI. The Problems of Perception
A. The veracity of Experience
Let me turn now to the problems concerning perception which were outlined above. These problems were then introduced as problems concerning the theory-ladenness of perceptions. In order to address this question, it is useful to begin with more traditional problems of perception. These divide roughly into two categories: first, can our perceptions be mistaken, that is, is the world really the way we perceive it to be; and second, can we be mistaken about our perceptions, that is, do we really perceive what we think we perceive? By means of a discussion of these questions I put into place the theoretical structure necessary to consider in more detail a response to concerns regarding theory-laden perceptions.
Philosophers have historically attached the veracity of experience. Plato, in the Republic, employs an analogy of shadows on the wall of a cave to make this point. [40] The way the world appears to us is represented by the shadows, but of course, if we want to see the way the world really is, we must leave the cave. Descartes, in Meditations, begins his sceptical argument with an attack on the senses. We can be fooled by the senses, argues Descartes, and if we can be fooled by them, then we should not place our trust in them. [41] Contemporary empiricists, for example Ayer, considered this to be a problem. [42]
There are two approaches to this argument. First, we can argue that the premise, that we are fooled by the senses, is false. Second, we can argue that the argument is irrelevant from the point of view of cognition; even if we are fooled by the senses, we nonetheless rely on the senses.
The first response is provided by Austin. [43] He considers examples of cases in which we are "fooled" by the senses and argues that we are not in fact fooled. For example, a stick appears bent when placed in water. We know that the stick is not bent., therefore, it appears as though we are fooled by our senses. However, nobody is actually fooled by this experience. For one thing, when we perceive a stick in water, we perceive not only the stick, but also the water. Since we expect a straight stick to appear bent when in water, we are not fooled into believing that the stick is bent. And for another thing, even had we never seen a stick in water, we would have seen the stick before and after it was placed in water. Our evaluation of the shape of the stick is based on all of these observations, not only the observation of the stick in the water.
Both of these responses make the same point. We cannot evaluate the veracity of any experience by considering only that experience in isolation. We must consider that experience within the totality of experiences, and experience as a whole as occurring within a given context, for a given person. And when we consider our experiences as a whole it turns out that we are not fooled by out experiences. What happens is that when we have apparently conflicting experiences, we revise our understanding of what we have seen in order to explain the experiences. This process of revision is easily explainable in terms of a connectionist system; it is a process of readjusting weights in order to reduce error.
The reason why it is arguable that questions regarding the veracity of experience are not relevant to a study of cognition is that in a certain sense it cannot be argued that we are wrong about our experiences. Now here it is important to be careful, for I do not wish to assert some sort of infallibility thesis. What I mean to say is that questions concerning the correctness or incorrectness of experience have no bearing on whether or not a given experience is experienced. Let me outline the argument before giving details. The infallibility thesis states that we cannot be wrong when we think we are experiencing, say, red. This can be criticized on the grounds that our perception of what we experience can be distinct from what we actually experience. [44] In response, on the one hand, we can say that the experience is whatever it is, and what we think about it is irrelevant, or on the other hand, what we think about the experience is what it is, and what the actual experience is is irrelevant. Arguments concerning the veracity of experience make sense only if we insist that there be some one-to-one correspondence between the actual experience and what we think we experience, and there is no reason to suppose that such a correspondence is the case.
Now let me proceed with this argument in more detail. Suppose I proceed by stipulation and assert that whatever constitutes the activation of input units in a given network is by definition experience. Either this input is in some way isomorphic with or representative of some external reality, or it is not. Sceptics about experience (such as Plato and Descartes) assert that it is not. However, whether or not the input is isomorphic with or representative of some external reality has no bearing on whether the input is input. No matter what the relation between the input and the external work, there is some state of activation of input units, and that state is called "experience", and the argument that this state of activation is somehow "wrong" does not change the fact that it exists.
The infallibility thesis rests on the assertion that one cannot be wrong about the content of one's own perceptions. In one sense, this is meaningless; if one's perceptions are a part of oneself, then how can a person be wrong about them? In order to make the argument against the infallibility thesis work, it is necessary to separate the person from the experience, or at the very least, to separate one's consciousness from the experience. Some philosophers, for example Hume in the Treatise, argue that this cannot be done. If, however, we can somehow separate the two, then it is in principle possible for a person to be wrong about his perceptions. But here we must step carefully: what does it mean to separate a person or a consciousness form that person's experience?
We must have two "experiences": one which is the actual input, and one which is a representation of that actual input. Arguments against the infallibility thesis are arguments that the representation of the experience is not isomorphic with or representative of the actual experience, which has some sort of "reality" of its own. But if this "reality" is outside our consciousness, then it cannot be considered to be a part of experience; our experience per se is our consciousness of this reality, and about that we cannot be mistaken. And our consciousness of an experience must be whatever it is, no matter what the "reality" of the experience is. (Please notice that I am arguing by analogy of form here; this argument has exactly the same form as the argument concerning experience versus reality a few paragraphs above.)
In my opinion, the major questions concerning perception and experience are questions regarding what counts as a perception or an experience. Above, I stipulated that the activation of input units is what counts as experience. This needs to be qualified. For if what I have just asserted immediately above is true, then since we are not consciously aware of the actual input activation, then the actual input activation cannot be considered to be experience. What we need to do is distinguish different areas of the network. Let me, quite arbitrarily, divide the network into two parts: that part of the brain the processes of which are consciously accessible to us; and that part of the processes which is not accessible to us. Then I can employ the definition of "experience" given above in the following consistent manner. There are two types of experience: on the one hand, "real experience", which is the input activation to the network as a whole; and "conscious experience" which is the input of the consciously accessible part of the network.
In human brains, my divisions work roughly as follows. The activation of, say, rods and cones on the retina in the eye constitute "real" experience. We are not consciously aware of these activations. These activations spread through the network and eventually reach the cerebral cortex. Then, since the activations from the initiative activations first in the V-IV section of the cerebral cortex, the activation of V-IV cells constitutes "conscious experience". The precise division of conscious and non-conscious areas, as well as input and non-input areas, is subject to empirical investigation. It is worth noting that at the other end, perhaps, say, the cerebellum, we are again unaware of states of activation.
B. The Nature of Experience
Now let me turn to the questions concerning the nature of experience. It is arguable, as I have mentioned above, that there are no "pure" perceptions, that is, that there is no means of distinguishing between the experiential and the non-experiential components of experience. What we believe about the world, or what theories govern our beliefs about the world, are a part of and indistinguishable from what we see or observe. In order to conduct this discussion, I will review the arguments concerning the theory-ladenness of perception and then, drawing on the lessons of the previous section, provide a reasonable and empirical explanation of such phenomena.
There are two ways to argue for the theory-ladenness of perception. First, it is arguable that one's beliefs or other cognitive state can affect perception. And second, it is arguable that there are innate constraints which affect perception.
The examples given above are examples of the first. Consider, for example, Hanson's argument. he suggests that when Ptolemy, who believes that the Sun revolves around the Earth, and Kepler, who believes that the Earth revolves around the Sun, look to the east at dawn, they see two different things. Ptolemy sees the Sun rising above the horizon, while Kepler sees the horizon turning to face the Sun. [45] This is an instance of one's beliefs about the world affecting the way we see the world. Churchland [46] makes a similar point quite vividly. If we view the horizon as level, the stars just look like, well, stars. If, however, if we (in the northern hemisphere) tilt our heads, and if we are aware that the planets orbit the Sun on a certain plane, then we can see this plane and further we can see that we are on the side of a planet on this plane. This really is a most vivid experience.
On the other hand, one may argue that there are innate constraints which affect the way we perceive the world and which constitute some item of knowledge about the world. According to Marr, for example, in order to construct three dimensional representations of the world from two dimensional input, as is for example received at the retina, we must be guided by two constraints: the uniqueness constraint and the continuity constraint. [47] In order to construct three dimensional representations, we need to match a given input from one eye to a given input from the other. The uniqueness constraint asserts that one thing cannot be in two places at the same time, thus, we match the input from one eye to the one and only input from the other. The continuity constraint asserts that we can't see through objects, so adjacent inputs tend to be considered to be at the same depth.
It is possible to create three-dimensional representations by employing these two constraints and a network as described above. The matrix consists of possible matchings between input from each eye. The two constraints represent inhibitory and excitatory connections in this matrix. For any given input from a given eye, the possible machine matches with inputs from the other eye are inhibited with respect to each other; we want to pick only one such match. Matches which represent the same depth excite each other; we think that objects have relatively continuous surfaces. The reason why these constraints must be innate, it is argued, is that there is nothing in the input from the eyes which, even given the learning processes described above, would result in this pattern of connectivity. This it must be build-in. See figure 12.
Let me now respond to these considerations. If anything like back propagation is correct, then we have a reasonable explanation for both types of theory-ladenness. We continue to define experiences as above, that is, as the activation of input units. This initial activation in turn causes the activation of other units. Patterns and association are detected by the system. Once these patterns are detected, they provide grounds for correcting various input activations. Input to the eye is "clamped", that is, it is fixed and cannot be changed. So real experience is unchangeable. However, conscious experience is not clamped. It is initiated by the activation of real input units, and this input is filtered and modified by the pattern of connections between the real input units and the conscious input units. This pattern of connectivity can be altered via back propagation. Thus, our conscious experience can be altered by previous experience. See figure 13.
This is an extremely powerful theory. First, it explains why we are not fooled by appearances. As mentioned above, we are not fooled because we take into consideration the totality of our experiences. Since previous experience creates a pattern of connectivity in the brain, this previous experience is effectively stored and plays an active role in our processing of subsequent experience. So when we see, say, a bent stick, we take into account the totality of our experiences of that stick and, as Austin suggests, are not fooled by the illusion. Second, it explains how beliefs and other cognition can affect what we consciously see. Once again, since beliefs are in pact patterns of connectivity, these patterns, via back propagation, can affect the connections between real input and conscious input, and hence, affect the states of activation of conscious input neurons.
There is empirical evidence that something like back-propagation may be at work here, Past experience plays a role in present perception. We can pick out objects when thy resemble objects we have seen before, such as for example picking out a dalmatian in a birch forest. We can also pcik out objects, such as birds flying in the dark, by their motions. [48]
Third, it explains Marr's uniqueness and continuity constraints. The particular patterns of connectivity described emerge because, if they did not, visual input would clash with tactile input. we know, for example, that surfaces are solid because we can't put our hands through them. Our experience of touching walls and the like creates a certain pattern of connectivity. In order to minimize error, it is necessary to adjust the connections in the three-d vision matrix to fix this pattern.
This last claim is something that can be tested empirically. In essence, what I am asserting is that other senses affect how we see. We can either proceed with biological experiments, testing the effect of sensory deprivation on visual processing, or we can proceed computationally, testing for improvements in visual processing as a consequence of the addition of other sensory modalities.
This theory can also explain why it is impossible to distinguish the theoretical component of conscious experience from the visual component. Suppose "i" is a unit at the level of conscious input. Then "i" is receiving excitatory or inhibitory input from both visual cells, such as those in the retina, and from higher level cognitive cells in the form of back propagation. These input activations are summed, so that is the retinal cells are sending a signal of '4' and the cognitive cells are sending a signal of '5', the net input to "i" is '9'. This is what shows up in "i"; it increases activation by 9. Once, however, the input is summed, there is no means of determining which part of the sum is the retinal input and which part is the cognitive input. Given only '9', we can't tell whether it is '3'+'6', '5'+'4', or whatever.
C. A Theory of Perception
I have, in the course of responding to these various objections, sketched the outline of a theory of perception, which I would now like to make clear.
Our conscious experience just is the activation of a certain set of cells. These cells are the input cells to whatever network constitutes the conscious part of the brain, that is, the part of the activities of which we are aware. Input to these input cells comes from two sources: on the one hand, from input cells in the various of our sensory modalities, for example, retinal cells; and on the other hand, via back propagation from cells which are at higher levels, which I have called cognitive cells. These inputs are summed to produce activation, and that activation is what we call conscious experience.
Input activation produces what we call a representation. Above, I have explicitly endorsed what is called the "picture" theory of representation. The term "picture" is a bit misleading, since as input activation may be augmented and corrected from back propagation from previously formed associations and other sensory modalities, and there is no requirement that a given representation be isomorphic with or represent input activation. It is not necessary, and rather unlikely, that representations are composed of symbols and sentences. Human beings do not appear to be constrained by the sort of principles which constrain symbolic representations, rather, what appears to be the case is that representations are constructed and manipulated according to their relevant similarity with other representations.
I would like to briefly address the categorization of the different regions of human neural networks which I have identified above. I suggest that there are no fixed delineations regarding what counts as the conscious network or what counts as the input part of the conscious network. It seems reasonable to believe that the exact boundaries of these regions may vary from person to person, and indeed, may vary over time in a single person. Therefore, in a precise sense, there is no set of necessary and sufficient conditions which define what counts as an "experience" and a "representation". Let me describe a well-known phenomenon which substantiates this claim.
Professional athletes, for example, skiers, practice their sport by "imagining" or "visualizing" perfect performances. [49] It is worth noting, first, that the are able to do this, and second, that their performance improves as a result. For the most part, further, this is not an ability they have had from birth; they must be trained, and then practice, mental imagery. What I believe they are doing is learning to stimulate activation of conscious input cells according to previously learned association. In a sense, this requires extending the conscious region of the brain, for not everybody has the ability to vividly "picture" a sequence of events.
In my own case, I am quite convinced that this is a skill which can be learned, for over the course of the last two or three years I have practiced it myself. When I started, I had no such skills, yet now, I am able to evoke vivid mental images. I am able to control the subject matter of these images, but I prefer to allow them to occur at random. It really is an extraordinary experience and I recommend it for fun and relaxation. [50]
TNP Part VI Next Post
[40] Plato, Republic, p. 168.
[41] Rene Descartes, The Philosophical Works of Descartes (Haldane and Ross, eds.) Vol. 1, p. 145.
[42] A.J. Ayer, The Problem of Knowledge, p. 37.
[43] J.L. Austin, Sense and Sensibilia, ch. III.
[44] Here I am thinking of arguments such as Armstrong's SEEG argument. Pollock also argues this way. See John Pollock, Contemporary Theories of Knowledge, p.34.
[45] N.R. Hanson, Patterns of Discovery, p. 5. See also Thomas Kuhn, The Structure of Scientific Revolution, p. 150.
[46] Paul Churchland, Scientific Realism and the Plasticity of Mind, Ch. 1.
[47] See D. Marr and T. Poggio, "A Computational Theory of Human Stereo Vision", Science 194, pp.283-7, 1976. Also see P. Johnson-Laird, The Computer and the Mind, pp. 86-87.
[48] See Johnson-Laird, The Computer and the Mind, pp. 100-110. He describes this as
'top-down' processing. Typically, when people think of top-down processing, they think of pre-established or innate knowledge. But there is no reason to suppose this; it could just as easily and more consistently be the result of back propagation.
[49] Arnold Lazarus, In the Mind's Eye, p. 30.
[50] It is worth noting that when I started this, I tried to visualize a circle as though it were really in front of me. I had almost no success for about six months, Quite by accident, however, I once visualized a baseball, which of course appears as a circle. Since then, I have had no trouble producing circles. I start with a circle I have actually seen, then allow it to become an outline form.
Posted by
Downes
at
9:25 AM
0
comments
Links to this post
Monday, March 09, 2009
TNP 5. Distributed Representation
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part IV Previous Post
V. Distributed Representation
A. A First Glance at Distributed Representation
Above, when discussing the "Jets and Sharks" example, I mentioned that the hidden units represent individual people. This representation occurs not in virtue of any property or characteristic of the unit in question, but rather, it occurs in virtue of the connections between the unit in question and other units throughout the network. Another way of saying this is to say that the representation of a given individual is "distributed" across a number of units.
The concept of distributed representation is, first, a completely novel concept, and second, central to the replies to many of the objections which may be raised against empiricism and associationism. For that reason, it is best described clearly and it is best developed in contrast with traditional theories of representation.
B. Representation and the Imagery Debate
Traditional systems of representation are linguistic. What I mean by that is that individuals and properties of individuals are represented by symbols. More complex representations are obtained by combining these symbols. The content of the larger representations is determined exclusively by the contents of the individual symbols and the manner in which they are put together. For example, one such representation may be "Fa&Ga". The meaning of the sentence (that is [a bit controversially], the content) is determined by the meanings of "Fa", "Ga", and the logical connective "&".
There are two defining features to such a theory of representation. [18] First, they are governed by a combinatorial semantics. That means there are no global properties of a representation which are over and above an aggregate of the properties of its atomic components. Second, they are structure sensitive. By that, what I mean is that sentences in the representation may be manipulated strictly according to their form, and without regard to their content.
There are several advantages to traditional representations. First, they are exact. We can state, in clear and precise notation, exactly the content of a given representation. Second, the system is flexible. The symbols "F", "G" and "a" are abstracts and can stand for any properties or individuals. Thus, one small set of rules is applicable to a large number of distinct representations. That explains how we reason with and form new representations.
We can illustrate this latter advantage with an example. Human beings learn a finite number of words. However, since the rules are applicable to any set of words, they can be used to construct an infinite number of sentences. And it appears that human beings have the capacity to construct an infinite number of sentences. Now consider a system of representation which depends strictly on content. Then we would need one rule for each sentence. In order to construct an infinite number of sentences, we would need an infinite number of rules. Since it is implausible that we have an infinite number of rules at our disposal, then, we need rules which contain abstract terms and which apply t0 any number of instances. [19]
The linguistic theory of representation is often contrasted with what is called the "picture theory" of representation. Picture theories have in common the essential idea that our representations can in some cases consist of mental images or mental pictures. It appears that we manipulate these pictures according to the rules which govern our actual perceptions of similar events. For example, Sheppard's experiment involving the rotation of a mental image suggest that there is a correlation between the time taken to complete such a task and the angle of rotation - just as though we had the object in our head and had to rotate it physically. [20] See figure 7.
Sheppard's experiments are inconclusive, however. Pylyshyn [21] cautions that there is no reason to believe that laws which govern physical objects are the same as those which govern representations of physical objects (he calls the tendency to suppose that they do the "objective pull"). It could be the case, he argues that Sheppard's time-trial results are the result of a repeated series of mathematical calculations. It may be the case that we need to repeat the calculation for each degree of rotation. Thus, a mathematical representation could equally well explain Sheppard's time trial results.
I agree with Pylyshyn that the time trial results are inconclusive. So we must look for other reasons in order to determine whether we would prefer to employ a lingustic or non-linguistic theory of representation.
C. Cognitive Penetrability
Pylyshyn argues that in computational systems there are different levels of description. Essentially, there is the cognitive (or software) level and the physical (or hardware) level. [22] Aspects of the software level, he argues quite reasonably, will be determined by the hardware. The evidence in favour of a language-based theory of representation is that there are some aspect of linguistic performance which cannot be changed by thought alone. Therefore, they must be built into the hardware. Further study is needed to distinguish these essential hardware constraints from accidental hardware constraints, but this need not concern us. [23] The core of the theory is that there are cognitively impenetrable features of representation, and that these features are linguistic.
These features are called the "functional architecture". When I assert that these features are linguistic, I do not mean that they are encoded in a language. Rather, what I mean is that the architecture is designed to be a formal, or principled architecture. It is structure-sensitive and abstract. Perhaps the most widely known functional architecture is Chomsky's "universal grammar" which contains "certain basic properties of the mental representations and rule systems that generate and relate them." [24] Similarly, Fodor proposes a "Language of Thought" which has as its functional architecture a "primitive basis", which includes an innate vocabulary, and from which all representations may be constructed.
The best argument in favour of the functional architecture theory (in my opinion) is the following. In order to have higher-level cognitive functions, it is necessary to have the capacity to describe various phenomena at suitable degrees of abstraction. With respect to cognitive phenomena themselves, these degrees of abstraction are captured by 'folk psychological' terminology - the language of "beliefs", "intentions", "knowledge" and the like (as opposed to descriptions of neural states or some other low level description). The adequacy of folk psychology is easily demonstrated [26] and it is likely that humans employ similarly abstract descriptions in order to use language, do mathematics, etc.
It is important at this time to distinguish between describing a process in abstract terms and regulating a process by abstract rules. There is no doubt that abstractions are useful in description. But, as argued above, connectionist systems which are not governed by abstract rules can nonetheless generalize, and hence, describe in abstract terms. What Chomsky, Fodor and Pylyshyn are arguing is that these abstract processes govern human cognition. This may be a mistake. The tendency to say that rules which may be used to describe process are also those which govern them may b what Johnson-Laird calls the "Symbolic Fallacy".
What is needed, if it is to be argued that abstract rules govern mental representations, is inescapable proof that they do. Pylyshyn opts for a cognitive approach; he attempts to determine those aspects of cognition which cannot be changed by thought alone. The other approach which could be pursued is the biological approach: study brains and see what makes them work. Although I do not believe that psychology begins and ends with the slicing of human brains, I nonetheless favour the biological approach in this instance. For it is arguable that nothing is cognitively impenetrable.
Let me turn directly to the attack, then. If the physical construction of the brain can be affected cognitively, then even if rules are hard-wired, they can be cognitively penetrable. If we allow that "cognitive phenomena" can include experience, then there is substantial experimental support which shows that the construction of the brain can be affected cognitively. There is no reason not to call a perceptual experience a cognitive phenomena. For otherwise, Pylyshyn's argument begs the question, since it is easy, and evidently circular, to oppose empiricism is experience is not one of the cognitive phenomena permitted to exist in the brain.
A series of experiments [has] shown that the physical constitution of the brain is changed by experience and especially by experience in early age. Huber and Weisel showed that a series of rather gruesome stitchings and injuries to cats' eyes changes the pattern of neural connectivity in the visual cortex. [27] Similar phenomena have been observed in humans born with eye disorders. Even after a disorder, such as crossed eyes, has been surgically repaired, the impairment in visual processing continues. Therefore, it is arguable that experience can change the physical constitution of the brain. There is no a priori reason to argue that experience might not also change some high-level processing capacities as well.
Let me suggest further that there is no linguistic aspect of cognition which cannot be changed by thoughts and beliefs. For example, one paradigmatic aspect of linguistic behaviour is that the rules governing the manipulation of a symbol ought to be truth-preserving. Thus, no rule can produce an internally self-contradictory representation. However, my understanding of religious belief leads me to believe that many religious beliefs are inherently self-contradictory. For example, some people really believe that God is all-powerful and yet cannot do some things. If it is possible even to entertain such beliefs, then it is possible at least to entertain the thought that the principle of non-contradiction could be suspended. Therefore, the principle does not act as an all-encompassing constraint on cognition. It is therefore not built into the human brain; it is learned.
Wittgenstein recognized that many supposedly unchangeable aspects of cognition are not, in fact, unchangeable. In On Certainty, Wittgenstein examines those facts and rules which constitute the "foundation" or "framework" of cognition. he concludes, first, that there are always exceptions to be found to those rules, and second, that these rules change ovr time. He employs the analogy of a "riverbed" to describe how these rules, while foundational, may shift over time.
If the rules which govern representations are leaned, and not innate, then it does not follow that it is necessary that these rules ought to follow any given a priori structure. And if this does not follow, then it further does not follow that rules ought to follow a priori rules of linguistic behaviour. Since the rules which govern representations are learned, then, t follows that there is no a priori reason to suppose that human representations are linguistic in form. They could be, but there is no reason to suppose that they must be. The question, then, of what sort of rules representations will follow will not be determined by an analysis of representations in the search of governing a priori principles. We must look at actual representations to see what they are made of and how they work. In a word, we must look at the brain.
D. two Short Objections To Fodor
before looking at brains, I would like in this section [to] propose two short objections to Fodor's thesis that language use is governed by a set of rules which is combinatorial and structure-sensitive.
The first objection is that it would be odd if the meaning of sentences was determined strictly by the meanings of their atomic components. The reason why it would be odd is that we do not determine the meaning of words according to the meanings of their atomic components, namely, letters. It is in fact arguable that even the pronunciation of words is not determined strictly according to the pronunciation of individual letters (here I am reminded of Orwell's "ghoti", an alternative spelling for "fish"). Further, just as the pronunciation of individual letters is affected by the word as a whole (the letter "b" is pronounced differently in "but" and "bought" - the shape of the mouth differs in each case), so may also the meaning of a word be affected by the sentence as a whole and the context in which it is uttered. [28] See figure 8.
The second objection is that a strict application of grammatical rules results in absurd sentences. Therefore, something other than grammatical rules governs the construction of sentences. Rice [29] points out that the transformation from "John loves Mary" to "Mary is loved by John" ought to follow only from the transitivity of "loves". However, the transformation from "John loves pizza" to "Pizza is loved by John" is, to say the least, odd. In addition, grammatical rules, if they are genuinely structure-sensitive, ought to be recursive. Thus, the rule which allows us to construct the sentence "The door the boy opened is green" from "The door is green" ought to allow us to construct "The door the boy the girl hated opened is green" and the even more absurd "The door the boy the girl the bot bit hated opened is green" and so on ad infinitum ad absurdum (or whatever).
E. Brains and Distributed Representations
The human brain is composed of a set of interconnected neurons. Therefore, in order to determine whether or not the brain, without the assistance of higher-level rules, can construct representations, it is useful to determine, first, whether or not representations can be constructed in systems comprised solely of interconnected neurons, and second, whether or not they are in fact constructed in brains in that manner.
At least at some levels, there is substantial evidence that representations both can be and are stored non-linguistically. Sensory processing systems, routed firs through the thalamus and then into the cerebral cortex, produce not sentences and words, but rather, representational fields (much like Quine's 'quality spaces' [30]) which correspond to the structure of the input sensory modality. For example, the neurons in level IV of the visual cortex are arranged in a single sheet. The relative positions of the neurons on this sheet correspond to the relative positions of the input neurons in the retina. Variations in the number of cells producing output from the retina produce corresponding variations in the arrangement of the cells in V-IV. Similarly, the neurons in the cortex connected to input cells in the ear are arranged according to frequency. In fact, the cells of most of the cerebral cortex may be conceived to be arranged into a "map" of the human body. [31] See figure 9.
The connections between neurons in the human visual system are just the same as those in the connectionist networks described above. There are essentially two types of connections: excitatory connections, which tend to connect one layer of cells with the next, as for example the connections between retinal cells and ganglia are excitatory; and inhibitory connections, which tend to extend horizontally at a given level, as for example the connections between the horizontal cells and the bi-polar cells in the retina are inhibitory. [32] See figure 10. Thus, the structure of cells processing human vision parallels the structure of IAC networks described above.
It is, however, arguable that while lower-level cognitive processes such as vision can occur in non-linguistic systems, higher-level cognitive processes, and specifically those in which we entertain beliefs, knowledge, and the like, must be linguistically based. [33] In order to show that this is wrong, it is necessary to show that higher-level cognitive processes may be carried out by an exclusively lower-level structure. The lower level structure which I believe accomplishes this is distributed representation.
I think that the best way to show that these higher-level functions can be done by connectionist systems is to illustrate those arguments which conclude that they cannot be done and to show how those arguments are misconstrued. In the process of responding to these objections, I will describe in more detail the idea of distributed representation.
F. Distributed Representations: Objections and Responses
There are two types of objections to distributed representations.
The first of these is outlined by Katz, though Fodor provides a version of it. The idea is that if representations depend only on connected sets of units or neurons, then it is not possible to sort out two distinct representations or two distinct kinds of representations. Katz writes, "Given that two ideas are associated, each with a certain strength of association, we cannot decide whether one has the same meaning as the other, whether they are different in meaning... etc." For example, "ham" and "eggs" are strongly associated, yet we cannot tell whether this is a similarity in meaning or not. [34]
Fodor's argument is similar. Suppose each unit stands for a sentence fragment, for example, "John -" is connected to " - is going to the store." Suppose further that a person at the same time entertains a connection between "Mary -" and " - is going to school". In such a situation, we are unable to determine which connections, such as between "John -" and " - is going to the store" represents a sentence and which, for example "John -" and "Mary -" represents some other association, for example, between brother and sister. [35]
In order to respond to this objection, it is necessary to show that there can be different types of connections. Thus Katz suggests, for example, that there ought to be meaning connections, similarity connections, and the like (I am here using "connection" synonymously with "associations", which is not exactly right, but close enough: think of an association as a set of connections). If there are distinct types of connections, then there is a distinguishable subset of connections, say, meaning-connections or similarity-connections, which exist necessarily in order to distinguish sentences from, say, similarities. But the set of sentence-connections would be just the functional architecture or primitive basis described above.
There is, however, another way to respond to this objection. First, I would like to suggest that sentence-construction and internal representation are different sorts of activities. Sentence-formation is a behavioural or output process, while representation is a cognitive process. the only way to argue against this suggestion is to argue that representations are irreducibly linguistic. Since this is exactly the issue of debate, it will not do to stipulate that representations must be irreducibly linguistic. So it is permissible for me to suggest, at least as a hypothesis, that the two functions are separate.
Now let me consider how representations such as "John -", "Ham", etc. are stored in connectionist systems. [36] It is not the case that they are stored as sentence fragments, as Fodor misleadingly suggests. Rather, a representation of an individual thing, such as "ham" (ignoring such questions as whether we are talking about one ham or ham in general) is a set of connections between one unit at a hidden layer and a number of other units at a different layer. In other words, a representation is a pattern of connectivity. These patterns may be represented as "vectors" of connections between the hidden unit and a matrix of other units at another layer. Therefore, the representation of "ham" is a set fo active and non-active units connected to a given unit and the activation of the representation is the activation of the appropriate neurons in that pattern. It is often convenient to line up the matrix of units and display the vector which corresponds to the hidden units as a series of 1s and 0s according to whether the units are off and on. So the vector for "Ham" could be represented as "10010010010...10010".
Now we can distinguish between several types of connections. two representations, that is, two units at the hidden level, are "similar" if and only if their vectors of connections are similar. In turn, two representations, that is again, two units at the hidden layer, are in some other way associated is they are both units which constitute part of a vector of a higher layer unit. This is a different sort of association, although the mechanism which produce[s] this distinct type of association is exactly the same as the former. A third type of connection exists between units at a given layer, and that is if the units are clustered with each other via inhibitory connections to form a competitive pool.
Therefore, there can be three types of association which can be defined in a connectionist system: relations of similarity, which correspond to similarities of vector; relations of category which correspond to competitive pools at a given layer [37]; an conceptual relations, which correspond to two different units being a part of the same vector. It should be evident that there can be many types of conceptual relations, in fact, one conceptual relation for each vector for each unit at the next layer. One type of conceptual relation is membership in the same sentence. How this process occurs will be described below. See figure 11.
The second sort of criticism of distributed representation is that if representations are distributed, then two representations of the same thing, say "Ham", might be different from each other (Fodor: "no two people ever are in the same intentional stance" [38]). Suppose then we have two different pattern of connectivity, each of which we'll say stands for "Fa" (of course, it doesn't stand for, or correspond to, the sentence per se, but that's the way Fodor puts it so we'll leave it like that). According to Fodor, one of these must be the representation, while the other is only an approximation of the representation. The problem lies in determining which of these two patterns of connectivity actually stands for "Fa". [39]
In response, one might wonder why there ought to be one and only one meaning or representation of "Fa". Fodor considers this solution to be "the kind of yucky solution they're crazy about in AI". ad hominem aside, it is far from common-sense to believe, as Fodor believes, that there can be one and only one representation of "John Lennon is a better lyricist than Paul McCartney". Further, connectionist systems provide an understanding of how there could be indeterminate representations. This provides a flexibility which language-based systems cannot provide, a flexibility which is essential in our everyday lives.
Let us suppose that a given unit stands for "Fa" (a bit incorrect, but let's suppose). Then this unit may be activated by the vector "11001100". Yet (and this is easily proven on connectionist systems) even a partially complete or incorrect vector will activate the unit in question. For example, it is easily shown that the unit will be activated by "11001101". Fodor's objection consists in the objection that there is no one vector that is the vector that ought to represent "Fa". But that is like attacking a critic of Platonic forms on the ground that there is no means of determining which of two hand drawn triangles is the triangle.
This is an important concept, and I wish to linger just a moment, for it will surface again. Any given concept, and indeed any given representation, needs not correspond to a particular vector, but rather, may consist of a set of vectors. And further, this set of vectors may not have precise boundaries, for what counts as an instance of a concept may vary according to what other concepts are contained in the system. If, for example, there are three colour concepts, then the colour concept "red" may correspond to a wide set of vectors. On the other hand, if there are eight colour concepts, then the set of vectors which correspond to "red" may be much narrower.
In summary, first, I argue that representational structures are learned; they are not innate. Second, I argue that they are distributed, and not symbolic. And third, I argue that concepts are fuzzy, and not precise. The third is what we would expect were the second true, and the second is what we would expect if the first were true. But my third claim can be empirically tested. It is possible to examine actual human concepts, categories and the like in order to determine whether or not they are vague or fixed. If, as I suspect a rigorous empirical examination will show, they are in fact vague, then, first, I have a confirming instance for my own theory, and second, the language-of-thought theory has a serious difficulty with which it must contend.
TNP Part VI Next Post
[18] See Fodor and Pylyshyn, "Connections and Cognitive Architecture: A Critical Analysis", in Steven Pinker and Jacques Mehler, Connections and Symbols, pp. 12-13.
[19] Pylyshyn's "Dial 911 in the event of a fire" example contains much the same argument. There are mny ways to recognize a fire, and many ways to dial 911. It is implausible that we have a rule for each possible case. Thus, we have a general rule which covers these sorts of situations.
[20] Sheppard's experiments and others are summarized in Kosslyn (et.al) "On the Demystification of Mental Imagery" in Ned Block (editor), Imagery. See also Stephen Kosslyn, Ghosts in the Mind's Machine.
[21] in Block, Imagery.
[22] Zenon Pylyshyn, Computation and Cognition. Pylyshyn does not discuss what I call the data level. Others, for example McCorduck, suggest that we ought to contemplate a further "knowledge" level. Pamela McCorduck, "Artificial Intelligence: An Apercu", in Stephen Graubard (editor), The Artificial Intelligence Debate, p. 75.
[23] For example, one accidental feature of the hardware might be the material that it is built from. Theorists who assert that only humans have the appropriate hardware are called identity theorists. See U.T. Place, "Is Consciousness a Brain Process?", and J.C.C. Smart, "Sensations and Brain Processes", both in V.C. Chappell (editor), The Philosophy of Mind (1962), pp. 101-109 and 160-172 respectively.
[24] Noam Chomsky, "Rules and Representations", Behavioral and Brain Sciences 3 (1980), p. 10.
[25] Jerry Fodor, Language of Thought.
[26] See Jerry Fodor, Psychosemantics, ch. 1.
[27] See James Kalat, Biological Psychology (1988), p. 192.
[28] Philip Johnson-Laird, The Computer and the Mind, p. 290, cites A.M. Liberman (et.al.), "Perception of the Speech Code", Psychological Review 74 (1967) to make the point.
[29] Sally Rice, PhD Thesis, cited Jeff Elman, "Representation in Connectionist Models", Connectionism Conference, Simon Fraser University, 1990.
[30] W.V.O. Quine, Word and Object.
[31] All this is surveyed in P.S. Churchland, neurophilosophy and James Kalat, Biological Psychology.
[32] Kalat, Biological Psychology, p.185.
[33] Eg. Jerry Fodor and Zenon Pylyshyn, "Connectionism and Cognitive Architecture: A Critical Analysis" in S. Pinker and J. Mahler, Connections and Symbols. Also, in Anderson and Bower, Human Associative Memory: A Brief Edition (1980), p. 65.
[34] Jerrold Katz, The Philosophy of Language, ch. 5.
[35] Fodor and Pylyshyn, "Connectionism and Cognitive Architecture", in Connections and Symbols, p. 18.
[36] This account is drawn from Rumelhart and MacClelland, Parallel Distributed Processing, Vol. 2, Ch. 17, and from Hinton and Anderson, Parallel Models of Associative Memory.
[37] Categories are also defined by similarities. This is discussed below.
[38] Fodor, Psychosemantics, p. 57.
[39] Fodor, Psychosemantics.
Posted by
Downes
at
4:42 PM
0
comments
Links to this post
TNP 4. Connectionism
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part III Previous Post
IV. Connectionism
A. Basic Connectionist Structures
A connectionist system consists of a set of neurons, or "units", and a set of connections between those units. The units may be activated or inactivated, Most systems employ simple on-off activations, although other systems allow for degrees of activation. The motivation for this basic structure is biological. Connectionist systems emulate human brains, and human brains consist of interconnected neurons which may be activated (spiking) or inactivated.
The idea is that a unit "i", if activated, sends signals or "output" via connections to other units. Other units, in turn, send signals to other units, including unit "i". These signals comprise part of the "input" to "i". It is also possible to provide input to "i" via some external mechanism, in which case the input is referred to as "external input". For any given unit "i", the state of activation of "i" at time t depends on its external and internal input at time t and the state of activation at time t-1. Input can be excitatory or inhibitory. If it is excitatory, then the unit tends to become activated, while if it is inhibitory, then the unit tends to become inactivated.
Connections between units may similarly be excitatory or inhibitory. If a connection between two units "i" and "j" is excitatory, then if the output from "i: is excitatory, then the input to "j" will be excitatory, and if the output from "i" is inhibitory, then the input to "j" will be inhibitory. If the connection between "i" and "j" is inhibitory, then excitatory output will produce inhibitory input, and inhibitory output will produce excitatory input. See figure 1.
Typically, units in a given network are arranged into visible and hidden layers. The visible layers are in turn divided into input and output units. The idea is that the hidden units are sandwiched between the input and the output layers. The input units are activated by external input, and these in turn activate appropriately connected hidden units. These hidden units in turn activate output units. Thus, one might say that input stimulus produces output response. See figure 2.
b. Pools
The designation of one or another set of the visible units into input or output units is to some degree arbitrary. It is often more useful to think of visible units as being divided into "pools" such that any given pool or set of pools may be a set of input or output units depending on circumstances. The idea is that units in a given pool may be connected to each other and to units at higher or lower levels, but not to units in other pools at the same level.
The basic idea is derived from Feldman. [12] Suppose we have two units "i" and "j", each with a single input and a single output. Excitatory input will activate the unit and it will in turn send excitatory output. Suppose now that each unit is connected to the other such that if "i" is activated, it will tend to inhibit "j", and if "j" is activated, it will tend to inhibit "i". Then over time, whichever unit re3ceives more input activation will tend toward maximum activation, while the other will tend toward minimum activation (that is, maximum inhibition). A network like this is called a "Winner-take all (WTA)" network. See figure 3.
If you do this with two or more units, you have a pool. An example of this sort of structure is McClelland and Rumelhart's "IAC (Interactive Activation and Competition)" network. [13] This is an interesting network because it shows how networks can categorize and generalize.
McClelland and Rumelhart use as an example a network called "Jets and Sharks". Each unit at the visible level stands for some property of a person, for example, his age (in20s, in30s, in40s), his occupation (burglar, pusher, etc.), his name, his education, and so on. Each of these sets of properties (age, occupation, name, education) constitutes a single pool. Units in a given pool are connected to other units in the pool and to units in a second, hidden layer of units. the connections have been predefined such that connections between members of a given pool are inhibitory and connections between the visible and the hidden layers of units are excitatory. See figure 4.
The idea is that each unit at the hidden layer stands for an individual person. What characterizes that person (that is, the knowledge stored about the person) is not some property of the unit in the hidden layer, but rather, the set of connections between that unit and the units at the visible layer. Take, for example, some hidden unit "i". This unit contains no information about the person. However, it is connected to the units at the visible layer standing for Jake, burglar, in20s, high school, Jet (the gang name), etc.
Suppose now that the network could learn the connections just described. Then it would have leaned a system of categorization. Each of the pools constitutes a distinct category. The fact that it i a category is established by the inhibitory connections between all and only units of a given pool and by the fact that one, and only one, unit in a given pool is connected to any given unit at the hidden layer. What makes, say, in20s, in30s, and in40s a single category is first the fact that they have something in common - they are all associated with some person - and second that they are mutually exclusive. The activation of one inhibits the rest.
The IAC network can perform a number of associative and inductive tasks. For example, suppose we activated "burglar", "in20s" and "high school". Then, because of the excitatory connections to a given hidden unit, that unit would become activated. In turn, the hidden unit would send excitatory output via an excitatory connection to the visible unit representing "Jake". Thus, by input to a set of features, an individual's name may be recognized. What is interesting is that the name may be activated even if the description is incomplete or incorrect. [14] The reliability of such a conclusion drawn in such circumstances varies according to the scale of the missing or incorrect information and according to other connections in the network.
Such a network can also generlaize. For example, suppose the unit for "Jet" were activated. This unit is connected to a number of hidden units, and these will be activated. Each of the hidden units will send excitatory output to units in the other visible pools. Several units in each pool may be excited. However, since the connections between the units of a given [pool] are inhibitory, then only the unit with the greatest excitatory input will be activated; the rest will be inhibited, Thus, upon the activation of "Jets", a set of units, one in each pool, will also become activated. This set of units is a "stereotypical" picture of the Jets. For example, activating "Jets" may result in the activation of "in20s", "pusher", "high school", etc. One might say that these features constitute a definition of the category "Jets" even though no individual Jet has all and only those stereotypical features.
A similar sort of network performs well in visual recognition or multiple constraint tasks. One could input observed features of a person at a distance and the output could be that person's name. As with the previous example, the nature and reliability of the conclusion will vary according to circumstances. For example, a distinctive walk could very quickly aid in the determination of a given person's name, but if the system has information about two such people with the same distinctive walk, then this determination will not be so quick and so sure.
C. Learning in Connectionist Networks
In my mind, the real advantage of connectionism does not lie in the features just described, for those features could be realized by a system with enough predefined rules. Rather, the advantage is that such a system can learn its own connective structure. A network learns by adjusting connection w3eights between different units. There are several ways of doing this, and this accounts for one of the major differences between types of connectionist systems.
The simplest sort of system employs the Hebb rule. According to the Hebb rule, if two units are simultaneously activated, then the connection between thm should b strengthened. Similarly, the connection between two units should be strengthened if the two units are simultaneously inhibited. If the two units are at a given time at different states of activation (one is excited, the other inhibited) then the connection is weakened. The major problem (in my mind) with the Hebb rule is that it doesn't work in networks with more than two layers. This is a substantial weakness, since as Minsky and Papert point out, a two-layer network cannot distinguish between, say, exclusive and inclusive disjunctions. [15]
Most contempoary systems use a version of the "delta-learning rule". In such a system it is important to distinguish between input and output units. Input units are excited and consequent output noted (if there are more than two layers and no connections are yet established there will be no output at first). The output obtained is compared to the desired output and an error is computed. From this error a correction can be calculated (take the error and apply a linear or non-linear function to it. This produces a curve, the slope of which will be zero at the point of minimum error. There are various strategies for going 'down' the curve.) This correction is then propagated back through the network and the correction is distributed across the connections which contributed to the error. This process is called back-propagation.
Such a system depends on a teacher. This sometimes seems to pose a problem for artificial intelligence theorists who would rather the machine learned completely on its own. [16] However, experience can teach. The concept of "lessons of nature" dates back at least to the Scholastic philosophy of the middle ages and is a central thesis of empiricism. One can imagine how some particular output (a response or behaviour) might require correction because it causes, or fails to prevent, pain.
One of the dificulties encountered in back=propagation systems is the problem of the "local minima". What happens when you calculate the error curve for a number of variables is that you might not get one single location where the curv is zero; you might get several. One such point may be still an error, but since the slope i zero, there is no means for the system to correct itself, since the degree of correction is usually a function of the slope of the curve. What you want to do is to "shake" the system so that it reaches the lowest minimum. See figures 5 and 6.

This is accomplished in two stages. First, each unit is considered to have two possible states of activation, namely, activated and inactivated. Each unit has another state, which is its probability of activation. Input from other units or from external units affects the probability of activation and not the state of activation itself. Then, in the second stage, the probability that a given nit will be activated is represented by the function:
where E stands for the "energy" of the system and "T" stands for the "temperature" of the system. [17]
The reason why the terms "energy" and "temperature" are employed is that the equation above is borrowed directly from physics. Essentially, the higher the temperature, the more random the activation or inactivation of a given unit will be. As it turns out, if a system is started at a high temperature and then, as processing continues, the temperature is lowered, the system is much less likely to settle into a local minimum. This process is called "annealing" and is exactly analagous to the physical process (used to produce stable crystalline formations) of the same name.
There are several useful features of this process which I won't detail, however, I will mention that these is an equation such that the energy (and hence, error) of any given connection can be determined. Hence, energy minimization (and hence, error correction) can be accomplished at a local level, with no regard to the global properties of a system. This means that "higher level" knowledge is not required for error correction.
TNP Part V Next Post
[12] Feldman, J.A. and D.H. Ballard, "Connectionist Models and Their Properties", Cognitive Science 6 (1982), pp. 205-254; cited in Alvin Goldman, "Epistemology and the New Connectionism", in N. Garver and P. Hare, Naturalism and Rationality (1986), p. 84.
[13] Parallel Distributed Processing I, p. 28.
[14] This is called content addressability and differs from traditional systems. See James Anderson and Geoffrey Hinton, "Models of Information Processing in the Brain", in Anderson and Hinton, editors, Parallel Models of Associative Memory, p.11.
[15] Marvin Minsky and Seymour Papert, Perceptrons.
[16] Eg., Hinton, who proposed that learning could be accomplished without training if the system had a built-in coherence requirement. Geoffrey Hinton, "The Social Construction of Objective Reality in a neural Network". Connectionism conference, Simon Fraser University, 1990.
[17] This equation is specific to "Boltzmann" machine versions; there are other equations that accomplish the same thing. These are described in Parallel Distributed Processing I, ch. 6 and 7.
Posted by
Downes
at
3:09 PM
0
comments
Links to this post
TNP 3. Three Objections to Empiricism
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part II Previous Post
III. Three Objections to Empiricism
A. Objections to Associationism
There are three essential objections to empiricism. The first objection is that associative principles are not sufficiently powerful to explain human cognition. The second is that there is no means to distinguish input from other cognitive phenomena. And the third is that associative inferences can never be justified. I will discuss each in more detail and show what I need to prove in order to meet the objection.
The first objection always has the following form: "Human beings can know or do X, no associative system can know or do X, therefore, humans use something other than associative principles."
A paradigm example of such an objection mentions humans' use of abstractions. Consider the following example. [5] Suppose we had to determine whether a string of letters is a well formed formula (wff) in a language L. Language L is a "mirror image" language; only strings the two halves of which are mirror images are wffs in L. In order to distinguish wffs in L from non-wffs in L, it is necessary to employ an abstract term. Since it is impossible for an associationist principle to employ an abstract term, then the principle we employ to distinguish wffs and non-wffs cannot be an associaionist principle.
Another example may be found in Leibniz. [6] While we need experience to suggest to us universal eneral principle principles, we cannot derive these principles from experience. For experence consists entirely of particulars, and no set of particulars is ever sufficient for the derivation of a universal. Therefore, we must emply some means other than experience in order to derive universal general principles.
These examples could be multiplied, but they give the general idea. Since the argument is valid, then the only means of answering such objections is to deny either (a) that human beings can know oe do X, or (b) that associationist principles are insufficient for X. In general, I take the following approach. If the claim is that we know X, then I deny 9a). If the claim is that we do X, then I deny (b). Classical scepticism is all that I need to deny (a). The real challenge lies in the denial of (b).
B. Theory-Laden Perceptions
The second objection to empiricism reaches the conclusion that, since there is no means of distinguishing perception from other aspects of cognition, it follows that we cannot say that other aspects of cognition have their origin in perception.
The premise is well supported, For example, Quine argues that we cannot distinguish between the analytic (for example, formal rules of inference) and the synthetic (empirical content). [7] A similar point is made by Hanson. According to Hanson, what we see is affected by what we believe, that is, all our experiences are "theory-laden".
Since the premise of this argument is well supported, the only means of responding to this argument is to show that the conclusion that there are no pure perceptions does not follow from the premise. There are two ways of stating the premise. I will consider each in turn.
The first way is to state the premise is to say that perceptual terms are theory-laden. [9] The second is that perceptions themselves are theory-laden. The first formulation I embrace, since all aspects of language are theory-laden. Language depends to a large degree on rules and categories, and these, I believe, are theories. However, it does not follow from this premise that there are no pure perceptions, since perceptions are distinct from descriptions of them.
The second way of stating the premise is to the effect that perceptions themselves are theory-laden. I can agree that at some level, perceptions are theory-laden. This is a natural and expected consequence of the theory of learning which I will propose below. If it is true that perceptions are always theory-laden, then the conclusion, that there are no pure perceptions, follows. So, in order to show that this conclusion does not follow, I will need to show, first, that there is some level of perception that is not theory-laden, and second, there was some point in time at which no perception is theory-laden.
This will not be an easy task. It is arguable, for example, that even if there are pure perceptions, they cannot be used unless combined with some theoretical structure or another. [10] It is also arguable that in order to perceive objects in three dimensions, some higher-level built-in constraints are required. [11] Therefore, in order to meet this objection, not only is it necessary to show that perceptions are pure at some level at some time, it is also necessary to show that these perceptions are sufficient for all other cognitive activity.
C. Justification
The whole idea behind justification is that of distinguishing between right and wrong (correct and incorrect, justified or unjustified) inferences, and the third objection to empiricism is that it cannot distinguish between right and wrong inferences.
The ground for this objection is that associationism does not distinguish between truth-preserving infrencs and other sorts of inferences. For example, suppose we adopt a causal theory of cognition (following, say, Armstrong and Goldman). Some causal interactions, for example, the triggering of relays in a computer, are truth-preserving. Others, for example, a bat striking a ball, are not.
What is needed, the objection continues, is a formal representation of the sort of causal interactions which are truth-preserving, in order to distinguish fropm those which are not. This representation must exist at a level over and above mere physical instantiation. An example of the sort of principles that we require is the set of rules of logical inference.
In response to this argument, I will argue, first, that the distinction between right and wrong inferences is sufficiently drawn by the concept of relevant similarity, and second, that associationist systems allow only inferences which preserve relevant similarity. Therefore, associationist systems, by the fact that they are associationist systems, provide sufficient justification for their conclusions.
An objection to this conclusion may state that there is a rather large difference between truth and similarity, and thus one cannot equate a mechanism which preserves similarity with a mechanism which preserves truth. Therefore, associationist systems do not provide sufficient justification for their conclusions.
The reason why I believe that justification must be defined in terms of similarity are complex. They will be discussed below. At this point, however, may I say that, if it is indeed the case that justification can be defined in terms of similarity, then the third objecyion can no longer be sustained.
TNP Part IV Next Post
[5] T.G. Bever, J.A. Fodor, M. Garrett, "A Formal Limit of Associationism," from Verbal Behaviour and General Behavour Theory, T.R. Dixon and D.L. Horton, editors. Prentice-Jall, 1968.
[6] New Essays Concerning Human Understanding.
[7] W.V.O. Quine, "Two Dogmas of Empiricism", From a Logical Point of View.
[8] N.R. Hanson, Patterns of Discovery.
[9] Eg., Paul Churchland, "Two Kinds of Evidential Bias". I have only a manuscript of this.
[10] Lawrence Bonjour. The Structure of Empirical Knowledge.
[11] Marr, Vision. See also Phillip Johnson-Laird, The Computer and the Mind.
Posted by
Downes
at
2:25 PM
0
comments
Links to this post
TNP 2. Empiricism
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
TNP Part I Previous Post
II. Empiricism
A. What I Mean By Empiricism
When I speak of "empiricism" I wish to make it clear that I am not discussing logical positivism or other contemporary theories which have been described as empirical. Rather, what I mean has a much closer affinity to the philosophies of David Hume and John Stuart Mill. To employ Hume's terminology, what I wish to assert is that all ideas are copies of impressions. Modern terminology demands a more precise definition.
We may identify three distinct levels of human cognition. [1] The first, or lowest level is the "hardware" level, that is, the physical structure in which cognition occurs. The second ;evel is the "software" level, that is, a set of rules or procedures which govern cognitive processes. Third, there is the "data" level, which contains the contents of cognition, for example, mental representations.
Empiricists assert that all content at the data level has its origin in experience. What this means is that all content must have been, at some time or another, input through one or another of the senses. The contemporary dispute between empiricists and other philosophers concersn the origin of the software level. Empiricists believe that these rules or processes are learned, while other philosophers believe that they are innate or otherwise directly intuited.
Insofar as we are talking about formal rules, for example, the rules of logical inference, grammar, or mathematics, then I am in agreement with the empiricists. However, I believe that these rules belong to the data level. I think that they describe and do not govern. The rules or principles which actually govern cognition are of a different type: they are associative, not formal, principles. In this way, I believe my thesis differs significantly from contemporary or positivist forms of empiricism.
Like Hume, I believe that human cognition is governed by human nature. Thus, I believe that the associative principles which govern human cognition are a part of, or instantiated in, human nature. Therefore, according to the sort of empiricism I am proposing, instead of there being three levels of cognition, there are only two lvels: the hardware level, which contains the associative principles which govern cognition, and the data level, which contains the content of cognition including descriptive 'formal' principles.
Why am I the sort of empiricist I say I am? First, I do not believe that human cognition is governed by formal rules. Otherwise, we would never be able to break these rules, and there is substantial evidence that we can. So I believe we are not governed by formal rules, and further, I believe that these rules cannot be innate.
And second, even were we governed, innately or otherwise, by formal rules, I would argue that we should not be. For formal rules are abstractions, and while abstractions are useful, they are insufficient to respond to sufficiently complex phenomena. As Wittghenstein [2] points out, we can always find an exception to a rule, and we need to be able to respond effectively even in the case of an exception.
B. Association, Rules and Categories
Before the advent of logical positivism, empiricists such as Mill and Mach argued that general principles, such as formal rules or laws of nature, are summaries of previously experienced phenomena. [3] In my opinion, this view of rules is correct. Rules, as Wittgenstein argued, describe, and they do not prescribe.
Generalized descriptions such as rules or laws of nature may be derived employing associative principles. Simply put, the idea is that, when we observe a sequence of similar events in which two things go together, we generalize and say those things generally go togethr.
There is in my mind a close link between rules and categorizations. When we place two things into a category, we are saying that those two things are similar in some way. We use categories in order to generalize. If most members of a given category are associated with something, say, some sort of behaviour, we tend to associate all members of the category with that behaviour.
The concept of "similarity" is central to empiricism, for all association and categorization depends on similarity. By "similarity", I wish to emphasize, I do not mean identity of a set of observational predicates. In my opinion, similarity is a pre-linguistic concept. below, I will say that two things are similar if they have sufficiently overlapping vectors.
The entire principle of associationism may be defined by the following paradigm example: "dpgs are similar to cats, dogs are associated with fur, therefore, cats are associated with fur." While this may seem to be a very weak principle, once it is recognized that the 'cat', 'dog' and 'fur' in the example can be anything, for example, '1101', '1100', and '0001' respectively, then we can see that this is a very powerful principle.
TNP Part III Next Post
[1] I use the term "levels" in much the same manner as Pylyshyn, Computation and Cognition.
[2] On Certainty. "A rule is shewn by its exception."
[3] John Stuart Mill, A System of Logic, and Ernst Mach, The Analysis of Sensations.
[4] Ludwig Wittgenstein, Philosophical Investigations.
Posted by
Downes
at
2:01 PM
0
comments
Links to this post
TNP 1. Introduction
The Network Phenomenon: Empiricism and the New Connectionism
Stephen Downes, 1990
(The whole document in MS-Word)
1. Introduction
2. Empiricism
3. The Objections to Empiricism
4. Connectionism
5. Distributed Representations
6. The Problems of Perception
7. Associationism: Cognitive Structures
8. Associationism: Inferential Structures
9. Connectionism and Justification
10. Summary
11. Projects and Investigations
TNP: 20 Years On
I. Introduction
I wish to argue in this paper that the new connectionism provides a vindication for classical empiricism. By "empiricism" I mean the philosophy that all knowledge has its origin in experience, that is, that there is no innate or otherwise intuited knowledge. My argument is that connectionism provides a computational framework within which traditional objections to empiricism may be met.
The structure of this paper is as follows. I will begin with a description of what I mean by empiricism. Then I will set out a series of objections to the philosophy I describe. In order to meet those objections, I will first describe connectionism, then outline some objections particular to connectionism, and then finally respond to each of the objections mentioned. Finally, I will discuss some further avenues of investigation.
I would like to caution the reader that this is to a large degree a survey paper. While arguments are sketched in order to demonstrate the plausibility of the thesis being proposed, I do not claim to have completely solved all the problems and to have met all the objections. In addition, the reader will note that many of the arguments given are only sketches and do not consider a number of important yet intricate points. The reason for this is that length was, believe it or not, a limiting factor in this presentation.
TNP Part II Next Post
Posted by
Downes
at
1:50 PM
0
comments
Links to this post
Wednesday, March 04, 2009
School Choice
Responding to Joanne Jacobs:
Without school choice, Ty’Sheoma Bethea will stay in her second-rate school
What does she think would happen if 'school choice' should suddenly appear? That this one person - and no other - would go to the first-rate school? No, of course not - but then, would everyone go to the first rate school? No, that wouldn't work either - there aren't enough spaces, and creating them would ruin the first-rate school.
The reasoning, of course, is that choice would create competition, which would magically make underpaid and underfunded schools suddenly become better. As good as the first-rate schools, even - because, otherwise, the logic simply doesn't work.
In fact, it doesn't work at all. The idea if school choice being the answer to someone stuck in a 'second-rate' school is a farce. You won't make all schools first-rate, and you won't get nearly all of the students into the first rate school.
The only way school choice makes sense is in supporting the *type* of education that is more appropriate for people (this, though, doesn't fly with the standards crowd because it allows that people have different learning styles, different needs, different interests, and that these could be served by the school board).
The fact is, "school choice" - at least how it is being used here - is code for "private". And these days, the people supporting privatization bear the onus of proof. The privatization crowd has basically wrecked the economy and the parts of the school system they touch - like, say, Edison schools - often end up as a wreck as well.
There is such a thing as genuine choice. I wrote about it here: http://www.downes.ca/post/44259 But it has nothing to do with privatization, and everything to do with quality education. So it's probably not of interest here.
Posted by
Downes
at
5:42 PM
4
comments
Links to this post
Monday, March 02, 2009
The Monkeysphere Ideology
Goodness knows, I don't want to cite Cracked as my source.
My brother liked Cracked. I was always a Mad reader. He also liked Pepsi and CFGO in Ottawa. Myself, I was always a devotee of Coke and listened to CFRA. The originals.
But sometimes you follow the leads wherever they take you. But let me digress first.
Unlike most people, I did not lose any money in the economic crash. At least, no money that I know of yet - I may have something in some pension account somewhere. But I don't have investments, retirement accounts, or any of that sort of thing.
"I won't get to retire," I always said when people asked me. "Whatever retirement money I could ever save, they would figure out some way to steal it." And so they have, and now with the Dow passing 7000 and continuing its downward plunge, it feels like I'm watching the fall of a civilization.
I watched The Day After on YouTube today. You can watch the entire length of the controversial 1983 made-for-TV movie. The story centers around the survivors of a nuclear attack. The few that made it endured chaos, disease and hardship. Their entire way of life disappeared in jyst a few minutes.
I have always wondered why people go on with their daily lives in cities on the brink of disaster. How the residents of Pompeii, for example, were cooking bread and weaving cloth right up to the time of the fateful eruption. How villages continued as normal up to the very minute of Genghis Khan's golden horde.
Now I know: what else can you do? The disasters cut a swatch through society, everything changes, and then you try to make do with whatever you have left.
So what does this have to do with Cracked?
Well - it's this. Great societies, they endure. Their fabric withstands the blows of fate and fortune and there is enough in their people to carry on after. To carry on in an altered, reformed, fundamentally different state, perhaps, but to carry on.
But what gives you that capacity is not typically your technology or your wealth or your dominions - all of which are characteristically wiped out in a crash. No it is your character, your capacity not simply to carry on, but to have a reason to carry on, to rebuild what you have lost.
Now let's look at the Cracked article, which suggests that each of us has a limit of about 150 people we can know and understand and relate to. The theory is based on Dunbar's number, and Cracked calls it - with more than a little alacrity - the 'monkeysphere'. The article, which was written in 2005, is making the rounds again.
In our complex society, writes Cracked, "Most of us do not have room in our Monkeysphere for our friendly neighborhood sanitation worker. So, we don't think of him as a person. We think of him as The Thing That Makes The Trash Go Away."
So far, this is fine. We have limits to our capacity. We are monkey brains. We all know that. But the writer takes it a step further. " We are hard-wired to have a drastic double standard for the people inside our Monkeysphere versus the 99.999% of the world's population who are on the outside."
My fiend Bob Armstrong used to say, when we worked on the Gauntlet together, that the importance of a story to the media was inversely proportional to the distance from us and proportional to the amount of blood shed and the whiteness of their skin. An observation, not a thesis, and one that remains true today, at least in western media.
But being Cracked, the thesis is pressed one fatal step further: "The problem is that eventually, the needs of you or those within your Monkeysphere will require screwing someone outside it (even if that need is just venting some tension and anger via exaggerated insults). This is why most of us wouldn't dream of stealing money from the pocket of the old lady next door, but don't mind stealing cable, adding a shady exemption on our tax return, or quietly celebrating when they forget to charge us for something at the restaurant."
Except... that's not true.
Oh, wait a minute. It's true for some people, at least. "There is a reason why all of the really phat-ass nations with the biggest SUV's with the shiniest 22-inch rims all have some kind of representative democracy (where you vote for people to do the governing for you) and all of them are, to some degree, capitalist (where people actually get to buy property and keep some of what they earn). "
And this is what I have always known about what would be the fate of my putative retirement savings plan.
And what we are going through now is the logical consequence of thinking like monkeys. If we can't even get though a day without yelling at people on the road, stealing money from old ladies, or cheating on our taxes, cable bills or restaurant cheques, then any hope we have of building a modern technological society is probably doomed. They're too fragile. They require a high degree of intelligent behaviour on the parts of their citizens.
And we have spend the last few decades fostering, nay celebrating, the ethos of the monkeysphere. Believing that if each of us looked out solely and entirely for our own interests (and that it wasn't cheating unless you got caught and convicted). A nation of Conrad Blacks, looking at us smugly, derisively, snarling at our inability to understand the realities of our times.
And even as our society heads toward the precipice, life continues on as usual. Television channels continue to play Jerry Springer and Dr. Phil. The newspapers continue to publish stories about the needs of business and retirement savings. Our society continues to slide - and, one thinks, it will not cease to slide until people get the point.
The point is that the monkeyspehere ethos that has been informing our society over the last three decades or so is fundamentally wrong.
Our failure lies not in the fact that we cannot know and understand more than 150 people. That's just a fact of physiology. Rather, our failure lies in how we characterize the remaining 99.99 percent of humanity: as though they were automatons.
This is the fundamental error of our times. It is the error that allows us to characterize entire societies as 'ragheads', the culture that allows is to say "it's not personal, it's business" as we evict someone from their home or cheat them out of their life's savings, the ethos that allows us in the western world to build a society based on consumption and ownership of more and more even as starvation and disease wrack the remainder of the world.
This is what allows us to treat politics and warfare as games, that allows people like Rush Limbaugh to say he hopes Obama's plan will fail, that allows us to treat education as though it were economics, able to be sceptical about reform but not really caring, because those kids, aren't people, beacuse success has nothing to do with lives, everything to do with test scores.
This fallacy persists. The failure to understand just what has gone wrong with our society continues in our government halls, where our own ministers are sacrificing humanities and the arts for business - “[s]cholarships granted by the Social Sciences and Humanities Research Council will be focused on business-related degrees.”
And I read this headline in he New York Times: "In tough times, the humanities must justify their worth." Not business, which caused this mess, not media, which propagated the monkeysphere ideology, not accounting, law or political science, which participated by stumbling around each other in their haste to see who could be corrupted most quickly. No - humanities. And arts.
Because the humanities continue to be portrayed as the pastime of the idyll rich: "a traditional liberal arts education is, by definition, not intended to prepare students for a specific vocation. Rather, the critical thinking, civic and historical knowledge and ethical reasoning that the humanities develop have a different purpose: They are prerequisites for personal growth and participation in a free democracy, regardless of career choice."
Our falure is not a failure of business, which performed as intended (at least for those who made off with the wealth). It is a failure of the humanities, a failure of humanity, the study of which has been in notable decline throughout these last few decades, having, if you will, no measurable worth, no valuation, it being nothing more than a pastime and a recreation.
Ironic then that Obama's success in the United States is the very antithesis of that: “He does something academic humanists have not been doing well in recent years,” [Andrew Delbanco] said of a president who invokes Shakespeare and Faulkner, Lincoln and W. E. B. Du Bois. “He makes people feel there is some kind of a common enterprise, that history, with its tragedies and travesties, belongs to all of us, that we have something in common as Americans.”
The case has been made before. Our media, one of the early victims of the rise of corporatism, has transformed us from a society of thinkers and reflectors to a society of passive consumers of slapstick. A society were a Jerry Springer retort is what constitutes a reasoned argument, a society where lies and deception become standard fare in the media, a society in which cardboard caricatures substitute themselves in our awareness for reality.
I find myself asking this a lot, "How can you find that moral?" or "how can you find that ethically defensible?" Business practices that depend on preventing poor people from obtaining an education. World financial systems that require deceptive advertising and child labour in work camps in the third world. A network of luxuries and resources that are based in the systemic looting and deprivation of entire populations.
Andrea and I went out to see The Reader last night, a film that had enough conflict, sex and nudity to catch the attention of the box-office-sensitive critics and Oscar voters. "What would you have done?" asks the illiterate prison camp guard quite reasonably. "There were more people coming. We had no place to put them. What would you have done differently? Should I have not joined the SS?"
What is society, other than law? We are tempted to say that it must be more - that it must be morality, say - but even that is far to shallow a notion. Law and morality are not what make us obey even the little principles that create a society. Law and morality depend, even in themselves, on self-interest, on reward and punishment, on monkey teleology. We know that when law and morality are all that hold us together, things fall apart as soon as the source of order is removed.
Our society is founded, and made possible, though an act of mind: and that act is the capacity to empathize - to see, through reason, the conseuqence of our action on others, and to feel the impact of those consequences in ourselves. We even have bits of monkey brain specifically designed for that purpose. Until we shut them off. Until we deliberately erase their impact, because they have no 'value'.
This is not simply a matter of schooling, not simply a matter of going to college. Perhaps there was a time when we could afford to have a society where education was available only to the elite. It isn't even a matter of preparing students "for professional success, responsible citizenship, and fulfilling lives." It's not a matter of preparing at all.
James Bloom writes, 'When we start telling students, their families and the public who pay for our services: 'Trust us. Don’t ask questions. We know what we’re doing,' instead of encouraging them to ask, 'Why do you what you do?' or 'What’s the point of studying literature and philosophy?,' we’ll deserve to go out of business." But it isn't a matter of being or not being in business.
The economy is just numbers. Education is just facts. Business is just commerce. None of these will offer our society any sort of hope in the current crisis, or the numerous crises that are coming. Yes, the economy is crashing, yes, millions of people are losing their homes and their jobs, yes, we may be only weeks and months away from riots in the streets and civil insurrection - none of this is at the core of our despair.
I finished Hemingway's For Whom The Bell Tolls last night. A story of civil insurrection, of the end of society and the rise of fascism, of casual murder, betrayal, and love. "'Then you will have to fight in your country as we fight here.' 'Yes, we will have to fight.' 'But are there not many fascists in your country?' 'There are many who do not know they are fascists but will find it out when the time comes.'"
In Hemingway's time, as in our own time, society falls, and fascism rises, when the humanity is erased from its citizens. "The soldiers using those weapons are simple brutes, they lack 'all conception of dignity' as Fernando remarked. Anselmo insisted, "We must teach them. We must take away their planes, their automatic weapons, their tanks, their artillery and teach them dignity".
When we live our lives in the monkeysphere, we have no comprehension of any of this. We see glimpses only of the lives of the participants, and mostly, see that they do not see each other as people - as hurting, feeling beings. "Because thou art a miracle of deafness....It is not that thou art stupid. Thou art simply deaf. One who is deaf cannot hear music. Neither can he hear the radio. So he might say, never having heard them, that such things do not exist."
What we need, to survive this crisis and the next, is to get beyond the crass calculations of statistics and value, beyond the idea of "proving your worth", beyond seeing people as caricatures, as cardboads cutouts populating the backdrop of our lives, but of beings worth of consideration, nay, worthy of sacrifice.
This is more than "a common enterprise, that history, with its tragedies and travesties." This is, rather, a way of seeing the world, or as Wittgenstein would say, a way of being, a way of living. Our fundamental bedrock assumption must be, as Kant said, that we treat people as having inherent value in and of themselves. "Act in such a way that you treat humanity both in your own person and in the person of all others, never as a means only but always equally as an end."
I have, from time to time in the past, advocated that educators ought to adhere to something like the Hippocratic oath, a commitment to, above all, do no harm. This ought to be the end of statistical education, the end of the idea that students are not mere caricatures, the end of the idea that educational innovation that satisfies the needs of the many, or the needs of society, or the needs of business or the rich, can be accomplished by the sacrifice of even one person.
And, were a similar standard adopted in our processes of politics and business, it would be the end of government by statistics. The end of the depiction of unemployment as a rate. The end of the accounting of poverty as a percentage. The end of the idea that "it's just business" when we sacrifice a life, and the beginning of the idea that, not only is it morally and legally wrong, it is also fundamentally opposed to our idea of selves as humanity. Inhumane.
So how do we get there?
Ideologically, we have to get beyond the mass. We have to get beyond the idea of seeing ourselves as being nothing more than the corporate entity to which we belong, whether that entity be a business, a religion, a discipline, a nationality. We are each of us members of all of these things, and more, and yet they form only the shallowest part of ourselves.
Yes, though it is empowering and aspirational to be a part of something that 'greater than ourselves', it is key and fundamental to understand that, whatever this thing may be, it is a fiction, an artifice, that we have created in order to more efficiently express our thoughts, feelings and affiliations. The moment we subvert ourselves to the mass, is the moment we can see all other humans as similarly subverted.
Conceptually, we need to begin to think and reason and act in terms of the concrete rather than the abstract. That does not entail the end of abstract reason - far from it - but rather it is to foster in ourselves a clear and precise understanding that the abstract is an artifice, an invention, that we use to facilitate thought and reasoning.
Probably the most evident of the abstract that has become reality, and the form of artifice most often promulgated in our mass media, is that of simple causality, whether that of a war, a depression, a successful education, an election. We hear constantly the idea of some 'leader' or 'great person' (usually from MIT or Princeton or something) having 'done' something, whether it be as mundane as raising money through a football program or 'the inventor of' as though there were no players or society or funding of staff or support or janitorial service that made it possible. Our mass media idolize the famous, and in so doing, relegate the rest of us to being bit players. Props.
This understanding, this way of seeing the world, is wrong, and demonstrably wrong. Place Alexander Graham Bell in the Middle Ages and - guaranteed - he does not invent the telephone. Place Rene Descartes in the Tsarist Russia and the Meditations never sees the light of day. Under slightly different circumstances, Elisha Gray is the inventor of the telephone, or Blaise Pascal the inventor of Cartesian Geometry.
When we magnify the importance of the corporate entity above all else, we hurt society. And when we magnify the importance of the individual actor above all else, we hurt society. The monkeysphere ideology is based on both of those fallacies. Business, media, and the rest of them, are based on that fallacy.
Practically, we must immerse ourselves in our own humanity. We must talk to each other. We must communicate with each other. We must be open about our own lives, and curious about others. We must transcend the limit of the monkeysphere by constructing for ourselves concrete understandings of what it is to be human, to live, to have hopes and dreams, and to die. We must read each others' stories, listen to each others' music, to, above all, communicate.
We may not be able to know, in a personal sense, more than 150 people. But we can know of many many more, and we can know, in a concrete sense, that each of these people live lives of value, and cannot simply be thrown away or discounted, not through any sense of law or ethics, but because that's how it feels to be human.
Which returns us to the monkeyspehere. Which returns us to the fact that most of us would not cheat on our neighbours, steal from the blind, swindle old ladies, and all the rest of it, not because it's against the law, not because it's immoral, but because of the way we feel when we do it. As Hume attempted to explain, our sense of humanity and decency is based on a sense of feeling inside us, a passion, and this passion is in turn born in us through a process of experience and education, the process of living in the world, interacting with others, and understanding them.
In our society today a great many people live without this sense of feeling for others. It's a sad thing, and the result of decades of deliberate desensitization. These are the people who, above all, will not be able to comprehend the economic collapse (or global warming, or resource scarcity, or the rest of it) and, with its onset, will be poorly placed to survive it.
These are the people who, through the decades of the monkeysphere, laughed at us from their SUV, blames poverty on the indigent, and championed the unique acumen and skill of CEOs who, by luck and a similar narrowly focused ethic, managed to steal success and create an empire. These are the people who will be least stable in the coming years, which is why reconciliation - as hard as it may be - will have to serve as a touchstone for our post-crash society.
And the other touchstone will be even more simple and more basic - the preservation and promotion of individual human worth and dignity, for each and every person in society - no exceptions. The understanding that our first response to the crisis will have to be to ensure that everyone remains housed and healthy, nourished and educated. The understanding that acquisition and hoarding are dysfunctional, that the chronically wealthy are, in a certain sense, disabled, and that the wealth of society is the birthright of each and every individual of which it forms a part.
Posted by
Downes
at
11:42 AM
8
comments
Links to this post
