Academia.eduAcademia.edu

Consciousness and metarepresentation: A computational sketch

2007, Neural Networks

When one is conscious of something, one is also conscious that one is conscious. Higher-Order Thought Theory [Rosenthal, D. (1997). A theory of consciousness. In N. Block, O. Flanagan, & G. Güzeldere (Eds.), The nature of consciousness: Philosophical debates. Cambridge, MA: MIT Press] takes it that it is in virtue of the fact that one is conscious of being conscious, that one is conscious. Here, we ask what the computational mechanisms may be that implement this intuition. Our starting point is Clark and Karmiloff-Smith’s [Clark, A., & Karmiloff-Smith, A. (1993). The cognizer’s innards: A psychological and philosophical perspective on the development of thought. Mind and Language, 8, 487–519] point that knowledge acquired by a connectionist network always remains “knowledge in the network rather than knowledge for the network”. That is, while connectionist networks may become exquisitely sensitive to regularities contained in their input–output environment, they never exhibit the ability to access and manipulate this knowledge as knowledge: The knowledge can only be expressed through performing the task upon which the network was trained; it remains forever embedded in the causal pathways that developed as a result of training. To address this issue, we present simulations in which two networks interact. The states of a first-order network trained to perform a simple categorization task become input to a second-order network trained either as an encoder or on another categorization task. Thus, the second-order network “observes” the states of the first-order network and has, in the first case, to reproduce these states on its output units, and in the second case, to use the states as cues in order to solve the secondary task. This implements a limited form of metarepresentation, to the extent that the second-order network’s internal representations become re-representations of the first-order network’s internal states. We conclude that this mechanism provides the beginnings of a computational mechanism to account for mental attitudes, that is, an understanding by a cognitive system of the manner in which its first-order knowledge is held (belief, hope, fear, etc.). Consciousness, in this light, thus involves knowledge of the geography of one own’s internal representations — a geography that is itself learned over time as a result of an agent’s attributing value to the various experiences it enjoys through interaction with itself, the world, and others.

Neural Networks 20 (2007) 1032–1039 www.elsevier.com/locate/neunet 2007 Special Issue Consciousness and metarepresentation: A computational sketch Axel Cleeremans ∗ , Bert Timmermans, Antoine Pasquali Cognitive Science Research Unit, Université Libre de Bruxelles CP 191, 50 ave. F.-D. Roosevelt, B1050 Bruxelles, Belgium Abstract When one is conscious of something, one is also conscious that one is conscious. Higher-Order Thought Theory [Rosenthal, D. (1997). A theory of consciousness. In N. Block, O. Flanagan, & G. Güzeldere (Eds.), The nature of consciousness: Philosophical debates. Cambridge, MA: MIT Press] takes it that it is in virtue of the fact that one is conscious of being conscious, that one is conscious. Here, we ask what the computational mechanisms may be that implement this intuition. Our starting point is Clark and Karmiloff-Smith’s [Clark, A., & Karmiloff-Smith, A. (1993). The cognizer’s innards: A psychological and philosophical perspective on the development of thought. Mind and Language, 8, 487–519] point that knowledge acquired by a connectionist network always remains “knowledge in the network rather than knowledge for the network”. That is, while connectionist networks may become exquisitely sensitive to regularities contained in their input–output environment, they never exhibit the ability to access and manipulate this knowledge as knowledge: The knowledge can only be expressed through performing the task upon which the network was trained; it remains forever embedded in the causal pathways that developed as a result of training. To address this issue, we present simulations in which two networks interact. The states of a first-order network trained to perform a simple categorization task become input to a second-order network trained either as an encoder or on another categorization task. Thus, the second-order network “observes” the states of the first-order network and has, in the first case, to reproduce these states on its output units, and in the second case, to use the states as cues in order to solve the secondary task. This implements a limited form of metarepresentation, to the extent that the second-order network’s internal representations become re-representations of the first-order network’s internal states. We conclude that this mechanism provides the beginnings of a computational mechanism to account for mental attitudes, that is, an understanding by a cognitive system of the manner in which its firstorder knowledge is held (belief, hope, fear, etc.). Consciousness, in this light, thus involves knowledge of the geography of one own’s internal representations — a geography that is itself learned over time as a result of an agent’s attributing value to the various experiences it enjoys through interaction with itself, the world, and others. c 2007 Elsevier Ltd. All rights reserved. Keywords: Consciousness; Representation; Higher-order thought; Neural networks As abundantly demonstrated not only by empirical evidence but also by the very fact that extremely powerful informationprocessing machines, namely, computers, have now become ubiquitous, information processing can undoubtedly take place without consciousness. Only but a few would be willing to grant any quantum of conscious experience to contemporary computers, yet they are undeniably capable of sophisticated information processing — from recognizing faces to analyzing speech, from winning chess tournaments to helping prove theorems. Likewise, it is hard to discern anything intrinsic to neural activity that mandates that such activity be associated ∗ Corresponding author. Tel.: +32 2 650 32 96; fax: +32 2 650 22 09. E-mail address: axcleer@ulb.ac.be (A. Cleeremans). 0893-6080/$ - see front matter c 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.neunet.2007.09.011 to or produce conscious experience. Thus, consciousness is not information processing tout court; experience is an “extra ingredient” (Chalmers, 2007) that comes over and beyond mere computation. With this premise in mind (a premise that just restates Chalmers’ hard problem, that is, the question of why it is the case that information processing is accompanied by experience in humans and other higher animals), there are several ways in which one can think about the problem of consciousness. One is to simply state, as per Dennett (e.g., Dennett (1991, 2001)) that there is nothing more to explain. Experience is just (a specific kind of) information processing in the brain; the contents of experience are just whatever representations have come to dominate processing at some point in time (“fame in the brain”); consciousness is just a harmless A. Cleeremans et al. / Neural Networks 20 (2007) 1032–1039 illusion. From this perspective, it is easy to imagine that machines will be conscious when they have accrued sufficient complexity; the reason they are not conscious now is simply because they are not sophisticated enough: They lack the appropriate architecture perhaps, they lack sufficiently broad and diverse information processing abilities, and so on. Regardless of what is missing, the basic point here is that, contra Chalmers, there is no reason to assume that conscious experience is anything special. Instead, all that is required is one or several yet-to-be-identified functional mechanisms: Recurrence, perhaps (Lamme, 2003), stability of representation (O’Brien & Opie, 1999), global availability (Baars, 1988; Dehaene, Kerszberg, & Changeux, 1998), integration and differentiation of information (Tononi, 2003), or the involvement of higher-order representations (Rosenthal, 1997), to name just a few (see Atkinson, Thomas, and Cleeremans (2000), Maia and Cleeremans (2005), for reviews). Let us try to engage in some phenomenological analysis at this point in an attempt to capture what it means for each of us to have an experience. Imagine you see a patch of red (Humphrey, 2006). You now have a red experience — something that a camera recording the same patch of red will most definitely not have. What is the difference between you and the camera? Tononi (2007), from whom I borrow this simple thought experiment, points out that one key difference is that when you see the patch of red, the state you find yourself in is but one among billions, whereas for a simple light-sensitive device, it is perhaps one of only two possible states — thus the state conveys a lot more differentiated information for you than for a light-sensitive diode. A further difference is that you are able to integrate the information conveyed by many different inputs, whereas the chip of a camera can be thought of as a mere array of independent sensors among which there is no interaction. Both Chalmers’ (somewhat paradoxically) and Tononi’s analyses, however, describe conscious experience as a rather abstract dimension or aspect of information, whereas our intuition is that what it feels like is anything but abstract. On the contrary, what we mean when we say that seeing a patch of red elicits an “experience” is that the seeing does something to us — in particular, we might feel one or several emotions, and we may associate the redness with memories of red. Perhaps seeing the patch of red makes you remember the color of the dress that your prom night date wore 20 years ago. Perhaps it evokes a vague anxiety, which we now know is also shared by monkeys (Humphrey, 1971). To a synaesthete, perhaps seeing the color red will evoke the number 5. The point is that if conscious experience is what it feels like to be in a certain state, then “What it feels like” can only mean the specific set of associations that have been established by experience between the stimulus or the situation you now find yourself in, on the one hand, and your memories, on the other. This is what one means by saying that there is something it is like to be you in this state rather than nobody or somebody else: The set of memories evoked by the stimulus (or by actions you perform, etc.), and, crucially, the set of emotional states associated with each of these memories. It is interesting to 1033 note that Indian philosophical traditions have placed similar emphasis on the role that emotion plays in shaping conscious experience (Banerjee, 2007). Hence, a first point about what we mean by “experience” is that there is nothing it is like for the camera to see the patch of red simply because it does not care: The stimulus is meaningless; the camera lacks even the most basic machinery that would make it possible to ascribe any interpretation to the patch of red; it is instead just a mere recording device for which nothing matters. There is nothing it is like to be that camera at that point in time simply because (1) the experience of different colors does not do anything to the camera; that is, colors are not associated with different emotional valences; and (2) the camera has no brain with which to register and process its own states. It is easy to imagine how this could be different. To hint at my forthcoming argument, a camera could, for instance, keep a record of the colors it is exposed to, and come to “like” some colors better than others. Over time, your camera would like different colors than mine, and it would also know that in some non-trivial sense. Appropriating one’s mental contents for oneself is the beginning of individuation, and hence the beginning of a self. A second point about experience that we perceive as crucially important is that it does not make any sense to speak of experience without an experiencer who experiences the experiences. Experience is, almost by definition (“what it feels like”), something that takes place not in any physical entity but rather only in special physical entities, namely cognitive agents. Chalmers’ thermostat (Chalmers, 1996) fails to be conscious because, despite the fact that it can find itself in different internal states, it lacks the ability to remove itself from the causal chain in which it is embedded. In other words, it lacks knowledge that it can find itself in different states. While there is indeed something to be experienced there (the different states the thermostat can find itself in), there is no one home to be the subject of these experiences — the thermostat simply lacks the appropriate machinery to do so. This point can be illustrated by means of well-known results in the connectionist, or artificial neural network modelling literature. Consider for instance Hinton’s (1986) famous demonstration that a simple back-propagation network can learn about abstract dimensions of the training set. Hinton’s network was a relatively simple back-propagation network trained to process linguistic expressions consisting of an agent, a relationship, and a patient, such as for instance “Maria is the wife of Roberto”. The stimulus material consisted of a series of such expressions, which together described some of the relationships that exist in the family trees of an Italian family and of an English family. The network was required to produce the patient of each agent–relationship pair it was given as input. For instance, the network should produce “Roberto” when presented with “Maria” and “wife”. Crucially, each person and each relationship were presented to the network by activating a single input unit. Hence there was no overlap whatsoever between the input representations of, say, Maria and Victoria. Yet, despite this complete absence of surface similarity between training exemplars, Hinton showed that, 1034 A. Cleeremans et al. / Neural Networks 20 (2007) 1032–1039 after training, the network could, under certain conditions, develop internal representations that capture relevant abstract dimensions of the domain, such as nationality, sex, or age! Hinton’s point was to demonstrate that such networks were capable of learning richly structured internal representations as a result of merely being required to process exemplars of the domain. Crucially, the structure of the internal representations learned by the network is determined by the manner in which different exemplars interact with each other, that is, by their functional similarity, rather than by their mere physical similarity expressed, for instance, in terms of how many features (input units) they share. Hinton thus provided a striking demonstration of this important and often misunderstood aspect of associative learning procedures by showing that under some circumstances, specific hidden units of the network had come to act as detectors for dimensions of the material that had never been presented explicitly to the network. These results truly flesh out the notion that rich, abstract knowledge can simply emerge as a by-product of processing structured domains. It is interesting to note that the existence of such single-unit “detectors” has recently been shown to exist in the human neocortex (Kreiman, Fried, & Koch, 2002): Single-neuron recording of activity in the hippocampus, for instance, has shown that some individual neurons exclusively respond to highly abstract entities, such as the words “Bill Clinton” and images of the American president. Now, the point we want to make with this example is as follows: One could certainly describe the network as being aware of nationality, in the sense that it is sensitive to the concept: It exhibits differential responding (hence, behavioural sensitivity) to inputs that involve Italian agents vs. English agents. But, obviously, the network does not know anything about nationality. It does not even know that it has such and such representations of the inputs, nor does it know anything about its own, self-acquired sensitivity or awareness of the relevant dimensions. Instead, the rich, abstract, structured representations that the network has acquired over training forever remain embedded in a causal chain that begins with the input and ends with the network’s responses. As Clark and Karmiloff-Smith (1993) insightfully pointed out, such representations are “first-order” representations to the extent that they are representations in the system rather than representations for the system; that is, such representations are not accessible to the network as representations. In this context, what would it take for a network like Hinton’s to be able to access its own representations; and what difference would that make with respect to consciousness? To answer the first question, the required machinery is the machinery of agenthood; in a nutshell, the ability to do something not just with external states of affairs, but rather with one own’s representations of such external states. This crucially requires that the agent be able to access, inspect, and otherwise manipulate its own representations, and this in turn, I surmise, requires mechanisms that make it possible for an agent to redescribe its own representations to itself. The outcome of this continuous “representational redescription” (Karmiloff-Smith, 1992) process is that the agent ends up knowing something about the geography of its own internal states: It has, in effect, learned about its own representations. Minimally, this could be achieved rather simply, for instance by having another network take both the input (i.e., the external stimulus as represented proximally) to the first-order network and its internal representations of that stimulus as inputs themselves and do something with them. One elementary thing the system consisting of the two interconnected networks (the first-order, observed network and the second-order, observing network) would now be able to do is to make decisions, for instance, about the extent to which an external input to the first-order network elicits a familiar pattern of activation over its hidden units or not. This would in turn enable the system to come up with judgments about the performance of the first-order network (Dienes, 2007; Persaud, McLeod, & Cowey, 2007). This is just what we propose below in a preliminary set of simulations. To address the second question (what difference would representational redescription make in terms of consciousness), if you think this is starting to sound like a higher-order thought theory of consciousness (Rosenthal, 1997), you may be right: Higher-order representations (which we will call metarepresentations in what follows) play a crucial role in consciousness. An immediate objection to this idea is as follows: If there is nothing intrinsic to the existence of a representation in a cognitive system that makes this representation conscious, why should things be different for metarepresentations? After all, metarepresentations are representations also. Yes indeed, but with a crucial difference: Metarepresentations inform the agent about its own internal states, making it possible for it to develop an understanding of its own workings. And this, we argue, forms the basis for the contents of conscious experience, provided of course – which cannot be the case in any contemporary artificial system – that the system has learned about its representations by itself, over its development, and provided that it cares about what happens to it, that is, provided its behaviour is rooted in emotion-laden motivation (to survive, to mate, to find food, etc.). 1. The radical plasticity thesis We would thus like to defend the following claim: Conscious experience occurs if and only if an information processing system has learned about its own representations of the world. To put this claim even more provocatively: Consciousness is the brain’s theory about itself, gained through experience interacting with the world, others, and, crucially, with itself. We call this claim the “Radical Plasticity Thesis”, for its core is the notion that learning is what makes us conscious. How so? The short answer, as hinted above, is that consciousness involves not only knowledge about the world, but, crucially, knowledge about our own internal states, or mental representations. In the following, we describe some preliminary simulation work aimed at capturing these intuitions about the possible role that metarepresentations may play in shaping consciousness. A. Cleeremans et al. / Neural Networks 20 (2007) 1032–1039 1035 Fig. 2. Error proportion (see text for details) for the first-order network and for both higher-order networks (10 and 5 hidden units). Fig. 1. Architecture of the first network, in which the higher-order network serves as an encoder of the hidden unit patterns of the first-order network. 2. Simulations: The digits problem We illustrate two ways in which metarepresentations can be operationalized and what this might teach us about consciousness. Both simulations involve a first-order network that has to perform a simple task such as digit recognition, and a higher-order network that “observes” the internal states of the first-order network. This second network is thus wholly independent from the causal chain set up by the first-order network. In the first simulation the higher-order network is simply trained to act as an encoder of the first-order internal states. It learns to reproduce the state of the entire first-order network based on that network’s hidden unit patterns. In the second simulation the higher-order network is given the more complex task of evaluating the first-order network’s performance by wagering. In other words, it has to distinguish between “correct” and “wrong” internal states of the first-order network. 2.1. Higher-order encoding of first-order internal states For the first simulations, we constructed a first-order feedforward backpropagation network consisting of 20 input units representing digit shapes, 5 hidden units, and 10 output units representing the 10 digits. Immediately following each presentation the hidden unit activation pattern was copied onto the 5 input units of the higher-order feedforward network, connected to either 10 or 5 hidden units, in turn connected to the 35 output units that corresponded to the number of units in the entire first-order network, as shown in Fig. 1. One epoch consisted of presentation of all 10 digits. For each of both architectures (higher-order network with 10 or 5 hidden units, identical first-order networks) we trained 5 networks over 1000 epochs with a learning rate of .1 and a momentum of .9, measuring the error proportion (defined, for a particular epoch of training, as the magnitude of RMS output error for that epoch divided by the maximum output error, i.e. output error prior to any training) separately across the output units of the first-order and the higher-order network. Results are shown in Fig. 2 and show comparable learning curves for both architectures. We can see that initially the first-order network learns at a faster rate than the higher-order network. However, after 50–100 epochs the higher-order network becomes actually better at predicting the entire state of the first-order network based on its hidden unit patterns than the first-order network is at predicting the correct digit from the input pattern. This difference decreases gradually, and for the higher-order network with 5 hidden units we can see that eventually the first-order network again outperforms the higher-order network. This suggests that as soon as some activation stability is achieved in the first-order network’s hidden units, these patterns, even though they do not yet permit the first-order network itself to optimize its performance beyond an error proportion of .40, become available to a higher-order network that is able to extract from these hidden units information about the overall state of the first-order network, — information that is in itself not available to that first-order network. In terms of awareness, this would mean that at some point during the early stages of learning, some aspects of the learned knowledge become available as targets of higherorder representations. In other words, whereas initially unstable first-order knowledge makes it impossible for the higher-order network to consistently learn about them, this changes with training in such a manner that once first-order representations have become sufficiently stable, the higher-order network can then use the structure that they contain so as to improve its own ability to reconstruct the input and the output of the first-order network successfully. In the next simulation study, we will explore how a higherorder network can make use of this capacity to re-represent 1036 A. Cleeremans et al. / Neural Networks 20 (2007) 1032–1039 Fig. 3. Architecture of the second network, in which the higher-order networks classifies the hidden unit patterns of the first-order network. first-order internal states so as to perform a different task, namely, evaluating the performance of the first-order network. 2.2. Higher-order classification of first-order internal states: A wagering network Fig. 4. Error proportion (see text for details) for the first-order network and for both higher-order networks (“high and low consciousness”, reflected by using .1 and 10−7 learning rates). Recently, Persaud et al. (2007) introduced wagering as a measure of awareness, where participants are required to place a high or a low wager on their decision, such as relative to stimulus identification for example. The intuition behind this measure is that people will place a high wager when they have conscious knowledge about the reasons for their decisions, and a low wager when they are uncertain of their decisions. In this, wagering is thus similar to other subjective measures of awareness (Dienes, 2004; Gaillard, Vandenberghe, Destrebecqz, & Cleeremans, 2006). According to Persaud et al., wagering provides an incentive for participants not to withhold any conscious information, as well as not to guess, making it a more objective measure of awareness than confidence judgment. Despite recent criticism of Persaud et al. ’s claims (Seth, 2007), wagering certainly reflects the extent to which an agent is sensitive to its own internal states. This may perhaps be captured by training a higher-order network to use first-order information so as to evaluate the performance of the latter. We therefore aimed at creating a wagering network. For this simulation, the first-order feedforward backpropagation network consisted of 7 input units representing digit shapes (as on a digital watch), 100 hidden units, and 10 output units for the 10 digits. The 100 first-order hidden units connected to a different pool of 100 hidden units of a higher-order feedforward network, with 2 output units representing a high and a low wager, as shown in Fig. 3. A learning rate of .15 and a momentum of .5 were used during training of the first-order network. However, in a first condition of high awareness, the second network was trained with a learning rate of .1, and in a second condition of low awareness, a learning rate of 10−7 was applied. The task of the higher-order network consisted of wagering high if it “thought” that the first-order network was providing a correct answer (correct identification of the digit), and to wager low in case the first network gave a wrong answer (misidentification of the digit). Fig. 4 displays the average error curves of 10 networks throughout 200 epochs of training. Despite the gradual learning exhibited by the first-order network, error in wagering increases during the 40 first epochs in both conditions of high and low awareness. Only from the 40th epoch onwards does the higher-order network start to improve the quality of its wagering. In order to understand the reason for this initial increase, we need to evaluate the networks’ performance through an analysis of the recognition rate for the first-order network, and of the wagering strategy for the higher-order network. The firstorder network’s performance is represented by the percentage of correct identification (the chance level is at .1 since 10 digits are available). Wagering strategy is considered good if the network wagered high in case of correct identification and low in case of misrepresentation. Conversely, the strategy is considered to be poor if a high wager accompanies a incorrect classification, or when a correct identification was only associated to a low wager. As the strategy has the same probability of being good or bad, the chance level is at 50%. The results of this analysis are shown in Fig. 5. As shown in Fig. 5, the previously identified error extremum at the 40th epoch corresponds in fact to a chance level wagering performance. Further analysis revealed that the higher-order networks mainly used a low-wagering strategy during the first epochs, during which the first-order network is misclassifying most of the digits, whereas is develops a high-wagering strategy at a later stage in learning, when first-order identification becomes progressively more accurate. Thus the error extremum observed in Fig. 5 characterizes the higher-order network’s “most doubtful moment” when identification is correct only 50% of the time and no strategy can be applied. One could view this as the moment at which the higher-order network abandons a simple “safe” strategy of low wagers and explores the space of first-order hidden unit representations, looking for a criterion or a categorization that will allow it to separate correct identifications from wrong identifications. A. Cleeremans et al. / Neural Networks 20 (2007) 1032–1039 Fig. 5. Performance of the first-order network in terms of correct identifications, and of the higher-order networks in terms of advantageous wagers (high wagers when the first-order network is correct and low wagers when it is wrong). 3. Metarepresentation The simulations sketched above illustrate how a network can be trained to observe the internal states of another network in such a manner that it can use this information to perform tasks that require knowledge of the structure of these internal states — either to reconstruct the corresponding inputs and outputs, or to actually evaluate the extent to which these internal representations will result in successful performance. In both cases, it is interesting to note that while the higherorder, observing network initially performs poorly, it quickly learns enough about the structure of the first-order internal representations to become more accurate in performing its own task. This captures the processes involved in the development of expertise, whereby learning might initially take place in an essentially implicit manner, and be subsequently followed by a period where explicit knowledge becomes available (Bechara, Damasio, Tranel, & Damasio, 1997; Bierman, Destrebecqz, & Cleeremans, 2005; Cleeremans, 2005, 2006; Cleeremans, Destrebecqz, & Boyer, 1998). Automaticity (Shiffrin & Schneider, 1977) would correspond to a third period in skill learning where the acquired metarepresentations become optional or otherwise detached from first-order representations. What are the conditions under which metarepresentations emerge? Strong, stable, and distinctive representations as they occur in trained neural networks are explicit representations, at least in the sense put forward by Koch (2004): They indicate what they stand for in such a manner that their reference can be retrieved directly through processes involving low computational complexity (see also Kirsh (1991, 2003)). Conscious representations, in this sense, are explicit representations that have come to play, through processes of learning, adaptation, and evolution, the functional role of denoting a particular content for a cognitive system. 1037 Once a representation has accrued sufficient strength, stability, and distinctiveness, it may be the target of metarepresentations: The system may then “realize”, if it is so capable, that is, if it is equipped with the mechanisms that are necessary to support self-inspection, which here takes the form of an “observer” network, that it has learned a novel partition of the input; that it now possesses a new “detector” that only fires when a particular kind of stimulus, or a particular condition, is present. Humphrey (2006) emphasizes the same point when he states that “This self-monitoring by the subject of his own response is the prototype of the ‘feeling sensation’ as we humans know it” (p. 90). Importantly, our claim here is that such metarepresentations are learned in just the same way as first-order representations, that is, in virtue of continuously operating learning mechanisms. Because metarepresentations are also representations, the same principles that make first-order representations explicit therefore apply. An important implication of this observation is that activation of metarepresentations can become automatic, just as it is the case for first-order representations. What might be the function of such metarepresentations? One intriguing possibility is that their function is to indicate the mental attitude through which a first-order representation is held: Is this something I know, hope, fear or regret? Possessing such metaknowledge about one’s knowledge has obvious adaptive advantages, not only with respect to the agent himself, but also because of the important role that communicating such mental attitudes to others plays in both competitive and cooperative social environments. In the simulations we have described, metarepresentations as they occur in the second-order network take the more limited role of indicating relationships between internal representations and the input–output representations. However, there is another important function that metarepresentations may play: They can also be used to anticipate the future occurrences of first-order representations. Thus, for instance, if my brain learns that SMA (Supplementary Motor Area) is systematically active before M1 (Primary Motor Cortex), then it can use SMA representations to explicitly represent their consequences downstream, that is, M1 activation, and ultimately, action. If neurons in SMA systematically become active before an action is carried out, a metarepresentation can link the two and represent this fact explicitly in a manner that will be experienced as intention. That is: When neurons in the SMA become active, I experience the feeling of intention because my brain has learned, unconsciously, that such activity in SMA precedes action. It is this knowledge that gives qualitative character to experience, for, as a result of learning, each stimulus that I see, hear, feel, or smell is now not only represented, but also re-represented through metarepresentations that enrich and augment the original representation(s) with knowledge about (1) how similar the manner in which the stimulus’ representation is with respect to that associated with other stimuli, (2) how similar the stimulus’ representation is now with respect to what it was before, (3) how consistent is a stimulus’ representation with what it typically is, (4) what other regions of my brain are active at the same time that the stimulus’ 1038 A. Cleeremans et al. / Neural Networks 20 (2007) 1032–1039 representation is, etc. This perspective is akin to the sensorimotor perspective (O’Regan & Noë, 2001) in the sense that awareness is linked with knowledge of the consequences of our actions, but, crucially, the argument is extended to the entire domain of neural representations. 4. Conclusion Thus we end with the following idea, which is the heart of the “Radical Plasticity Thesis”: The brain continuously and unconsciously learns not only about the external world, but about its own representations of it. The result of this unconscious learning is conscious experience, in virtue of the fact that each representational state is now accompanied by (unconsciously learnt) metarepresentations that convey the mental attitude with which these first-order representations are held. Thus, from this perspective, there is nothing intrinsic to neural activity, or to information per se, that makes it conscious. Conscious experience involves specific mechanisms through which particular (i.e., stable, strong, and distinctive) unconscious neural states become the target of further processing, which we surmise involves some form of representational redescription in the sense described by Karmiloff-Smith (1992). These ideas are congruent both with higher-order theories in general (Dienes, 2007; Dienes & Perner, 1999; Rosenthal, 1997), but also with those of Lau (2007), who characterizes consciousness as “signal detection on the mind”. The simulation work we have sketched here is a first step in implementing these ideas in the form of actual computational principles. Acknowledgements A.C. is a Research Director with the National Fund for Scientific Research (F.R.S.-FNRS, Belgium). B.T. is a postdoctoral fellow at the ULB — Université Libre de Bruxelles. A.P. is a Research Fellow of the Fonds National de la Recherche Scientifique (F.R.S-FNRS, Belgium). This work was supported by an institutional grant from the Université Libre de Bruxelles to A.C., by Concerted Research Action 06/11342 titled “Culturally modified organisms: What it means to be human in the age of culture”, financed by the Ministère de la Communauté Française – Direction Générale l’Enseignement non obligatoire et de la Recherche scientifique (Belgium), by European Commission Grant #043457 “Mindbridge – Measuring Consciousness”, and by F.R.F.C / E.S.F. Grant #2.4577.06 “Mechanisms of serial action”. Portions of this article were adapted from the following publication: Cleeremans, A. (in press), The radical plasticity thesis, in R. Banerjee & B. Chakrabarti (Eds.), Models of Brain and Mind: Physical, Computational and Psychological Approaches; Progress in Brain Research. Amsterdam: Elsevier. References Atkinson, A. P., Thomas, M. S. C., & Cleeremans, A. (2000). Consciousness: mapping the theoretical landscape. Trends in Cognitive Sciences, 4(10), 372–382. Baars, B. J. (1988). A cognitive theory of consciousness. Cambridge: Cambridge University Press. Banerjee, R. (2007). Buddha and the bridging relations. In R. Banerjee, & B. Chakrabarti (Eds.), Progress in brain research, Models of brain and mind: Physical, computational and psychological approaches. Amsterdam: Elsevier. Bechara, A., Damasio, H., Tranel, D., & Damasio, A. (1997). Deciding advantageously before knowing the advantageous strategy. Science, 275(5304), 1293–1295. Bierman, D., Destrebecqz, A., & Cleeremans, A. (2005). Intuitive decision making in complex situations: Somatic markers in an artificial grammar learning task. Cognitive, Affective & Behavioral Neuroscience, 5(3), 297–305. Chalmers, D. J. (1996). The conscious mind: In search of a fundamental theory. Oxford University Press. Chalmers, D. J. (2007). The hard problem of consciousness. In M. Velmans, & S. Schneider (Eds.), The Blackwell companion to consciousness (pp. 225–235). Oxford, UK: Blackwell Publishing. Clark, A., & Karmiloff-Smith, A. (1993). The cognizer’s innards: A psychological and philosophical perspective on the development of thought. Mind and Language, 8, 487–519. Cleeremans, A. (2005). Computational correlates of consciousness. In S. Laureys (Ed.), Progress in brain research: Vol. 150 (pp. 81–98). Amsterdam: Elsevier. Cleeremans, A. (2006). Conscious and unconscious cognition: A graded, dynamic perspective. In Q. Jing, M. R. Rosenzweig, G. d’Ydewalle, H. Zhang, H. -C. Chen, & C. Zhang (Eds.), Progress in psychological science around the world: Vol. 1. Neural, cognitive, and developmental issues (pp. 401–418). Hove, UK: Psychology Press. Cleeremans, A., Destrebecqz, A., & Boyer, M. (1998). Implicit learning: News from the front. Trends in Cognitive Sciences, 2, 406–416. Dehaene, S., Kerszberg, M., & Changeux, J. -P. (1998). A neuronal model of a global workspace in effortful cognitive tasks. Proceedings of the National Academy of Sciences of the United States of America, 95(24), 14529–14534. Dennett, D. C. (1991). Consciousness explained. Boston, MA: Little, Brown & Co. Dennett, D. C. (2001). Are we explaining consciousness yet? Cognition, 79, 221–237. Dienes, Z. (2004). Assumptions of subjective measures of unconscious mental states: Higher order thoughts and bias. Journal of Consciousness Studies, 11(9), 25–45. Dienes, Z. (2007). Subjective measures of unconscious knowledge. In R. Banerjee, & B. Chakrabarti (Eds.), Progress in brain research, Models of brain and mind: Physical, computational and psychological approaches. Amsterdam: Elsevier. Dienes, Z., & Perner, J. (1999). A theory of implicit and explicit knowledge. Behavioral and Brain Sciences, 22, 735–808. Gaillard, V., Vandenberghe, M., Destrebecqz, A., & Cleeremans, A. (2006). Third- and first-person approaches in implicit learning research. Consciousness and Cognition, 15, 709–722. Hinton. (1986). Learning distributed representations of concepts. Paper presented at the 8th annual conference of the cognitive science society. Humphrey, N. (1971). Colour and brightness preferences in monkeys. Nature, 229, 615–617. Humphrey, N. (2006). Seeing red. Cambridge, MA: Harvard University Press. Karmiloff-Smith, A. (1992). Beyond modularity: A developmental perspective on cognitive science. Cambridge: MIT Press. Kirsh, D. (1991). When is information explicitly represented? In P. P. Hanson (Ed.), Information, language, and cognition. New York, NY: Oxford University Press. Kirsh, D. (2003). Implicit and explicit representation. In L. Nadel (Ed.), Encyclopedia of cognitive science: Vol. 2 (pp. 478–481). London, UK: Macmillan. Koch, C. (2004). The quest for consciousness. A neurobiological approach. Englewood, CO: Roberts & Company Publishers. Kreiman, G., Fried, I., & Koch, C. (2002). Single-neuron correlates of subjective vision in the human medial temporal lobe. Proceedings of the National Academy of Sciences of the United States of America, 99, 8378–8383. A. Cleeremans et al. / Neural Networks 20 (2007) 1032–1039 Lamme, V. A. F. (2003). Why visual attention and awareness are different. Trends in Cognitive Sciences, 7(1), 12–18. Lau, H. (2007). A higher-order Bayesian decision theory of consciousness. In R. Banerjee, & B. Chakrabarti (Eds.), Progress in brain research, Models of brain and mind: Physical, computational and psychological approaches. Amsterdam: Elsevier. Maia, T. V., & Cleeremans, A. (2005). Consciousness: Converging insights from connectionist modeling and neuroscience. Trends in Cognitive Sciences, 9(8), 397–404. O’Brien, G., & Opie, J. (1999). A connectionist theory of phenomenal experience. Behavioral and Brain Sciences, 22, 175–196. O’Regan, J. K., & Noë, A. (2001). What it is like to see: A sensorimotor theory of visual experience. Synthèse, 129(1), 79–103. Persaud, N., McLeod, P., & Cowey, A. (2007). Post-decision wagering objectively measures awareness. Nature Neuroscience, 10, 257–261. 1039 Rosenthal, D. (1997). A theory of consciousness. In N. Block, O. Flanagan, & G. Güzeldere (Eds.), The nature of consciousness: Philosophical debates. Cambridge, MA: MIT Press. Seth, A. K. (2007). Post-decision wagering measures metacognitive content, not sensory consciousness. Consciousness and Cognition. doi:10.1016/j.concog.2007.05.008. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: II. Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127–190. Tononi, G. (2003). Consciousness differentiated and integrated. In A. Cleeremans (Ed.), The unity of consciousness: Binding, integration, and dissociation (pp. 253–265). Oxford, UK: Oxford University Press. Tononi, G. (2007). The information integration theory. In M. Velmans, & S. Schneider (Eds.), The Blackwell companion to consciousness (pp. 287–299). Oxford, UK: Blackwell Publishing.