The Forced-Choice Paradigm and The Perception of Facial Expressions of Emotion
The Forced-Choice Paradigm and The Perception of Facial Expressions of Emotion
The Forced-Choice Paradigm and The Perception of Facial Expressions of Emotion
1, 75-85
Copyright 2001 by the American Psychological Association, Inc. 0O22-3514/01/S5.00 DOI: 10.1037//0022-3514.80.1.75
Janine Stennett
University of New South Wales
The view that certain facial expressions of emotion are universally agreed on has been challenged by studies showing that the forced-choice paradigm may have artificially forced agreement. This article addressed this methodological criticism by offering participants the opportunity to select a none of these terms are correct option from a list of emotion labels in a modified forced-choice paradigm. The results show that agreement on the emotion label for particular facial expressions is still greater than chance, that artifactual agreement on incorrect emotion labels is obviated, that participants select the none option when asked to judge a novel expression, and that adding 4 more emotion labels does not change the pattern of agreement reported in universality studies. Although the original forced-choice format may have been prone to artifactual agreement, the modified forced-choice format appears to remedy that problem.
One of the most thought provoking findings in all of social science is that a limited number of facial expressions of emotion are universally recognized (eg., Ekman, 1992; 1994; Ekman et al., 1987; Izard, 1971, 1992, 1994; Matsumoto, 1990). Across numerous experiments and participant and cultural groups, the advocates of this universal view (who are known as universalists) have pointed to evidence that shows that people agreeat rates greater than chanceas to which facial expressions best represent the emotions of anger, disgust, fear, happiness, sadness and distress, and surprise (although others have suggested contempt, interest, shame, and embarrassment as well; eg., Ekman, 1992; Izard, 1977; Keltner, 1995). The bulk of the evidence for universality is based on experiments in which a display of still photos of models posing these facial expressions were shown to various observer groups. These groups then selected from a list of six to nine emotion labels the label that best represented the emotions depicted in the photographs. This forced-choice response paradigm, as it is called, is a simple, clear, and methodologically strong technique that provides robust results (e.g., Ekman, 1972; Ekman et al., 1987). The findings derived from the forced-choice paradigm led researchers to propose that the reason these aforementioned facial
Mark G. Frank, School of Communication, Rutgers, The State University of New Jersey; Janine Stennett, School of Psychology, University of New South Wales, Sydney, New South Wales, Australia. Janine Stennett is now at the New South Wales Department of Health, Sydney, New South Wales, Australia. Portions of this research were supported by grants from the Australian Research Council and the Rutgers Research Council. We thank Antonia Catanzaro and Kimberly Bongiovi for their help with the last two experiments. Correspondence concerning this article should be addressed to Mark G. Frank, Rutgers, The State University of New Jersey, School of Communication, Information, and Library Studies, 4 Huntington Street, New Brunswick, New Jersey 08903-1071. Electronic mail may be sent to mgfrank@scils.rutgers.edu. 75
expressions are universally recognized is that they are the external signals of universally experienced human emotions. These emotions, researchers have argued, are physiological reactions critical to survival and thus have been selected through evolutionary processes (e.g., Darwin, 1872/1998; Ekman, 1972; Izard, 1971; Plutchik, 1962; Tomkins, 1962, 1963). Thus, the evidence for the universality of some facial expressions of emotion, and the ideas they inspired and the data they generated, has caused social scientists to reevaluate the strict social learning perspective of human nature that prevailed up through the 1970s (e.g., Buss, 1992). The universal, biological nature of human emotion and its expression is not without controversy. First, it is in stark contrast to the cultural relativity position, which states that all emotions and their expression are socially derived and defined and are dependent on context for meaning (e.g., Lutz & White, 1986). As evidence, these researchers (henceforth, relativists) have pointed to the fact that the smile is not the universal expression of happiness, in that it can be used to express happiness, disgust, or surprise in the United States (Landis, 1924), sadness in Japan (Klineberg, 1940), or uncertainty in Africa (LeBarre, 1947). Second, others have similarly argued that all facial expressions are simply communicative gestures; that is, they are not the result of internal emotional states but only the result of the social motives of a person within a particular context (the behavioral ecology view; Fridlund, 1994). As evidence, research found that the amount of smiling was not related to self-reports of felt happiness but to the actual or implied presence of others (Fridlund, 1991). Third, other researchers have argued that the forced-choice paradigm used to generate the bulk of the evidence for the universality of facial expressions of emotion contained a number of methodological weaknesses that, when compounded, might have artifactually generated the high agreement rates reported by the universalists (Russell, 1994; but see replies by Ekman, 1994, and Izard, 1994). Although the universalists have generated evidence and arguments against the first two relativist challenges to universality
76
(e.g., Ekman, 1993; Ekman, Davidson, & Friesen, 1990; Frank & Ekman, 1993; Frank, Ekman, & Friesen, 1993; Izard, 1992), the third challengethe flawed methodology used to discover universalityhas not been addressed systematically. Specifically, the methodological critique argued that there are six potentially problematic aspects of the forced-choice paradigm (Russell, 1994). The first problem is that participants were allowed to preview the facial expressions, thus exposing the full range of expressionswhich in turn may allow for sharper distinctions between expressions than would occur in day-to-day life. The second problem is that these studies used a within-subject design, which again might allow for participants to make sharper distinctions than they would in the real world, where people typically show expressions one at a time. The third problem is that these within-subject presentations did not systematically manipulate the order of presentation of the expressions. The fourth problem is that many of these facial-expression photographs were preselected by the researchers on the basis of normative judgments by a panel of observers. Thus, it is not surprising that agreement rates might be artificially higher. The fifth problem is that these expressions were posed and, thus, may have been better exemplars of these expressions than are those observed in the real world. The sixth problem is that the universality findings were most often based on a response form that forced judges to choose a single emotion from a limited number of emotion terms. Although the first five problems might each add a small push toward artificially increasing the agreement rates, it is the sixth problemforcing judges to choose a single emotionthat is most serious. First, offering only specific emotion terms suggests that these emotion options are mutually exclusive, which in turns implies that facial expressions of emotion are perceived categorically, when possibly a more continuous view of emotion is more appropriate (e.g., Russell & Fehr, 1987). Second, presenting only six emotion terms might be too limiting, because judges might have a vastly larger emotion vocabulary that would better convey their impressions of these expressions than do the six labels offered by researchers. Third, there is evidence that this forcedchoice technique can artifactually generate agreement. Research shows that when the "correct"1 emotion term is removed from the list of emotion terms, participants will agree on a different term also at rates greater than chance. For example, observers presented with a standard forced-choice format will agree at rates greater than chance that a particular facial expression represents anger. However, when the label anger is not available for selection, observers will agree that this same facial expression might represent contempt, frustration, or disgust, again at rates greater than chance (Russell, 1993). The relativists have argued that if observers show significant agreement on what might be considered an incorrect emotion label, then maybe their agreement on the correct emotion label, suggested by years of work on the universality of facial expressions of emotion, may be incorrect as well. Thus, it is possible that the "fact" of universal agreement on certain facial expressions of emotionone of the most fundamental, central, and fascinating in all of the social sciencesis not fact but artifact. The strongest universalist response to these methodological problems has been to point to studies in which observers are provided the opportunity to describe freely the emotion depicted in a target facial expression. This research has shown a tendency for observers to respond using emotion labels that are synonymous
with those selected in the forced-choice paradigm (e.g., Boucher & Carlson, 1980; Haidt & Keltner, 1999; Izard, 1971; Rosenberg & Ekman, 1995), thus confirming the results of the forced-choice experiments. As compelling as that might seem, the free-response evidence cited in support of the universalist position does not entirely diminish the relativist argument. First, only a handful of studies have used the free-response paradigm, and these studies have typically reported considerably lower levels of agreement than have studies using the forced-choice paradigm. It is more important to note that the evidence that there are flaws within the forced-choice paradigm remains unchallenged. Whether the forced-choice paradigm is artifactually responsible for the evidence cited in support of the universalists' position is an empirical question. One way to test this proposal is to create a forced-choice response scale that negates the forcedand most problematicaspect of the scale. This might be as simple as adding an escape option such as none of these terms are correct. This option seems to allow observers the flexibility to indicate whether the correct term is not present among the response options. Given the demands of empirical research, it is not certain that participants will choose that option even if it is available (e.g., Orne, 1962). This is also an empirical question. The four experiments in this article are designed to test whether adding a none of these terms are correct option to the standard forced-choice paradigm alleviates the problems inherent in the paradigm. The first experiment tests the effects of this none correct option on Australian participants' agreement rates. The second experiment tests whether the finding that incorrect agreement on certain emotion labels occurs when the correct label is removed (Russell, 1993) will still hold true if participants have a none correct option. The third experiment tests how this none correct option affects agreement rates on a facial expression never before seen by humans. Will observers choose the none correct option, select randomly from the alternatives, or agree on an inaccurate label? The fourth experiment tests whether adding additional emotion response options to the standard forced-choice paradigm affects agreement rates, both with and without the none correct option. This is important because the typical forced-choice experiment uses about seven emotion options, which might be too limited a selection for participants. This could have caused observers to settle on an emotion label such as fear when the label alarmed might better represent the observer's judgment (Russell, 1994). Experiment 1 The first step in addressing issues with the forced-choice paradigm is to see whether agreement on emotion labels occurs when
1 We use correct and incorrect in relation to the emotion labels proposed by the universalists as a shorthand to represent the label agreed on by observers for certain facial-emotion-expression configurations. This particularly applies to the Pictures of Facial Affect set (Ekman & Friesen, 1976). However, because this article deals with perceptions of facial expressions of emotion and not with accuracy, we acknowledge that our results do not provide direct evidence for accuracy. To know whether these expressions are "correct" or not involves techniques beyond the scope of this article.
FORCED CHOICE observers are presented with a forced-choice paradigm that has an escape option of none of these terms are correct. The second is to compare the agreement rates with and without this option. We used an entirely between-subjects design for this experiment to shed some light on three of the six methodological criticisms leveled at the forced-choice paradigm. This change eliminated the problems associated with a within-subject design as well as the problems of previewing and order effects. However, as has other forced-choice paradigm research, we used posed facial expressions of emotion taken from a standard set, the Pictures of Facial Affect (POFA; Ekman & Friesen, 1976), whose agreement rates were normed on Americans in the mid 1970s. We showed these to a group clearly different from the norm groupAustralians in the 1990s. Thus, on the basis of the large body of work on the universal agreement on facial expressions of emotion using the forcedchoice paradigm, as well as the studies showing agreement using a free-choice paradigm, we predicted that Australian participants would agree on the terms predicted by the universalists at rates greater than chance, regardless of the response format.
77
mind volunteering 2 min to answer anonymously a quick question for a university class project. Those who consented and were fluent in English were then handed the target stimulus from the top of the pile. Participants were asked to work quickly, and the only prompt, if one was needed, was "just pick the best answer." Design. The independent variables were the type of response form (standard set vs. modified set), the gender of the poser in the stimulus photo, and the emotion displayed in the poser's face, as stipulated in the POFA norms. The dependent variable was the proportion of participants selecting each response option when judging a target stimulus. We hypothesized that participants would agree on the emotion displayed in each facial expression at rates greater than expected by chance, irrespective of response form.
Results
Participants' judgments of each of the 12 target stimuli are displayed in Table 1. It is' clear from these data that the majority of participants agreed on the emotion term predicted by the universalists, irrespective of the emotion portrayed, the gender of the poser, or the type of response form (see Table 1). To test the significance of this pattern, we chose to set chance at one quarter, or 25%. This seemed justified for two reasons. First, although there are six or seven responses and chance may be seen as one sixth or one seventh, critics of the universalist position have argued that some of these emotion terms are so similar, and some so different, that in reality, observers are choosing among four options rather than six or seven (e.g., Russell, 1994). Second, other researchers see emotional expressions as occupying a position along two independent bipolar dimensions or qualities, namely pleasant-unpleasant and aroused-unaroused. If, as these researchers have proposed, all facial expressions can be placed reliably within the four quadrants defined by crossing these two dimensions (e.g., Russell & Fehr, 1987), then chance might be expected to be one quarter rather than one sixth or one seventh. Taken together, these reasons suggest that setting chance at 25% is a sufficiently conservative test of participant agreement rates. A binomial test showed that participants selected the term predicted by the universalists at rates greater than chance (all ps < .01). Moreover, this pattern was similar whether the expression was posed by a man or a woman (all ps < .01) and whether participants judged the expressions on the standard-set or the modified-set response forms (all ps < .01). It is important to note that none of the other terms were selected at rates above chance; this included the none correct option when it was available. In comparing the agreement rates across all emotions, participants selected the term predicted by the universalists at rates greater than chance for both the standard set (84.8% agreement overall, p < .01) and the modified set (80.9% agreement overall, p < .01). When we compare the agreement rates between the two response forms, a contingency table chi-square shows that the ratio of agreements and disagreements for the modified set did not differ significantly from the ratio of agreements and disagreements for the standard set, ^ ( 1 , N = 1,296) = 3.38, ns. Thus adding the option of none correct did not significantly reduce the overall level of agreement on these expressions.
Method
Participants. The participants were 1,296 Australians recruited from public settings in the greater Sydney metropolitan area, including university campuses and shopping malls. The age range for the sample was 18 to 64 years old, and there were approximately equal numbers of men and women. Almost all of those approached to be in this experiment consented to participate. Materials. We selected photographs to show the participants from the POFA set (Ekman & Friesen, 1976). POFA is a standard slide set of 110 black and white facial close ups of six men and eight women, each posing the emotion expressions of anger, disgust, fear, happiness, sadness, and surprise as well as a neutral expression. The normative data that accompanies POFA suggest that observers agree, at rates greater than chance, that each of these posed expressions does represent the emotion it is intended to depict. We selected one male and one female poser for each of these six emotion expressions, such that no poser would appear more than once in the set of 12. We selected the particular expression from each poser that featured all the specific muscle action patterns proposed by Ekman and Friesen (1975) as representing the emotional expressions of anger, disgust, fear, happiness, sadness, and surprise. According to the POFA norms, all of these expressions were agreed on by at least 80% of the observers. These expressions were converted from slides to 5 X 7 in. (12.7 X 17.78 cm) photographic prints, and then each print was photocopied onto a single sheet of paper. The top part of this sheet of paper asked for the participant' s age and gender; the bottom part read "Please refer to the face above when making your judgment. Please circle the term that best describes what this person is feeling." This statement was followed by one of two different sets of response options. The standard forced-choice set contained the words anger, disgust, fear, happiness, sadness and surprise arranged in that order. The modified forced-choice set contained the words anger, disgust, fear, happiness, sadness, surprise, and none of these terms are correct arranged in that order. We made 108 photocopies of each of the 12 slides; 54 copies featured the standard set, and 54 copies featured the modified set. Procedure. We shuffled the response forms to get them into a random order and then gave them to a research team composed of university students enrolled in a final-year social psychology elective. The response forms were given to the team face down, and the team members were instructed not to look at the photograph side of the sheet, so that they were unaware of both the facial expression and the response-form condition. The participants were approached by a member of the research team and asked whether they were fluent in English and then whether they would not
Discussion
The results of Experiment 1 with Australian participants replicated the normative data reported for the POFA set (Ekman &
78
Table 1
Participants' Categorical Judgments of Facial Expressions With or Without a None of the Above Terms Are Correct Option, for Male or Female Posers
Categorical choice (% ) Photo and gender Angernone option Male face Female face Angerno none option Male face Female face Disgustnone option Male face Female face Disgustno none option Male face Female face Fearnone option Male face Female face Fearno none option Male face Female face Happynone option Male face Female face Happyno none option Male face Female face Sadnone option Male face Female face Sadno none option Male face Female face Surprisenone option Male face Female face Surpriseno none option Male face Female face Anger 73** 76** 70** 81** 83** 78** 9 19 0 9 15 4 8 0 17 10 4 17 0 0 0 0 0 0 0 0 0 0 0 0 1 0 Disgust 11 6 17 11 7 15 78** 70** 85** 87** 80** 94** 4 6 2 3 0 6 0 0 0 0 0 0 3 6 0 9 19 0 1 2 0 1 2 0 Fear 2 4 0 3 2 4 1 2 0 0 0 0 69** 80** 60** 76** 89** 63** 1 0 2 0 0 0 1 2 0 3 6 0 8 13 4 8 4 13 Happy 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 95** 96** 94** 98** 96** 100** 2 4 0 2 4 0 1 2 0 0 0 0 Sad 0 0 0 2 4 0 2 2 2 4 6 2 1 2 0 1 0 Surprise 6 4 7 4 4 4 0 0 0 0 0 0 13 11 15 10 7 13 1 2 0 2 4 0 0 0 0 2 4 0 85** 81** 89** 90** 94** 85** None 7 11 4 10 7 13 5 2 7 3 2 4 18 35 0 4 2 6 n 108 54 54 108 54 54 108 54 54 108 54 54 108 54 54 108 54 54 108 54 54 108 54 54 108 54 54 108 54 54 108 54 54 108 54 54
2
0 0 0 0 0 0 77** 54** 100** 84** 69** 100** 0 0 0 1 0 2
2
0 0 0
Note. Percentages may not add to 100% because of rounding. Dashes indicate that this option was not available. ** p < .01.
Friesen, 1976) derived from Americans. Australian participants agreed, at rates greater than chance, that the faces stipulated by the POFA set and described by the universalists as representing anger, disgust, fear, happiness, sadness, and surprise did in fact depict these emotions. We found this pattern of agreement despite the conservative chance level we adopted; moreover, agreement was not affected by the gender of the poser in the target stimulus nor by the type of response form. That is, the participants did not choose the none correct option when they had the opportunity to do so. Moreover, we found this pattern of results using a betweensubjects design where participants saw only one expression, and thus there was no previewing or potential order effect confounds. It seems, therefore, that the modified forced-choice paradigm provides an adequate test of the position that certain facial expressions are recognized universally. However, one problem emerges. It is possible that observers will not select none of these terms are correct with any zeal
even if they consider it to be the most appropriate option. We found that less than 8% of the participants chose this option when it was available to them. A recent article exploring American and Indian judgments of facial expressions of emotion used this none correct option in a typical forced-choice response format and found that less than 5% of participants would choose that option (Haidt & Keltner, 1999). However, this study did not manipulate the presence or absence of this option to measure its effect on agreement rates. Also, it is well known that participants will bow to the demand characteristics of social psychological experiments (e.g., Orne, 1962). Even in real-life situations like criminal suspect lineups, police have noted that witnesses often feel compelled to identify someone anyonerather than no one, because of the demands of the situation (Bartol, 1983). In Experiment 1, participants may have felt subtle social pressure to choose an emotion termany emotion termrather than none, because of the demands of the
FORCED CHOICE
79
situation. After all, their instructions were to judge the feelings being expressed in the face. Consequently, although the findings of Experiment 1 are consistent with the position that the universalist findings are not artifact, they do not entirely contradict the relativist position that agreement on particular emotion labels may be an artifact of the forced-choice methodology. The fact that the relativists have shown that people can be made to agree on emotion labels that are not those proposed by the universalists (Russell, 1993) suggests that the demands of the experimental situation may lead participants to select any emotion label as descriptive of a particular facial expression of emotion. However, in the relativist studies, agreement on incorrect emotion terms occurred in the absence of both the correct emotion label and a none of these terms are correct option (Russell, 1993). It is an empirical question whether observers will actually choose the none correct option when it is available or whether they will agree on an incorrect emotion label because of the demands of the experiment. Experiment 2 This experiment aimed to replicate the relativist finding that in a forced-choice format, people will agree on incorrect labels for facial expressions in the absence of the correct label. This experiment extended the relativist research to assess whether this phenomenon occurs when observers are provided the opportunity to select the option none of these terms are correct. We predicted that in the absence of the emotion label considered correct by the universalists and in the presence of a none of these terms are correct option, participants would select the none correct option for three reasons. First, the results of Experiment 1 showed that participants will agree that certain facial expressions represent specific emotions when a none of these terms are correct option is available. Second, we noted earlier that people generally agree on emotion descriptors for facial expressions when they are tested using a free-response paradigm (e.g., Boucher & Carlson, 1980; Haidt & Keltner, 1999; Izard, 1971). Third, research on the physiological expression of emotion shows that when people of different cultures are asked to pose particular facial expressions they (a) report "feeling" the emotion and (b) show an emotionspecific pattern of physiology (Ekman, Levenson, & Friesen, 1983; Levenson, Ekman, & Friesen, 1990; Levenson, Ekman, Heider, & Friesen, 1992; although see critiques by Cacioppo, Klein, Berntson, & Hatfield, 1993). This suggests that certain facial expressions are not randomly related to physiology but tied to specific emotional experiences. It seems reasonable to conclude that the emotion-physiology link reinforces beliefs about which expressions represent which emotions. At the same time, we expected that participants would agree on an incorrect emotion label when the correct term was missing and no none correct option was available, as was found in previous work (Russell, 1993).
single exemplar of each of the six emotional expressions from the POFA set. Three of these photos were of men, and three were women. As in Experiment 1, the stimulus photos were photocopied onto a response form. In contrast to Experiment 1, however, the correct emotion for the target stimulus was removed from the list of response options presented at the bottom of the response form. For example, if the target stimulus was a purported anger expression, participants were asked to choose among the labels disgust, fear, happiness, sadness, and surprise. For each target stimulus, half of the response forms listed the five emotion labels only (the adjusted standard response form), and the remaining half listed the same five emotion labels and the none of the these terms are correct option (the adjusted modified response form). The procedure was the same as in Experiment 1. Design. The independent variables were the facial expression depicted in the target stimulus and the type of response form (adjusted standard vs. adjusted modified). The dependent variable was the proportion of participants selecting each response option. We predicted greater than chance agreement on an incorrect response for the adjusted standard response form and greater than chance agreement on the none correct option for the adjusted modified response form.
Results
The number of participants who agreed on each emotion label for each target stimulus is shown in Table 2. Consistent with the relativists' findings, these data demonstrate that when the correct emotion label is missing, participants will agree on an incorrect emotion label at rates greater than chance (Russell, 1993). For example, according to a binomial test set at 25%, participants agreed on disgust for an anger expression, anger for a disgust expression, surprise for a happy expression, disgust for a sad expression, and fear for a surprise expression (all p& < .05). The only exception was a fear expression, for which participants did not agree on any emotion label at rates greater than chance (see Table 2). As we predicted, when participants were allowed to choose none of these terms are correct, they did choose that option at rates greater than chance. Participants were more likely to select none correct when the correct option was not available for the disgust, fear, happy, sad, and surprise faces (all ps < .05).
Discussion
The results of Experiment 2 replicate the relativist finding that participants will agree on an incorrect emotion label when judging facial expressions using a forced-choice paradigm. However, these results also demonstrate that the antidote to this potentially fatal weakness of the forced-choice paradigm is the addition of the none of these terms are correct option. Table 2 shows that in the absence of the correct term, participants will select the none of these terms are correct option; this finding seems independent of the emotional expression judged and the gender of the poser. Moreover, we found this pattern using a between-subjects design, so none of the issues associated with previewing the photos or order could have affected the results. Taken together, Experiments 1 and 2 suggest that at least for the POFA set of facial expressions, the labels given to these emotional expressions by the universalists appear to be appropriate. It seems that the findings obtained by the universalists using the old forcedchoice paradigm and these particular photos may not be artifactual after all.
Method
Participants. We recruited 216 observers in the same manner as in Experiment 1. Their ages ranged from 17 to 62 years, and approximately half were men and half women. Materials and procedure. Because the characteristics of the specific poser did not affect the judgments in Experiment 1, we selected only a
80
FRANK AND STENNETT Table 2 Comparison of Percentage of Categorical Judgments With or Without a None of the Above Terms Are Correct Option, as Well as the Presence or Absence of the Correct Option Categorical choice i Photo Anger With none option Without none option Disgust With none option Without none option Fear With none option Without none option Happy With none option Without none option Sad With none option Without none option Surprise With none option Without none option Anger 28 78** 11 22 0 6 6 11 0 6 Disgust 50* 72** 17 39 0 11 6 56** 6 6 Fear 6 11 0 0 0 0 11 22 28 61** Happy 0 0 0 0 0 0 6 0 6 11 Sad 6 0 6 22 0 6 00 6 6 17 Surprise 0 17 6 0 17 33 78** 6 11 None 39 61** 56** 100** 67** 56** n 18 18 18 18 18 18 18 18 18 18 18 18
Note. Percentages may not add to 100% because of rounding. Dashes indicate that this option was not available. *p<.05. **/?<.01.
Experiment 3
However, a third challenge awaits a forced-choice response format that includes a none correct option. Specifically, how would this response format handle a novel facial expression? For example, we can only ponder what would happen if we tried to replicate the original universalist experiments on visually isolated peoples with this modified forced-choice response format. We say ponder because it appears as if there are no more visually isolated peoples left on this planet. However, we can conduct a conceptual replication: We can test what would happen to observer agreement rates if the observers were presented with a facial expression of emotion that they did not recognize. For example, what if the Dani people of New Guinea did not show, or did not recognize, the facial expression of angerwould the forced-choice response format allow researchers to determine that? Would observers incorrectly agree on this expression using the standard or the modified forced-choice format, or would each format allow us to detect a lack of agreement by judges that should occur for this particular emotion expression? Again, this is an empirical question. We examined these questions by creating a novel facial expression, one impossible to demonstrate physically or experience physiologically, to show to American observers. Observers then judged this face using either the standard or the modified forced-choice response format to enable us to determine whether the modified form would prevent accidental agreement on an incorrect option.
Materials and procedure. We used line tracings of the photograph of the fear expression from the POFA set used in Experiment 2 as one target stimulus (the fear expression). This served as a control measure to enable us to determine whether participants would judge a line tracing and a photograph of a facial expression similarly. We also chose this particular fear expression because it had one of the lowest agreement rates (65%) for the photos in the POFA set and thus provided the most stringent test of the agreement rates for a line tracing of an emotional expression. We created the second target stimulus by taking the line tracing outline of the head of the person showing the fear expression and drawing in a pattern of wrinkles and bulges in the mouth, eyes, cheeks, and forehead that does not occur in natural facial movements (the nonsense expression). For example, we drew in vertical wrinkles on the lateral portion of the forehead, we made the eye corners go up on one side and down on another, and we added horizontal wrinkles on the cheeks (see Figure 1). These faces were then photocopied onto one of two different response formsthe modified forced-choice format response form (with the none correct option) and the standard forced-choice format response form (see Experiment 1). The procedure was the same as in Experiments 1 and 2. Design. The independent variables were the line tracings of the target stimulus (fear expression vs. nonsense expression) and the response form (modified set vs. standard set). The dependent variable was the proportion of participants selecting each of the response options. On the basis of our previous findings, we predicted that participants would agree on the label fear for the fear expression at rates greater than chance, regardless of the response form. In light of our findings in Experiment 2, we predicted that participants would agree on the label none of these terms are correct for the nonsense expression at rates greater than chance when they had the modified forced-choice format and would agree incorrectly on an emotion label when they had the standard forced-choice format.
Method
Participants. One hundred and twenty people aged 18-54 years were recruited for this study from the New Brunswick, New Jersey area, in the same manner as the participants in Experiments 1 and 2 were recruited. There were approximately equal numbers of men and women.
Results
The data are presented in Table 3. As shown in the table, participants agreed on the label fear for the fear expression at rates
FORCED CHOICE
81
Figure 1. Line drawings of the nonsense face (left) and the fear face (right).
greater than chance (chance = 25%) for both the standard (67% agreement) and the modified forced-choice response format (73% agreement; binomial test, both ps < .01). This suggests that the line tracing was an adequate representation of a facial expression, and participants viewed it much as they viewed the photograph of the same expression. Comparing these results for the fear expression with those based on the results derived from photograph shown in Table 1, we find that collapsed across the response forms, the agreement on the photos of fear averaged 72.5%, whereas the agreement on the line drawings averaged 70.0%. The results for the nonsense expression were as predicted. The modal choice for the nonsense expression in the standard response format was the label disgust (47%; p < .01). The modal choice in the modified response format was the label none of these terms are correct (43%; p < .05). No other response options approached statistical significance.
Discussion
The results of Experiment 3 show that the standard forcedchoice paradigm can cause a nonsense face to become incorrectly classified as a particular emotion expression at rates greater than chance, as predicted by the relativists (Russell, 1993). However, as the other experiments in this article have shown, the modified forced-choice format once again prevented that error. We note that these results are not a function of using line tracings; participants did agree on a line tracing of a fear facial expression at rates no different than for the photograph of that same fear expression, regardless of response format. Also, we found this result using a between-subjects design. Although it seems we can no longer perform a replication of the universalist work with visually isolated peoples (e.g., Ekman, 1972) using the modified forced-choice format, the results of this
Table 3 Comparison of Percentage of Categorical Judgments With and Without a None of the Above Terms Are Correct Option, for Fear and Nonsense Facial Expressions
Categorical choice (%) Drawing Fear With none option Without none option Nonsense With none option Without none option Anger 0 0 13 20 Disgust 0 3 20 47** Fear 73** 67** 10 13 Happy 0 Sad 0 0 10 10 Surprise 20 30 3 10 None 7 43* n 30 30 30 30
'
0
0 0
Note. Dashes indicate that this option was not available. * p < .05. **p < .01.
82
experiment raise the possibility that the original agreement rates on particular expressions found for these isolated peoples may have been a function of the standard forced-choice paradigm. The results of our studies cast doubt on but do not eliminate this possibility. We do find levels of agreement for these facial expressions at rates similar to the other research on universality of expression, even with a response option of none of these terms are correct. Also, we now know that participants will use that option when it is available, and use it at rates greater than chance. However, we acknowledge that one must be very cautious about comparing accuracy rates across different studies, participant groups, and time periods.
Method
Participants. We solicited 241 participants aged 17-50 years in the same manner and from the same locations as in Experiment 3. There were approximately equal numbers of men and women. Materials and procedure. The target stimuli from Experiment 2 and the response options from Experiment 1 were used in this experiment. This time, however, the standard and modified (which included the none correct option) response forms were enhanced by the inclusion of four additional response options, namely alarmed, bored, contempt, and excited; these labels were chosen because each is located in one of the quadrants of the relativist circumplex model of facial expression perception (Russell & Fehr, 1987). All 10 or 11 options were listed in alphabetical order along the bottom of the response form. The procedure was identical to that of Experiments 1-3. Design. The independent variables were the emotion expressed in the target stimulus and the response form (enhanced modified vs. enhanced standard). The dependent variable was the proportion of participants selecting each response option for each of the target stimuli.
Experiment 4
A fourth criticism of the forced-choice response form used in facial-expression-of-emotion research is that its response options may be too limited (e.g., Russell, 1994). The relativists have argued that any number of other emotion labels might better describe the facial expressions considered by the universalists to represent anger, disgust, fear, happiness, sadness, and surprise. As we noted in Experiment 1, they also argue that these facial expressions may not be perceived categorically at all but judged according to their position along the two continuous dimensions of arousal and pleasure (Russell & Fehr, 1987). For example, the relativists have proposed that a facial expression that combines high pleasure and high arousal might be judged as excited, one that combines low pleasure and high arousal might be judged as alarmed, and so forth. The relativists have argued that reliance on the six emotion labels may have had the effect of slanting the playing field such that these labels are seen as the best labels for particular facial expressions of emotion, when other labels like alarmed may be more accurate (Ortony & Turner, 1990). Specifically, one can infer from the relativist position that emotion labels derived from the ratings of arousal and pleasure may better characterize the expressions shown by stimulus targets used in universality studies (Russell, 1993, 1994). We do not know what effect adding response options to the modified response form will have on participants' judgments and on the rates of agreement on specific facial expressions of emotion. One possibility is that in the previous three experiments, the none correct option worked as it did because of the limited number of emotion labels from which to choose. If adding more response options "dilutes" the effect of providing a none correct option, then this would provide some support for the relativist position, or at least support for a weaker universality position (e.g., Russell, 1995). However, we note that with more options available and the greater possibility of error in measurement, it is potentially less likely that agreement on a particular emotion label will occur at rates greater than chance. Again, this is an empirical question. This experiment added four more emotion labels to the options presented to observers. The labels added were those suggested by the poles of the dimensional account of facial expressions of emotion. If the relativists are correct, then a facial expression that is high in arousal and low in pleasure is as likely to be judged as fear or alarm as it is to be judged as anger. If the universalists are correct, then an anger expression should be judged as anger, not as alarm or fear, and disgust should be judged as disgust, and so on.
Results
We tabulated the participants' responses, and these are presented in Table 4. Once again, we set chance at 25% agreement. As shown in Table 4, the labels predicted by the universalists were agreed on at rates greater than chance, regardless of the response format. The lowest agreement level was 65%, for surprise. A binomial test showed that this rate was significantly different from chance (p < .001). Outside of the universalists' predicted emotion label, no other emotion label was agreed on at a rate greater than 30% (ns). The correct response, as predicted by the universalists, was chosen 83% of the time in the enhanced standard response format and 78% of the time in the enhanced modified response format. This difference in agreement rates was not significant, ^ ( 1 , N = 241) = 1.22, ns, and was comparable to the agreement rates shown in Experiment 1 (85% for the standard set, and 81% for the modified set).
Discussion
These results show that the pattern of findings we obtained using the modified forced-choice response format remains largely unaffected when additional emotion labels are added to the list of possible options. Once again, it is possible to conclude that results obtained in previous studies using the forced-choice paradigm are unlikely to have been experimental artifact and, furthermore, that the emotion labels provided by the researchers for selection by observers were appropriate. Regardless of experimental condition, participants did not select the labels alarmed, bored, excited, or contempt, despite the relativist argument that these might be more appropriate labels for some of the emotions depicted in the POFA set. We note that we tested the effect of including only four additional emotion labels. It remains to be seen whether the findings reported here can be replicated when observers are provided the opportunity to select from a wider array of response options. We also note that although the relativists have argued that the emotion label contempt is equally descriptive of the facial expression labeled disgust by the universalists, we found no evidence for this (cf. Russell, 1993). In fact, the universalists have suggested that contempt may be another universal facial expression of emo-
FORCED CHOICE
83
Table 4 Comparison of Percentage of Categorical Judgments With or Without a None of These Terms Are Correct Option, With Proposed Alternative Choices
Categorical choice ( Photo Anger With none option Without none option Disgust With none option Without none option Fear With none option Without none option Happiness With none option Without none option Sadness With none option Without none option Surprise With none option Without none option AL 0 0 0 0 25 10 0 0 0 0 20 30 AN 79** 90** 0 0 5 0 0 0 5 0 0 0 BO 0 0 5 0 0 0 0 0 10 10 0 0 CO 16 10 10 5 0 0 0 0 0 10 0 0 DI 0 0 86** 95** 0 0 0 0 5 0 0 0 EX 0 0 0 0 0 0 5 15 0 0 0 0 FE 0 0 0 0 65** 90** 0 0 0 0 15 5 HA 0 0 0 0 0 0 90** 80** 0 0 0 0 SA 0 0 0 0 0 0 0 0 81** 80** 0 0 SU 0 0 0 0 5 0 0 5 0 0 65** 65** None 19 20 21 20
20 20 20 20
21 20
20 20
Note. AL = alarmed; AN = anger; BO = bored; CO = contempt; DI = disgust; EX = excited; FE = fear; HA = happiness; SA = sadness; SU = surprise. **p < .01.
tion (Izard & Haynes, 1988) and should only be labeled as such by observers when the particular configuration of facial muscles is displayed. We did not show that particular facial configuration, and thus it is not surprising that participants did not select contempt at rates greater than chance. General Discussion These four experiments demonstrate the benefits of adding the response option none of these terms are correct to a standard forced-choice response paradigm. First, this none correct option does not significantly change the pattern of agreement rates reported by the universalists using the standard forced-choice response form. Second, the significant agreement is not due to a demand to avoid the none correct option; observers use it when it seems appropriate. Third, the none correct option prevents artifactual agreement on a novel facial expression. Fourth, this none option does not appreciably affect the pattern of results reported by the universalists even when additional emotion labels are added. Finally, we found these results consistent with universality using a between-subjects design, which eliminates the previewing and order effect problems of the previous work. Taken together, these studies show that the relativists were correct when they pointed out that the standard forced-choice paradigm has a propensity to demand agreement (Russell, 1993). However, the addition of the none correct option prevented that artifactual agreement. We found that people will agree, at rates greater than chance, that particular facial configurations represent the emotions of anger, disgust, fear, happiness, sadness, and surpriseeven when they were no longer forced into choosing an emotion term. Our results suggest that although it is potentially problematic methodologically, the modified forced-choice para-
digm, in a between-subjects method, did replicate the universalist findings in Australian and American participants. The fact that people will agree on a label, even an incorrect label, does highlight the findings that facial expressions of emotion are perceived categorically (Alvarado, 1996; Etcoff & Magee, 1992). The best test of the merits of continuing the categorical approach to understanding facial expressions of emotion is seen in the new avenues of research generated by that approach (Ekman, 1993). For example, researchers using a categorical approach to facial expressions of emotion have been able to discover (a) emotion-specific physiology (Ekman et al., 1983; Levenson et al., 1990; 1992; cf. Schachter & Singer, 1962); (b) emotion-specific brain circuitry, including isolating locations within the brain that respond to these specific facial expressions of emotion (e.g., anger and sadness; Blair, Morris, Frith, Perrett, & Dolan, 1999; fear, Whalen et al., 1998); and (c) nonverbal clues to deceit (Ekman, Friesen, & O'Sullivan, 1988; Frank & Ekman, 1997). These findings would not have been predicted by a noncategorical approach to perceiving facial expressions of emotion. It is also not clear exactly what a noncategorical approach would predict (e.g., Ortony & Turner, 1990). For example, one relativist approach argues that facial expressions are perceived along continuous dimensions such as arousal and pleasure rather than categorically (Russell & Fehr, 1987). Although it is likely to be true that facial expressions are perceived continuously, so are most other objects in the world. For example, we can rate cars, dogs, or trees on continuous dimensions of arousal and pleasure (e.g., Osgood, Suci, & Tannenbaum, 1957). In fact, evidence shows that fear and anger expressions are rated very closely in multidimensional space on the basis of these continuous dimensions (high arousal, low pleasure; Russell & Fehr, 1987). However, to confuse
84
FRANK AND STENNETT Ekman, P. (1992). An argument for basic emotions. Cognition and Emotion, 6, 169-200. Ekman, P. (1993). Facial expression and emotion. American Psychologist, 48, 384-392. Ekman, P. (1994). Strong evidence for universals in facial expressions: A reply to Russell's mistaken critique. Psychological Bulletin, 115, 268-287. Ekman, P., Davidson, R. J., & Friesen, W. V. (1990). The Duchenne smile: Emotional expression and brain physiology II. Journal of Personality and Social Psychology, 58, 342-353. Ekman, P., & Friesen, W. V. (1975). Unmasking the face: A guide to recognizing emotions from facial clues. Englewood Cliffs, NJ: PrenticeHall. Ekman, P., & Friesen, W. V. (1976). Pictures of facial affect. Palo Alto, CA: Consulting Psychologists Press. Ekman, P., Friesen, W. V., & O'Sullivan, M. (1988). Smiles when lying. Journal of Personality and Social Psychology, 54, 414-420. Ekman, P., Friesen, W. V., O'Sullivan, M., Chan, A., DiacoyanniTarlatzis, I., Heider, K., Krause, R., LeCompte, W. A., Pitcairn, T., Ricci-Bitti, P. E., Scherer, K., Tomita, M., & Tzavaras, A. (1987). Universals and cultural differences in the judgments of facial expressions of emotion. Journal of Personality and Social Psychology, 53, 712-717. Ekman, P., Levenson, R. W., & Friesen, W. V. (1983, September 16). Autonomic nervous system activity distinguishes among emotions. Science, 221, 1208-1210. Etcoff, N. L., & Magee, J. J. (1992). Categorical perception of facial expressions. Cognition, 44, 227-240. Frank, M.G., & Ekman, P. (1993). Not all smiles are created equal: The differences between enjoyment and nonenjoyment smiles. Humor: The International Journal for Research in Humor, 6, 9-26. Frank, M. G., & Ekman, P. (1997). The ability to detect deceit generalizes across different types of high stake lies. Journal of Personality and Social Psychology, 72, 1429-1439. Frank, M. G., Ekman, P., & Friesen, W. V. (1993). Behavioral markers and recognizability of the smile of enjoyment. Journal of Personality and Social Psychology, 64, 83-93. Fridlund, A. J. (1991). Sociality of solitary smiling: Potentiation by an implicit audience. Journal of Personality and Social Psychology, 60, 229-240. Fridlund, A. J. (1994). Human facial expression: An evolutionary view. San Diego, CA: Academic Press. Haidt, J., & Keltner, D. (1999). Culture and facial expression: Open-ended methods find more expressions and a gradient of recognition. Cognition and Emotion, 13, 225-266. Izard, C. E. (1971). The face of emotion. New York: Appleton-CenturyCrofts. Izard, C. E. (1977). Human emotions. New York: Plenum. Izard, C. E. (1992). Basic emotions, relations among emotions, and emotion-cognition relations. Psychological Review, 99, 561-565. Izard, C. E. (1994). Innate and universal facial expressions: Evidence from developmental and cross-cultural research. Psychological Bulletin, 115, 288-299. Izard, C. E., & Haynes, O. M. (1988). On the form and universality of the contempt expression: A challenge to Ekman and Friesen's claim of discovery. Motivation and Emotion, 12, 1-16. Keltner, D. (1995). The signs of appeasement: Evidence for the distinct displays of embarrassment, amusement, and shame. Journal of Personality and Social Psychology, 68, 441454. Klineberg, O. (1940). Social psychology. New York: Holt. Landis, C. (1924). Studies of emotional reactions: II. General behavior and facial expression. Journal of Comparative Psychology, 4, 447-509. LeBarre, W. (1947). The cultural basis of emotions and gestures. Journal of Personality, 16, 49-68.
these expressions in day-to-day life can have life-threatening implications. A prison inmate who shows fear expressions will be assaulted, and an inmate who shows anger expressions will be left alone (J. J. Newberry, personal communication, June 22, 1998).2 One can also imagine a therapist observing the facial expression of a patient complaining about his or her partnerwhether that patient expresses fear or anger would have dramatically different implications for treatment. These examples suggest that although rating facial expressions on two or three continuous dimensions seems more parsimonious than does dealing with six or eight categories, this parsimony comes at the expense of losing critically important social and interpersonal information. The preceding evidence strongly suggests that researchers not only continue the categorical approach but refine the instruments used to measure judgments of emotion. All future research on the topic of facial expressions should stop using the standard forcedchoice format and instead use the modified forced-choice format. It offers the advantage of ease of use and scoring, yet it avoids the pitfall of artificial agreement on emotion terms. By having this safety valve option of selecting none of these terms are correct, this response form adequately allows one to address novel faces, missing terms, and agreement and accuracy issues. It is important to note that although our results suggest that the universalist findings were not artifact, they cannot prove it. We did not apply this modified forced-choice response format to nonWestern participant groups, and thus we did not collect the data necessary to definitively replicate the cross-cultural findings on universality. The interconnectedness and access to media throughout the world may have made that forever impossible to test. Thus, we must develop other techniques to address the concerns of both the relativists and the universalists to fully understand the nature of human emotion.
2 This information was conveyed to us by Special Agent James J. Newberry (retired), formerly of the Bureau of Alcohol, Tobacco, and Firearms.
References
Alvarado, N. (1996). Congruence of meaning between facial expressions of emotion and selected emotion terms. Motivation and Emotion, 20, 33-61. Bartol, C. R. (1983). Psychology and American law. Belmont, CA: Wadsworth. Blair, R. J. R., Morris, J. S., Frith, C. C , Perrett, D. I., & Dolan, R. J. (1999). Dissociable neural responses to facial expressions of sadness and anger. Brain, 122, 883-893. Boucher, J. D., & Carlson, G. E. (1980). Recognition of facial expression in three cultures. Journal of Cross-Cultural Psychology, 11, 263-280. Buss, D. (1992). Is there a universal human nature? Contemporary Psychology, 37, 1262-1263. Cacioppo, J. T., Klein, D. J., Berntson, G. G., & Hatfield, E. (1993). The psychophysiology of emotion. In M. Lewis & J. M. Haviland (Eds.), The handbook of emotions (pp. 119-142). New York: Guilford. Darwin, C. (1998). In P. Ekman (Ed.), The expression of the emotions in man and animals (3rd ed.). New York: Oxford. (Original work published 1872). Ekman, P. (1972). Universals and cultural differences in facial expressions of emotion. In J. Cole (Ed.), Nebraska Symposium on Motivation 1971 (pp. 207-283). Lincoln, NE: University of Nebraska Press.
FORCED CHOICE Levenson, R. W., Ekman, P., & Friesen, W. V. (1990). Voluntary facial action generates emotion-specific autonomic nervous system activity. Psychophysiology, 27, 363-384. Levenson, R. W., Ekman, P., Heider, K., & Friesen, W. V. (1992). Emotion and autonomic nervous system activity in the Minangkabau of West Sumatra. Journal of Personality and Social Psychology, 62, 972-988. Lutz, C , & White, G. M. (1986). The anthropology of the emotions. Annual Review of Anthropology, 15, 405436. Matsumoto, D. (1990). Cultural similarities and differences in display rules. Motivation and Emotion, 14, 195214. Orne, M. (1962). On the social psychology of the psychological experiment. American Psychologist, 17, 776-783. Ortony, A., & Turner, T. J. (1990). What's basic about basic emotions? Psychological Review, 97, 315-331. Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. Urbana, IL: University of Illinois Press. Plutchik, R. (1962). The emotions: Facts, theories, and a new model. New York: Random House. Rosenberg, E. L., & Ekman, P. (1995). Conceptual and methodological issues in the judgment of facial expressions of emotion. Motivation and Emotion, 19, 111-138. Russell, J. A. (1993). Forced-choice response format in the study of facial expression. Motivation and Emotion, 17, 41-51.
85
Russell, J. A. (1994). Is there universal recognition of emotion from facial expressions? A review of the cross-cultural studies. Psychological Bul-
Received February 25, 2000 Revision received September 12, 2000 Accepted September 14, 2000