Papers by Johannes Eichstaedt
Assessment, 2013
We present a new open language analysis approach that identifies and visually summarizes the domi... more We present a new open language analysis approach that identifies and visually summarizes the dominant naturally occurring words and phrases that most distinguished each Big Five personality trait. Using millions of posts from 69,792 Facebook users, we examined the correlation of personality traits with online word usage. Our analysis method consists of feature extraction, correlational analysis, and visualization. The distinguishing words and phrases were face valid and provide insight into processes that underlie the Big Five traits. Open-ended data driven exploration of large datasets combined with established psychological theory and measures offers new tools to further understand the human psyche.
Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2015
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015
People vary widely in their temporal orientation-how often they emphasize the past, present, and ... more People vary widely in their temporal orientation-how often they emphasize the past, present, and future-and this affects their finances, health, and happiness. Traditionally, temporal orientation has been assessed by self-report questionnaires. In this paper, we develop a novel behavior-based assessment using human language on Facebook. We first create a past, present, and future message classifier, engineering features and evaluating a variety of classification techniques. Our message classifier achieves an accuracy of 71.8%, compared with 52.8% from the most frequent class and 58.6% from a model based entirely on time expression features. We quantify a users' overall temporal orientation based on their distribution of messages and validate it against known human correlates: conscientiousness, age, and gender. We then explore social scientific questions, finding novel associations with the factors openness to experience, satisfaction with life, depression, IQ, and one's number of friends. Further, demonstrating how one can track orientation over time, we find differences in future orientation around birthdays.
Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2014
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014
Journal of personality and social psychology, Jan 3, 2014
Language use is a psychologically rich, stable individual difference with well-established correl... more Language use is a psychologically rich, stable individual difference with well-established correlations to personality. We describe a method for assessing personality using an open-vocabulary analysis of language from social media. We compiled the written language from 66,732 Facebook users and their questionnaire-based self-reported Big Five personality traits, and then we built a predictive model of personality based on their language. We used this model to predict the 5 personality factors in a separate sample of 4,824 Facebook users, examining (a) convergence with self-reports of personality at the domain- and facet-level; (b) discriminant validity between predictions of distinct traits; (c) agreement with informant reports of personality; (d) patterns of correlations with external criteria (e.g., number of friends, political attitudes, impulsiveness); and (e) test-retest reliability over 6-month intervals. Results indicated that language-based assessments can constitute valid p...
Psychological science, 2015
Hostility and chronic stress are known risk factors for heart disease, but they are costly to ass... more Hostility and chronic stress are known risk factors for heart disease, but they are costly to assess on a large scale. We used language expressed on Twitter to characterize community-level psychological correlates of age-adjusted mortality from atherosclerotic heart disease (AHD). Language patterns reflecting negative social relationships, disengagement, and negative emotions-especially anger-emerged as risk factors; positive emotions and psychological engagement emerged as protective factors. Most correlations remained significant after controlling for income and education. A cross-sectional regression model based only on Twitter language predicted AHD mortality significantly better than did a model that combined 10 common demographic, socioeconomic, and health risk factors, including smoking, diabetes, hypertension, and obesity. Capturing community psychological characteristics through social media is feasible, and these characteristics are strong markers of cardiovascular mortali...
PLoS ONE, 2013
We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages ... more We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase 'sick of' and the word 'depressed'), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive 'my' when mentioning their 'wife' or 'girlfriend' more often than females use 'my' with 'husband' or 'boyfriend'). To date, this represents the largest study, by an order of magnitude, of language and personality.
Developmental Psychology, 2014
We introduce a new method, differential language analysis (DLA), for studying human development i... more We introduce a new method, differential language analysis (DLA), for studying human development in which computational linguistics are used to analyze the big data available through online social media in light of psychological theory. Our open vocabulary DLA approach finds words, phrases, and topics that distinguish groups of people based on 1 or more characteristics. Using a data set of over 70,000 Facebook users, we identify how word and topic use vary as a function of age and compile cohort specific words and phrases into visual summaries that are face valid and intuitively meaningful. We demonstrate how this methodology can be used to test developmental hypotheses, using the aging positivity effect as an example. While in this study we focused primarily on common trends across age-related cohorts, the same methodology can be used to explore heterogeneity within developmental stages or to explore other characteristics that differentiate groups of people. Our comprehensive list of words and topics is available on our web site for deeper exploration by the research community.
Assessment, 2013
We present a new open language analysis approach that identifies and visually summarizes the domi... more We present a new open language analysis approach that identifies and visually summarizes the dominant naturally occurring words and phrases that most distinguished each Big Five personality trait. Using millions of posts from 69,792 Facebook users, we examined the correlation of personality traits with online word usage. Our analysis method consists of feature extraction, correlational analysis, and visualization. The distinguishing words and phrases were face valid and provide insight into processes that underlie the Big Five traits. Open-ended data driven exploration of large datasets combined with established psychological theory and measures offers new tools to further understand the human psyche.
Social and personality psychology compass, 2015
Countless studies have addressed why some individuals achieve more than others. Nevertheless, the... more Countless studies have addressed why some individuals achieve more than others. Nevertheless, the psychology of achievement lacks a unifying conceptual framework for synthesizing these empirical insights. We propose organizing achievement-related traits by two possible mechanisms of action: Traits that determine the rate at which an individual learns a skill are talent variables and can be distinguished conceptually from traits that determine the effort an individual puts forth. This approach takes inspiration from Newtonian mechanics: achievement is akin to distance traveled, effort to time, skill to speed, and talent to acceleration. A novel prediction from this model is that individual differences in effort (but not talent) influence achievement (but not skill) more substantially over longer (rather than shorter) time intervals. Conceptualizing skill as the multiplicative product of talent and effort, and achievement as the multiplicative product of skill and effort, advances sim...
Uploads
Papers by Johannes Eichstaedt