Do Mathematicians Agree About Mathematical Beauty
Do Mathematicians Agree About Mathematical Beauty
Do Mathematicians Agree About Mathematical Beauty
https://doi.org/10.1007/s13164-022-00669-3
Abstract
Mathematicians often conduct aesthetic judgements to evaluate mathematical
objects such as equations or proofs. But is there a consensus about which mathe-
matical objects are beautiful? We used a comparative judgement technique to meas-
ure aesthetic intuitions among British mathematicians, Chinese mathematicians, and
British mathematics undergraduates, with the aim of assessing whether judgements
of mathematical beauty are influenced by cultural differences or levels of expertise.
We found aesthetic agreement both within and across these demographic groups.
We conclude that judgements of mathematical beauty are not strongly influenced by
cultural difference, levels of expertise, and types of mathematical objects. Our find-
ings contrast with recent studies that found mathematicians often disagree with each
other about mathematical beauty.
* Rentuya Sa
R.Sa@lboro.ac.uk
1
Department of Mathematics Education, Loughborough University, Schofield Building,
University Road, Loughborough LE11 3TU, UK
2
Centre for Logic and Philosophy of Science, Vrije Universiteit Brussel, Brussels, Belgium
13
Vol.:(0123456789)
R. Sa et al.
objective property, but where individual aesthetic preferences might differ idiosyn-
cratically, or perhaps be systematically influenced by social contexts.
As a contribution to the field of experimental philosophy, this paper empirically
measures the level of aesthetic consensus among and across British mathematicians,
Chinese mathematicians, and British undergraduate mathematics students. We begin
by examining philosophical accounts of mathematical beauty and methods previ-
ously used to assess aesthetic intuitions. We then introduce comparative judgment as
a means of investigating aesthetics, and describe the stimuli, participants, and proce-
dure we employed. Finally, we discuss our empirical findings in the light of current
understandings of mathematical beauty, and summarise both our substantive contri-
bution to debates about philosophical intuitions and our methodological contribu-
tion to advances in experimental philosophy.
13
Do Mathematicians Agree about Mathematical Beauty?
1981, p.254). Erdős frequently referred to beautiful proofs as coming ‘straight out
of The Book’ (Erdős 1983, p.37), and famously claimed that mathematicians do not
necessarily need to believe in the existence of God, but they should believe in the
existence of ‘The Book’ (Aigner and Ziegler 2010, p.v). According to Cherniwchan
et al. (2010), one of Erdős’s great ambitions in life was to find such beautiful proofs.
His students Aigner and Ziegler (2010) published a collection entitled Proofs from
The Book, based on some of Erdős’s own suggestions. According to Aigner and Zie-
gler (2010, p.v), a degree of mathematical proficiency is required to comprehend
‘The Book’, although only at the undergraduate level. This suggests that undergradu-
ate mathematical training would suffice for making at least some aesthetic judgments.
The above accounts make two assumptions regarding mathematical beauty that
are subject to empirical investigation: (i) there should be agreement about aesthet-
ics among mathematicians, and (ii) judgements about aesthetics require a degree of
mathematical proficiency, although it is unclear what level of training is needed. It is
worth noting that the theoretical basis for the former assumption is potentially chal-
lenged by arguments that mathematical beauty is reducible to non-aesthetic properties
that are epistemically centered. Rota (1997, p.175), for instance, agrees that ‘both the
truth of a theorem and its beauty are…equally shared and agreed upon by the com-
munity of mathematicians’, but suggests that this is due to a sense of enlightenment
that is mistakenly referred to as ‘beauty’. Similarly, Todd (2008, pp.71–72) claims
that the ‘normative strength of the putative aesthetic claims’ is due to a relationship
between truth and epistemic warrants, and Dutilh Novaes (2019) argues that Hardy’s
aesthetic criteria are reducible to non-aesthetic properties that facilitate the epistemic
function of explanatoriness. These claims highlight ambiguity between aesthetic and
epistemic dimensions of mathematics, and we address them further later. However,
these authors still assume a degree of agreement between different mathematicians’
aesthetic intuitions, so evidence on (i) pertains to their views too.
13
R. Sa et al.
with 60.4% scoring the proof below the midpoint of the aesthetic scale, and only 31.5%
scoring it above. For a ‘Book proof’ to score so low seems to challenge the existence of
universal agreement about mathematical beauty; it certainly seems that Aigner and Zie-
gler’s aesthetic preferences might not be reflected in the wider community.
On simplicity, Inglis and Aberdein (2015) asked 225 mathematicians to rate the
extent to which 80 adjectives described a proof that they could think of or had recently
read. Using an exploratory factor analysis, they found four main dimensions on which
proofs varied: aesthetics, intricacy, precision, and utility. Neither ‘beautiful’ nor ‘ele-
gant’ correlated strongly with ‘simple’. Thus, even if there is agreement, this might not
be due to traditionally listed criteria.
On social influence, Inglis and Aberdein (2020) replicated their 2016 study, this
time with the manipulation that half of their 203 mathematician participants were
told the proof’s source, Aigner and Ziegler’s (2010) attempt to produce a version of
Erdős’s ‘The Book’. Pure mathematicians who were given the source rated the proof
more highly than those who were not, but applied mathematicians did not show such an
effect. This suggests that mathematicians’ aesthetic judgement might indeed be socially
influenced: Erdős was most active in pure mathematics, so ‘The Book’ is likely better
known among pure than applied mathematicians. This result also relates to the mere-
exposure account of aesthetics, which suggests that an individual’s aesthetic apprecia-
tion is developed through repetitive exposure to the same item (Zajonc 2001). Famous
mathematical objects, such as proofs from Aigner and Ziegler’s ‘The Book’, would
have more exposure within the field, and so mathematicians’ aesthetic appreciation of
such proofs could be socially developed through repetitive exposures. If accounts such
as the mere exposure effect, coupled with social conformity effects of the type studied
by Inglis and Aberdein, can successfully explain mathematicians’ judgement of math-
ematical beauty, then its objective existence would be an unnecessary assumption.
Overall, these empirical results show disagreement between different mathematicians’
intuitions of mathematical beauty. The evidence is not decisive, and there are reasons to
be cautious about the methodological approaches adopted in these studies, as discussed
below. However, it seems reasonable to conclude that investigations into mathematical
beauty should not assume that mathematicians all agree. With this in mind, we examine
judgments of mathematical beauty in relation to simplicity, cultural context, and pro-
ficiency by measuring and comparing the degree of aesthetic agreement among British
mathematicians, Chinese mathematicians, and British mathematics undergraduates.
To further situate this work and to raise issues in methodology, we next elaborate on rele-
vant studies involving the study of cross-cultural and cross-expertise philosophical intuitions.
13
Do Mathematicians Agree about Mathematical Beauty?
Work in this area would also contribute to wider disputes on the degree of cross-
cultural consensus about philosophical intuitions. In early cross-cultural experimental
works, Westerners and East Asians were found to have different patterns of epistemic
and semantic intuitions (Weinberg et al. 2001; Machery et al. 2004), challenging the
normativity of those intuitions and posing a serious problem for the standard philo-
sophical approach of using intuition as evidence (Stich 2001). However, these findings
failed to replicate (e.g. Seyedsayamdost 2015; Kim and Yuan 2015), arguably due to
methodological weaknesses: Knobe (2019, 2021) pointed out that the number of Asian
participants in Weinberg et al.’s study was only 24, and Lam (2010) argued that it was
problematic for Machery et al. to present questions to Asian participants in English
instead of their native languages. Machery et al. (2017) recently addressed these meth-
odological limitations by having a larger sample (N = 521) from four different cultural
backgrounds, and using materials written in native languages. Contrary to the early
experimental results, Machery et al. (2017) identified that people from different cultural
backgrounds exhibit similar patterns of epistemic intuitions. Knobe (2019) cited this
result in support of the hypothesis that philosophical intuitions are robust across cul-
tures. However, this position was criticized by Stich and Machery (2022), who argued
that Knobe had presented an unbalanced summary of the literature: although some
studies failed to replicate earlier findings of cross-cultural differences in intuitions,
many such findings have successfully replicated (Stich and Machery 2022, Table 1).
Similar points have been raised in work directly related to aesthetics, in the
more obvious domain of perceptual beauty (Che et al. 2018). Cross-cultural disa-
greement was initially detected by McElroy (1952) and Lawlor (1955), but agree-
ment was found in a series of later investigations (Eysenck and Iwawaki 1971;
Soueif and Eysenck 1971) and in cross-cultural studies on basic visual features
such as symmetry, proportion, curvature, brightness, and contrast (Che et al.
2018). These contrasting empirical results could be influenced not only by the
choice of stimuli but also by the types of judgements, since these studies varied
in asking participants to judge the stimuli in isolation or in comparison with one
another: McElroy and Lawler asked participants to rank an entire set of artworks
presented simultaneously, whereas Eysenck and his collaborators asked par-
ticipants to make comparative or individual judgements. It is certainly possible
that the apparent degree of aesthetic consensus is influenced by methodological
approach, and we pick up this point below.
As noted above, there are debates about the extent to which aesthetics overlaps with
epistemology. If mathematical beauty requires significant mathematical insight,
then recognizing beauty should be impossible for people without adequate training.
This would be consistent with an early study by Dreyfus and Eisenberg (1986), who
designed a set of problems to evoke elegant solutions and found that undergraduates
struggled to come up with such solutions and were unable to distinguish aesthetically
pleasing solutions from others when prompted. Dreyfus and Eisenberg interpreted
13
R. Sa et al.
this as indicating that people need mathematical training beyond undergraduate level
in order to appreciate mathematical beauty. Of course, it could be that Dreyfus and
Eisenberg’s own criteria for elegance or beauty are not widely shared, or that under-
graduates can appreciate beauty only in relatively simple mathematical contexts.
It could also be that mathematicians, who do have the potential to make aesthetic
judgements based on epistemology, do so only partly on that basis. Starikova (2017)
argues in this direction, distinguishing intellectual and perceptual aspects of math-
ematical beauty. She suggests that intellectual beauty is the aesthetic response to
abstract properties of a mathematical object, such as structure or degree of gener-
ality; sufficient proficiency is required to detect and appreciate these. Appreciating
perceptual beauty, on the other hand, does not necessarily require mathematical
understanding. Similarly, Montano (2014) and Pearcy (2020) distinguish the per-
formative appreciation response, which is a reaction of active intellectual engage-
ment, from the basic appreciation response, a passive and automatic reaction. Pearcy
(2020, p.59–60) illustrates this with the example of the physicist Richard Feynman
and his artist friend who could both see a flower as beautiful, but for different rea-
sons: Feynman has a basic appreciation response in visual aesthetics, but a perform-
ative appreciation response in scientific aesthetics, his artist friend vice versa.
These theoretical suggestions are consistent with empirical evidence. Zeki et al.
(2014), for instance, asked 15 mathematicians (postgraduate students and postdoc-
toral researchers) to study 60 equations, rating each for beauty from -5 to 5. After
about 2 to 3 weeks, the mathematicians were asked to re-rate the equations as ugly,
neutral, or beautiful while their brain activity was fMRI-scanned. A few days after
scanning, they were asked to rate their understanding of each equation from 0 to 3.
Zeki et al. found a significant positive correlation between understanding and scan-
time beauty rating, and a significant difference in brain activity in a region associ-
ated with appreciating beauty when participants were viewing equations they rated
as beautiful as opposed to ugly or neutral. The latter was driven by beauty ratings
after accounting for understanding, so there is room for aesthetic judgements to be
based partly on understanding and partly on visual appearance. Consistently with
this interpretation, Zeki et al. also found that 12 non-expert participants (educated
in mathematics only to the age of 16) indicated that they had no understanding for
the vast majority of the equations, but some did give positive beauty ratings for a
minority.
For our purposes, this provides evidence of an imperfect overlap between aes-
thetics and epistemology for experts, but no indication how this develops prior to
expertise or of whether or not there is aesthetic agreement. In fact, although Zeki
et al. found a highly significant positive correlation between pre-scan and scan-time
beauty ratings, the correlation coefficient of r = .612 is some way from perfect, and
some large shifts in ratings were seen between the two times. If mathematicians do
not always agree with themselves, perhaps it is unreasonable to expect them to agree
with one another. The aesthetic judgement of experts and non-experts has been fur-
ther studied by Johnson and Steinerberger (2019) who asked two groups of experts
(mathematicians and mathematics undergraduates) and a group of non-experts to
rate the similarity of mathematical arguments to artworks (paintings and classical
13
Do Mathematicians Agree about Mathematical Beauty?
music). Perhaps surprisingly, they found that participants could associate each math-
ematical argument to an artwork, with agreement at above chance levels, suggesting
some degree of shared consensus about this kind of aesthetic correspondence.
Using different methods, Hayn-Leichsenring et al. (2021) studied both
undergraduate students and aesthetic agreement. They asked twenty mathematics
undergraduates and twenty undergraduates without university-level mathematical
training to distribute 64 equations into 9 piles ranging from “extremely unaesthetic”
to “extremely aesthetic” with predetermined numbers in each pile to form a normally
distributed pattern. After participants completed their judgements, they were asked
to state which equations they were familiar with, and to indicate the criteria behind
their judgements from options including “meaning”. In line with the works of Zeki
et al. and Johnson and Steinerberger, Hayn-Leichsenring et al. found a positive
relationship between understanding and perceived beauty: in both groups, ratings
were significantly higher for familiar equations. This was more pronounced for the
mathematics undergraduates, who were familiar with more equations and who more
often stated that their aesthetic judgement relied on meaning. This seems to imply that
greater understanding would result in greater aesthetic appreciation in mathematics.
However, an alternative account would be that understanding is merely an essential
pre-condition in making any forms of judgements on equations or proofs. More
investigations are needed to examine how intuitions about mathematical aesthetics are
related to familiarity and understanding.
Hayn-Leichsenring et al. also looked explicitly at simplicity, finding a significant
negative relationship between the number of elements (numbers, letters and
mathematical signs) in an equation and its aesthetic rating for the mathematics
undergraduates but not the other group. They also found that compared to
undergraduates without mathematical training, mathematics undergraduates shared a
higher level of aesthetic agreement.
These results suggest that undergraduates with university-level mathematical
training have attained sufficient proficiency to share a performative appreciation
response to mathematical beauty. And this returns us to questions about methodology.
In some of these empirical studies, it seems that mathematicians do not agree about
beauty as much as traditional philosophical accounts suppose. In others, it seems
that agreement is present even among comparatively inexperienced undergraduates.
We suggest that method might be one reason for this. Notably, in Johnson and
Steinerberger’s study, participants’ aesthetic judgement was conducted through
comparing and contrasting mathematical arguments with artworks. Similarly,
in Hayn-Leichsenring et al.’s study, participants also compared and contrasted
equations for fine-grained aesthetic classification. In both studies, participants’
aesthetic judgements were relative rather than absolute. We believe it is plausible that
mathematicians might have different absolute standards for beauty and thus appear
to disagree when asked for absolute judgements, but might nevertheless agree about
which objects are more or less beautiful. If this is the case, then such agreement is
best sought with methods involving relative judgements. We used one such method,
as described below.
13
R. Sa et al.
3.1 Stimuli
The stimuli for this study were chosen from Zeki et al. (2014)’s list of 60 equations; 20
out of the 60 equations were used (selected by taking every third one), along with the
brief descriptions written by Zeki et al. For the Chinese participants, the descriptions were
13
Do Mathematicians Agree about Mathematical Beauty?
translated into simplified Chinese. The selected equations and their descriptions were for-
matted and uploaded to the online CJ platform No More Marking (https://www.nomor
emarking.com). The 20 equations appear in Table 1 in the Results section (along with
their CJ scores, to be explained below); their descriptions appear in Table 3 the Appendix.
To assess factors that might predict judgements of mathematical beauty, we counted
the number of characters in each equation, the number of words in its description, and
the number of mathematicians’ names mentioned in the description. For the equations,
we counted individual characters ignoring any commas, so that dx dt
and (𝛼 − 𝛽y) have 5
and 6 characters respectively. For the descriptions, we counted words ignoring punc-
tuation and brackets; any mathematical element that appeared within the description
was counted as one word. For the counts for each equation, see again Table 3.
Table 1 The ranking and the β score for each equation, separately for each demographic group
Equation British Mathematicians British Undergraduates Chinese Mathematicians
Ranking Score Ranking Score Ranking Score
f (x) = ∫
∞
− y)f (y)dy
−∞
𝛿(x 14 –0.482 11 0.074 11 –0.143
∑∞
15 –0.607 14 –0.355 8 0.287
1 = (𝜁 (n) − 1)
n=2
1
∞
∑ 𝜇(n) 17 –0.930 20 –1.624 13 –0.212
𝜁 (s)
= ns
, s𝜖ℂ, Re(s) >1
n=1
dx
d
= x(𝛼 − 𝛽y), dy
dt
= −y(𝛾 − 𝛿x) 18 –1.048 12 –0.033 19 –0.964
T = de + 𝜔 ∧ e,R = d𝜔 + 𝜔 ∧ 𝜔 19 –1.387 17 –0.765 17 –0.729
R𝛼𝛽[𝛾𝛿;𝜆] = 0 20 –1.461 19 –1.341 20 –2.081
13
R. Sa et al.
3.3 Results
For each demographic group, we used the Bradley-Terry Model, which assigns each
equation i a parameter 𝛽i to estimate its beauty. It does this via a process based on using
the judgements to iteratively update the probability that equation i is judged to be more
beautiful than equation j (Bradley and Terry 1952). To check that this yielded meaning-
ful scores, we then calculated inter-rater reliability (IRR) for each demographic group.
We randomly split each group into subgroups of 12, splitting the mathematician groups
evenly and randomly selecting two groups of 12 from among the undergraduates. We
then calculated new estimates of each equation’s aesthetic quality from each subgroup’s
judgements. This process was repeated 1000 times to calculate the average Pearson cor-
relation coefficient between the scores for the two subgroups.
The IRR of the British mathematicians’ judgements was r = 0.721, the IRR of
the British undergraduates’ judgements was r = 0.701, and the IRR of the Chinese
mathematicians’ judgements was r = .722. These results indicate relatively consistent
13
Do Mathematicians Agree about Mathematical Beauty?
aesthetic agreement within each demographic group: the CJ approach does detect
agreement based on relative judgements for all three groups. We thus treat the equa-
tion scores for the complete groups as reliable, and these scores are shown in Table 1,
ordered from most to least beautiful by the British mathematicians’ rankings.
A first indication of agreement not just within but across the demographic groups
is visible in the rankings and scores in Table 1. Both the British and the Chinese
mathematicians judged Euler’s identity the most beautiful equation and the Sec-
ond Bianchi Identity the least beautiful. Moreover, although the rankings do not
match perfectly, equations judged more beautiful by one group were generally
3
r=0.846
Chinese Mathematicians' Perceived Aesthetics
1
2
11
1 6
20 12
9 4
17
10
0 3 5 2
19
13
18
8
14
15
7
1
2 16
3
2 1 0 1 2 3
British Mathematicians' Perceived Aesthetics
Fig. 1 The correlation between CJ scores derived from British mathematicians’ and Chinese mathemati-
cians’ aesthetic judgements. Error bars show ±1 standard error
13
R. Sa et al.
judged more beautiful by the other. This is reflected in a statistical analysis: there
is a significant and strong positive correlation between the two sets of scores
r = .846, 95% CI[.645, .937]; see Fig. 1. At least for equations, it seems, relative
mathematical beauty is judged fairly consistently across these two cultures.
A first indication of agreement across expertise levels is also visible in Table 1. The Brit-
ish undergraduates, like both groups of mathematicians, judged Euler’s identity the most
beautiful equation. They also, like the Chinese mathematicians, broadly agreed with
the British mathematicians: again we found a significant and strong positive correlation
between the two sets of scores r = .781, 95% CI[.518, .909]; see Fig. 2. Again, at least
2
r=0.781
British Undergraduates' Perceived Aesthetics
1 20
6
9 8
11
4
10 12
19
5
0 7
18
17
13
14 15
2
1
16
2
2 1 0 1 2 3
British Mathematicians' Perceived Aesthetics
Fig. 2 The correlation between CJ scores derived from British mathematicians’ and British undergradu-
ates’ aesthetic judgments. Error bars show ±1 standard error
13
Do Mathematicians Agree about Mathematical Beauty?
for equations, it seems that relative mathematical beauty is judged similarly by British
undergraduate mathematics students and more experienced mathematicians.
Finally for this study, Table 1 suggests that characteristics of the equations might predict
judgements of beauty: equations judged more beautiful tend to be shorter. To formally
investigate this, along with our other possible predictors, we conducted linear regression
analyses, separately for each group, predicting CJ scores from the number of characters in
the equations and the numbers of words and mathematicians’ names in the descriptions.
The results appear in Table 2. For no group did either number of words or num-
ber of names predict the beauty scores. Although, in line with earlier studies (Wells
1990), it seems that everyone thinks Euler’s identity is beautiful, it appears unlikely
that this is due to a generally positive view of equations with names attached (nota-
bly, our measure does not capture relative renown). It could well be the case, how-
ever, that its perceived beauty is related to its simplicity: for both the British math-
ematicians and the British undergraduates, the number of characters in an equation
did significantly predict beauty score ( p = .008 and p = .010 respectively). To con-
textualize these estimates, for the mathematicians, an extra 10 characters in an equa-
tion predicted a drop in beauty score of 0.89, nearly a quarter of the overall range
from 2.390 to -1.460; for British undergraduates, it predicted a drop of 0.676 (score
range 1.347 to -1.623). For Chinese mathematicians, the number of characters was
not a significant predictor of beauty score ( p = .087), but the analysis nevertheless
indicated a similar predictive pattern compared to the previous two demographic
groups. Number of characters in an equation is clearly a crude measure of simplic-
ity, but the direction of these results is in line with philosophical claims that simplic-
ity is related to mathematical beauty.
Study 1 found not only consistent aesthetic judgements within our demographic
groups of British mathematicians, Chinese mathematicians and British under-
graduates, but also agreement across culture and expertise. These findings
13
R. Sa et al.
contrast with some earlier empirical work (Wells 1990; Inglis and Aberdein
2016, 2020), which found more disagreement than might have been expected
based on traditional accounts. One possible explanation was suggested above:
ratings against absolute scales might fail to capture underlying consensus on
relative mathematical beauty. However, it could also be that level of consen-
sus is affected by the type of stimuli: Inglis and Aberdein (2016, 2020), for
instance, found disagreement on proofs rather than equations. Hence, Study 2
aimed to measure mathematicians’ aesthetic agreement in relation to proofs,
to examine whether a change to different type of stimuli would influence the
agreement found in Study 1.
4.1 Stimuli
Eight proofs were employed in Study 2 (we used fewer proofs than equations for
the obvious reason that these take longer to read). Five proofs were selected from
Aigner and Ziegler’s (2010) collection of Proofs from The Book, two were from
Pearcy’s (2020) Mathematical Beauty, and one was from Nelsen’s (2000) Proofs
Without Words II: More Exercises in Visual Thinking. All eight proofs stimuli
appear in Table 4 in the Appendix.
13
Do Mathematicians Agree about Mathematical Beauty?
Proof 1
Proof 2
Proof 3
Proof 4
Proof 5
Proof 6
Proof 7
Proof 8
5 Discussion
5.1 Summary
The studies in this paper investigated whether mathematicians agree about math-
ematical beauty. Using comparative judgement methods, Study 1 found agreement
about the aesthetics of equations both within and across three demographic groups:
British mathematicians, Chinese mathematicians, and British undergraduate math-
ematics students. It also found that simplicity – operationalized by counting charac-
ters in each equation – predicted collective judgements of beauty. Study 2 broadened
the range of stimuli, finding a similar level of between-participant agreement among
British mathematicians about the aesthetics of proofs.
Together, these studies constitute evidence that relative judgements about beauty
in mathematics are fairly stable and robust within and across cultures, and that
undergraduates have learned enough to be able to judge beauty in a way similar
to expert mathematicians, at least for the equations we considered. In this last sec-
tion, we discuss the implications of these findings for views on beauty in relation to
agreement and simplicity, to cross-cultural studies, and to epistemology; throughout,
we consider issues of methodology in experimental philosophy.
Two of the most interesting results of this paper are the level of aesthetic agreement
found and the result that short equations tend to be judged more beautiful. Both find-
ings are in line with traditional accounts of mathematical beauty, but they go against
prevailing trends in recent empirical work, which has found aesthetic disagreement
among mathematicians (Wells 1990; Inglis and Aberdein 2016, 2020) and a lack
of relationship between beauty and simplicity (Inglis and Aberdein 2015). We sug-
gest that, in both cases, methodological factors might account for these apparent
contradictions.
13
R. Sa et al.
Our first study deliberately examined judgements of beauty from multiple demo-
graphic groups, answering calls to consider culture in mathematical practice (Larvor
2016) and addressing the issue of what expertise is required to exercise aesthetic
13
Do Mathematicians Agree about Mathematical Beauty?
judgement. Our finding of a strong degree of aesthetic consensus across the Brit-
ish and Chinese groups suggests that mathematicians’ aesthetic judgements are not
strongly influenced by cultural differences. This is consistent with the moderate aes-
thetic agreement on basic visual properties found elsewhere in the field of cross-
cultural empirical aesthetics (Che et al. 2018), and is good news for those inclined
towards aesthetic realism: although agreement among mathematicians does not
imply that mathematical beauty is objective or that aesthetic intuitions can be nor-
matively correct, it provides no reason to reject that position.
Another way of accounting for our finding of aesthetic consensus across
the British and Chinese groups would be to suggest that the two groups share a
similar mathematical culture. In other words, perhaps mathematics is so inter-
connected in the modern world, it no longer makes sense (if it ever did) to talk
about distinct mathematical cultures. We doubt that this is the case. While it is
certainly true that there has been a great deal of interaction between Western
and Chinese mathematics, both historically and today, this has led to concerted
efforts by some Chinese mathematicians to try to preserve what they consider
to be distinctive about Chinese mathematics. For instance, following the arrival
in China of Jesuit missionaries with Western mathematical texts in the late sev-
enteenth century, political movements such as “Chinese Origins of Western
Science” were founded. These aimed to minimize the significance of Western
influence in the development of Chinese mathematics by valuing and maintain-
ing its traditional culture (Bréard 2019, p.82).
The aim to preserve Chinese cultural identity in mathematics was again found
during the early development of modern mathematics in China. By the 1930s, the
first group of mathematics departments were founded in Chinese universities, which
led to exchanges with Western institutions. For instance, mathematicians such as
Bertrand Russell and William F. Osgood visited Peking University during the 1930s,
and their visits were significantly valued by the Chinese mathematics community
(Zong 2020). But a number of well-respected Chinese mathematicians responded
by emphasising the need to preserve the Chinese approach to mathematics. For
example, Shiing-Shen Chern advocated that “Chinese mathematics must be on the
same level as its Western counterpart, though not necessarily bending its effect in
the same direction” (Hudecek 2014, p.166). Chern’s student Wu Wen Tsun noted
that “there is an essentially Chinese mathematical style, and that Chinese mathema-
ticians have a patriotic duty to study it and build upon it” (Hudecek 2014, p.161).
After Wu returned to China from France in the 1951, he focused on promoting the
ancient Chinese style of mathematics characterises by algorithms and the “mecha-
nisation of mathematics” (Hudecek 2012). In sum, there are reasons to suppose that
distinctive cultural aspects of Chinese mathematics were, and still are, valued by
Chinese mathematicians, despite the international mobility that characterises mod-
ern academia.
The fact that the way mathematics is taught in China contrasts to many Western
countries gives us further reasons to suppose that Chinese and Western mathematical
practices do not share identical cultural norms. Recent decades have seen the devel-
opment of international comparison studies where student achievement is compared
13
R. Sa et al.
between educational jurisdictions. These have tended to find that Chinese students,
and indeed students in the Pacific rim more generally, tend to outperform Western
students of the same age in mathematics (Fan and Zhu 2004). This, in turn, has led
to systematic investigations into how typical pedagogy in Chinese classrooms differs
from typical pedagogy in Western classrooms (Fan et al. 2004). A common obser-
vation is that mathematics education in China tends to place a relatively stronger
emphasis on the acquisition of mathematical content through rote learning and hard
work than is normal in the West (Leung 2001). Again, these findings support the
view that Chinese and Western mathematical cultures are not identical in general,
despite our findings that Chinese and British mathematicians’ aesthetic tastes appear
to be largely shared.
Moreover, we have narrowed down the way in which other social influences
might affect judgements about beauty: if social conformity plays a role in math-
ematicians’ aesthetic judgements, this is not visible in effects of the number of
influential names attached to an equation. That said, our findings cannot in this
case be said to contradict those of Inglis and Aberdein (2020): Euler’s identity,
with its longstanding aesthetic status, was consistently judged more beautiful
than the rest of our stimuli.
With regard to expertise, we found not only that mathematics undergradu-
ates are capable of making collectively consistent aesthetic judgements, but
also that they seem to share aesthetic criteria with mathematicians. This pro-
vides evidence against the claims of Hardy that judging mathematical beauty
requires advanced mathematical proficiency, at least as applies to equations.
However, the nature of the criteria remains unclear. Starikova (2017) or Pearcy
(2020) could argue that this cross-expertise agreement might be derived from
either perceptual or basic appreciation responses: perhaps both mathemati-
cians and students have similar responses to the visual appearance of the equa-
tions, and only mathematicians go beyond this. Taking our work in conjunction
with Johnson and Steinerberger (2019) and Hayn-Leichsenring et al.’s (2021),
however, we consider it more likely that epistemology plays a role, that under-
graduates have developed sufficient proficiency to engage an intellectual or
performative appreciation response at a level that might not match that of pro-
fessional mathematicians but does reflect shared values beyond those acces-
sible to the general population. In addition, since our stimuli includes some
famous equations – such as Euler’s identity, the Pythagorean theorem and Fer-
mat’s last theorem – the mere-exposure effect could be an alternative explana-
tion of the aesthetic agreement that we have found between mathematicians and
undergraduates. However, since we did not empirically measure participants’
familiarity with each equation, we cannot test this hypothesis. Further work
that directly measures how familiar mathematicians are with stimuli such as
ours would be worthwhile.
Finally, we highlight that our use of a comparative judgement method might
be useful technique more generally in experimental philosophy. A common
goal of experimental philosophers is to empirically assess philosophical intui-
tions (e.g., Heintz and Taraborelli 2010). The psychophysics literature has estab-
lished that humans are more reliable at making judgements of physical properties
13
Do Mathematicians Agree about Mathematical Beauty?
such as height, weight and brightness when asked to compare stimuli rather than
judge them in isolation. Our findings suggest that the same may be true when par-
ticipants are asked to make philosophical judgements. If this conclusion is correct
then the method of comparative judgement might be widely useful to experimental
philosophers.
To conclude, we stress that the conflicting empirical findings about aesthetic
agreement in mathematics echo the broader evolution of experimental philosophy.
Although this paper’s finding of cross-cultural aesthetic agreement is consistent with
stability of philosophical intuitions found in the more recent works of experimental
philosophy, this does not mean that one should disregard findings to the contrary.
Rather, it highlights the degree of complexity in constructing measures to operation-
alize the relevant constructs. This paper’s findings that aesthetic judgments in math-
ematics are relatively stable and robust across expertise and cultures suggest that
empirical findings of aesthetic disagreement based on absolute scales do not paint a
full picture of the nature of mathematical beauty. As Knobe (2019) has suggested, it
is important to seek that full picture, which demands triangulation (Löwe and Kerk-
hove 2019). In this case, investigations might profitably work towards directly com-
paring absolute with relative judgement approaches, and towards developing more
sophisticated operationalizations of simplicity and understanding. Our work sug-
gests that more nuanced accounts might successfully marry empirical findings with
traditional accounts of mathematical beauty.
13
Appendix
Table 3 Equation stimuli
13
Stimuli ID Equations Description Characters Names Words
13
R. Sa et al.
1 1 1.132
2 2 0.444
3 3 0.304
13
Do Mathematicians Agree about Mathematical Beauty?
Table 4 (continued)
4 4 0.185
5 5 0.012
13
R. Sa et al.
Table 4 (continued)
6 6 -0.569
7 7 -0.678
8 8 -0.830
Table 3Table 4
13
Do Mathematicians Agree about Mathematical Beauty?
Acknowledgements Many thanks to Dave Sirl, Ouhao Chen, Michael Barany, Andrew Aberdein,
Brendan Larvor, Paola Iannone, Paul Hasselkuß, two anonymous reviewers, and attendees at the Novem-
bertagung for helpful discussions about this work.
Declarations
Conflict of Interest None.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as
you give appropriate credit to the original author(s) and the source, provide a link to the Creative Com-
mons licence, and indicate if changes were made. The images or other third party material in this article
are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is
not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission
directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licen
ses/by/4.0/.
References
Aigner, M., and G.M. Ziegler. 2010. Proofs from THE BOOK, 4th ed. Springer.
Alexanderson, G.L. 1981. An interview with Paul Erdős. The Two-Year College Mathematics Journal 12
(4): 249–259.
Berghaus, G. 1992. Neoplatonic and Pythagorean notions of world harmony and unity and their influence
on Renaissance dance theory. Dance Research: The Journal of the Society for Dance Research 10
(2): 43–70.
Bisson, M.-J., C. Gilmore, M. Inglis, and I. Jones. 2016. Measuring conceptual understanding using com-
parative judgement. International Journal of Research in Undergraduate Mathematics Education 2
(2): 141–164. https://doi.org/10.1007/s40753-016-0024-3.
Bradley, R., and M. Terry. 1952. Rank analysis of incomplete block designs the method of paired com-
parisons. Biometrika 39: 324–345.
Bréard, A. 2019. Nine Chapters on Mathematical Modernity: Essay on the global historical entangle-
ments of the science of numbers in China, 1st ed. Cham: Springer.
Che, J., Sun, X., Gallardo, V., & Nadal, M. 2018. Cross-cultural empirical aesthetics. In J. Christensen,
A. Gomila, & V. Walsh (Ed.), The arts and the brain: Psychology and physiology beyond pleasure:
Volume 237(Progress in Brain Research) (Vol. 237, pp. 77–103). Elsevier B.V.
Cherniwchan, C., Ghassemi, A., and J. Keating. 2010. Is a mathematical proof beautiful? Mathematical
ethnographies project. [Video]. YouTube. https://www.youtube.com/watch?v=tANB6hXmY7U.
Cova, F., and N. Pain. 2012. Can folk aesthetics ground aesthetic realism? The Monist 95 (2): 241–263.
Davies, B., L. Alcock, and I. Jones. 2020. Comparative judgement, proof summaries and proof compre-
hension. Educational Studies in Mathematics 105 (2): 181–197.
Dreyfus, T., and T. Eisenberg. 1986. On the aesthetics of mathematical thought. For the Learning of
Mathematics 6 (1): 2–10.
DutilhNovaes, C. 2019. The beauty (?) of mathematical proofs. In Advances in experimental philosophy
of logic and mathematics, ed. A. Aberdein and M. Inglis, 63–94. Bloomsbury.
Erdős, P. 1983. Combinatorial problems in geometry. Mathematical Chronicle 12 (1): 35–54.
Eysenck, H.J., and S. Iwawaki. 1971. Cultural relativity in aesthetic judgements: An empirical study. Per-
ceptual and Motor Skills 32 (3): 817–818.
Fan, L., and Y. Zhu. 2004. How have Chinese student performed in mathematics? A perspective from
large-scale international mathematics comparisions. In How Chinese learn mathematics: Perspec-
tive from insiders, eds. L. Fan, W. Ngai-Ying, C. Jinfa, & Li. Shiqi, 3–26. World Scientific.
Fan, L., Wong, N-Y., Cai, J., and S. Li. 2004. How Chinese learn mathematics: Perspectives . In How
Chinese learn mathematics: Perspective from insiders, eds. L. Fan, W. Ngai-Ying, C. Jinfa, & Li.
Shiqi, pp. vii–xii. World Scientific.
Hardy, G. 1940. A Mathematician’s Apology, 1st ed. Cambridge University Press.
13
R. Sa et al.
Hayn-Leichsenring, G., O. Vartanian, and A. Chatterjee. 2021. The role of expertise in the aesthetic eval-
uation of mathematical equations. Psychological Research Psychologische Forschung. https://doi.
org/10.1007/s00426-021-01592-5.
Heine, S.J., D.R. Lehman, K. Peng, and J. Greenholtz. 2002. What’s wrong with cross-cultural com-
parisons of subjective likert scales?: The reference-group effect. Journal of Personality and Social
Psychology 82 (6): 903–918.
Heintz, C., and D. Taraborelli. 2010. Editorial: Folk epistemology. The cognitive bases of epistemic eval-
uation. Review of Philosophy and Psychology 1 (4): 477–482.
Hudecek, J. 2012. Ancient Chinese mathematics in action: Wu Wen-Tsun’s nationalist historicism after
the cultural revolution. East Asian Science, Technology and Society: An International Journal 6 (1):
41–64.
Hudecek, J. 2014. Reviving ancient Chinese mathematics: Mathematics, history and politcs in the work of
Wu Wen-Tsun, 1st ed. Springer.
Inglis, M., and A. Aberdein. 2015. Beauty is not simplicity: An analysis of mathematicians’ proof
appraisals. Philosophia Mathematica 23 (1): 87–109.
Inglis, M., and A. Aberdein. 2016. Diversity in proof appraisal. In Mathematical cultures: The London
meetings 2012–2014, ed. B. Larvor, 163–179. Springer.
Inglis, M., and A. Aberdein. 2020. Are aesthetic judgements purely aesthetic? Testing the Social Con-
formity Account. ZDM 52 (6): 1127–1136.
Johnson, S.G.B., and S. Steinerberger. 2019. Intuitions about mathematical beauty: A case study in the
aesthetic experience of ideas. Cognition 189: 242–259.
Jones, I., M. Bisson, C. Gilmore, and M. Inglis. 2019. Measuring conceptual understanding in ran-
domised controlled trials: Can comparative judgement help? British Educational Research Journal
45 (3): 662–680.
Kim, M., & Yuan, Y. (2015). No cross-cultural differences in Gettier car case intuition: A replication
study of Weinberg et al. 2001. Episteme 12(3):355–361.
Knobe, J. 2007. Experimental philosophy. Philosophy Compass 2 (1): 81–92.
Knobe, J. 2019. Philosophical intuitions are surprisingly robust across demographic differences. Episte-
mology & Philosophy of Science 56 (2): 29–36.
Knobe, J. 2021. Philosophical intuitions are surprisingly stable across both demographic groups and situ-
ations. Filozofia Nauki 29(2): 11-76
Lam, B. 2010. Are Cantonese-speakers really descriptivists? Revisiting cross-cultural semantics. Cogni-
tion 115(2): 320–329.
Laming, D. 2003. Human Judgment: The eye of the beholder, 1st ed. CENGAGE Learning.
Larvor, B. 2016. What are mathematical cultures? In Cultures of mathematics and logic, ed. S. Ju, B.
Löwe, T. Müller, and X. Yun, 1–22. Birkhäuser.
Lawlor, M. 1955. Cultural influences on preference for designs. The Journal of Abnormal and Social
Psychology 51 (3): 690–692.
Leung, F.K.S. 2001. In search of an east Asian identity in mathematics education. Education Studies in
Mathematics 47(1): 35–51.
Löwe, B., and B. Van Kerkhove. 2019. Methodological triangulation in empirical philosophy (of math-
ematics). In Advances in experimental philosophy of logic and mathematics, ed. A. Aberdein and M.
Inglis, 15–37. Bloomsbury.
Machery, E., R. Mallon, S. Nichols, and S. Stich. 2004. Semantics, cross-cultural style. Cognition 92 (3):
B1–B12.
Machery, E., S. Stich, D. Rose, A. Chatterjee, K. Karasawa, N. Struchiner, and T. Hashinmoto. 2017. Get-
tier Across Cultuers. Noûs 51 (3): 645–664.
McAllister, J. W. 1996. Beauty and revolution in science. New York: Cornell University Press.
McAllister, J.W. 2005. Mathematical beauty and the evolution of the standards of mathematical proof. In
The Visual Mind II, ed. M. Emmer, 15–34. The MIT Press.
McElroy, W. 1952. Aesthetic appreciation in Aborigines of Arnhem Land: A comparative experimental
study. Oceania 23 (2): 81–95.
Mejía Ramos, J., T. Evans, C. Rittberg, and M. Inglis. 2021. Mathematicians’ assessments of the explana-
tory value of proofs. Axiomathes 31 (5): 575–599.
Montano, U. 2014. Explaining beauty in mathematics: An aesthetic theory of mathematics, 1st ed. Cham:
Springer.
Nelsen, R.B. 2000. Proofs without words II: More exercises in visual thinking, 1st ed. The Mathematical
Association of America.
13
Do Mathematicians Agree about Mathematical Beauty?
Pearcy, D. 2020. Mathematical beauty: What is mathematical beauty and can anyone experience it?, 1st
ed. John Catt Educational.
Plato. 1993. Philebus. Hackett Publishing Company.
Pollack, I. 1952. The information of elementary auditory displays. Journal of the Acoustical Society of
America 24 (1): 745–749.
Rota, G.-C. 1997. The phenomenology of mathematical beauty. Synthese 111 (2): 171–182.
Seyedsayamdost, H. 2015. On normativity and epistemic intuitions: Failure of replication. Episteme 12
(1): 95–116.
Simoniti, V. 2017. Aesthetic properties as powers. European Journal of Philosophy 25 (4): 1434–1453.
Sinclair, N., and D. Pimm. 2006. A historical gaze at the mathematical aesthetic. In Mathematics and the
aesthetic: New approaches to an ancient affinity, ed. N. Sinclair, D. Pimm, and W. Higginson, 1–17.
Springer.
Soueif, M., and H. Eysenck. 1971. Cultural differences in aesthetic preferences. International Journal of
Psychology 6 (4): 293–298.
Starikova, I. 2017. Aesthetic preferences in mathematics: A case study. Philosophia Mathematica 26 (2):
161–183.
Stich, S. 2001. Plato’s method meets cognitive science. Free Inquiry 21 (2): 36–38.
Stich, S., and E. Machery. 2022. Demographic Differences in Philosophical Intuition: A reply to Joshua
Knobe. Review of Philosophy and Psychology. https://doi.org/10.1007/s13164-021-00609-7.
Tatarkiewicz, W. 1963. Objectivity and subjectivity in the history of aesthetics. Philosophy and Phenom-
enological Research 24 (2): 157–173.
Thurstone, L. 1928. Attitudes can be measured. The American Journal of Sociology 33 (4): 529–554.
Thurstone, L. 1994. A law of comparative judgement. Psychological Review 101 (2): 266–270.
Todd, C. 2008. Unmasking the truth beneath the beauty: Why the supposed aesthetic judgements made in
science may not be aesthetic at all. International Studies in the Philosophy of Science 22 (1): 61–79.
Weinberg, J., S. Nichols, and S. Stich. 2001. Normativity and epistemic intuitions. Philosophical Topics
29 (1): 429–460.
Wells, D. 1990. Are these the most beautiful? The Mathematical Intelligencer 12 (3): 37–41.
Zajonc, R.B. 2001. Mere Exposure: A gateway to the subliminal. Current Directions in Psychological
Science 10 (6): 224–228.
Zeki, S., J.P. Romaya, D.T. Benincasa, and M. Atiyah. 2014. The experience of mathematical beauty and
its neural correlates. Frontiers in Human Neuroscience 8 (1): 1–12.
Zong, C. 2020. Blaschke, Osgood, Wiener, Hadamard and the early development of modern mathematics
in China. 1-8. https://doi.org/10.48550/arXiv.2009.13688
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.
13