1301 0081
1301 0081
1301 0081
Yuri I. Manin
Max–Planck–Institut für Mathematik, Bonn, Germany,
Summary
The word ”complexity” is most often used as a meta–linguistic expression re-
ferring to certain intuitive characteristics of a natural system and/or its scientific
description. These characteristics may include: sheer amount of data that must be
taken into account; visible “chaotic” character of these data and/or space distribu-
tion/time evolution of a system etc.
This talk is centered around the precise mathematical notion of ”Kolmogorov
complexity”, originated in the early theoretical computer science and measuring
the degree to which an available information can be compressed.
In the first part, I will argue that a characteristic feature of basic scientific theo-
ries, from Ptolemy’s epicycles to the Standard Model of elementary particles, is their
splitting into two very distinct parts: the part of relatively small Kolmogorov com-
plexity (”laws”, ”basic equations”, ”periodic table”, ”natural selection, genotypes,
mutations”) and another part, of indefinitely large Kolmogorov complexity (”initial
and boundary conditions”, ”phenotypes”, “populations”). The data constituting
this latter part are obtained by planned observations, focussed experiments, and
afterwards collected in growing databases (formerly known as ”books”, ”tables”,
”encyclopaedias” etc). In this discussion Kolomogorov complexity plays a role of
the central metaphor.
The second part and Appendix 1 are dedicated to more precise definitions and
examples of complexity.
Finally, the last part briefly touches upon attempts to deal directly with Kol-
mogorov complex massifs of data and the “End of Science” prophecies.
1 Talkat the Plenary Session of the Pontifical Academy of Sciences on “Complexity and Analogy
in Science: Theoretical, Methodological and Epistemological Aspects” , Casina Pio IV, November
5–7, 2012.
1
2
m1 m2
gravity law F = G , dynamic law F = ma,
r2
3
and the resulting solution of the “two–body problem”, planets “started moving”
along ellipsoidal orbits (with Sun as one focus rather than center). It is less well
known to the general public that this approximation as well is valid only insofar as
we can consider negligible the gravitational forces with which the planets interact
among themselves.
If we intend to obtain a more precise picture, we have to consider the system of
differential equations defining the set of curves parametrized by time t in the 6n–
dimensional phase space where n is the number of planets (including Sun) taken in
consideration:
n
d2 qi X mi mj ((qi − qj )
=
dt2 i=1
|qi − qj |3
retical construction in the framework of the Quantum Field Theory. The Standard
Model got its first important experimental correlates with the discovery of quarks
(components of nuclear “elementary” particles) and W and Z bosons, quanta of
ineractions. For a very rich and complex history of this stage of theoretical physics,
stressing the role of experiments and experimenters, see the fascinating account [Zi]
by Antonio Zichichi. The Standard Model recently reappeared on the first pages
of the world press thanks to the renewed hopes that the last critically missing
component of the Model, the Higgs boson, has finally been observed.
Somewhat paradoxically, one can say that mathematics of the Standard Model
is firmly based on the same ancient archetypes of the human thought as that of
Hipparchus and Ptolemy: symmetry and uniform movement along circles.
More precisely, the basic idea of symmetry of modern classical (as opposed to
quantum) non–relativistic physics involves the symmetry group of rigid movements
of the three–dimensional Euclidean space, that is combinations of parallel shifts
and rotations around a point. The group of rotations is denoted SO(3), and ce-
lestial spheres are the unique objects invariant with respect to rotations. Passing
from Hipparchus and Ptolemy to modernity includes two decisive steps: adding
shifts (Earth, and then Sun, cease being centers of the Universe), and, crucially,
understanding the new meta–law of physics: symmetry must govern laws of physics
themselves rather than objects/processes etc that these laws are supposed to govern
(such as Solar System).
When we pass now to the quantum mechanics, and further to the Quantum Field
Theory (not involving gravitation), the group of SO(3) (together with shifts) should
be extended, in particular, by several copies of such groups as SU (2) and SU (3)
describing rotations in the internal degrees of freedom of elementary particles, such
as spin, colour etc. The basic ”law” that should be invariant with respect to this
big group, is encoded in the Lagrangian density: it is a “mathematical formula”
that is considerably longer than everything we get exposed to in our high school
and even college courses: cf. Appendix 2.
Finally, the Ptolemy celestial movements, superpositions of rotations of rigid
spheres, now transcends our space–time and happens in the infinite–dimensional
Hilbert space of wave–functions: this is the image describing, say, a hydrogen atom
in the paradigm of the first decades of the XXth century.
Scientific laws (at least those that are expressed by mathematical constructions)
can be considered as programs for computation, whereas observations produce in-
puts to these programs.
Outputs of these computations serve first to check/establish a domain of applica-
bility of our theories. We compare the predicted behavior of a system with observed
one, we are happy when our predictions agree quantitatively and/or qualitatively
with observable behaviour, we fix the border signs signalling that at this point we
went too far.
Afterwards, the outputs are used for practical/theoretical purposes, e. g. in en-
gineering, weather predictions etc, but also to formulate the new challenges arising
before the scientific thinking.
This comparison of scientific laws with programs is, of course, only a metaphor,
but it will allow us to construct also a precise model of the kind of complexity,
inherently associated with this metaphor of science: Kolmogorov complexity.
The next section is dedicated to the sketch of this notion in the framework of
mathematics, again in its historical perspective.
is approximately the logarithm of the number of strokes (in the base 10 or 2 re-
spectively).
More generally, we can speak about “size”, or “volume” of any finite text based
upon a fixed finite alphabet.
The discovery of this logarithmic upper bound of the Kolmogorov complexity of
numbers was a leap in the development of humanity on the scale of civilizations.
However, if one makes some slight additional conventions in the system of no-
tation, it will turn out that some integers admit a much shorter notation. For
10
example, let us allow ourselves to use the vertical dimension and write, e. g. 1010 .
The logarithm of the last number is about 1010 , much larger than the length of
the notation for which we used only 6 signs! And if we are unhappy about non–
linear notation, we may add to the basic alphabet two brackets (,) and postulate
10
that a(b) means ab . Then 1010 will be linearly written as 10(10(10)) using only
10 signs, still much less than 1010 + 1 decimal digits (of course, 1010 of them will
be just zeroes).
Then, perhaps, all integers can be produced by notation/programs that are much
shorter than logarithm of their size?
No! It turns out that absolute majority of numbers (or texts) cannot be signifi-
cantly compressed, although an infinity of integers can be written in a much shorter
way than it can be done in any chosen system of positional notation.
If we leave the domain of integers and leap, to, say, such a number as π =
3, 1415926..., it looks as if it had infinite complexity. However, this is not so. There
exists a program that can take as input the (variable) place of a decimal digit (an
integer) and give as output the respective digit. Such a program is itself a text
in a chosen algorithmic language, and as such, it also has a complexity: its own
Kolmogorov complexity. One agrees that this is the complexity of π.
A reader should be aware that I have left many subtle points of the definition
of Kolmogorov complexity in shadow, in particular, the fact that its dependence
of the chosen system of encoding and computation model can change it only by
a bounded quantity etc. A reader who would like to see some more mathematics
about this matter is referred to the brief Appendix 1 and the relevant references.
Here I will mention two other remarkable facts related to the Kolmogorov com-
plexity of numbers: one regarding its unexpected relation to the idea of randomness,
8
and another one showing that some psychological data make explicit the role of this
complexity in the cognitive activity of our mind.
Complexity and randomness. Consider arbitrarily long finite sequences of
zeroes and ones, say, starting with one so that each such sequence could be inter-
preted as a binary notation of an integer.
There is an intuitive notion of “randomness ” of such a sequence. In the con-
temporary technology “random” sequences of digits and similar random objects are
used for encoding information, in order to make it inaccessible for third parties. In
fact, a small distribured industry producing such random sequences (and, say, ran-
dom big primes) has been created. A standard way to produce random objects is
to leave mathematics and to recur to physics: from throwing a piece to registering
white noise.
One remarkable property of Kolmogorov complexity is this: those sequences
of digits whose Kolmogorov complexity is approximately the same as their length,
are random in any meaningful sense of the word. In particular, they cannot be
generated by a program essentialy shorter than the sequence itself.
Complexity and human mind. In the history of humanity, discovery of laws
of classical and quantum physics that represent incredible compression of com-
plex information, stresses the role of Kolmogorov complexity, at least as a relevant
metaphor for understanding the laws of cognition.
In his very informative book [De], Stanislas Dehaene considers certain exper-
imental results about the statistics of appearance numerals and other names of
numbers. cf. especially pp. 110 – 115, subsection “Why are some numerals more
frequent than others?”.
As mathematicians, let us consider the following abstract question: can one
say anything non-obvious about possible probabilities distributions on the set of
all natural numbers? More precisely, one such distribution
P is a sequence of non–
negative real numbers pn , n = 1, 2, . . . such that n pn = 1. Of course, from the
last formula it follows that pn must tend to zero, when n tends to infinity; moreover
pn cannot tend to zero too slowly: for example, pn = n−1 will not do. But two
different distributions can be widely incomparable.
Remarkably, it turns out that if we restrict our class of distributions only to
computable from below ones, that is, those in which pn can be computed as a function
of n (in a certain precise sense), then it turns out that there is a distinguished and
9
small subclass C of such distributions, that are in a sense maximal ones. Any
member (pn ) of this class has the following unexpected property (see [Lev]):
the probability pn of the number n, up to a bounded (from above and below)
factor, equals the inverse of the exponentiated Kolmogorov complexity of n.
This statement needs additional qualifications: the most important one is that
we need here not the original Kolmogorov complexity but the so called prefix–free
version of it. We omit technical details, because they are not essential here. But
the following properties of any distribution (pn ) ∈ C are worth stressing in our
context:
(i) Most of the numbers n, those that are Kolmogorov ”maximally complex”,appear
with probability comparable with n−1 (log n)−1−ε , with a small ε: “most large num-
bers appear with frequency inverse to their size” (in fact, somewhat smaller one).
(ii) However, frequencies of those numbers that are Kolmogorov very simple,
such as 103 (thousand), 106 (million), 109 (billion), produce sharp local peaks in
the graph of (pn ).
The reader may compare these properties of the discussed class of distributons,
which can be called a priori distributions, with the observed frequencies of numerals
(number words) in printed and oral texts in various languages: cf. Dehaene, loc.
cit., p. 111, Figure 4.4. To me, their qualitative agreement looks very convincing:
brains and their societies do reproduce a priori probabilities.
Notice that those parts of the Dehaene and Mehler graphs in loc. cit. that refer
to large numbers, are somewhat misleading: they might create an impression that
frequencies of the numerals, say, between 106 and 109 smoothly interpolate between
those of 106 and 109 themselves, whereas in fact they abruptly drop down.
Finally, I want to stress that the class of a priori probability distributions that
we are considering here is qualitatively distinct from those that form now a common
stock of sociological and sometimes scientific analysis: cf. a beautiful synopsis by
Terence Tao in [Ta]. The appeal to the uncomputable degree of maximal compres-
sion is exactly what can make such a distribution an eye–opener. As I have written
at the end of [Ma2]:
“One can argue that all cognitive activity of our civilization, based upon symbolic
(in particular, mathematical) representations of reality, deals actually with the
initial Kolmogorov segments of potentially infinite linguistic constructions, always
replacing vast volumes of data by their compressed descriptions. This is especially
visible in the outputs of the modern genome projects.
10
The message of this essay, written by the Editor–in–Chief Chris Anderson, was
summarized in the following words:
“The new availablility of huge amounts of data, along with statistical tools to
crunch these numbers, offers a whole new way of understanding the world. Corre-
lation supersedes causation, and science can advance even without coherent models,
unified theories, or really any mechanical explanation at all. There’s no reason to
cling to our old ways. It’s time to ask: What can science learn from Google?”
I will return to this rhetoric question at the end of this talk. Right now I want
only to stress that, as well as in the scientific models of the “bygone days”, basic
theory is unavoidable in this brave new Petabyte World: encoding and decoding
data, search algorithms, and of course, computers themselves are just engineering
embodiment of some very basic and very abstract notions of mathematics. The
mathematical idea underlying the structure of modern computers is the Turing
machine (or one of several other equivalent formulations of the concepts of com-
putability). We know that the universal Turing machine has a very small Kol-
mogorov complexity, and therefore, using the basic metaphor of this talk, we can
say that the bipartite structure of the classical scientific theories is reproduced at
this historical stage.
Moreover, what Chris Anderson calls “the new availability of huge amounts of
data” by itself is not very new: after spreading of printing, astronomic observa-
tories, scientific laboratories, and statistical studies, the amount of data available
to any visitor of a big public library was always huge, and studies of correlations
proliferated for at least the last two centuries.
Charles Darwin himself collected the database of his observations, and the result
of his pondering over it was the theory of evolution.
11
Coda
What can science learn from Google:
“Think! Otherwise no Google will help you.”
12
References
[An] Ch. Anderson. The End of Theory. in: Wired, 17.06, 2008.
[ChCoMa] A. H. Chamseddine, A. Connes, M. Marcolli. Gravity and the stan-
dard model with neutrino mixing. Adv.Theor.Math.Phys. 11 (2007), 991-1089.
arXiv:hep-th/0610241
[De] S. Dehaene. The Number Sense. How the Mind creates Mathematics. Ox-
ford UP, 1997.
[FlFoHaSCH] R. Floud, R. W. Fogel, B. Harris, Sok Chul Hong. The Changing
Body: Health, Nutrition and Human Development in the Western World Since
1700. Cambridge UP, 431 pp.
[Gr] J. Groopman. The Body and the Human Progress. In: NYRB, Oct. 27,
2011
[Lev] L. A. Levin, Various measures of complexity for finite objects (axiomatic
description), Soviet Math. Dokl. Vol.17 (1976) N. 2, 522–526.
[Ma1] Yu. Manin. A Course of Mathematical Logic for Mathematicians. 2nd
Edition, with new Chapters written by Yu. Manin and B. Zilber. Springer, 2010.
[Ma2] Yu. Manin. Renormalization and Computation II: Time Cut-off and the
Halting Problem. In: Math. Struct. in Comp. Science, pp. 1–23, 2012, Cambridge
UP. Preprint math.QA/0908.3430
[Pa] D. Park. The How and the Why. An Essay on the Origins and Development
of Physical Theory. Princeton UP, 1988.
[Ta] T. Tao. E pluribus unum: From Complexity, Universality. Daedalus, Journ.
of the AAAS, Summer 2012, 23–34.
[Zi] A. Zichichi. Subnuclear Physics. The first 50 years: Highlights from Erice
to ELN. World Scientific, 1999.
13
P (x1 , . . . , xm ; y1 , . . . , yn ; t1 , . . . , tq ) ∈ Z[x, y, t]
LSM =
1 1
− ∂ν gµa ∂ν gµa −gs f abc ∂µ gνa gµb gνc − gs2 f abc f ade gµb gνc gµd gνe −∂ν Wµ+ ∂ν Wµ− −M 2 Wµ+ Wµ− −
2 4
1 1 1
∂ν Zµ0 ∂ν Zµ0 − 2 M 2 Zµ0 Zµ0 − ∂µ Aν ∂µ Aν −igcw (∂ν Zµ0 (Wµ+ Wν− −−Wν+ Wµ− )−Zν0 (Wµ+ ∂ν Wµ− −
2 2cw 2
Wµ− ∂ν Wµ+ )+Zµ0 (Wν+ ∂ν Wµ− −Wν− ∂ν Wµ+ ))−igsw (∂ν Aµ (Wµ+ Wν− −Wν+ Wµ− )−Aν (Wµ+ ∂ν Wµ− −
1 1
Wµ− ∂ν Wµ+ )+Aµ (Wν+ ∂ν Wµ− −Wν− ∂ν Wµ+ ))− g 2 Wµ+ Wµ− Wν+ Wν− + g 2 Wµ+ Wν− Wµ+ Wν− +
2 2
g 2 c2w (Zµ0 Wµ+ Zν0 Wν− −Zµ0 Zµ0 Wν+ Wν− )+g 2 s2w (Aµ Wµ+ Aν Wν− −Aµ Aµ Wν+ Wν− )+g 2 sw cw (Aµ Zν0 (Wµ+ Wν−
1 1
Wν+ Wµ− ) − 2Aµ Zµ0 Wν+ Wν− ) − ∂µ H∂µ H − 2M 2 αh H 2 − ∂µ φ+ ∂µ φ− − ∂µ φ0 ∂µ φ0 −
2 2
2M 2 2M 4
2M 1 2 0 0 + − 3 0 0 + −
βh + H + (H + φ φ + 2φ φ ) + α h −gα h M H + Hφ φ + 2Hφ φ −
g2 g 2 g2
1 2
g αh H 4 + (φ0 )4 + 4(φ+ φ− )2 + 4(φ0 )2 φ+ φ− + 4H 2 φ+ φ− + 2(φ0 )2 H 2 −gM Wµ+ Wµ− H−
8
1 M 0 0 1
g 2 Zµ Zµ H − ig Wµ+ (φ0 ∂µ φ− − φ− ∂µ φ0 ) − Wµ− (φ0 ∂µ φ+ − φ+ ∂µ φ0 ) +
2 cw 2
1 1 1
g Wµ+ (H∂µ φ− − φ− ∂µ H) + Wµ− (H∂µ φ+ − φ+ ∂µ H) + g (Zµ0 (H∂µ φ0 −φ0 ∂µ H)+
2 2 cw
1 0 s2
M( Zµ ∂µ φ0 +Wµ+ ∂µ φ− +Wµ− ∂µ φ+ )−ig w M Zµ0 (Wµ+ φ− −Wµ− φ+ )+igsw M Aµ (Wµ+ φ− −Wµ− φ+ )−
cw cw
1 − 2c2w 0 + 1
ig Zµ (φ ∂µ φ− −φ− ∂µ φ+ )+igsw Aµ (φ+ ∂µ φ− −φ− ∂µ φ+ )− g 2 Wµ+ Wµ− H 2 + (φ0 )2 + 2φ+ φ−
2cw 4
18
1 2 1 0 0 1 s2
g 2 Zµ Zµ H 2 + (φ0 )2 + 2(2s2w − 1)2 φ+ φ− − g 2 w Zµ0 φ0 (Wµ+ φ− + Wµ− φ+ )
8 cw 2 cw
1 s2 1 1
− ig 2 w Zµ0 H(Wµ+ φ− −Wµ− φ+ )+ g 2 sw Aµ φ0 (Wµ+ φ− +Wµ− φ+ )+ ig 2 sw Aµ H(Wµ+ φ− −Wµ− φ+ )−
2 cw 2 2
sw 1
g2 (2c2w −1)Zµ0 Aµ φ+ φ− −g 2 s2w Aµ Aµ φ+ φ− + igs λaij (q̄iσ γ µ qjσ )gµa −ēλ (γ∂+mλe )eλ −ν̄ λ (γ∂+
cw 2
λ λ λ λ λ ¯λ λ λ λ µ λ 2 λ µ λ 1 ¯λ µ λ
mν )ν −ūj (γ∂+mu )uj −dj (γ∂+md )dj +igsw Aµ −(ē γ e ) + (ūj γ uj ) − (dj γ dj ) +
3 3
ig 0 4 8
Zµ {(ν̄ λ γ µ (1+γ 5 )ν λ )+(ēλ γ µ (4s2w −1−γ 5 )eλ )+(d¯λj γ µ ( s2w −1−γ 5 )dλj )+(ūλj γ µ (1− s2w +γ 5 )uλj )}
4cw 3 3
ig
lep κ
+ √ Wµ+ (ν̄ λ γ µ (1 + γ 5 )Uλκ e ) + (ūλj γ µ (1 + γ 5 )Cλκ dκj )
2 2
ig
+ √ Wµ− (ēκ Uκλ γ (1 + γ 5 )ν λ ) + (d¯κj Cκλ
lep† µ †
γ µ (1 + γ 5 )uλj ) +
2 2
ig
lep lep
+ √ φ+ −mκe (ν̄ λ Uλκ (1 − γ 5 )eκ ) + mλν (ν̄ λ Uλκ (1 + γ 5 )eκ +
2M 2
ig g mλ g mλe
lep† lep† ν
+ √ φ− mλe (ēλ Uλκ (1 + γ 5 )ν κ ) − mκν (ēλ Uλκ (1 − γ 5 )ν κ − H(ν̄ λ ν λ )− H(ēλ eλ )+
2M 2 2 M 2 M
ig mλν 0 λ 5 λ ig mλe 0 λ 5 λ 1 R 1 R
φ (ν̄ γ ν )− φ (ē γ e )− ν̄λ Mλκ (1−γ5 )ν̂κ − ν̄λ Mλκ (1 − γ5 )ν̂κ +
2 M 2 M 4 4
ig
√ φ+ −mκd (ūλj Cλκ (1 − γ 5 )dκj ) + mλu (ūλj Cλκ (1 + γ 5 )dκj +
2M 2
ig g mλ g mλd
λ ¯λ † κ ¯λ †
− 5 κ
√ φ md (dj Cλκ (1 + γ )uj ) − mu (dj Cλκ (1 − γ )uj − 5 κ u
H(ūλj uλj )− H(d¯λj dλj )+
2M 2 2 M 2 M
ig mλu 0 λ 5 λ ig mλd 0 ¯λ 5 λ
φ (ūj γ uj ) − φ ( dj γ dj )
2 M 2 M