First Order Theories
First Order Theories
First Order Theories
Articles
Logic
Propositional calculus First-order logic Second-order logic Decidability (logic) List of first-order theories Complete theory Gdel's completeness theorem Gdel's incompleteness theorems Recursively enumerable set Model theory Compactness theorem LwenheimSkolem theorem Elementary class Saturated model Kripke semantics Forcing (mathematics) Proof theory Hilbert system Natural deduction Sequent calculus Resolution (logic) Method of analytic tableaux Boolean satisfiability problem Satisfiability Modulo Theories 1 1 16 36 41 44 53 54 57 74 76 84 86 90 92 94 101 109 112 117 130 136 140 160 167 172 172 174 176 184 185 191
Arithmetic
Presburger arithmetic Robinson arithmetic Peano axioms Gentzen's consistency proof Second-order arithmetic Reverse mathematics
198 198 214 220 229 231 238 241 243 243 245 247 250 251 258 263 267 276 280 281 290 291 295 296 297 298 299
Set theory
General set theory KripkePlatek set theory Zermelo set theory Ackermann set theory ZermeloFraenkel set theory Von NeumannBernaysGdel set theory MorseKelley set theory New Foundations ScottPotter set theory Positive set theory Axiom of choice Axiom of dependent choice Continuum hypothesis Martin's axiom Diamond principle Clubsuit Axiom of constructibility Proper forcing axiom
References
Article Sources and Contributors Image Sources, Licenses and Contributors 300 304
Article Licenses
License 305
Logic
Propositional calculus
In mathematical logic, a propositional calculus or logic (also called sentential calculus or sentential logic) is a formal system in which formulas of a formal language may be interpreted as representing propositions. A system of inference rules and axioms allows certain formulas to be derived, called theorems; which may be interpreted as true propositions. The series of formulas which is constructed within such a system is called a derivation and the last formula of the series is a theorem, whose derivation may be interpreted as a proof of the truth of the proposition represented by the theorem. Truth-functional propositional logic is a propositional logic whose interpretation limits the truth values of its propositions to two, usually true and false. Truth-functional propositional logic and systems isomorphic to it are considered to be zeroth-order logic.
Terminology
In general terms, a calculus is a formal system that consists of a set of syntactic expressions (well-formed formul or wffs), a distinguished subset of these expressions (axioms), plus a set of formal rules that define a specific binary relation, intended to be interpreted as logical equivalence, on the space of expressions. When the formal system is intended to be a logical system, the expressions are meant to be interpreted as statements, and the rules, known as inference rules, are typically intended to be truth-preserving. In this setting, the rules (which may include axioms) can then be used to derive ("infer") formul representing true statements from given formul representing true statements. The set of axioms may be empty, a nonempty finite set, a countably infinite set, or be given by axiom schemata. A formal grammar recursively defines the expressions and well-formed formul (wffs) of the language. In addition a semantics may be given which defines truth and valuations (or interpretations). The language of a propositional calculus consists of 1. a set of primitive symbols, variously referred to as atomic formulae, placeholders, proposition letters, or variables, and 2. a set of operator symbols, variously interpreted as logical operators or logical connectives. A well-formed formula (wff) is any atomic formula, or any formula that can be built up from atomic formul by means of operator symbols according to the rules of the grammar. Mathematicians sometimes distinguish between propositional constants, propositional variables, and schemata. Propositional constants represent some particular proposition, while propositional variables range over the set of all atomic propositions. Schemata, however, range over all propositions. It is common to represent propositional constants by , , and , propositional variables by , , and , and schematic letters are often Greek letters, most often , , and .
Propositional calculus
Basic concepts
The following outlines a standard propositional calculus. Many different formulations exist which are all more or less equivalent but differ in the details of 1. their language, that is, the particular collection of primitive symbols and operator symbols, 2. the set of axioms, or distinguished formul, and 3. the set of inference rules. We may represent any given proposition with a letter which we call a propositional constant, analogous to representing a number by a letter in mathematics, for instance, . We require that all propositions have exactly one of two truth-values: true or false. To take an example, let will be true if it is raining outside and false otherwise. We then define truth-functional operators, beginning with negation. We write , which can be thought of as the denial of . In the example above, or by a more standard reading: "It is not the case that it is raining outside." When to represent the negation of is true, is false; and expresses that it is not raining outside, be the proposition that it is raining outside. This
when is false, is true. always has the same truth-value as . Conjunction is a truth-functional connective which forms a proposition out of two simpler propositions, for example, and . The conjunction of and is written , and expresses that each are true. We read 1. 2. 3. 4. as " is true and is true and is false and is false and and is true is false is true is false and is true in case 1 and is false otherwise. Where is the proposition that it is is true when it is raining is false; and if there is is the proposition that a cold-front is over Kansas, ". For any two propositions, there are four possible assignments of truth values:
outside and there is a cold-front over Kansas. If it is not raining outside, then
no cold-front over Kansas, then is false. Disjunction resembles conjunction in that it forms a proposition out of two simpler propositions. We write it , and it is read " or ". It expresses that either or is true. Thus, in the cases listed above, the disjunction of and is true in all cases except 4. Using the example above, the disjunction expresses that it is either raining outside or there is a cold front over Kansas. (Note, this use of disjunction is supposed to resemble the use of the English word "or". However, it is most like the English inclusive "or", which can be used to express the truth of at least one of two propositions. It is not like the English exclusive "or", which expresses the truth of exactly one of two propositions. That is to say, the exclusive "or" is false when both and are true (case 1). An example of the exclusive or is: You may have a bagel or a pastry, but not both. Sometimes, given the appropriate context, the addendum "but not both" is omitted but implied.) Material conditional also joins two simpler propositions, and we write , which is read "if then The proposition to the left of the arrow is called the antecedent and the proposition to the right is called the consequent. (There is no such designation for conjunction or disjunction, since they are commutative operations.) It expresses that is true whenever is true. Thus it is true in every case above except case 2, because this is the only case when is true but is not. Using the example, if then expresses that if it is raining outside then there is a cold-front over Kansas. The material conditional is often confused with physical causation. The material conditional, however, only relates two propositions by their truth-valueswhich is not the relation of cause and effect. It is contentious in the literature whether the material implication represents logical causation. Biconditional joins two simpler propositions, and we write , which is read " if and only if ". It expresses that otherwise. and have the same truth-value, thus if and only if is true in cases 1 and 4, and false
".
Propositional calculus It is extremely helpful to look at the truth tables for these different operators, as well as the method of analytic tableaux.
with another proposition. In order to represent this, we need to use parentheses to indicate which proposition is is not a well-formed formula, because we do not know if we are with . Thus we must write either to represent the latter. By evaluating the truth conditions, we see that or if we are conjoining
both expressions have the same truth conditions (will be true in the same cases), and moreover that any proposition formed by arbitrary conjunctions will have the same truth conditions, regardless of the location of the parentheses. This means that conjunction is associative, however, one should not assume that parentheses never serve a purpose. For instance, the sentence does not have the same truth conditions as , so they are different sentences distinguished only by the parentheses. One can verify this by the truth-table method referenced above. Note: For any arbitrary number of propositional constants, we can form a finite number of cases which list their possible truth-values. A simple way to generate this is by truth-tables, in which one writes , , , for any list of propositional constantsthat is to say, any list of propositional constants with rows, and below entries. Below this list, one writes (or F). Below one fills in the first half of the rows with true (or T) and the second half with false
one fills in one-quarter of the rows with T, then one-quarter with F, then one-quarter with T and
the last quarter with F. The next column alternates between true and false for each eighth of the rows, then sixteenths, and so on, until the last propositional constant varies between T and F for each row. This will give a complete listing of cases or truth-value assignments possible for those propositional constants.
Argument
The propositional calculus then defines an argument as a set of propositions. A valid argument is a set of propositions, the last of which follows fromor is implied bythe rest. All other arguments are invalid. The simplest valid argument is modus ponens, one instance of which is the following set of propositions:
This is a set of three propositions, each line is a proposition, and the last follows from the rest. The first two lines are called premises, and the last line the conclusion. We say that any proposition follows from any set of propositions argument above, for any , if and must be true whenever every member of the set , whenever and are true, necessarily is true. In the is true. Notice that, when
is true, we cannot consider cases 3 and 4 (from the truth table). When
2. This leaves only case 1, in which Q is also true. Thus Q is implied by the premises. This generalizes schematically. Thus, where and may be any propositions at all,
Other argument forms are convenient, but not necessary. Given a complete set of axioms (see below for one such set), modus ponens is sufficient to prove all other argument forms in propositional logic, and so we may think of them as derivative. Note, this is not true of the extension of propositional logic to other logics like first-order logic.
Propositional calculus First-order logic requires at least one additional rule of inference in order to obtain completeness. The significance of argument in formal logic is that one may obtain new truths from established truths. In the first example above, given the two premises, the truth of is not yet known or stated. After the argument is made, is deduced. In this way, we define a deduction system as a set of all propositions that may be deduced from another set of propositions. For instance, given the set of propositions , we can define a deduction system, assumed, so modus ponens, is a consequence, and so , which is the set of all propositions which follow from . Also, from the first element of . Reiteration is always , last element, as well as
though, nothing else may be deduced. Thus, even though most deduction systems studied in propositional logic are able to deduce , this one is too weak to prove such a proposition.
is a finite set of elements called operator symbols or logical connectives. The set
In this partition,
A frequently adopted convention treats the constant logical values as operators of arity zero, thus:
Some writers use the tilde (~), or N, instead of ; and some use the ampersand (&), the prefixed K, or instead of . Notation varies even more for the set of logical values, with symbols like {false, true}, {F, T}, or all being seen in various contexts instead of {0, 1}. The zeta set applications. The iota set is a finite set of initial points that are called axioms when they receive logical interpretations. The language of following rules: 1. Base: Any element of the alpha set is a formula of , then . is a formula. 2. If are formul and is in 3. Closed: Nothing else is a formula of . , also known as its set of formul, well-formed formulas or wffs, is inductively defined by the is a finite set of transformation rules that are called inference rules when they acquire logical
Repeated applications of these rules permits the construction of complex formul. For example: 1. By rule 1, 2. By rule 2, 3. By rule 1, 4. By rule 2, is a formula. is a formula. is a formula. is a formula.
Propositional calculus
Of the three connectives for conjunction, disjunction, and implication ( , , and ), one can be taken as primitive and the other two can be defined in terms of it and negation ( ). Indeed, all of the logical connectives can be defined in terms of a sole sufficient operator. The biconditional ( ) can of course be defined in terms of conjunction and implication, with defined as . Adopting negation and implication as the two primitive operations of a propositional calculus is tantamount to having the omega set partition as follows:
An axiom system discovered by Jan ukasiewicz formulates a propositional calculus in this language as follows. The axioms are all substitution instances of: The rule of inference is modus ponens (i.e., from , and is defined as . and , infer ). Then is defined as
In the following example of a propositional calculus, the transformation rules are intended to be interpreted as the inference rules of a so-called natural deduction system. The particular system presented here has no initial points, which means that its interpretation for logical applications derives its theorems from an empty axiom set. The set of initial points is empty, that is, . The set of transformation rules, , is described as follows: Our propositional calculus has ten inference rules. These rules allow us to derive other true formulae given a set of formulae that are assumed to be true. The first nine simply state that we can infer certain wffs from other wffs. The last rule however uses hypothetical reasoning in the sense that in the premise of the rule we temporarily assume an (unproven) hypothesis to be part of the set of inferred formulae to see if we can infer a certain other formula. Since the first nine rules don't do this they are usually described as non-hypothetical rules, and the last one as a hypothetical rule. Reductio ad absurdum (negation introduction) From and [accepting leads to a proof that ], infer .
; not ; if
Either
; therefore, ; but or ;
If then therefore
Destructive Dilemma
If then ; and if then therefore not or not If then therefore and ; and if or not then
; but not
or not
Bidirectional Dilemma
; but
or not
Simplification Conjunction
is true
and are true separately; therefore they are true conjointly is true; therefore the disjunction ( If then ; and if then then and are true or ) is true is true
Addition Composition
; therefore if
Propositional calculus
7
The negation of ( ) The negation of ( ) ( ( ( or and and ) is equiv. to (not or not
or
) is equiv. to (not
and not
Commutation (1) Commutation (2) Commutation (3) Association (1) Association (2) Distribution (1)
) is equiv. to ( ) is equiv. to (
or
and
) is equiv. to ( or
) is equiv. to (
and or
) is equiv. to ( ) is equiv. to (
Distribution (2) ) Double Negation Transposition Material Implication Material Equivalence (1) If If
and
) is equiv. to (
or
) and (
or
is equivalent to the negation of not then then is equiv. to if not is equiv. to not or is true) then not
( is equiv. to ) means (if is true then and (if is true then is true) ( is equiv. to or (both and ) means either ( are false) and
are true)
or not
is true)
Exportation Importation
[1]
from (if and are true then is true) we can prove (if is true then is true, if is true) If then (if then then ) is equivalent to if and
Tautology (1) Tautology (2) Tertium non datur (Law of Excluded Middle) Law of Non-Contradiction
is true is true
and not
Propositional calculus
Example of a proof
To be shown that . One possible proof of this (which, though valid, happens to contain more steps than are necessary) may be arranged as follows:
Example of a Proof Number 1 2 3 4 5 6 Formula premise From (1) by disjunction introduction From (1) and (2) by conjunction introduction From (3) by conjunction elimination Summary of (1) through (4) From (5) by conditional proof Reason
Interpret
as "Assuming
, infer
". Read
implies
implies
satisfies the propositional variable if and only if satisfies if and only if does not satisfy satisfies satisfies satisfies satisfies if and only if satisfies both and if and only if satisfies at least one of either or if and only if it is not the case that satisfies but not if and only if satisfies both and or satisfies neither one of them to be implied by a certain set the formula of also
With this definition we can now formalize what it means for a formula holds. This leads to the following formal definition: We say that a set
formulae. Informally this is true if in all worlds that are possible given the set of formulae certain wff if all truth assignments that satisfy all the formulae in also satisfy Finally we define syntactical entailment such that is syntactically entailed by
with the inference rules that were presented above in a finite number of steps. This allows us to formulate exactly what it means for the set of inference rules to be sound and complete: Soundness If the set of wffs Completeness If the set of wffs semantically entails wff then syntactically entails syntactically entails wff then semantically entails
Propositional calculus
" has an inductive definition, and that gives us the immediate resources for proves , then ...". So our proof proceeds by induction.
I. Basis. Show: If is a member of , then implies . II. Basis. Show: If is an axiom, then implies . III. Inductive step (induction on , the length of the proof): a. Assume for arbitrary and that if proves in or fewer steps, then implies . b. For each possible application of a rule of inference at step , leading to a new theorem implies . , show that
Notice that Basis Step II can be omitted for natural deduction systems because they have no axioms. When used, Step II involves showing that each of the axioms is a (semantic) logical truth. The Basis step(s) demonstrate(s) that the simplest provable sentences from are also implied by , for any . (The is simple, since the semantic fact that a set implies any of its members, is also trivial.) The Inductive step will systematically cover all the further sentences that might be provableby considering each case where we might reach a logical conclusion using an inference ruleand shows that if a new sentence is provable, it is also logically implied. (For example, we might have a rule telling us that from " " we can derive " or ". In III.a We assume that if is provable it is implied. We also know that if or is provable from true makes is provable then " or " is provable. We . So any semantic or " true, by the or " is have to show that then " assumption we just made. valuation making all of " too is implied. We do so by appeal to the semantic definition and the , we assume. So it is also implied by true makes " or true makes " true. But any valuation making
implied.) Generally, the Inductive step will consist of a lengthy but simple case-by-case analysis of all the rules of inference, showing that each "preserves" semantic implication. By the definition of provability, there are no sentences provable other than by being a member of following by a rule; so if all of those are semantically implied, the deduction calculus is sound. , an axiom, or
I. does not prove . (Assumption) II. If does not prove , then we can construct an (infinite) "Maximal Set", which also does not prove .
1. Place an "ordering" on all the sentences in the language (e.g., shortest first, and equally long ones in extended alphabetical ordering), and number them , , 2. Define a series i. ii. If proves , then of sets ( , , ) inductively:
Propositional calculus iii. If does not prove , then 3. Define as the union of all the . (That is, 4. It can be easily shown that i. ii. iii. contains (is a superset of) does not prove (by (b.i)); then some sentence was added to some which caused , it (because if it proves
10
.)
it to prove ' ; but this was ruled out by definition); and is a "Maximal Set" (with respect to ): If any more sentences whatever were added to would prove when they were encountered during the construction of the , again by definition) is a Maximal Set (wrt ), then it is "truth-like". This means that it contains the sentence " ; If it contains " " and contains "If then
. (Because if it were possible to add any more sentences, they should have been added " only if
III. If
it does not contain the sentence not"; and so forth. IV. If is truth-like there is a "
true and everything outside false while still obeying the laws of semantic composition in the language. V. A -canonical valuation will make our original set all true, and make false. VI. If there is a valuation on which are true and is false, then does not (semantically) imply . QED
propositional calculus may also be expressed in terms of truth tables.[2] distinct propositional symbols there are is assigned T, or is assigned F. , there are possible interpretations: for example, there are possible interpretations:
1. both are assigned T, 2. both are assigned F, 3. is assigned T and is assigned F, or 4. is assigned F and is assigned T.[2] Since has , that is, denumerably many propositional symbols, there are .
[2]
Propositional calculus
11
sentence. If a sentence is true under an interpretation, then that interpretation is called a model of that sentence. is false under an interpretation iff is not true under .[2] A sentence of propositional logic is logically valid iff it is true under every interpretation means that A sentence is logically valid iff there is no interpretation
under which is true and is false. A sentence of propositional logic is consistent iff it is true under at least one interpretation. It is inconsistent if it is not consistent. Some consequences of these definitions: For any given interpretation a given formula is either true or false.[2] No formula is both true and false under the same interpretation.[2] is false for a given interpretation iff
[2]
iff is false under that interpretation. If and are both true under a given interpretation, then If and , then .[2] is true under iff is not true under .
is true under iff either is not true under or is true under A sentence of propositional logic is a semantic consequence of a sentence that is, iff .[2]
.[2] iff
is logically valid,
Alternative calculus
It is possible to define another version of propositional calculus, which defines most of the syntax of the logical operators by means of axioms, and which uses only one inference rule.
Axioms
Let , and stand for well-formed formul. (The wffs themselves would not contain any Greek letters, but only capital Roman letters, connective operators, and parentheses.) Then the axioms are as follows:
Axioms Name THEN-1 THEN-2 AND-1 AND-2 AND-3 OR-1 OR-2 OR-3 NOT-1 NOT-2 Eliminate disjunction Introduce negation Eliminate negation Introduce conjunction Introduce disjunction Axiom Schema Add hypothesis Description , implication introduction over implication
Propositional calculus
12
Excluded middle, classical logic Eliminate equivalence
Introduce equivalence
Axiom THEN-2 may be considered to be a "distributive property of implication with respect to implication." Axioms AND-1 and AND-2 correspond to "conjunction elimination". The relation between AND-1 and AND-2 reflects the commutativity of the conjunction operator. Axiom AND-3 corresponds to "conjunction introduction." Axioms OR-1 and OR-2 correspond to "disjunction introduction." The relation between OR-1 and OR-2 reflects the commutativity of the disjunction operator. Axiom NOT-1 corresponds to "reductio ad absurdum." Axiom NOT-2 says that "anything can be deduced from a contradiction." Axiom NOT-3 is called "tertium non datur" (Latin: "a third is not given") and reflects the semantic valuation of propositional formulae: a formula can have a truth-value of either true or false. There is no third truth-value, at least not in classical logic. Intuitionistic logicians do not accept the axiom NOT-3.
Inference rule
The inference rule is modus ponens: .
Meta-inference rule
Let a demonstration be represented by a sequence, with hypotheses to the left of the turnstile and the conclusion to the right of the turnstile. Then the deduction theorem can be stated as follows: If the sequence
has been demonstrated, then it is also possible to demonstrate the sequence . This deduction theorem (DT) is not itself formulated with propositional calculus: it is not a theorem of propositional calculus, but a theorem about propositional calculus. In this sense, it is a meta-theorem, comparable to theorems about the soundness or completeness of propositional calculus. On the other hand, DT is so useful for simplifying the syntactical proof process that it can be considered and used as another inference rule, accompanying modus ponens. In this sense, DT corresponds to the natural conditional proof inference rule which is part of the first version of propositional calculus introduced in this article. The converse of DT is also valid: If the sequence
in fact, the validity of the converse of DT is almost trivial compared to that of DT: If
then
Propositional calculus 1: 2: and from (1) and (2) can be deduced 3: by means of modus ponens, Q.E.D. The converse of DT has powerful implications: it can be used to convert an axiom into an inference rule. For example, the axiom AND-1,
13
can be transformed by means of the converse of the deduction theorem into the inference rule
which is conjunction elimination, one of the ten inference rules used in the first version (in this article) of the propositional calculus.
Example of a proof
The following is an example of a (syntactical) demonstration, involving only axioms THEN-1 and THEN-2: Prove: Proof: 1. Axiom THEN-2 with 2. Axiom THEN-1 with 3. From (1) and (2) by modus ponens. 4. Axiom THEN-1 with 5. From (3) and (4) by modus ponens. , , , , (Reflexivity of implication).
translated as , but this translation is incorrect intuitionistically. In both Boolean and Heyting algebra, inequality can be used in place of equality. The equality expressible as a pair of inequalities and . Conversely the inequality
is expressible as the
14 . The significance of inequality for Hilbert-style systems is that it corresponds to the latter's
or
internal to the logic while the latter is external. Internal implication between two terms is another term of the same kind. Entailment as external implication between two terms expresses a metatruth outside the language of the logic, and is considered part of the metalanguage. Even when the logic under study is intuitionistic, entailment is ordinarily understood classically as two-valued: either the left side entails, or is less-or-equal to, the right side, or it is not. Similar but more complex translations to and from algebraic logics are possible for natural deduction systems as described above and for the sequent calculus. The entailments of the latter can be interpreted as two-valued, but a more insightful interpretation is as a set, the elements of which can be understood as abstract proofs organized as the morphisms of a category. In this interpretation the cut rule of the sequent calculus corresponds to composition in the category. Boolean and Heyting algebras enter this picture as special categories having at most one morphism per homset, i.e., one proof per entailment, corresponding to the idea that existence of proofs is all that matters: any proof will do and there is no point in distinguishing them.
Graphical calculi
It is possible to generalize the definition of a formal language from a set of finite sequences over a finite basis to include many other sets of mathematical structures, so long as they are built up by finitary means from finite materials. What's more, many of these families of formal structures are especially well-suited for use in logic. For example, there are many families of graphs that are close enough analogues of formal languages that the concept of a calculus is quite easily and naturally extended to them. Indeed, many species of graphs arise as parse graphs in the syntactic analysis of the corresponding families of text structures. The exigencies of practical computation on formal languages frequently demand that text strings be converted into pointer structure renditions of parse graphs, simply as a matter of checking whether strings are wffs or not. Once this is done, there are many advantages to be gained from developing the graphical analogue of the calculus on strings. The mapping from strings to parse graphs is called parsing and the inverse mapping from parse graphs to strings is achieved by an operation that is called traversing the graph.
Propositional calculus these; others include set theory and mereology. Second-order logic and other higher-order logics are formal extensions of first-order logic. Thus, it makes sense to refer to propositional logic as "zeroth-order logic", when comparing it with these logics. Modal logic also offers a variety of inferences that cannot be captured in propositional calculus. For example, from "Necessarily " we may infer that . From we may infer "It is possible that ". The translation between modal logics and algebraic logics is as for classical and intuitionistic logics but with the introduction of a unary operator on Boolean or Heyting algebras, different from the Boolean operations, interpreting the possibility modality, and in the case of Heyting algebra a second operator interpreting necessity (for Boolean algebra this is redundant since necessity is the De Morgan dual of possibility). The first operator preserves 0 and disjunction while the second preserves 1 and conjunction. Many-valued logics are those allowing sentences to have values other than true and false. (For example, neither and both are standard "extra values"; "continuum logic" allows each sentence to have any of an infinite number of "degrees of truth" between true and false.) These logics often require calculational devices quite distinct from propositional calculus. When the values form a Boolean algebra (which may have more than two or even infinitely many values), many-valued logic reduces to classical logic; many-valued logics are therefore only of independent interest when the values form an algebra that is not Boolean.
15
Solvers
Finding solutions to propositional logic formulas is an NP-complete problem. However, practical methods exist (e.g., DPLL algorithm, 1962; Chaff algorithm, 2001) that are very fast for many useful cases. Recent work has extended the SAT solver algorithms to work with propositions containing arithmetic expressions; these are the SMT solvers.
References
[1] Toida, Shunichi (2 August 2009). "Proof of Implications" (http:/ / www. cs. odu. edu/ ~toida/ nerzic/ content/ logic/ prop_logic/ implications/ implication_proof. html). CS381 Discrete Structures/Discrete Mathematics Web Course Material. Department Of Computer Science, Old Dominion University. . Retrieved 10 March 2010. [2] Hunter, Geoffrey (1971). Metalogic: An Introduction to the Metatheory of Standard First-Order Logic. University of California Pres. ISBN0520023560.
Further reading
Brown, Frank Markham (2003), Boolean Reasoning: The Logic of Boolean Equations, 1st edition, Kluwer Academic Publishers, Norwell, MA. 2nd edition, Dover Publications, Mineola, NY. Chang, C.C. and Keisler, H.J. (1973), Model Theory, North-Holland, Amsterdam, Netherlands. Kohavi, Zvi (1978), Switching and Finite Automata Theory, 1st edition, McGrawHill, 1970. 2nd edition, McGrawHill, 1978. Korfhage, Robert R. (1974), Discrete Computational Structures, Academic Press, New York, NY. Lambek, J. and Scott, P.J. (1986), Introduction to Higher Order Categorical Logic, Cambridge University Press, Cambridge, UK. Mendelson, Elliot (1964), Introduction to Mathematical Logic, D. Van Nostrand Company.
Propositional calculus
16
Related works
Hofstadter, Douglas (1979). Gdel, Escher, Bach: An Eternal Golden Braid. Basic Books. ISBN978-0-46-502656-2.
External links
Klement, Kevin C. (2006), "Propositional Logic", in James Fieser and Bradley Dowden (eds.), Internet Encyclopedia of Philosophy, Eprint (http://www.iep.utm.edu/p/prop-log.htm). Introduction to Mathematical Logic (http://www.ltn.lv/~podnieks/mlog/ml2.htm) Formal Predicate Calculus (http://www.qedeq.org/current/doc/math/qedeq_formal_logic_v1_en.pdf), contains a systematic formal development along the lines of Alternative calculus Elements of Propositional Calculus (http://www.visualstatistics.net/Scaling/Propositional Calculus/Elements of Propositional Calculus.htm) forall x: an introduction to formal logic (http://www.fecundity.com/logic/), by P.D. Magnus, covers formal semantics and proof theory for sentential logic. Propositional Logic (GFDLed)
First-order logic
First-order logic is a formal logical system used in mathematics, philosophy, linguistics, and computer science. It goes by many names, including: first-order predicate calculus, the lower predicate calculus, quantification theory, and predicate logic (a less precise term). First-order logic is distinguished from propositional logic by its use of quantifiers; each interpretation of first-order logic includes a domain of discourse over which the quantifiers range. The adjective "first-order" is used to distinguish first-order theories from higher-order theories in which there are predicates having other predicates or functions as arguments or in which predicate quantifiers or function quantifiers are permitted or both.[1] In interpretations of first-order theories, predicates are associated with sets. In interpretations of higher order theories, they may be also associated with sets of sets. There are many deductive systems for first-order logic that are sound (only deriving correct results) and complete (able to derive any logically valid implication). Although the logical consequence relation is only semidecidable, much progress has been made in automated theorem proving in first-order logic. First-order logic also satisfies several metalogical theorems that make it amenable to analysis in proof theory, such as the LwenheimSkolem theorem and the compactness theorem. First-order logic is of great importance to the foundations of mathematics, where it has become the standard formal logic for axiomatic systems. It has sufficient expressive power to formalize two important mathematical theories: ZermeloFraenkel set theory (ZF) and first-order Peano arithmetic. However, no axiom system in first order logic is strong enough to fully (categorically) describe infinite structures such as the natural numbers or the real line. Categorical axiom systems for these structures can be obtained in stronger logics such as second-order logic. A history of first-order logic and an account of its emergence over other formal logics is provided by Ferreirs (2001).
First-order logic
17
Introduction
While propositional logic deals with simple declarative propositions, first-order logic additionally covers predicates and quantification. A predicate resembles a function that returns either True or False. Consider the following sentences: "Socrates is a philosopher", "Plato is a philosopher". In propositional logic these are treated as two unrelated propositions, denoted for example by p and q. In first-order logic, however, the sentences can be expressed in a more parallel manner using the predicate Phil(a), which asserts that the object represented by a is a philosopher. Thus if a represents Socrates then Phil(a) asserts the first proposition, p; if a instead represents Plato then Phil(a) asserts the second proposition, q. A key aspect of first-order logic is visible here: the string "Phil" is a syntactic entity which is given semantic meaning by declaring that Phil(a) holds exactly when a is a philosopher. An assignment of semantic meaning is called an interpretation. First-order logic allows reasoning about properties that are shared by many objects, through the use of variables. For example, let Phil(a) assert that a is a philosopher and let Schol(a) assert that a is a scholar. Then the formula
asserts that if a is a philosopher then a is a scholar. The symbol is used to denote a conditional (if/then) statement. The hypothesis lies to the left of the arrow and the conclusion to the right. The truth of this formula depends on which object is denoted by a, and on the interpretations of "Phil" and "Schol". Assertions of the form "for every a, if a is a philosopher then a is a scholar" require both the use of variables and the use of a quantifier. Again, let Phil(a) assert a is a philosopher and let Schol(a) assert that a is a scholar. Then the first-order sentence
asserts that no matter what a represents, if a is a philosopher then a is scholar. Here expresses the idea that the claim in parentheses holds for all choices of a.
To show that the claim "If a is a philosopher then a is a scholar" is false, one would show there is some philosopher who is not a scholar. This counterclaim can be expressed with the existential quantifier :
Here: is the negation operator: not a scholar. is the conjunction operator: is true if and only if is false, in other words if and only if a is
The predicates Phil(a) and Schol(a) take only one parameter each. First-order logic can also express predicates with more than one parameter. For example, "there is someone who can be fooled every time" can be expressed as:
Here Person(x) is interpreted to mean x is a person, Time(y) to mean that y is a moment of time, and Canfool(x,y) to mean that (person) x can be fooled at (time) y. For clarity, this statement asserts that there is at least one person who can be fooled at all times, which is stronger than asserting that at all times at least one person exists who can be fooled. Asserting the latter (that there is always at least one foolable person) does not signify whether this foolable person is always the same for all moments of time. The range of the quantifiers is the set of objects that can be used to satisfy them. (In the informal examples in this section, the range of the quantifiers was left unspecified.) In addition to specifying the meaning of predicate symbols such as Person and Time, an interpretation must specify a nonempty set, known as the domain of discourse or universe, as a range for the quantifiers. Thus a statement of the form is said to be true, under a particular interpretation, if there is some object in the domain of discourse of that interpretation that satisfies the predicate that the interpretation uses to assign meaning to the symbol Phil.
First-order logic
18
Syntax
There are two key parts of first order logic. The syntax determines which collections of symbols are legal expressions in first-order logic, while the semantics determine the meanings behind these expressions.
Alphabet
Unlike natural languages, such as English, the language of first-order logic is completely formal, so that it can be mechanically determined whether a given expression is legal. There are two key types of legal expressions: terms, which intuitively represent objects, and formulas, which intuitively express predicates that can be true or false. The terms and formulas of first-order logic are strings of symbols which together form the alphabet of the language. As with all formal languages, the nature of the symbols themselves is outside the scope of formal logic; they are often regarded simply as letters and punctuation symbols. It is common to divide the symbols of the alphabet into logical symbols, which always have the same meaning, and non-logical symbols, whose meaning varies by interpretation. For example, the logical symbol always represents "and"; it is never interpreted as "or". On the other hand, a non-logical predicate symbol such as Phil(x) could be interpreted to mean "x is a philosopher", "x is a man named Philip", or any other unary predicate, depending on the interpretation at hand. Logical symbols There are several logical symbols in the alphabet, which vary by author but usually include: The quantifier symbols and The logical connectives: for conjunction, for disjunction, for implication, for biconditional, for negation. Occasionally other logical connective symbols are included. Some authors use , or Cpq, instead of , and , or Epq, instead of , especially in contexts where is used for other purposes. Moreover, the horseshoe may replace ; the triple-bar may replace , and a tilde (~), Np, or Fpq, may replace ; ||, or Apq may replace ; and &, or Kpq, may replace , especially if these symbols are not available for technical reasons. Parentheses, brackets, and other punctuation symbols. The choice of such symbols varies depending on context. An infinite set of variables, often denoted by lowercase letters at the end of the alphabet x, y, z, . Subscripts are often used to distinguish variables: x0, x1, x2, . An equality symbol (sometimes, identity symbol) =; see the section on equality below. It should be noted that not all of these symbols are required - only one of the quantifiers, negation and conjunction, variables, brackets and equality suffice. There are numerous minor variations that may define additional logical symbols: Sometimes the truth constants T, Vpq, or , for "true" and F, Opq, or , for "false" are included. Without any such logical operators of valence 0, these two constants can only be expressed using quantifiers. Sometimes additional logical connectives are included, such as the Sheffer stroke, Dpq (NAND), and exclusive or, Jpq.
First-order logic Non-logical symbols The non-logical symbols represent predicates (relations), functions and constants on the domain of discourse. It used to be standard practice to use a fixed, infinite set of non-logical symbols for all purposes. A more recent practice is to use different non-logical symbols according to the application one has in mind. Therefore it has become necessary to name the set of all non-logical symbols used in a particular application. This choice is made via a signature.[2] The traditional approach is to have only one, infinite, set of non-logical symbols (one signature) for all applications. Consequently, under the traditional approach there is only one language of first-order logic.[3] This approach is still common, especially in philosophically oriented books. 1. For every integer n0 there is a collection of n-ary, or n-place, predicate symbols. Because they represent relations between n elements, they are also called relation symbols. For each arity n we have an infinite supply of them: Pn0, Pn1, Pn2, Pn3, 2. For every integer n0 there are infinitely many n-ary function symbols: f n0, f n1, f n2, f n3, In contemporary mathematical logic, the signature varies by application. Typical signatures in mathematics are {1, } or just {} for groups, or {0, 1, +, , <} for ordered fields. There are no restrictions on the number of non-logical symbols. The signature can be empty, finite, or infinite, even uncountable. Uncountable signatures occur for example in modern proofs of the Lwenheim-Skolem theorem. In this approach, every non-logical symbol is of one of the following types. 1. A predicate symbol (or relation symbol) with some valence (or arity, number of arguments) greater than or equal to 0. These which are often denoted by uppercase letters P, Q, R,... . Relations of valence 0 can be identified with propositional variables. For example, P, which can stand for any statement. For example, P(x) is a predicate variable of valence 1. One possible interpretation is "x is a man". Q(x,y) is a predicate variable of valence 2. Possible interpretations include "x is greater than y" and "x is the father of y". 2. A function symbol, with some valence greater than or equal to 0. These are often denoted by lowercase letters f, g, h,... . Examples: f(x) may be interpreted as for "the father of x". In arithmetic, it may stand for "-x". In set theory, it may stand for "the power set of x". In arithmetic, g(x,y) may stand for "x+y". In set theory, it may stand for "the union of x and y". Function symbols of valence 0 are called constant symbols, and are often denoted by lowercase letters at the beginning of the alphabet a, b, c,... . The symbol a may stand for Socrates. In arithmetic, it may stand for 0. In set theory, such a constant may stand for the empty set. The traditional approach can be recovered in the modern approach by simply specifying the "custom" signature to consist of the traditional sequences of non-logical symbols.
19
First-order logic
20
Formation rules
The formation rules define the terms and formulas of first order logic. When terms and formulas are represented as strings of symbols, these rules can be used to write a formal grammar for terms and formulas. These rules are generally context-free (each production has a single symbol on the left side), except that the set of symbols may be allowed to be infinite and there may be many start symbols, for example the variables in the case of terms. Terms The set of terms is inductively defined by the following rules: 1. Variables. Any variable is a term. 2. Functions. Any expression f(t1,...,tn) of n arguments (where each argument ti is a term and f is a function symbol of valence n) is a term. In particular, symbols denoting individual constants are 0-ary function symbols, and are thus terms. Only expressions which can be obtained by finitely many applications of rules 1 and 2 are terms. For example, no expression involving a predicate symbol is a term. Formulas The set of formulas (also called well-formed formulas[4] or wffs) is inductively defined by the following rules: 1. 2. 3. 4. Predicate symbols. If P is an n-ary predicate symbol and t1, ..., tn are terms then P(t1,...,tn) is a formula. Equality. If the equality symbol is considered part of logic, and t1 and t2 are terms, then t1 = t2 is a formula. Negation. If is a formula, then is a formula. Binary connectives. If and are formulas, then ( ) is a formula. Similar rules apply to other binary logical connectives. and are formulas.
Only expressions which can be obtained by finitely many applications of rules 15 are formulas. The formulas obtained from the first two rules are said to be atomic formulas. For example,
is a formula, if f is a unary function symbol, P a unary predicate symbol, and Q a ternary predicate symbol. On the other hand, is not a formula, although it is a string of symbols from the alphabet. The role of the parentheses in the definition is to ensure that any formula can only be obtained in one way by following the inductive definition (in other words, there is a unique parse tree for each formula). This property is known as unique readability of formulas. There are many conventions for where parentheses are used in formulas. For example, some authors use colons or full stops instead of parentheses, or change the places in which parentheses are inserted. Each author's particular definition must be accompanied by a proof of unique readability. This definition of a formula does not support defining an if-then-else function ite(c,a,b) where "c" is a condition expressed as a formula, that would return "a" if c is true, and "b" if it is false. This is because both predicates and functions can only accept terms as parameters, but the first parameter is a formula. Some languages built on first-order logic, such as SMT-LIB 2.0, add this. [5]
First-order logic Notational conventions For convenience, conventions have been developed about the precedence of the logical operators, to avoid the need to write parentheses in some cases. These rules are similar to the order of operations in arithmetic. A common convention is: is evaluated first and are evaluated next Quantifiers are evaluated next is evaluated last. Moreover, extra punctuation not required by the definition may be inserted to make formulas easier to read. Thus the formula
21
might be written as
In some fields, it is common to use infix notation for binary relations and functions, instead of the prefix notation defined above. For example, in arithmetic, one typically writes "2 + 2 = 4" instead of "=(+(2,2),4)". It is common to regard formulas in infix notation as abbreviations for the corresponding formulas in prefix notation. The definitions above use infix notation for binary connectives such as . A less common convention is Polish notation, in which one writes , , and so on in front of their arguments rather than between them. This convention allows all punctuation symbols to be discarded. Polish notation is compact and elegant, but rarely used in practice because it is hard for humans to read it. In Polish notation, the formula
Example
In mathematics the language of ordered abelian groups has one constant symbol 0, one unary function symbol , one binary function symbol +, and one binary relation symbol . Then: The expressions +(x, y) and +(x, +(y, (z))) are terms. These are usually written as x + y and x + y z. The expressions +(x, y) = 0 and (+(x, +(y, (z))), +(x, y)) are atomic formulas. These are usually written as x + y = 0 and x + y z x + y. The expression is a formula, which is usually written as
First-order logic For example, in x y (P(x) Q(x,f(x),z)), x and y are bound variables, z is a free variable, and w is neither
22
because it does not occur in the formula. Freeness and boundness can be also specialized to specific occurrences of variables in a formula. For example, in , the first occurrence of x is free while the second is bound. In other words, the x in is free while the in is bound. A formula in first-order logic with no free variables is called a first-order sentence. These are the formulas that will have well-defined truth values under an interpretation. For example, whether a formula such as Phil(x) is true must depend on what x represents. But the sentence will be either true or false in a given interpretation.
Semantics
An interpretation of a first-order language assigns a denotation to all non-logical constants in that language. It also determines a domain of discourse that specifies the range of the quantifiers. The result is that each term is assigned an object that it represents, and each sentence is assigned a truth value. In this way, an interpretation provides semantic meaning to the terms and formulas of the language. The study of the interpretations of formal languages is called formal semantics. The domain of discourse D is a nonempty set of "objects" of some kind. Intuitively, a first-order formula is a statement about these objects; for example, states the existence of an object x such that the predicate P is true where referred to it. The domain of discourse is the set of considered objects. For example, one can take be the set of integer numbers. The interpretation of a function symbol is a function. For example, if the domain of discourse consists of integers, a function symbol f of arity 2 can be interpreted as the function that gives the sum of its arguments. In other words, the symbol f is associated with the function I(f) which, in this interpretation, is addition. The interpretation of a constant symbol is a function from the one-element set D0 to D, which can be simply identified with an object in D. For example, an interpretation may assign the value to the constant symbol . The interpretation of an n-ary predicate symbol is a set of n-tuples of elements of the domain of discourse. This means that, given an interpretation, a predicate symbol, and n elements of the domain of discourse, one can tell whether the predicate is true of those elements according to the given interpretation. For example, an interpretation I(P) of a binary predicate symbol P may be the set of pairs of integers such that the first one is less than the second. According to this interpretation, the predicate P would be true if its first argument is less than the second. to
First-order structures
The most common way of specifying an interpretation (especially in mathematics) is to specify a structure (also called a model; see below). The structure consists of a nonempty set D that forms the domain of discourse and an interpretation I of the non-logical terms of the signature. This interpretation is itself a function: Each function symbol f of arity n is assigned a function I(f) from to . In particular, each constant symbol to of the signature is assigned an individual in the domain of discourse. Each predicate symbol P of arity n is assigned a relation I(P) over or, equivalently, a function from . Thus each predicate symbol is interpreted by a Boolean-valued function on D.
First-order logic
23
Next, each formula is assigned a truth value. The inductive definition used to make this assignment is called the T-schema. 1. Atomic formulas (1). A formula , where is associated the value true or false depending on whether are the evaluation of the terms and is the
interpretation of , which by assumption is a subset of . 2. Atomic formulas (2). A formula is assigned true if and of discourse (see the section on equality below). 3. Logical connectives. A formula in the form ,
, etc. is evaluated according to the truth table for the if there exists an evaluation of the is true if
connective in question, as in propositional logic. 4. Existential quantifiers. A formula is true according to M and variables that only differs from interpretation M and the variable assignment
regarding the evaluation of x and such that is true according to the . This formal definition captures the idea that
and only if there is a way to choose a value for x such that (x) is satisfied. 5. Universal quantifiers. A formula is true according to M and if (x) is true for every pair composed by the interpretation M and some variable assignment that differs from only on the value of x. This captures the idea that is true if every possible choice of a value for x causes (x) to be true. If a formula does not contain free variables, and so is a sentence, then the initial variable assignment does not affect its truth value. In other words, a sentence is true according to M and if and only if is true according to M and every other variable assignment . There is a second common approach to defining truth values that does not rely on variable assignment functions. Instead, given an interpretation M, one first adds to the signature a collection of constant symbols, one for each element of the domain of discourse in M; say that for each d in the domain the constant symbol cd is fixed. The interpretation is extended so that each new constant symbol is assigned to its corresponding element of the domain. One now defines truth for quantified formulas syntactically, as follows: 1. Existential quantifiers (alternate). A formula of discourse such that holds. Here . 2. Universal quantifiers (alternate). A formula is true according to M if there is some d in the domain is the result of substituting cd for every free occurrence of x in is true according to M if, for every d in the domain of
discourse, is true according to M. This alternate approach gives exactly the same truth values to all sentences as the approach via variable assignments.
First-order logic
24
Algebraizations
An alternate approach to the semantics of first-order logic proceeds via abstract algebra. This approach generalizes the LindenbaumTarski algebras of propositional logic. There are three ways of eliminating quantified variables from first-order logic, that do not involve replacing quantifiers with other variable binding term operators: Cylindric algebra, by Alfred Tarski and his coworkers; Polyadic algebra, by Paul Halmos; Predicate functor logic, mainly due to Willard Quine. These algebras are all lattices that properly extend the two-element Boolean algebra. Tarski and Givant (1987) showed that the fragment of first-order logic that has no atomic sentence lying in the scope of more than three quantifiers, has the same expressive power as relation algebra. This fragment is of great interest because it suffices for Peano arithmetic and most axiomatic set theory, including the canonical ZFC. They also prove that first-order logic with a primitive ordered pair is equivalent to a relation algebra with two ordered pair projection functions.
First-order logic
25
Empty domains
The definition above requires that the domain of discourse of any interpretation must be a nonempty set. There are settings, such as inclusive logic, where empty domains are permitted. Moreover, if a class of algebraic structures includes an empty structure (for example, there is an empty poset), that class can only be an elementary class in first-order logic if empty domains are permitted or the empty structure is removed from the class. There are several difficulties with empty domains, however: Many common rules of inference are only valid when the domain of discourse is required to be nonempty. One example is the rule stating that implies when x is not a free variable in . This rule, which is used to put formulas into prenex normal form, is sound in nonempty domains, but unsound if the empty domain is permitted. The definition of truth in an interpretation that uses a variable assignment function cannot work with empty domains, because there are no variable assignment functions whose range is empty. (Similarly, one cannot assign interpretations to constant symbols.) This truth definition requires that one must select a variable assignment function ( above) before truth values for even atomic formulas can be defined. Then the truth value of a sentence is defined to be its truth value under any variable assignment, and it is proved that this truth value does not depend on which assignment is chosen. This technique does not work if there are no assignment functions at all; it must be changed to accommodate empty domains. Thus, when the empty domain is permitted, it must often be treated as a special case. Most authors, however, simply exclude the empty domain by definition.
Deductive systems
A deductive system is used to demonstrate, on a purely syntactic basis, that one formula is a logical consequence of another formula. There are many such systems for first-order logic, including Hilbert-style deductive systems, natural deduction, the sequent calculus, the tableaux method, and resolution. These share the common property that a deduction is a finite syntactic object; the format of this object, and the way it is constructed, vary widely. These finite deductions themselves are often called derivations in proof theory. They are also often called proofs, but are completely formalized unlike natural-language mathematical proofs. A deductive system is sound if any formula that can be derived in the system is logically valid. Conversely, a deductive system is complete if every logically valid formula is derivable. All of the systems discussed in this article are both sound and complete. They also share the property that it is possible to effectively verify that a purportedly valid deduction is actually a deduction; such deduction systems are called effective. A key property of deductive systems is that they are purely syntactic, so that derivations can be verified without considering any interpretation. Thus a sound argument is correct in every possible interpretation of the language, regardless whether that interpretation is about mathematics, economics, or some other area. In general, logical consequence in first-order logic is only semidecidable: if a sentence A logically implies a sentence B then this can be discovered (for example, by searching for a proof until one is found, using some effective, sound, complete proof system). However, if A does not logically imply B, this does not mean that A logically implies the negation of B. There is no effective procedure that, given formulas A and B, always correctly decides whether A logically implies B.
First-order logic
26
Rules of inference
Further information: List of rules of inference A rule of inference states that, given a particular formula (or set of formulas) with a certain property as a hypothesis, another specific formula (or set of formulas) can be derived as a conclusion. The rule is sound (or truth-preserving) if it preserves validity in the sense that whenever any interpretation satisfies the hypothesis, that interpretation also satisfies the conclusion. For example, one common rule of inference is the rule of substitution. If t is a term and is a formula possibly containing the variable x, then [t/x] (often denoted [x/t]) is the result of replacing all free instances of x by t in . The substitution rule states that for any and any term t, one can conclude [t/x] from provided that no free variable of t becomes bound during the substitution process. (If some free variable of t becomes bound, then to substitute t for x it is first necessary to change the bound variables of to differ from the free variables of t.) To see why the restriction on bound variables is necessary, consider the logically valid formula given by , in the signature of (0,1,+,,=) of arithmetic. If t is the term "x + 1", the formula [t/y] is , which will be false in many interpretations. The problem is that the free variable x of t became bound during the substitution. The intended replacement can be obtained by renaming the bound variable x of to something else, say z, so that the formula after substitution is , which is again logically valid. The substitution rule demonstrates several common aspects of rules of inference. It is entirely syntactical; one can tell whether it was correctly applied without appeal to any interpretation. It has (syntactically-defined) limitations on when it can be applied, which must be respected to preserve the correctness of derivations. Moreover, as is often the case, these limitations are necessary because of interactions between free and bound variables that occur during syntactic manipulations of the formulas involved in the inference rule.
Sequent calculus
Further information: Sequent calculus The sequent calculus was developed to study the properties of natural deduction systems. Instead of working with one formula at a time, it uses sequents, which are expressions of the form
where A1, ..., An, B1, ..., Bk are formulas and the turnstile symbol halves. Intuitively, a sequent expresses the idea that
First-order logic
27
Tableaux method
Further information: Method of analytic tableaux Unlike the methods just described, the derivations in the tableaux method are not lists of formulas. Instead, a derivation is a tree of formulas. To show that a formula A is provable, the tableaux method attempts to demonstrate that the negation of A is unsatisfiable. The tree of the derivation has at its root; the tree branches in a way that reflects the structure of the formula. For example, to show that is unsatisfiable requires showing that C and D are each unsatisfiable; the corresponds to a branching point in the tree with parent and children C and D.
Resolution
The resolution rule is a single rule of inference that, together with unification, is sound and complete for first-order logic. As with the tableaux method, a formula is proved by showing that the negation of the formula is unsatisfiable. Resolution is commonly used in automated theorem proving.
A tableau proof for the propositional formula ((a ~b) & b) a.
The resolution method works only with formulas that are disjunctions of atomic formulas; arbitrary formulas must first be converted to this form through Skolemization. The resolution rule states that from the hypotheses and , the conclusion can be obtained.
Provable identities
The following sentences can be called "identities" because the main connective in each is the biconditional.
(where (where
) )
First-order logic 2. Substitution for functions. For all variables x and y, and any function symbol f, x = y f(...,x,...) = f(...,y,...). 3. Substitution for formulas. For any variables x and y and any formula (x), if ' is obtained by replacing any number of free occurrences of x in with y, such that these remain free occurrences of y, then x = y ( '). These are axiom schemes, each of which specifies an infinite set of axioms. The third scheme is known as Leibniz's law, "the principle of substitutivity", "the indiscernibility of identicals", or "the replacement property". The second scheme, involving the function symbol f, is (equivalent to) a special case of the third scheme, using the formula x = y (f(...,x,...) = z f(...,y,...) = z). Many other properties of equality are consequences of the axioms above, for example: 1. Symmetry. If x = y then y = x. 2. Transitivity. If x = y and y = z then x = z.
28
First-order logic
29
Metalogical properties
One motivation for the use of first-order logic, rather than higher-order logic, is that first-order logic has many metalogical properties that stronger logics do not have. These results concern general properties of first-order logic itself, rather than properties of individual theories. They provide fundamental tools for the construction of models of first-order theories.
First-order logic model. Thus the class of all finite graphs is not an elementary class (the same holds for many other algebraic structures). There are also more subtle limitations of first-order logic that are implied by the compactness theorem. For example, in computer science, many situations can be modeled as a directed graph of states (nodes) and connections (directed edges). Validating such a system may require showing that no "bad" state can be reached from any "good" state. Thus one seeks to determine if the good and bad states are in different connected components of the graph. However, the compactness theorem can be used to show that connected graphs are not an elementary class in first-order logic, and there is no formula (x,y) of first-order logic, in the signature of graphs, that expresses the idea that there is a path from x to y. Connectedness can be expressed in second-order logic, however, but not with only existential set quantifiers, as also enjoys compactness.
30
Lindstrm's theorem
Per Lindstrm showed that the metalogical properties just discussed actually characterize first-order logic in the sense that no stronger logic has the properties (Ebbinghaus and Flum 1994, Chapter XIII). Lindstrm defined a class of abstract logical systems, so that it makes sense to say that one system is stronger than another. He established two theorems for systems of this type: A logical system satisfying Lindstrm's definition that contains first-order logic and satisfies both the LwenheimSkolem theorem and the compactness theorem must be equivalent to first-order logic. A logical system satisfying Lindstrm's definition that has a semidecidable logical consequence relation and satisfies the LwenheimSkolem theorem must be equivalent to first-order logic.
Limitations
Although first-order logic is sufficient for formalizing much of mathematics, and is commonly used in computer science and other fields, it has certain limitations. These include limitations on its expressiveness and limitations of the fragments of natural languages that it can describe.
Expressiveness
The LwenheimSkolem theorem shows that if a first-order theory has any infinite model, then it has infinite models of every cardinality. In particular, no first-order theory with an infinite model can be categorical. Thus there is no first-order theory whose only model has the set of natural numbers as its domain, or whose only model has the set of real numbers as its domain. Many extensions of first-order logic, including infinitary logics and higher-order logics, are more expressive in the sense that they do admit categorical axiomatizations of the natural numbers or real numbers. This expressiveness has a cost in terms of metalogical properties; by Lindstrm's theorem, any logic stronger than first-order logic must fail the conclusion of the compactness theorem or the conclusion of the downward LwenheimSkolem theorem.
First-order logic
31
Relative adjective
An expression such as "terribly", when applied to a relative adjective such as "small", results in a new composite relative adjective "terribly small" The preposition "next to" when applied to "John" results in the predicate adverbial "next to John"
Restricted languages
First-order logic can be studied in languages with fewer logical symbols than were described above. Because can be expressed as , and and can be expressed as can be expressed as and , or and , either of , either , as the only logical
connectives. Similarly, it is sufficient to have only and as logical connectives, or to have only the Sheffer stroke (NAND) or the Peirce arrow (NOR) operator. It is possible to entirely avoid function symbols and constant symbols, rewriting them via predicate symbols in an appropriate way. For example, instead of using a constant symbol one may use a predicate (interpreted as ), and replace every predicate such as with . A function such as interpreted as will similarly be replaced by a predicate
. This change requires adding additional axioms to the theory at hand, so that interpretations of the predicate symbols used have the correct semantics. Restrictions such as these are useful as a technique to reduce the number of inference rules or axiom schemes in deductive systems, which leads to shorter proofs of metalogical results. The cost of the restrictions is that it becomes more difficult to express natural-language statements in the formal system at hand, because the logical connectives
First-order logic used in the natural language statements must be replaced by their (longer) definitions in terms of the restricted collection of logical connectives. Similarly, derivations in the limited systems may be longer than derivations in systems that include additional connectives. There is thus a trade-off between the ease of working within the formal system and the ease of proving results about the formal system. It is also possible to restrict the arities of function symbols and predicate symbols, in sufficiently expressive theories. One can in principle dispense entirely with functions of arity greater than 2 and predicates of arity greater than 1 in theories that include a pairing function. This is a function of arity 2 that takes pairs of elements of the domain and returns an ordered pair containing them. It is also sufficient to have two predicate symbols of arity 2 that define projection functions from an ordered pair to its components. In either case it is necessary that the natural axioms for a pairing function and its projections are satisfied.
32
Many-sorted logic
Ordinary first-order interpretations have a single domain of discourse over which all quantifiers range. Many-sorted first-order logic allows variables to have different sorts, which have different domains. This is also called typed first-order logic, and the sorts called types (as in data type), but it is not the same as first-order type theory. Many-sorted first-order logic is often used in the study of second-order arithmetic. When there are only finitely many sorts in a theory, many-sorted first-order logic can be reduced to single-sorted first-order logic. One introduces into the single-sorted theory a unary predicate symbol for each sort in the many-sorted theory, and adds an axiom saying that these unary predicates partition the domain of discourse. For example, if there are two sorts, one adds predicate symbols and and the axiom . Then the elements satisfying are thought of as elements of the first sort, and elements satisfying as elements of the second sort. One can quantify over each sort by using the corresponding predicate symbol to limit the range of quantification. For example, to say there is an element of the first sort satisfying formula (x), one writes .
Additional quantifiers
Additional quantifiers can be added to first-order logic. Sometimes it is useful to say that "P(x) holds for exactly one x", which can be expressed as notation, called uniqueness quantification, may be taken to abbreviate a formula such as x P(x). This x (P(x) y (P(y)
(x = y))). First-order logic with extra quantifiers has new quantifiers Qx,..., with meanings such as "there are many x such that ...". Also see branching quantifiers and the plural quantifiers of George Boolos and others. Bounded quantifiers are often used in the study of set theory or arithmetic.
Infinitary logics
Infinitary logic allows infinitely long sentences. For example, one may allow a conjunction or disjunction of infinitely many formulas, or quantification over infinitely many variables. Infinitely long sentences arise in areas of mathematics including topology and model theory. Infinitary logic generalizes first-order logic to allow formulas of infinite length. The most common way in which formulas can become infinite is through infinite conjunctions and disjunctions. However, it is also possible to admit generalized signatures in which function and relation symbols are allowed to have infinite arities, or in which quantifiers can bind infinitely many variables. Because an infinite formula cannot be represented by a finite string, it is necessary to choose some other representation of formulas; the usual representation in this context is a tree. Thus formulas are, essentially, identified with their parse trees, rather than with the strings being parsed.
First-order logic The most commonly studied infinitary logics are denoted L, where and are each either cardinal numbers or the symbol . In this notation, ordinary first-order logic is L. In the logic L, arbitrary conjunctions or disjunctions are allowed when building formulas, and there is an unlimited supply of variables. More generally, the logic that permits conjunctions or disjunctions with less than constituents is known as L. For example, L1 permits countable conjunctions and disjunctions. The set of free variables in a formula of L can have any cardinality strictly less than , yet only finitely many of them can be in the scope of any quantifier when a formula appears as a subformula of another.[6] In other infinitary logics, a subformula may be in the scope of infinitely many quantifiers. For example, in L, a single universal or existential quantifier may bind arbitrarily many variables simultaneously. Similarly, the logic L permits simultaneous quantification over fewer than variables, as well as conjunctions and disjunctions of size less than .
33
Higher-order logics
The characteristic feature of first-order logic is that individuals can be quantified, but not predicates. Thus
is not, in most formalizations of first-order logic. Second-order logic extends first-order logic by adding the latter type of quantification. Other higher-order logics allow quantification over even higher types than second-order logic permits. These higher types include relations between relations, functions from relations to relations between relations, and other higher-type objects. Thus the "first" in first-order logic describes the type of objects that can be quantified. Unlike first-order logic, for which only one semantics is studied, there are several possible semantics for second-order logic. The most commonly employed semantics for second-order and higher-order logic is known as full semantics. The combination of additional quantifiers and the full semantics for these quantifiers makes higher-order logic stronger than first-order logic. In particular, the (semantic) logical consequence relation for second-order and higher-order logic is not semidecidable; there is no effective deduction system for second-order logic that is sound and complete under full semantics. Second-order logic with full semantics is more expressive than first-order logic. For example, it is possible to create axiom systems in second-order logic that uniquely characterize the natural numbers and the real line. The cost of this expressiveness is that second-order and higher-order logics have fewer attractive metalogical properties than
First-order logic first-order logic. For example, the LwenheimSkolem theorem and compactness theorem of first-order logic become false when generalized to higher-order logics with full semantics.
34
Notes
[1] Mendelson, Elliott (1964). Introduction to Mathematical Logic. Van Nostrand Reinhold. pp.56. [2] The word language is sometimes used as a synonym for signature, but this can be confusing because "language" can also refer to the set of formulas. [3] More precisely, there is only one language of each variant of one-sorted first-order logic: with or without equality, with or without functions, with or without propositional variables, . [4] Some authors who use the term "well-formed formula" use "formula" to mean any string of symbols from the alphabet. However, most authors in mathematical logic use "formula" to mean "well-formed formula" and have no term for non-well-formed formulas. In every context, it is only the well-formed formulas that are of interest. [5] The SMT-LIB Standard: Version 2.0, by Clark Barrett, Aaron Stump, and Cesare Tinelli. http:/ / goedel. cs. uiowa. edu/ smtlib/ [6] Some authors only admit formulas with finitely many free variables in L, and more generally only formulas with < free variables in L. [7] Avigad et al. (2007) discuss the process of formally verifying a proof of the prime number theorem. The formalized proof required approximately 30,000 lines of input to the Isabelle proof verifier.
First-order logic
35
References
Peter Andrews. 2002. An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof. 2nd ed. Berlin: Kluwer Academic Publishers, available from Springer. Jeremy Avigad, Kevin Donnelly, David Gray, Paul Raff, 2007. "A formally verified proof of the prime number theorem", ACM Transactions on Computational Logic, v. 9 n. 1. doi:10.1145/1297658.1297660 Jon Barwise, 1977. "An introduction to first-order logic", in Barwise, Jon, ed (1982). Handbook of Mathematical Logic. Studies in Logic and the Foundations of Mathematics. Amsterdam: North-Holland. ISBN978-0-444-86388-1 Jon Barwise and John Etchemendy, 2000. Language Proof and Logic. Stanford, CA: CSLI Publications (Distributed by the University of Chicago Press). Jzef Maria Bocheski, 2007. A Prcis of Mathematical Logic. Translated from the French and German editions by Otto Bird. Dordrecht, South Holland: D. Reidel. Jos Ferreirs. The Road to Modern LogicAn Interpretation. http://jstor.org/stable/2687794 Bulletin of Symbolic Logic. Volume 7, Issue 4, 2001.pp. 441-484. DOI 10.2307/2687794. JStor (http://links.jstor.org/ sici?sici=1079-8986(200112)7:4<441:TRTMLI>2.0.CO;2-O) L. T. F. Gamut, 1991. Logic, Language, and Meaning, Volume 2: Introduction to Logic. Chicago: University Of Chicago Press. ISBN 0-226-28088-8. David Hilbert and Wilhelm Ackermann 1950. Principles of Mathematical Logic (English translation). Chelsea. The 1928 first German edition was titled Grundzge der theoretischen Logik. Wilfrid Hodges, 2001, "Classical Logic I: First Order Logic," in Lou Goble, ed., The Blackwell Guide to Philosophical Logic. Blackwell. Heinz-Dieter Ebbinghaus, Jrg Flum, and Wolfgang Thomas. 1994. Mathematical Logic. Berlin, New York: Springer-Verlag. Second Edition. Undergraduate Texts in Mathematics. ISBN 978-0-387-94258-2.
External links
Stanford Encyclopedia of Philosophy: " Classical Logic (http://plato.stanford.edu/entries/logic-classical/) -by Stewart Shapiro. Covers syntax, model theory, and metatheory for first-order logic in the natural deduction style. forall x: an introduction to formal logic (http://www.fecundity.com/logic/), by P.D. Magnus, covers formal semantics and proof theory for first-order logic. Metamath (http://us.metamath.org/index.html): an ongoing online project to reconstruct mathematics as a huge first-order theory, using first-order logic and the axiomatic set theory ZFC. Principia Mathematica modernized. Podnieks, Karl. Introduction to mathematical logic. (http://www.ltn.lv/~podnieks/) Cambridge Mathematics Tripos Notes (http://john.fremlin.de/schoolwork/logic/index.html) (typeset by John Fremlin). These notes cover part of a past Cambridge Mathematics Tripos course taught to undergraduates students (usually) within their third year. The course is entitled "Logic, Computation and Set Theory" and covers Ordinals and cardinals, Posets and zorns Lemma, Propositional logic, Predicate logic, Set theory and Consistency issues related to ZFC and other set theories.
Second-order logic
36
Second-order logic
In logic and mathematics second-order logic is an extension of first-order logic, which itself is an extension of propositional logic.[1] Second-order logic is in turn extended by higher-order logic and type theory. First-order logic uses only variables that range over individuals (elements of the domain of discourse); second-order logic has these variables as well as additional variables that range over sets of individuals. For example, the second-order sentence says that for every set P of individuals and every individual x, either x is in P or it is not (this is the principle of bivalence). Second-order logic also includes variables quantifying over functions, and other variables as explained in the section Syntax below. Both first-order and second-order logic use the idea of a domain of discourse (often called simply the "domain" or the "universe"). The domain is a set of individual elements which can be quantified over.
Expressive power
Second-order logic is more expressive than first-order logic. For example, if the domain is the set of all real numbers, one can assert in first-order logic the existence of an additive inverse of each real number by writing x y (x + y = 0) but one needs second-order logic to assert the least-upper-bound property for sets of real numbers, which states that every bounded, nonempty set of real numbers has a supremum. If the domain is the set of all real numbers, the following second-order sentence expresses the least upper bound property:
In second-order logic, it is possible to write formal sentences which say "the domain is finite" or "the domain is of countable cardinality." To say that the domain is finite, use the sentence that says that every surjective function from the domain to itself is injective. To say that the domain has countable cardinality, use the sentence that says that there is a bijection between every two infinite subsets of the domain. It follows from the compactness theorem and the upward LwenheimSkolem theorem that it is not possible to characterize finiteness or countability, respectively, in first-order logic.
Syntax
The syntax of second-order logic tells which expressions are well formed formulas. In addition to the syntax of first-order logic, second-order logic includes many new sorts (sometimes called types) of variables. These are: A sort of variables that range over sets of individuals. If S is a variable of this sort and t is a first-order term then the expression t S (also written S(t) or St) is an atomic formula. Sets of individuals can also be viewed as unary relations on the domain. For each natural number k there is a sort of variable that ranges over all k-ary relations on the individuals. If R is such a k-ary relation variable and t1,..., tk are first-order terms then the expression R(t1,...,tk) is an atomic formula. For each natural number k there is a sort of variable that ranges over functions that take k elements of the domain and return a single element of the domain. If f is such a k-ary function symbol and t1,...,tk are first-order terms then the expression f(t1,...,tk) is a first-order term. For each of the sorts of variable just defined, it is permissible to build up formulas by using universal and/or existential quantifiers. Thus there are many sorts of quantifiers, two for each sort of variable. A sentence in second-order logic, as in first-order logic, is a well-formed formula with no free variables (of any sort). In monadic second-order logic (MSOL), only variables for subsets of the domain are added. The second-order logic with all the sorts of variables just described is sometimes called full second-order logic to distinguish it from the monadic version.
Second-order logic Just as in first-order logic, second-order logic may include non-logical symbols in a particular second-order language. These are restricted, however, in that all terms that they form must be either first-order terms (which can be substituted for a first-order variable) or second-order terms (which can be substituted for a second-order variable of an appropriate sort).
37
Semantics
The semantics of second-order logic establish the meaning of each sentence. Unlike first-order logic, which has only one standard semantics, there are two different semantics that are commonly used for second-order logic: standard semantics and Henkin semantics. In each of these semantics, the interpretations of the first-order quantifiers and the logical connectives are the same as in first-order logic. Only the ranges of quantifiers over second-order variables differ in the two types of semantics. In standard semantics, also called full semantics, the quantifiers range over all sets or functions of the appropriate sort. Thus once the domain of the first-order variables is established, the meaning of the remaining quantifiers is fixed. It is these semantics that give second-order logic its expressive power, and they will be assumed for the remainder of this article. In Henkin semantics, each sort of second-order variable has a particular domain of its own to range over, which may be a proper subset of all sets or functions of that sort. Leon Henkin (1950) defined these semantics and proved that Gdel's completeness theorem and compactness theorem, which hold for first-order logic, carry over to second-order logic with Henkin semantics. This is because Henkin semantics are almost identical to many-sorted first-order semantics, where additional sorts of variables are added to simulate the new variables of second-order logic. Second-order logic with Henkin semantics is not more expressive than first-order logic. Henkin semantics are commonly used in the study of second-order arithmetic.
Deductive systems
A deductive system for a logic is a set of inference rules and logical axioms that determine which sequences of formulas constitute valid proofs. Several deductive systems can be used for second-order logic, although none can be complete for the standard semantics (see below). Each of these systems is sound, which means any sentence they can be used to prove is logically valid in the appropriate semantics. The weakest deductive system that can be used consists of a standard deductive system for first-order logic (such as natural deduction) augmented with substitution rules for second-order terms.[2] This deductive system is commonly used in the study of second-order arithmetic. The deductive systems considered by Shapiro (1991) and Henkin (1950) add to the augmented first-order deductive scheme both comprehension axioms and choice axioms. These axioms are sound for standard second-order semantics. They are sound for Henkin semantics if only Henkin models that satisfy the comprehension and choice axioms are considered.[3]
Second-order logic But notice that the domain was asserted to include all sets of real numbers. That requirement cannot be reduced to a first-order sentence, as the Lwenheim-Skolem theorem shows. That theorem implies that there is some countably infinite subset of the real numbers, whose members we will call internal numbers, and some countably infinite collection of sets of internal numbers, whose members we will call "internal sets", such that the domain consisting of internal numbers and internal sets satisfies exactly the same first-order sentences satisfied as the domain of real-numbers-and-sets-of-real-numbers. In particular, it satisfies a sort of least-upper-bound axiom that says, in effect: Every nonempty internal set that has an internal upper bound has a least internal upper bound. Countability of the set of all internal numbers (in conjunction with the fact that those form a densely ordered set) implies that that set does not satisfy the full least-upper-bound axiom. Countability of the set of all internal sets implies that it is not the set of all subsets of the set of all internal numbers (since Cantor's theorem implies that the set of all subsets of a countably infinite set is an uncountably infinite set). This construction is closely related to Skolem's paradox. Thus the first-order theory of real numbers and sets of real numbers has many models, some of which are countable. The second-order theory of the real numbers has only one model, however. This follows from the classical theorem that there is only one Archimedean complete ordered field, along with the fact that all the axioms of an Archimedean complete ordered field are expressible in second-order logic. This shows that the second-order theory of the real numbers cannot be reduced to a first-order theory, in the sense that the second-order theory of the real numbers has only one model but the corresponding first-order theory has many models. There are more extreme examples showing that second-order logic with standard semantics is more expressive than first-order logic. There is a finite second-order theory whose only model is the real numbers if the continuum hypothesis holds and which has no model if the continuum hypothesis does not hold (cf. Shapiro 2000 p.105). This theory consists of a finite theory characterizing the real numbers as a complete Archimedean ordered field plus an axiom saying that the domain is of the first uncountable cardinality. This example illustrates that the question of whether a sentence in second-order logic is consistent is extremely subtle. Additional limitations of second order logic are described in the next section.
38
Metalogical results
It is a corollary of Gdel's incompleteness theorem that there is no deductive system (that is, no notion of provability) for second-order formulas that simultaneously satisfies these three desired attributes:[4] (Soundness) Every provable second-order sentence is universally valid, i.e., true in all domains under standard semantics. (Completeness) Every universally valid second-order formula, under standard semantics, is provable. (Effectiveness) There is a proof-checking algorithm that can correctly decide whether a given sequence of symbols is a valid proof or not. This corollary is sometimes expressed by saying that second-order logic does not admit a complete proof theory. In this respect second-order logic with standard semantics differs from first-order logic; Quine (1970, pp.9091) pointed to the lack of a complete proof system as a reason for thinking of second-order logic as not logic, properly speaking. As mentioned above, Henkin proved that the standard deductive system for first-order logic is sound, complete, and effective for second-order logic with Henkin semantics, and the deductive system with comprehension and choice principles is sound, complete, and effective for Henkin semantics using only models that satisfy these principles.
Second-order logic
39
Applications to complexity
The expressive power of various forms of second-order logic on finite structures is intimately tied to computational complexity theory. The field of descriptive complexity studies which computational complexity classes can be characterized by the power of the logic needed to express languages (sets of finite strings) in them. A string w=w1wn in a finite alphabet A can be represented by a finite structure with domain D={1,...,n}, unary predicates Pa for each aA, satisfied by those indices i such that wi=a, and additional predicates which serve to uniquely identify which index is which (typically, one takes the graph of the successor function on D or the order relation <, possibly with other arithmetic predicates). Conversely, the table of any finite structure can be encoded by a finite string. With this identification, we have the following characterizations of variants of second-order logic over finite structures: NP is the set of languages definable by existential, second-order formulas (Fagin's theorem, 1974). co-NP is the set of languages definable by universal, second-order formulas.
Second-order logic PH is the set of languages definable by second-order formulas. PSPACE is the set of languages definable by second-order formulas with an added transitive closure operator. EXPTIME is the set of languages definable by second-order formulas with an added least fixed point operator. Relationships among these classes directly impact the relative expressiveness of the logics over finite structures; for example, if PH=PSPACE, then adding a transitive closure operator to second-order logic would not make it any more expressive over finite structures.
40
Notes
[1] [2] [3] [4] Shapiro (1991) and Hinman (2005) give complete introductions to the subject, with full definitions. Such a system is used without comment by Hinman (2005). These are the models originally studied by Henkin (1950). The proof of this corollary is that a sound, complete, and effective deduction system for standard semantics could be used to produce a recursively enumerable completion of Peano arithmetic, which Gdel's theorem shows cannot exist.
References
Andrews, Peter (2002). An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof (2nd ed.). Kluwer Academic Publishers. Boolos, George (1984). "To Be Is To Be a Value of a Variable (or to Be Some Values of Some Variables)". Journal of Philosophy 81 (8): 43050. doi:10.2307/2026308. JSTOR2026308.. Reprinted in Boolos, Logic, Logic and Logic, 1998. Henkin, L. (1950). "Completeness in the theory of types". Journal of Symbolic Logic 15 (2): 8191. doi:10.2307/2266967. JSTOR2266967. Hinman, P. (2005). Fundamentals of Mathematical Logic. A K Peters. ISBN1-56881-262-0. Putnam, Hilary (1982). "Peirce the Logician". Historia Mathematica 9 (3): 290301. doi:10.1016/0315-0860(82)90123-9.. Reprinted in Putnam, Hilary (1990), Realism with a Human Face, Harvard University Press, pp.252260. W.V. Quine (1970). Philosophy of Logic. Prentice-Hall. Rossberg, M. (2004). "First-Order Logic, Second-Order Logic, and Completeness" (http://homepages.uconn. edu/~mar08022/papers/RossbergCompleteness.pdf). In V. Hendricks et al., eds.. First-order logic revisited. Berlin: Logos-Verlag. Shapiro, S. (2000). Foundations without Foundationalism: A Case for Second-order Logic. Oxford University Press. ISBN0-19-825029-0. Vaananen, J. (2001). "Second-Order Logic and Foundations of Mathematics" (http://www.math.ucla.edu/~asl/ bsl/0704/0704-003.ps). Bulletin of Symbolic Logic 7 (4): 504520. doi:10.2307/2687796. JSTOR2687796.
Decidability (logic)
41
Decidability (logic)
In logic, the term decidable refers to the decision problem, the question of the existence of an effective method for determining membership in a set of formulas. Logical systems such as propositional logic are decidable if membership in their set of logically valid formulas (or theorems) can be effectively determined. A theory (set of formulas closed under logical consequence) in a fixed logical system is decidable if there is an effective method for determining whether arbitrary formulas are included in the theory. Many important problems are undecidable.
Relationship to computability
As with the concept of a decidable set, the definition of a decidable theory or logical system can be given either in terms of effective methods or in terms of computable functions. These are generally considered equivalent per Church's thesis. Indeed, the proof that a logical system or theory is undecidable will use the formal definition of computability to show that an appropriate set is not a decidable set, and then invoke Church's thesis to show that the theory or logical system is not decidable by any effective method (Enderton 2001, pp.206ff.).
Decidability of a theory
A theory is a set of formulas, which here is assumed to be closed under logical consequence. The question of decidability for a theory is whether there is an effective procedure that, given an arbitrary formula in the signature of the theory, decides whether the formula is a member of the theory or not. This problem arises naturally when a theory is defined as the set of logical consequences of a fixed set of axioms. Examples of decidable first-order theories include the theory of real closed fields, and Presburger arithmetic, while the theory of groups and Robinson arithmetic are examples of undecidable theories.
Decidability (logic) There are several basic results about decidability of theories. Every inconsistent theory is decidable, as every formula in the signature of the theory will be a logical consequence of, and thus member of, the theory. Every complete recursively enumerable first-order theory is decidable. An extension of a decidable theory may not be decidable. For example, there are undecidable theories in propositional logic, although the set of validities (the smallest theory) is decidable. A consistent theory which has the property that every consistent extension is undecidable is said to be essentially undecidable. In fact, every consistent extension will be essentially undecidable. The theory of fields is undecidable but not essentially undecidable. Robinson arithmetic is known to be essentially undecidable, and thus every consistent theory which includes or interprets Robinson arithmetic is also (essentially) undecidable.
42
Decidability (logic)
43
Semidecidability
A property of a theory or logical system weaker than decidability is semidecidability. A theory is semidecidable if there is an effective method which, given an arbitrary formula, will always tell correctly when the formula is in the theory, but may give either a negative answer or no answer at all when the formula is not in the theory. A logical system is semidecidable if there is an effective method for generating theorems (and only theorems) such that every theorem will eventually be generated. This is different from decidability because in a semidecidable system there may be no effective procedure for checking that a formula is not a theorem. Every decidable theory or logical system is semidecidable, but in general the converse is not true; a theory is decidable if and only if both it and its complement are semi-decidable. For example, the set of logical validities V of first-order logic is semi-decidable, but not decidable. In this case, it is because there is no effective method for determining for an arbitrary formula A whether A is not in V. Similarly, the set of logical consequences of any recursively enumerable set of first-order axioms is semidecidable. Many of the examples of undecidable first-order theories given above are of this form.
References
Notes
[1] Trakhtenbrot, 1953
Bibliography
Barwise, Jon (1982), "Introduction to first-order logic", in Barwise, Jon, Handbook of Mathematical Logic, Studies in Logic and the Foundations of Mathematics, Amsterdam: North-Holland, ISBN978-0-444-86388-1 Cantone, D., E. G. Omodeo and A. Policriti, "Set Theory for Computing. From Decision Procedures to Logic Programming with Sets," Monographs in Computer Science, Springer, 2001. Chagrov, Alexander; Zakharyaschev, Michael (1997), Modal logic, Oxford Logic Guides, 35, The Clarendon Press Oxford University Press, ISBN978-0-19-853779-3, MR1464942 Davis, Martin (1958), Computability and Unsolvability, McGraw-Hill Book Company, Inc, New York Enderton, Herbert (2001), A mathematical introduction to logic (2nd ed.), Boston, MA: Academic Press, ISBN978-0-12-238452-3 Keisler, H. J. (1982), "Fundamentals of model theory", in Barwise, Jon, Handbook of Mathematical Logic, Studies in Logic and the Foundations of Mathematics, Amsterdam: North-Holland, ISBN978-0-444-86388-1 Monk, J. Donald (1976), Mathematical Logic, Berlin, New York: Springer-Verlag
44
Preliminaries
For every natural mathematical structure there is a signature listing the constants, functions, and relations of the theory together with their valences, so that the object is naturally a -structure. Given a signature there is a unique first-order language L that can be used to capture the first-order expressible facts about the -structure. There are two common ways to specify theories: 1. List or describe a set of sentences in the language L, called the axioms of the theory. 2. Give a set of -structures, and define a theory to be the set of sentences in L holding in all these models. For example, the "theory of finite fields" consists of all sentences in the language of fields that are true in all finite fields. An L theory may: be consistent: no proof of contradiction exists; be satisfiable: there exists a -structure for which the sentences of the theory are all true (by the completeness theorem, satisfiability is equivalent to consistency); be complete: for any statement, either it or its negation is provable; have quantifier elimination; eliminate imaginaries; be finitely axiomatizable; be decidable: There is an algorithm to decide which statements are provable; be recursively axiomatizable; be Model complete or sub-model complete; be -categorical: All models of cardinality are isomorphic; be Stable or unstable. be -stable (same as totally transcendental for countable theories). be superstable have an atomic model have a prime model have a saturated model
List of first-order theories logic, but if a property needs an infinite number of sentences then its opposite property cannot be stated in first-order logic. Any statement of pure identity theory is equivalent to either (N) or to (N) for some finite subset N of the non-negative integers, where (N) is the statement that the number of elements is in N. It is even possible to describe all possible theories in this language as follows. Any theory is either the theory of all sets of cardinality in N for some finite subset N of the non-negative integers, or the theory of all sets whose cardinality is not in N, for some finite or infinite subset N of the non-negative integers. (There are no theories whose models are exactly sets of cardinality N if N is an infinite subset of the integers.) The complete theories are the theories of sets of cardinality n for some finite n, and the theory of infinite sets. One special case of this is the inconsistent theory defined by the axiom x x = x. It is a perfectly good theory with many good properties: it is complete, decidable, finitely axiomatizable, and so on. The only problem is that it has no models at all. By Gdel's completeness theorem, it is the only theory (for any given language) with no models.
45
Unary relations
A set of unary relations Pi for i in some set I is called independent if for every two disjoint finite subsets A and B of I there is some element x such that Pi(x) is true for i in A and false for i in B. Independence can be expressed by a set of first-order statements. The theory of a countable number of independent unary relations is complete, but has no atomic models. It is also an example of a theory that is superstable but not totally transcendental.
Equivalence relations
The signature of equivalence relations has one binary infix relation symbol ~, no constants, and no functions. Equivalence relations satisfy the axioms: Reflexivity x x~x; Symmetry x y x~y y~x; Transitivity: x y z (x~y y~z) x~z. Some first order properties of equivalence relations are: ~ has an infinite number of equivalence classes; ~ has exactly n equivalence classes (for any fixed positive integer n); All equivalence classes are infinite; All equivalence classes have size exactly n (for any fixed positive integer n).
The theory of an equivalence relation with exactly 2 infinite equivalence classes is an easy example of a theory which is -categorical but not categorical for any larger cardinal. The equivalence relation ~ should not be confused with the identity symbol '=': if x=y then x~y, but the converse is not necessarily true. Theories of equivalence relations are not all that difficult or interesting, but often give easy examples or counterexamples for various statements. The following constructions are sometimes used to produce examples of theories with certain spectra; in fact by applying them to a small number of explicit theories T one gets examples of complete countable theories with all possible uncountable spectra. If T is a theory in some language, we define a new theory 2T by adding a new binary relation to the language, and adding axioms stating that it is an equivalence relation, such that there are an infinite number of equivalence classes all of which are models of T. It is possible to iterate this construction transfinitely: given an ordinal , define a new theory by adding an equivalence relation E for each <, together with axioms stating that whenever < then each E equivalence class is the union of infinitely many E equivalence classes, and each E0 equivalence class is a model of T. Informally, one can visualize models of this theory as infinitely branching
List of first-order theories trees of height with models of T attached to all leaves.
46
Orders
The signature of orders has no constants or functions, and one binary relation symbols . (It is of course possible to use , < or > instead as the basic relation, with the obvious minor changes to the axioms.) We define xy, x<y, x>y as abbreviations for yx, xy yx, y<x, Some first-order properties of orders: Transitive: x y z x yy z x z Reflexive: x x x Antisymmetric: x y x y y x x = y Partial: TransitiveReflexiveAntisymmetric; Linear (or total): Partial x y xy yx Dense x z x<z y x<y y<z ("Between any 2 distinct elements there is another element") There is a smallest element: x y xy There is a largest element: x y yx Every element has an immediate successor: x y z x<z yz
The theory DLO of dense linear orders with no endpoints (i.e. no smallest or largest element) is complete, -categorical, but not categorical for any uncountable cardinal. There are 3 other very similar theories: the theory of dense linear orders with a: Smallest but no largest element; Largest but no smallest element; Largest and smallest element. Being well ordered ("any non-empty subset has a minimal element") is not a first-order property; the usual definition involves quantifying over all subsets.
Lattices
Lattices can be considered either as special sorts of partially ordered sets, with a signature consisting of one binary relation symbol , or as algebraic structures with a signature consisting of two binary operations and . The two approaches can be related by defining a b to mean ab=a. For two binary operations the axioms for a lattice are:
Commutative laws: Associative laws: Absorption laws:
For one relation the axioms are: Axioms stating is a partial order, as above. First order properties include: Completeness is not a first order property of lattice. (distributive lattices) (modular lattices) (existence of c=ab) (existence of c=ab)
47
Graphs
The signature of graphs has no constants or functions, and one binary relation symbol R, where R(x,y) is read as "there is an edge from x to y". The axioms for the theory of graphs are Symmetric: x y R(x,y) R(y,x) Anti-reflexive: x R(x,x) ("no loops") The theory of random graphs has the following extra axioms for each positive integer n: For any two disjoint finite sets of size n, there is a point joined to all points of the first set and to no points of the second set. (For each fixed n, it is easy to write this statement in the language of graphs.) The theory of random graphs is categorical, complete, and decidable, and its countable model is called the Rado graph. A statement in the language of graphs is true in this theory if and only if it is true with probability 1 for a random graph on a countable number of points.
Boolean algebras
There are several different signatures and conventions used for Boolean algebras: 1. The signature has 2 constants, 0 and 1, and two binary functions and ("and" and "or"), and one unary function ("not"). This is a little confusing as the functions use the same symbols as the propositional functions of first-order logic. 2. In set theory, a common convention is that the language has 2 constants, 0 and 1, and two binary functions and +, and one unary function . The three functions have the same interpretation as the functions in the first convention. Unfortunately, this convention clashes badly with the next convention: 3. In algebra, the usual convention is that the language has 2 constants, 0 and 1, and two binary functions and +. The function has the same meaning as , but a+b means ab(ab). The reason for this is that the axioms for a Boolean algebra are then just the axioms for a ring with 1 plus x x2 = x. Unfortunately this clashes with the standard convention in set theory given above. The axioms are: The axioms for a distributive lattice (see above) ab aa = 0, ab aa = 1 (properties of negation) Some authors add the extra axiom 0=1, to exclude the trivial algebra with one element. Tarski proved that the theory of Boolean algebras is decidable. We write x y as an abbreviation for x y = x, and atom(x) as an abbreviation for x = 0 y yx y = 0 y = x, read as "x is an atom", in other words a non-zero element with nothing between it and 0. Here are some first-order properties of Boolean algebras: Atomic: x x=0 y yx atom(y) Atomless: x atom(x) The theory of atomless Boolean algebras is -categorical and complete. For any Boolean algebra B, there are several invariants defined as follows. the ideal I(B) consists of elements that are the sum of an atomic and an atomless element. The quotient algebras Bi of B are defined inductively by B0=B, Bk+1 = Bk/I(Bk). The invariant m(B) is the smallest integer such that Bm+1 is trivial, or if no such integer exists. If m(B) is finite, the invariant n(B) is the number of atoms of Bm(B) if this number is finite, or if this number is infinite. The invariant l(B) is 0 if Bm(B) is atomic or if m(B) is , and 1 otherwise.
List of first-order theories Then two Boolean algebras are elementarily equivalent if and only if their invariants l, m, and n are the same. In other words, the values of these invariants classify the possible completions of the theory of Boolean algebras. So the possible complete theories are: The trivial algebra (if this is allowed; sometimes 01 is included as an axiom.) The theory with m= The theories with m a natural number, n a natural number or , and l = 0 or 1 (with l = 0 if n=0).
48
Groups
The signature of group theory has one constant 1 (the identity), one function of arity 1 (the inverse) whose value on t is denoted by t1, and one function of arity 2, which is usually omitted from terms. For any integer n. tn is an abbreviation for the obvious term for the nth power of t. Groups are defined by the axioms Identity: x 1x = x x1 = x Inverse: x x1x = 1 xx1 = 1 Associative: xyz (xy)z = x(yz) Some properties of groups that can be defined in the first-order language of groups are: Abelian x y xy = yx. Torsion free x x2 = 1x = 1, x x3 = 1 x = 1, x x4 = 1 x = 1, ... Divisible x y y2 = x, x y y3 = x, x y y4 = x, ... Infinite (as in identity theory) Exponent n (for any fixed positive integer n) x xn = 1 Nilpotent of class n (for any fixed positive integer n) Solvable of class n (for any fixed positive integer n)
The theory of Abelian groups is decidable. The theory of Infinite divisible torsion-free abelian groups is complete, as is the theory of Infinite abelian groups of exponent p (for p prime). The theory of finite groups is the set of first-order statements in the language of groups that are true in all finite groups (there are plenty of infinite models of this theory). It is not completely trivial to find any such statement that is not true for all groups: one example is "given two elements of order 2, either they are conjugate or there is a non-trivial element commuting with both of them". The properties of being finite, or free, or simple, or torsion are not first-order. More precisely, the first-order theory of all groups with one of these properties has models that do not have this property.
List of first-order theories For any positive integer n the property that all equations of degree n have a root can be expressed by a single first-order sentence: a1 a2... an x (...((x+a1)x +a2)x+...)x+an = 0 Perfect fields The axioms for fields, plus axioms for each prime number p stating that if p 1 = 0 (i.e. the field has characteristic p), then every field element has a pth root. Algebraically closed fields of characteristic p The axioms for fields, plus for every positive n the axiom that all polynomials of degree n have a root, plus axioms fixing the characteristic. The classical examples of complete theories. Categorical in all uncountable cardinals. The theory ACFp has a universal domain property, in the sense that every structure N satisfying the universal axioms of ACFp is a substructure of a sufficiently large algebraically closed field , and additionally any two such embeddings N M induce an automorphism of M. Finite fields. The theory of finite fields is the set of all first-order statements that are true in all finite fields. Significant examples of such statements can, for example, be given by applying the ChevalleyWarning theorem, over the prime fields. The name is a little misleading as the theory has plenty of infinite models. Ax proved that the theory is decidable. Formally real fields These are fields with the axiom For every positive n, the axiom a1 a2... an a1a1+a2a2+ ...+anan=0 a1=0a2=0 ... an=0 (0 is not a non-trivial sum of squares). Real closed fields Axioms: x y x=yy x+yy=0. For every odd positive n, the axiom stating that every polynomial of degree n has a root. For every positive n, the axiom a1 a2... an a1a1+a2a2+ ...+anan=0 a1=0a2=0 ... an=0 (0 is not a non-trivial sum of squares). The theory of real closed fields is decidable (Tarski) and therefore complete. p-adic fields: Ax & Kochen (1965) showed that the theory of p-adic fields is decidable and gave a set of axioms for it.
49
Geometry
Axioms for various systems of geometry usually use a typed language, with the different types corresponding to different geometric objects such as points, lines, circles, planes, and so on. The signature will often consist of binary incidence relations between objects of different types; for example, the relation that a point lies on a line. The signature may have more complicated relations; for example ordered geometry might have a ternary "betweenness" relation for 3 points, which says whether one lies between two others, or a "congruence" relation between 2 pairs of points. Some examples of axiomatized systems of geometry include ordered geometry, absolute geometry, affine geometry, Euclidean geometry, projective geometry, and hyperbolic geometry. For each of these geometries there are many different and inequivalent systems of axioms for various dimensions. Some of these axiom systems include "completeness" axioms that are not first order. As a typical example, the axioms for projective geometry use 2 types, points and lines, and a binary incidence relation between points and lines. If point and line variables are indicated by small and capital letter, and a incident to A is written as aA, then one set of axioms is axiom: if ab and cd lie on intersecting lines, then so do ac and bd.) (There is a line through any 2 distinct points a,b ...) (... which is unique) (Veblen's
50
Euclid did not state all the axioms for Euclidean geometry explicitly, and the first complete list was given by Hilbert in Hilbert's axioms. This is not a first order axiomatization as one of Hilbert's axioms is a second order completeness axiom. Tarski's axioms are a first order axiomatization of Euclidean geometry. Tarski showed this axiom system is complete and decidable by relating it to the complete and decidable theory of real closed fields.
Differential algebra
The theory DF of differential fields. The signature is that of fields (0, 1, +, -, ) together with a unary function , the derivation. The axioms are those for fields together with
For this theory one can add the condition that the characteristic is p, a prime or zero, to get the theory DFp of differential fields of characteristic p (and similarly with the other theories below). If K is a differential field then the field of constants other words for each prime p it has the axiom: (There is little point in demanding that the whole field should be perfect field, because in non-zero characteristic this implies the differential is 0.) For technical reasons to do with quantifier elimination it is sometimes more convenient to force the constant field to be perfect by adding a new symbol r to the signature with the axioms The theory of differentially
perfect fields is the theory of differential fields together with the condition that the field of constants is perfect; in
The theory of DCF differentially closed fields is the theory of differentially perfect fields with axioms saying that such that if f and g are differential polynomials and the separant of f is nonzero and g0 and f has order greater than that of g, then there is some x in the field with f(x)=0 and g(x)0.
Addition
The theory of the natural numbers with a successor function has signature consisting of a constant 0 and a unary function S ("successor": S(x) is interpreted as x+1), and has axioms: 1. x Sx = 0 2. xy Sx = Sy x = y 3. Let P(x) be a first-order formula with a single free variable x. Then the following formula is an axiom: (P(0) x(P(x)P(Sx))) y P(y). The last axiom (induction) can be replaced by the axioms For each integer n>0, the axiom x SSS...Sx x (with n copies of S) x x = 0 y Sy = x The theory of the natural numbers with a successor function is complete and decidable, and is -categorical for uncountable but not for countable . Presburger arithmetic is the theory of the natural numbers under addition, with signature consisting of a constant 0, a unary function, S, and a binary function +. It is complete and decidable. The axioms are 1. x Sx = 0 2. xy Sx = Sy x = y
List of first-order theories 3. x x + 0 = x 4. xy x + Sy = S(x + y) 5. Let P(x) be a first-order formula with a single free variable x. Then the following formula is an axiom: (P(0) x(P(x)P(Sx))) y P(y).
51
Arithmetic
Many of the first order theories described above can be extended to complete recursively enumerable consistent theories. This is no longer true for most of the following theories; they can usually encode both multiplication and addition of natural numbers, and this gives them enough power to encode themselves, which implies that Gdel's incompleteness theorem applies and the theories can no longer be both complete and recursively enumerable (unless they are inconsistent). The signature of a theory of arithmetic has: The constant 0; The unary function, the successor function, here denoted by prefix S, or by prefix or postfix elsewhere; Two binary functions, denoted by infix + and , called "addition" and "multiplication." Some authors take the signature to contain a constant 1 instead of the function S, then define S in the obvious way as St = 1 + t. Robinson arithmetic (also called Q). Axioms (1) and (2) govern the distinguished element 0. (3) assures that S is an injection. Axioms (4) and (5) are the standard recursive definition of addition; (6) and (7) do the same for multiplication. Robinson arithmetic can be thought of as Peano arithmetic without induction. Q is a weak theory for which Gdel's incompleteness theorem holds. Axioms: 1. 2. 3. 4. 5. 6. 7. x Sx = 0 x x = 0 y Sy = x xy Sx = Sy x = y x x + 0 = x xy x + Sy = S(x + y) x x 0 = 0 xy x Sy = (x y) + x.
In is first order Peano arithmetic with induction restricted to n formulas (for n = 0, 1, 2, ...). The theory I0 is often denoted by I0. This is a series of more and more powerful fragments of Peano arithmetic. The case n=1 has about the same strength as primitive recursive arithmetic (PRA). Exponential function arithmetic (EFA) is I0 with an axiom stating that xy exists for all x and y (with the usual properties). First order Peano arithmetic, PA. The "standard" theory of arithmetic. The axioms are the axioms of Robinson arithmetic above, together with the axiom scheme of induction: for any formula in the language of PA. may contain free
variables other than x. Kurt Gdel's 1931 paper proved that PA is incomplete, and has no consistent recursively enumerable completions. Complete arithmetic (also known as true arithmetic) is the theory of the standard model of arithmetic, the natural numbers N. It is complete but does not have a recursively enumerable set of axioms.
52
These are defined in detail in the articles on second order arithmetic and reverse mathematics.
Set theories
The usual signature of set theory has one binary relation , no constants, and no functions. Some of the theories below are "class theories" which have two sorts of object, sets and classes. There are three common ways of handling this in first-order logic: 1. Use first-order logic with two types. 2. Use ordinary first-order logic, but add a new unary predicate "Set", where "Set(t)" means informally "t is a set". 3. Use ordinary first-order logic, and instead of adding a new predicate to the language, treat "Set(t)" as an abbreviation for "y ty" Some first order set theories include: Weak theories lacking powersets: S' (Tarski, Mostowksi, and Robinson, 1953); (finitely axiomatizable) General set theory; Kripke-Platek set theory; Zermelo set theory; Ackermann set theory Zermelo-Fraenkel set theory; Von Neumann-Bernays-Gdel set theory; (finitely axiomatizable) MorseKelley set theory; New Foundations; (finitely axiomatizable) Scott-Potter set theory Positive set theory
Some extra first order axioms that can be added to one of these (usually ZF) include: axiom of choice, axiom of dependent choice Generalized continuum hypothesis Martin's axiom (usually together with the negation of the continuum hypothesis), Martin's maximum and
List of first-order theories analytic determinacy, projective determinacy, Axiom of determinacy Many large cardinal axioms
53
References
Ax, James; Kochen, Simon (1965), "Diophantine problems over local fields. II. A complete set of axioms for p-adic number theory.", Amer. J. Math. (The Johns Hopkins University Press) 87 (3): 631648, doi:10.2307/2373066, JSTOR2373066, MR0184931 Chang, C.C.; Keisler, H. Jerome (1989), Model Theory (3 ed.), Elsevier, ISBN0-7204-0692-7 Hodges, Wilfrid (1997), A shorter model theory, Cambridge University Press, ISBN0-521-58713-1 Marker, David (2002), Model Theory: An Introduction, Graduate Texts in Mathematics, 217, Springer, ISBN0-387-98760-6
Complete theory
In mathematical logic, a theory is complete if it is a maximal consistent set of sentences, i.e., if it is consistent, and none of its proper extensions is consistent. For theories in logics which contain classical propositional logic, this is equivalent to asking that for every sentence in the language of the theory it contains either itself or its negation . Recursively axiomatizable first-order theories that are rich enough to allow general mathematical reasoning to be formulated cannot be complete, as demonstrated by Gdel's incompleteness theorem. This sense of complete is distinct from the notion of a complete logic, which asserts that for every theory that can be formulated in the logic, all semantically valid statements are provable theorems (for an appropriate sense of "semantically valid"). Gdel's completeness theorem is about this latter kind of completeness. Complete theories are closed under a number of conditions internally modelling the T-schema: For a set For a set : : if and only if if and only if and or , .
Maximal consistent sets are a fundamental tool in the model theory of classical logic and modal logic. Their existence in a given case is usually a straightforward consequence of Zorn's lemma, based on the idea that a contradiction involves use of only finitely many premises. In the case of modal logics, the collection of maximal consistent sets extending a theory T (closed under the necessitation rule) can be given the structure of a model of T, called the canonical model.
Examples
Some examples of complete theories are: Presburger arithmetic Tarski's axioms for Euclidean geometry The theory of dense linear orders The theory of algebraically closed fields of a given characteristic The theory of real closed fields Every uncountably categorical countable theory Every countably categorical countable theory
Complete theory
54
References
Mendelson, Elliott (1997). Introduction to Mathematical Logic (Fourth edition ed.). Chapman & Hall. pp.86. ISBN978-0-412-80830-2.
Background
There are numerous deductive systems for first-order logic, including systems of natural deduction and Hilbert-style systems. Common to all deductive systems is the notion of a formal deduction. This is a sequence (or, in some cases, a finite tree) of formulas with a specially-designated conclusion. The definition of a deduction is such that it is finite and that it is possible to verify algorithmically (by a computer, for example, or by hand) that a given collection of formulas is indeed a deduction. A formula is logically valid if it is true in every structure for the language of the formula. To formally state, and then prove, the completeness theorem, it is necessary to also define a deductive system. A deductive system is called complete if every logically valid formula is the conclusion of some formal deduction, and the completeness theorem for a particular deductive system is the theorem that it is complete in this sense. Thus, in a sense, there is a different completeness theorem for each deductive system.
Gdel's completeness theorem establishes a fundamental connection between these two branches, giving a link between semantics and syntax. The completeness theorem should not, however, be misinterpreted as obliterating the difference between these two concepts; in fact Gdel's incompleteness theorem, another celebrated result, shows that there are inherent limitations in what can be achieved with formal proofs in mathematics. The name for the incompleteness theorem refers to another meaning of complete (see model theory - Using the compactness and completeness theorems). In particular, Gdel's completeness theorem deals with formulas that are logical consequences of a first-order theory, while the incompleteness theorem constructs formulas that are not logical consequences of certain theories. An important consequence of the completeness theorem is that the set of logical consequences of an effective first-order theory is an enumerable set. The definition of logical consequence quantifies over all structures in a particular language, and thus does not give an immediate method for algorithmically testing whether a formula is logically valid. Moreover, a consequence of Gdel's incompleteness theorem is that the set of logically valid formulas is not decidable. But the completeness theorem implies that the set of consequences of an effective theory is enumerable; the algorithm is to first construct an enumeration of all formal deductions from the theory, and use this to produce an enumeration of their conclusions. The finitary, syntactic nature of formal deductions makes it possible to enumerate them.
55
56
Proofs
Gdel's original proof of the theorem proceeded by reducing the problem to a special case for formulas in a certain syntactic form, and then handling this form with an ad hoc argument. In modern logic texts, Gdel's completeness theorem is usually proved with Henkin's proof, rather than with Gdel's original proof. Henkin's proof directly constructs a term model for any consistent first-order theory. James Margetson (2004) developed a computerized formal proof using the Isabelle theorem prover. [1] Other proofs are also known.
Further reading
Gdel, K (1929). ber die Vollstndigkeit des Logikkalkls. Doctoral dissertation. University Of Vienna.. The first proof of the completeness theorem. Gdel, K (1930). "Die Vollstndigkeit der Axiome des logischen Funktionenkalkls" (in German). Monatshefte fr Mathematik 37 (1): 349360. doi:10.1007/BF01696781. JFM56.0046.04. The same material as the dissertation, except with briefer proofs, more succinct explanations, and omitting the lengthy introduction.
External links
Stanford Encyclopedia of Philosophy: "Kurt Gdel [2]" -- by Juliette Kennedy. MacTutor biography: Kurt Gdel. [3] Detlovs, Vilnis, and Podnieks, Karlis, "Introduction to mathematical logic. [4]"
References
[1] [2] [3] [4] http:/ / afp. sourceforge. net/ entries/ Completeness-paper. pdf http:/ / plato. stanford. edu/ entries/ goedel/ http:/ / www-groups. dcs. st-and. ac. uk/ ~history/ Mathematicians/ Godel. html http:/ / www. ltn. lv/ ~podnieks/
57
Background
Because statements of a formal theory are written in symbolic form, it is possible to mechanically verify that a formal proof from a finite set of axioms is valid. This task, known as automatic proof verification, is closely related to automated theorem proving. The difference is that instead of constructing a new proof, the proof verifier simply checks that a provided formal proof (or, in some cases, instructions that can be followed to create a formal proof) is correct. This process is not merely hypothetical; systems such as Isabelle are used today to formalize proofs and then check their validity. Many theories of interest include an infinite set of axioms, however. To verify a formal proof when the set of axioms is infinite, it must be possible to determine whether a statement that is claimed to be an axiom is actually an axiom. This issue arises in first order theories of arithmetic, such as Peano arithmetic, because the principle of mathematical induction is expressed as an infinite set of axioms (an axiom schema). A formal theory is said to be effectively generated if its set of axioms is a recursively enumerable set. This means that there is a computer program that, in principle, could enumerate all the axioms of the theory without listing any statements that are not axioms. This is equivalent to the existence of a program that enumerates all the theorems of the theory without enumerating any statements that are not theorems. Examples of effectively generated theories with infinite sets of axioms include Peano arithmetic and ZermeloFraenkel set theory. In choosing a set of axioms, one goal is to be able to prove as many correct results as possible, without proving any incorrect results. A set of axioms is complete if, for any statement in the axioms' language, either that statement or its negation is provable from the axioms. A set of axioms is (simply) consistent if there is no statement such that both the statement and its negation are provable from the axioms. In the standard system of first-order logic, an inconsistent set of axioms will prove every statement in its language (this is sometimes called the principle of explosion), and is thus automatically complete. A set of axioms that is both complete and consistent, however, proves a maximal set of non-contradictory theorems. Gdel's incompleteness theorems show that in certain cases it is not possible to obtain an effectively generated, complete, consistent theory.
58
Gdel's incompleteness theorems The existence of an incomplete formal system is, in itself, not particularly surprising. A system may be incomplete simply because not all the necessary axioms have been discovered. For example, Euclidean geometry without the parallel postulate is incomplete; it is not possible to prove or disprove the parallel postulate from the remaining axioms. Gdel's theorem shows that, in theories that include a small portion of number theory, a complete and consistent finite list of axioms can never be created, nor even an infinite list that can be enumerated by a computer program. Each time a new statement is added as an axiom, there are other true statements that still cannot be proved, even with the new axiom. If an axiom is ever added that makes the system complete, it does so at the cost of making the system inconsistent. There are complete and consistent list of axioms for arithmetic that cannot be enumerated by a computer program. For example, one might take all true statements about the natural numbers to be axioms (and no false statements), which gives the theory known as "true arithmetic". The difficulty is that there is no mechanical way to decide, given a statement about the natural numbers, whether it is an axiom of this theory, and thus there is no effective way to verify a formal proof in this theory. Many logicians believe that Gdel's incompleteness theorems struck a fatal blow to David Hilbert's second problem, which asked for a finitary consistency proof for mathematics. The second incompleteness theorem, in particular, is often viewed as making the problem impossible. Not all mathematicians agree with this analysis, however, and the status of Hilbert's second problem is not yet decided (see "Modern viewpoints on the status of the problem").
59
Original statements
The first incompleteness theorem first appeared as "Theorem VI" in Gdel's 1931 paper On Formally Undecidable Propositions in Principia Mathematica and Related Systems I. The second incompleteness theorem appeared as "Theorem XI" in the same paper.
Gdel's incompleteness theorems there exists a natural number n such that P(n). That is, the theory says that a number with property P exists while denying that it has any specific value. The -consistency of a theory implies its consistency, but consistency does not imply -consistency. J. Barkley Rosser (1936) strengthened the incompleteness theorem by finding a variation of the proof (Rosser's trick) that only requires the theory to be consistent, rather than -consistent. This is mostly of technical interest, since all true formal theories of arithmetic (theories whose axioms are all true statements about natural numbers) are -consistent, and thus Gdel's theorem as originally stated applies to them. The stronger version of the incompleteness theorem that only assumes consistency, rather than -consistency, is now commonly known as Gdel's incompleteness theorem and as the GdelRosser theorem.
60
61
Gdel's incompleteness theorems The combined work of Gdel and Paul Cohen has given two concrete examples of undecidable statements (in the first sense of the term): The continuum hypothesis can neither be proved nor refuted in ZFC (the standard axiomatization of set theory), and the axiom of choice can neither be proved nor refuted in ZF (which is all the ZFC axioms except the axiom of choice). These results do not require the incompleteness theorem. Gdel proved in 1940 that neither of these statements could be disproved in ZF or ZFC set theory. In the 1960s, Cohen proved that neither is provable from ZF, and the continuum hypothesis cannot be proven from ZFC. In 1973, the Whitehead problem in group theory was shown to be undecidable, in the first sense of the term, in standard set theory. In 1977, Paris and Harrington proved that the Paris-Harrington principle, a version of the Ramsey theorem, is undecidable in the first-order axiomatization of arithmetic called Peano arithmetic, but can be proven in the larger system of second-order arithmetic. Kirby and Paris later showed Goodstein's theorem, a statement about sequences of natural numbers somewhat simpler than the Paris-Harrington principle, to be undecidable in Peano arithmetic. Kruskal's tree theorem, which has applications in computer science, is also undecidable from Peano arithmetic but provable in set theory. In fact Kruskal's tree theorem (or its finite form) is undecidable in a much stronger system codifying the principles acceptable on the basis of a philosophy of mathematics called predicativism. The related but more general graph minor theorem (2003) has consequences for computational complexity theory. Gregory Chaitin produced undecidable statements in algorithmic information theory and proved another incompleteness theorem in that setting. Chaitin's theorem states that for any theory that can represent enough arithmetic, there is an upper bound c such that no specific number can be proven in that theory to have Kolmogorov complexity greater than c. While Gdel's theorem is related to the liar paradox, Chaitin's result is related to Berry's paradox.
62
63
64
Arithmetization of syntax
The main problem in fleshing out the proof described above is that it seems at first that to construct a statement p that is equivalent to "p cannot be proved", p would have to somehow contain a reference to p, which could easily give rise to an infinite regress. Gdel's ingenious trick is to show that statements can be matched with numbers (often called the arithmetization of syntax) in such a way that "proving a statement" can be replaced with "testing whether a number has a given property". This allows a self-referential formula to be constructed in a way that avoids any infinite regress of definitions. The same trick was later used by Alan Turing in his work on the Entscheidungsproblem. In simple terms, a method can be devised so that every formula or statement that can be formulated in the system gets a unique number, called its Gdel number, in such a way that it is possible to mechanically convert back and forth between formulas and Gdel numbers. The numbers involved might be very long indeed (in terms of number of digits), but this is not a barrier; all that matters is that such numbers can be constructed. A simple example is the way in which English is stored as a sequence of numbers in computers using ASCII or Unicode: The word HELLO is represented by 72-69-76-76-79 using decimal ASCII, ie the number 7269767679. The logical statement x=y => y=x is represented by 120-061-121-032-061-062-032-121-061-120 using octal ASCII, ie the number 120061121032061062032121061120. In principle, proving a statement true or false can be shown to be equivalent to proving that the number matching the statement does or doesn't have a given property. Because the formal system is strong enough to support reasoning about numbers in general, it can support reasoning about numbers which represent formulae and statements as well. Crucially, because the system can support reasoning about properties of numbers, the results are equivalent to reasoning about provability of their equivalent statements.
Gdel's incompleteness theorems language. An important feature of the formula Bew(y) is that if a statement p is provable in the system then Bew(G(p)) is also provable. This is because any proof of p would have a corresponding Gdel number, the existence of which causes Bew(G(p)) to be satisfied.
65
Diagonalization
The next step in the proof is to obtain a statement that says it is unprovable. Although Gdel constructed this statement directly, the existence of at least one such statement follows from the diagonal lemma, which says that for any sufficiently strong formal system and any statement form F there is a statement p such that the system proves p F(G(p)). By letting F be the negation of Bew(x), p is obtained: p roughly states that its own Gdel number is the Gdel number of an unprovable formula. The statement p is not literally equal to ~Bew(G(p)); rather, p states that if a certain calculation is performed, the resulting Gdel number will be that of an unprovable statement. But when this calculation is performed, the resulting Gdel number turns out to be the Gdel number of p itself. This is similar to the following sentence in English: ", when preceded by itself in quotes, is unprovable.", when preceded by itself in quotes, is unprovable. This sentence does not directly refer to itself, but when the stated transformation is made the original sentence is obtained as a result, and thus this sentence asserts its own unprovability. The proof of the diagonal lemma employs a similar method.
Proof of independence
Now assume that the formal system is -consistent. Let p be the statement obtained in the previous section. If p were provable, then Bew(G(p)) would be provable, as argued above. But p asserts the negation of Bew(G(p)). Thus the system would be inconsistent, proving both a statement and its negation. This contradiction shows that p cannot be provable. If the negation of p were provable, then Bew(G(p)) would be provable (because p was constructed to be equivalent to the negation of Bew(G(p))). However, for each specific number x, x cannot be the Gdel number of the proof of p, because p is not provable (from the previous paragraph). Thus on one hand the system supports construction of a number with a certain property (that it is the Gdel number of the proof of p), but on the other hand, for every specific number x, it can be proved that the number does not have this property. This is impossible in an -consistent system. Thus the negation of p is not provable. Thus the statement p is undecidable: it can neither be proved nor disproved within the chosen system. So the chosen system is either inconsistent or incomplete. This logic can be applied to any formal system meeting the criteria. The conclusion is that all formal systems meeting the criteria are either inconsistent or incomplete. It should be noted that p is not provable (and thus true) in every consistent system. The assumption of -consistency is only required for the negation of p to be not provable. So: In an -consistent formal system, neither p nor its negation can be proved, and so p is undecidable. In a consistent formal system either the same situation occurs, or the negation of p can be proved; In the later case, a statement ("not p") is false but provable. Note that if one tries to fix this by "adding the missing axioms" to avoid the undecidability of the system, then one has to add either p or "not p" as axioms. But this then creates a new formal system2 (old system + p), to which exactly the same process can be applied, creating a new statement form Bew2(x) for this new system. When the diagonal lemma is applied to this new form Bew2, a new statement p2 is obtained; this statement will be different from the previous one, and this new statement will be undecidable in the new system if it is -consistent, thus showing that system2 is equally inconsistent. So adding extra axioms cannot fix the problem.
66
Formalized proofs
Formalized proofs of versions of the incompleteness theorem have been developed by Natarajan Shankar in 1986 using Nqthm (Shankar 1994) and by R.O'Connor in 2003 using Coq (O'Connor 2005).
67
Paraconsistent logic
Although Gdel's theorems are usually studied in the context of classical logic, they also have a role in the study of paraconsistent logic and of inherently contradictory statements (dialetheia). Graham Priest (1984, 2006) argues that replacing the notion of formal proof in Gdel's theorem with the usual notion of informal proof can be used to show that naive mathematics is inconsistent, and uses this as evidence for dialetheism. The cause of this inconsistency is the inclusion of a truth predicate for a theory within the language of the theory (Priest 2006:47). Stewart Shapiro (2002) gives a more mixed appraisal of the applications of Gdel's theorems to dialetheism. Carl Hewitt (2008) has proposed that (inconsistent) paraconsistent logics that prove their own Gdel sentences may have applications in software engineering.
68
History
After Gdel published his proof of the completeness theorem as his doctoral thesis in 1929, he turned to a second problem for his habilitation. His original goal was to obtain a positive solution to Hilbert's second problem (Dawson 1997, p.63). At the time, theories of the natural numbers and real numbers similar to second-order arithmetic were known as "analysis", while theories of the natural numbers alone were known as "arithmetic". Gdel was not the only person working on the consistency problem. Ackermann had published a flawed consistency proof for analysis in 1925, in which he attempted to use the method of -substitution originally developed by Hilbert. Later that year, von Neumann was able to correct the proof for a theory of arithmetic without any axioms of induction. By 1928, Ackermann had communicated a modified proof to Bernays; this modified proof led Hilbert to announce his belief in 1929 that the consistency of arithmetic had been demonstrated and that a consistency proof of analysis would likely soon follow. After the publication of the incompleteness theorems showed that Ackermann's modified proof must be erroneous, von Neumann produced a concrete example showing that its main technique was unsound (Zach 2006, p.418, Zach 2003, p.33). In the course of his research, Gdel discovered that although a sentence which asserts its own falsehood leads to paradox, a sentence that asserts its own non-provability does not. In particular, Gdel was aware of the result now called Tarski's indefinability theorem, although he never published it. Gdel announced his first incompleteness theorem to Carnap, Feigel and Waismann on August 26, 1930; all four would attend a key conference in Knigsberg the following week.
Announcement
The 1930 Knigsberg conference was a joint meeting of three academic societies, with many of the key logicians of the time in attendance. Carnap, Heyting, and von Neumann delivered one-hour addresses on the mathematical philosophies of logicism, intuitionism, and formalism, respectively (Dawson 1996, p.69). The conference also included Hilbert's retirement address, as he was leaving his position at the University of Gttingen. Hilbert used the speech to argue his belief that all mathematical problems can be solved. He ended his address by saying, "For the mathematician there is no Ignorabimus, and, in my opinion, not at all for natural science either. ... The true reason why [no one] has succeeded in finding an unsolvable problem is, in my opinion, that there is no unsolvable problem. In contrast to the foolish Ignoramibus, our credo avers: We must know. We shall know!" This speech quickly became known as a summary of Hilbert's beliefs on mathematics (its final six words, "Wir mssen wissen. Wir werden wissen!", were used as Hilbert's epitaph in 1943). Although Gdel was likely in attendance for Hilbert's address, the two never met face to face (Dawson 1996, p.72). Gdel announced his first incompleteness theorem at a roundtable discussion session on the third day of the conference. The announcement drew little attention apart from that of von Neumann, who pulled Gdel aside for conversation. Later that year, working independently with knowledge of the first incompleteness theorem, von Neumann obtained a proof of the second incompleteness theorem, which he announced to Gdel in a letter dated November 20, 1930 (Dawson 1996, p.70). Gdel had independently obtained the second incompleteness theorem and included it in his submitted manuscript, which was received by Monatshefte fr Mathematik on November 17, 1930. Gdel's paper was published in the Monatshefte in 1931 under the title ber formal unentscheidbare Stze der Principia Mathematica und verwandter Systeme I (On Formally Undecidable Propositions in Principia Mathematica and Related Systems I). As the title implies, Gdel originally planned to publish a second part of the paper; it was never written.
69
Criticism
In September 1931, Ernst Zermelo wrote Gdel to announce what he described as an "essential gap" in Gdels argument (Dawson:76). In October, Gdel replied with a 10-page letter (Dawson:76, Grattan-Guinness:512-513). But Zermelo did not relent and published his criticisms in print with a rather scathing paragraph on his young competitor (Grattan-Guinness:513). Gdel decided that to pursue the matter further was pointless, and Carnap agreed (Dawson:77). Much of Zermelo's subsequent work was related to logics stronger than first-order logic, with which he hoped to show both the consistency and categoricity of mathematical theories. Paul Finsler (1926) used a version of Richard's paradox to construct an expression that was false but unprovable in a particular, informal framework he had developed. Gdel was unaware of this paper when he proved the incompleteness theorems (Collected Works Vol. IV., p.9). Finsler wrote Gdel in 1931 to inform him about this paper, which Finsler felt had priority for an incompleteness theorem. Finsler's methods did not rely on formalized provability, and had only a superficial resemblance to Gdel's work (van Heijenoort 1967:328). Gdel read the paper but found it deeply flawed, and his response to Finsler laid out concerns about the lack of formalization (Dawson:89). Finsler continued to argue for his philosophy of mathematics, which eschewed formalization, for the remainder of his career. Wittgenstein and Gdel Ludwig Wittgenstein wrote several passages about the incompleteness theorems that were published posthumously in his 1953 Remarks on the Foundations of Mathematics. Gdel was a member of the Vienna Circle during the period in which Wittgenstein's early ideal language philosophy and Tractatus Logico-Philosophicus dominated the circle's thinking; writings of Gdel in his Nachlass express the belief that Wittgenstein willfully misread Gdel's theorems. Multiple commentators have read Wittgenstein as misunderstanding Gdel (Rodych 2003), although Juliet Floyd and Hilary Putnam (2000) have suggested that the majority of commentary misunderstands Wittgenstein. On their release, Bernays, Dummett, and Kreisel wrote separate reviews on Wittgenstein's remarks, all of which were extremely negative (Berto 2009:208). The unanimity of this criticism caused Wittgenstein's remarks on the incompleteness theorems to have little impact on the logic community. In 1972, Gdel wrote to Karl Menger that Wittgenstein's comments demonstrate a fundamental misunderstanding of the incompleteness theorems. "It is clear from the passages you cite that Wittgenstein did "not" understand [the first incompleteness theorem] (or pretended not to understand it). He interpreted it as a kind of logical paradox, while in fact is just the opposite, namely a mathematical theorem within an absolutely uncontroversial part of mathematics
Gdel's incompleteness theorems (finitary number theory or combinatorics)." (Wang 1996:197) Since the publication of Wittgenstein's Nachlass in 2000, a series of papers in philosophy have sought to evaluate whether the original criticism of Wittgenstein's remarks was justified. Floyd and Putnam (2000) argue that Wittgenstein had a more complete understanding of the incompleteness theorem than was previously assumed. They are particularly concerned with the interpretation of a Gdel sentence for an -inconsistent theory as actually saying "I am not provable", since the theory has no models in which the provability predicate corresponds to actual provability. Rodych (2003) argues that their interpretation of Wittgenstein is not historically justified, while Bays (2004) argues against Floyd and Putnam's philosophical analysis of the provability predicate. Berto (2009) explores the relationship between Wittgenstein's writing and theories of paraconsistent logic.
70
Notes
[1] The word "true" is used disquotationally here: the Gdel sentence is true in this sense because it "asserts its own unprovability and it is indeed unprovable" (Smoryski 1977 p. 825; also see Franzn 2005 pp. 2833). It is also possible to read "GT is true" in the formal sense that primitive recursive arithmetic proves the implication Con(T)GT, where Con(T) is a canonical sentence asserting the consistency of T (Smoryski 1977 p. 840, Kikuchi and Tanaka 1994 p. 403)
References
Articles by Gdel
1931, ber formal unentscheidbare Stze der Principia Mathematica und verwandter Systeme, I. Monatshefte fr Mathematik und Physik 38: 173-98. 1931, ber formal unentscheidbare Stze der Principia Mathematica und verwandter Systeme, I. and On formally undecidable propositions of Principia Mathematica and related systems I in Solomon Feferman, ed., 1986. Kurt Gdel Collected works, Vol. I. Oxford University Press: 144-195. The original German with a facing English translation, preceded by a very illuminating introductory note by Kleene. Hirzel, Martin, 2000, On formally undecidable propositions of Principia Mathematica and related systems I. (http://www.research.ibm.com/people/h/hirzel/papers/canon00-goedel.pdf). A modern translation by Hirzel. 1951, Some basic theorems on the foundations of mathematics and their implications in Solomon Feferman, ed., 1995. Kurt Gdel Collected works, Vol. III. Oxford University Press: 304-23.
Gdel's incompleteness theorems B. Meltzer (translation) and R. B. Braithwaite (Introduction), 1962. On Formally Undecidable Propositions of Principia Mathematica and Related Systems, Dover Publications, New York (Dover edition 1992), ISBN 0-486-66980-7 (pbk.) This contains a useful translation of Gdel's German abbreviations on pp.3334. As noted above, typography, translation and commentary is suspect. Unfortunately, this translation was reprinted with all its suspect content by Stephen Hawking editor, 2005. God Created the Integers: The Mathematical Breakthroughs That Changed History, Running Press, Philadelphia, ISBN 0-7624-1922-9. Gdels paper appears starting on p. 1097, with Hawkings commentary starting on p. 1089. Martin Davis editor, 1965. The Undecidable: Basic Papers on Undecidable Propositions, Unsolvable problems and Computable Functions, Raven Press, New York, no ISBN. Gdels paper begins on page 5, preceded by one page of commentary. Jean van Heijenoort editor, 1967, 3rd edition 1967. From Frege to Gdel: A Source Book in Mathematical Logic, 1979-1931, Harvard University Press, Cambridge Mass., ISBN 0-674-32449-8 (pbk). van Heijenoort did the translation. He states that Professor Gdel approved the translation, which in many places was accommodated to his wishes.(p.595). Gdels paper begins on p.595; van Heijenoorts commentary begins on p.592. Martin Davis editor, 1965, ibid. "On Undecidable Propositions of Formal Mathematical Systems." A copy with Gdel's corrections of errata and Gdel's added notes begins on page 41, preceded by two pages of Davis's commentary. Until Davis included this in his volume this lecture existed only as mimeographed notes.
71
Articles by others
George Boolos, 1989, "A New Proof of the Gdel Incompleteness Theorem", Notices of the American Mathematical Society v. 36, pp.388390 and p.676, reprinted in Boolos, 1998, Logic, Logic, and Logic, Harvard Univ. Press. ISBN 0 674 53766 1 Arthur Charlesworth, 1980, "A Proof of Godel's Theorem in Terms of Computer Programs," Mathematics Magazine, v. 54 n. 3, pp.109121. JStor (http://links.jstor.org/ sici?sici=0025-570X(198105)54:3<109:APOGTI>2.0.CO;2-1&size=LARGE&origin=JSTOR-enlargePage) Martin Davis, " The Incompleteness Theorem (http://www.ams.org/notices/200604/fea-davis.pdf)", in Notices of the AMS vol. 53 no. 4 (April 2006), p.414. Jean van Heijenoort, 1963. "Gdel's Theorem" in Edwards, Paul, ed., Encyclopedia of Philosophy, Vol. 3. Macmillan: 348-57. Geoffrey Hellman, How to Gdel a Frege-Russell: Gdel's Incompleteness Theorems and Logicism. Nos, Vol. 15, No. 4, Special Issue on Philosophy of Mathematics. (Nov., 1981), pp.451468. David Hilbert, 1900, " Mathematical Problems. (http://aleph0.clarku.edu/~djoyce/hilbert/problems. html#prob2)" English translation of a lecture delivered before the International Congress of Mathematicians at Paris, containing Hilbert's statement of his Second Problem. Kikuchi, Makoto; Tanaka, Kazuyuki (1994), "On formalization of model-theoretic proofs of Gdel's theorems", Notre Dame Journal of Formal Logic 35 (3): 403412, doi:10.1305/ndjfl/1040511346, ISSN0029-4527, MR1326122 Stephen Cole Kleene, 1943, "Recursive predicates and quantifiers," reprinted from Transactions of the American Mathematical Society, v. 53 n. 1, pp.4173 in Martin Davis 1965, The Undecidable (loc. cit.) pp.255287. John Barkley Rosser, 1936, "Extensions of some theorems of Gdel and Church," reprinted from the Journal of Symbolic Logic vol. 1 (1936) pp.8791, in Martin Davis 1965, The Undecidable (loc. cit.) pp.230235. John Barkley Rosser, 1939, "An Informal Exposition of proofs of Gdel's Theorem and Church's Theorem", Reprinted from the Journal of Symbolic Logic, vol. 4 (1939) pp.5360, in Martin Davis 1965, The Undecidable (loc. cit.) pp.223230 C. Smoryski, "The incompleteness theorems", in J. Barwise, ed., Handbook of Mathematical Logic, North-Holland 1982 ISBN 978-0444863881, pp.821866.
Gdel's incompleteness theorems Dan E. Willard (2001), " Self-Verifying Axiom Systems, the Incompleteness Theorem and Related Reflection Principles (http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.jsl/ 1183746459)", Journal of Symbolic Logic, v. 66 n. 2, pp.536596. doi:10.2307/2695030 Zach, Richard (2003), "The Practice of Finitism: Epsilon Calculus and Consistency Proofs in Hilbert's Program" (http://www.ucalgary.ca/~rzach/static/conprf.pdf), Synthese (Berlin, New York: Springer-Verlag) 137 (1): 211259, doi:10.1023/A:1026247421383, ISSN0039-7857 Richard Zach, 2005, "Paper on the incompleteness theorems" in Grattan-Guinness, I., ed., Landmark Writings in Western Mathematics. Elsevier: 917-25.
72
Miscellaneous references
Francesco Berto. "The Gdel Paradox and Wittgenstein's Reasons" Philosophia Mathematica (III) 17. 2009. John W. Dawson, Jr., 1997. Logical Dilemmas: The Life and Work of Kurt Gdel, A.K. Peters, Wellesley Mass, ISBN 1-56881-256-6. Goldstein, Rebecca, 2005, Incompleteness: the Proof and Paradox of Kurt Gdel, W. W. Norton & Company. ISBN 0-393-05169-2 Juliet Floyd and Hilary Putnam, 2000, "A Note on Wittgenstein's 'Notorious Paragraph' About the Gdel Theorem", Journal of Philosophy v. 97 n. 11, pp.624632.
Gdel's incompleteness theorems Carl Hewitt, 2008, "Large-scale Organizational Computing requires Unstratified Reflection and Strong Paraconsistency", Coordination, Organizations, Institutions, and Norms in Agent Systems III, Springer-Verlag. David Hilbert and Paul Bernays, Grundlagen der Mathematik, Springer-Verlag. John Hopcroft and Jeffrey Ullman 1979, Introduction to Automata theory, Addison-Wesley, ISBN 0-201-02988-X Stephen Cole Kleene, 1967, Mathematical Logic. Reprinted by Dover, 2002. ISBN 0-486-42533-9 Graham Priest, 2006, In Contradiction: A Study of the Transconsistent, Oxford University Press, ISBN 0-199-26329-9 Graham Priest, 1984, "Logic of Paradox Revisited", Journal of Philosophical Logic, v. 13, n. 2, pp.153179 Hilary Putnam, 1960, Minds and Machines in Sidney Hook, ed., Dimensions of Mind: A Symposium. New York University Press. Reprinted in Anderson, A. R., ed., 1964. Minds and Machines. Prentice-Hall: 77. Russell O'Connor, 2005, " Essential Incompleteness of Arithmetic Verified by Coq (http://arxiv.org/abs/cs/ 0505034)", Lecture Notes in Computer Science v. 3603, pp.245260. Victor Rodych, 2003, "Misunderstanding Gdel: New Arguments about Wittgenstein and New Remarks by Wittgenstein", Dialectica v. 57 n. 3, pp.279313. doi:10.1111/j.1746-8361.2003.tb00272.x Stewart Shapiro, 2002, "Incompleteness and Inconsistency", Mind, v. 111, pp 81732. doi:10.1093/mind/111.444.817 Alan Sokal and Jean Bricmont, 1999, Fashionable Nonsense: Postmodern Intellectuals' Abuse of Science, Picador. ISBN 0-31-220407-8 Joseph R. Shoenfield (1967), Mathematical Logic. Reprinted by A.K. Peters for the Association of Symbolic Logic, 2001. ISBN 978-156881135-2 Jeremy Stangroom and Ophelia Benson, Why Truth Matters, Continuum. ISBN 0-82-649528-1 George Tourlakis, Lectures in Logic and Set Theory, Volume 1, Mathematical Logic, Cambridge University Press, 2003. ISBN 978-0-52175373-9 Wigderson, Avi (2010), "The Gdel Phenomena in Mathematics: A Modern View" (http://www.math.ias.edu/ ~avi/BOOKS/Godel_Widgerson_Text.pdf), Kurt Gdel and the Foundations of Mathematics: Horizons of Truth, Cambridge University Press Hao Wang, 1996, A Logical Journey: From Gdel to Philosophy, The MIT Press, Cambridge MA, ISBN 0-262-23189-1. Richard Zach, 2006, "Hilbert's program then and now" (http://www.ucalgary.ca/~rzach/static/hptn.pdf), in Philosophy of Logic, Dale Jacquette (ed.), Handbook of the Philosophy of Science, v. 5., Elsevier, pp.411447.
73
External links
Godel's Incompleteness Theorems (http://www.bbc.co.uk/programmes/b00dshx3) on In Our Time at the BBC. ( listen now (http://www.bbc.co.uk/iplayer/console/b00dshx3/ In_Our_Time_Godel's_Incompleteness_Theorems)) Stanford Encyclopedia of Philosophy: " Kurt Gdel (http://plato.stanford.edu/entries/goedel/)" -- by Juliette Kennedy. MacTutor biographies: Kurt Gdel. (http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Godel.html) Gerhard Gentzen. (http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Gentzen.html) What is Mathematics:Gdel's Theorem and Around (http://www.ltn.lv/~podnieks/index.html) by Karlis Podnieks. An online free book. World's shortest explanation of Gdel's theorem (http://blog.plover.com/math/Gdl-Smullyan.html) using a printing machine as an example. October 2011 RadioLab episode (http://www.radiolab.org/2011/oct/04/break-cycle/) about/including Gdel's Incompleteness theorem
74
Formal definition
A set S of natural numbers is called recursively enumerable if there is a partial recursive function (synonymously, a partial computable function) whose domain is exactly S, meaning that the function is defined if and only if its input is a member of S. The definition can be extended to an arbitrary countable set A by using Gdel numbers to represent elements of the set and declaring a subset of A to be recursively enumerable if the set of corresponding Gdel numbers is recursively enumerable.
Equivalent formulations
The following are all equivalent properties of a set S of natural numbers: Semidecidability: The set S is recursively enumerable. That is, S is the domain (co-range) of a partial recursive function. There is a partial recursive function f such that:
Enumerability: The set S is the range of a partial recursive function. The set S is the range of a total recursive function or empty. If S is infinite, the function can be chosen to be injective. The set S is the range of a primitive recursive function or empty. Even if S is infinite, repetition of values may be necessary in this case. Diophantine: There is a polynomial p with integer coefficients and variables x, a, b, c, d, e, f, g, h, i ranging over the natural numbers such that
There is a polynomial from the integers to the integers such that the set S contains exactly the non-negative numbers in its range. The equivalence of semidecidability and enumerability can be obtained by the technique of dovetailing.
Recursively enumerable set The Diophantine characterizations of a recursively enumerable set, while not as straightforward or intuitive as the first definitions, were found by Yuri Matiyasevich as part of the negative solution to Hilbert's Tenth Problem. Diophantine sets predate recursion theory and are therefore historically the first way to describe these sets (although this equivalence was only remarked more than three decades after the introduction of recursively enumerable sets). The number of bound variables in the above definition of the Diophantine set is the best known so far; it might be that a lower number can be used to define all diophantine sets.
75
Examples
Every recursive set is recursively enumerable, but it is not true that every recursively enumerable set is recursive. A recursively enumerable language is a recursively enumerable subset of a formal language. The set of all provable sentences in an effectively presented axiomatic system is a recursively enumerable set. Matiyasevich's theorem states that every recursively enumerable set is a Diophantine set (the converse is trivially true). The simple sets are recursively enumerable but not recursive. The creative sets are recursively enumerable but not recursive. Any productive set is not recursively enumerable. Given a Gdel numbering Cantor pairing function and of the computable functions, the set indicates (where is the is defined) is recursively enumerable. This set encodes the is recursively
halting problem as it describes the input parameters for which each Turing machine halts. Given a Gdel numbering of the computable functions, the set
enumerable. This set encodes the problem of deciding a function value. Given a partial function f from the natural numbers into the natural numbers, f is a partial recursive function if and only if the graph of f, that is, the set of all pairs such that f(x) is defined, is recursively enumerable.
Properties
If A and B are recursively enumerable sets then A B, A B and A B (with the ordered pair of natural numbers mapped to a single natural number with the Cantor pairing function) are recursively enumerable sets. The preimage of a recursively enumerable set under a partial recursive function is a recursively enumerable set. A set is recursively enumerable if and only if it is at level A set of the arithmetical hierarchy. is recursively enumerable.
Equivalently, a set is co-r.e. if and only if it is at level of the arithmetical hierarchy. A set A is recursive (synonym: computable) if and only if both A and the complement of A are recursively enumerable. A set is recursive if and only if it is either the range of an increasing total recursive function or finite. Some pairs of recursively enumerable sets are effectively separable and some are not.
Remarks
According to the Church-Turing thesis, any effectively calculable function is calculable by a Turing machine, and thus a set S is recursively enumerable if and only if there is some algorithm which yields an enumeration of S. This cannot be taken as a formal definition, however, because the Church-Turing thesis is an informal conjecture rather than a formal axiom. The definition of a recursively enumerable set as the domain of a partial function, rather than the range of a total recursive function, is common in contemporary texts. This choice is motivated by the fact that in generalized recursion theories, such as -recursion theory, the definition corresponding to domains has been found to be more natural. Other texts use the definition in terms of enumerations, which is equivalent for recursively enumerable sets.
76
References
Rogers, H. The Theory of Recursive Functions and Effective Computability, MIT Press. ISBN 0-262-68052-1; ISBN 0-07-053522-1. Soare, R. Recursively enumerable sets and degrees. Perspectives in Mathematical Logic. Springer-Verlag, Berlin, 1987. ISBN 3-540-15299-7. Soare, Robert I. Recursively enumerable sets and degrees. Bull. Amer. Math. Soc. 84 (1978), no. 6, 11491181.
Model theory
In mathematics, model theory is the study of (classes of) mathematical structures (e.g. groups, fields, graphs, universes of set theory) using tools from mathematical logic. Objects of study in model theory are models for formal languages which are structures that give meaning to the sentences of these formal languages. If a model for a language moreover satisfies a particular sentence or theory (set of sentences satisfying special conditions), it is called a model of the sentence or theory. Model theory has close ties to algebra and universal algebra. This article focuses on finitary first order model theory of infinite structures. Finite model theory, which concentrates on finite structures, diverges significantly from the study of infinite structures in both the problems studied and the techniques used. Model theory in higher-order logics or infinitary logics is hampered by the fact that completeness does not in general hold for these logics. However, a great deal of study has also been done in such languages.
Introduction
Model theory recognises and is intimately concerned with a duality: It examines semantical elements by means of syntactical elements of a corresponding language. To quote the first page of Chang and Keisler (1990): universal algebra + structure (mathematical logic) = model theory. Model theory developed rapidly during the 1990s, and a more modern definition is provided by Wilfrid Hodges (1997): model theory = algebraic geometry fields. In a similar way to proof theory, model theory is situated in an area of interdisciplinarity between mathematics, philosophy, and computer science. The most important professional organization in the field of model theory is the Association for Symbolic Logic. An incomplete and somewhat arbitrary subdivision of model theory is into classical model theory, model theory applied to groups and fields, and geometric model theory. A missing subdivision is computable model theory, but this can arguably be viewed as an independent subfield of logic. Examples of early theorems from classical model theory include Gdel's completeness theorem, the upward and downward LwenheimSkolem theorems, Vaught's two-cardinal theorem, Scott's isomorphism theorem, the omitting types theorem, and the Ryll-Nardzewski theorem. Examples of early results from model theory applied to fields are Tarski's elimination of quantifiers for real closed fields, Ax's theorem on pseudo-finite fields, and Robinson's development of nonstandard analysis. An important step in the evolution of classical model theory occurred with the birth of stability theory (through Morley's theorem on uncountably categorical theories and Shelah's classification program), which developed a calculus of independence and rank based on syntactical conditions satisfied by theories. During the last several decades applied model theory has repeatedly merged with the more pure stability theory. The result of this synthesis is called geometric model theory in this article (which is taken to include o-minimality, for example, as well as classical geometric stability theory). An example of a theorem from geometric model theory is Hrushovski's proof of the MordellLang conjecture for function fields. The ambition of geometric model theory is to provide a geography of mathematics by
Model theory embarking on a detailed study of definable sets in various mathematical structures, aided by the substantial tools developed in the study of pure model theory.
77
Example
To illustrate the basic relationship involving syntax and semantics in the context of a non-trivial model, one can start, on the syntactic side, with suitable axioms for the natural numbers such as Peano axioms, and the associated theory. Going on to the semantic side, one has the usual counting numbers as a model. In the 1930s, Skolem developed alternative models satisfying the axioms. This illustrates what is meant by interpreting a language or theory in a particular model. A more traditional example is interpreting the axioms of a particular algebraic system such as a group, in the context of a model provided by a specific group.
Universal algebra
Fundamental concepts in universal algebra are signatures and -algebras. Since these concepts are formally defined in the article on structures, the present article can content itself with an informal introduction which consists in examples of how these terms are used. The standard signature of rings is ring = {,+,,0,1}, where and + are binary, is unary, and 0 and 1 are nullary. The standard signature of semirings is smr = {,+,0,1}, where the arities are as above. The standard signature of (multiplicative) groups is grp = {,1,1}, where is binary, nullary. The standard signature of monoids is mnd = {,1}. A ring is a ring-structure which satisfies the identities u + (v + w) = (u + v) + w, u + v = v + u, u + 0 = u, u + (u) = 0, u (v w) = (u v) w, u 1 = u, 1 u = u, u (v + w) = (u v) + (u w) and (v + w) u = (v u) + (w u). A group is a grp-structure which satisfies the identities u (v w) = (u v) w, u 1 = u, 1 u = u, u u1 = 1 and u1 u = 1. A monoid is a mnd-structure which satisfies the identities u (v w) = (u v) w, u 1 = u and 1 u = u. A semigroup is a {}-structure which satisfies the identity u (v w) = (u v) w. A magma is just a {}-structure. This is a very efficient way to define most classes of algebraic structures, because there is also the concept of -homomorphism, which correctly specializes to the usual notions of homomorphism for groups, semigroups, magmas and rings. For this to work, the signature must be chosen well. Terms such as the ring-term t(u,v,w) given by (u + (v w)) + (1) are used to define identities t = t', but also to construct free algebras. An equational class is a class of structures which, like the examples above and many others, is defined as the class of all -structures which satisfy a certain set of identities. Birkhoff's theorem states: A class of -structures is an equational class if and only if it is not empty and closed under subalgebras, homomorphic images, and direct products. An important non-trivial tool in universal algebra are ultraproducts , where I is an infinite set indexing
1
is unary and 1 is
a system of -structures Ai, and U is an ultrafilter on I. While model theory is generally considered a part of mathematical logic, universal algebra, which grew out of Alfred North Whitehead's (1898) work on abstract algebra, is part of algebra. This is reflected by their respective MSC classifications. Nevertheless model theory can be seen as an extension of universal algebra.
Model theory
78
First-order logic
Whereas universal algebra provides the semantics for a signature, logic provides the syntax. With terms, identities and quasi-identities, even universal algebra has some limited syntactic tools; first-order logic is the result of making quantification explicit and adding negation into the picture. A first-order formula is built out of atomic formulas such as R(f(x,y),z) or y = x + 1 by means of the Boolean connectives and prefixing of quantifiers or . A sentence is a formula in which each occurrence of a variable is in the scope of a corresponding quantifier. Examples for formulas are (or (x) to mark the fact that at most x is an unbound variable in ) and defined as follows:
(Note that the equality symbol has a double meaning here.) It is intuitively clear how to translate such formulas into mathematical meaning. In the smr-structure of the natural numbers, for example, an element n satisfies the formula if and only if n is a prime number. The formula similarly defines irreducibility. Tarski gave a rigorous definition, sometimes called "Tarski's definition of truth", for the satisfaction relation , so that one easily proves: is a prime number. is irreducible. A set T of sentences is called a (first-order) theory. A theory is satisfiable if it has a model , i.e. a
structure (of the appropriate signature) which satisfies all the sentences in the set T. Consistency of a theory is usually defined in a syntactical way, but in first-order logic by the completeness theorem there is no need to distinguish between satisfiability and consistency. Therefore model theorists often use "consistent" as a synonym for "satisfiable". A theory is called categorical if it determines a structure up to isomorphism, but it turns out that this definition is not useful, due to serious restrictions in the expressivity of first-order logic. The LwenheimSkolem theorem implies that for every theory T[1] which has an infinite model and for every infinite cardinal number , there is a model such that the number of elements of is exactly . Therefore only finite structures can be described by a categorical theory. Lack of expressivity (when compared to higher logics such as second-order logic) has its advantages, though. For model theorists the LwenheimSkolem theorem is an important practical tool rather than the source of Skolem's
Model theory paradox. First-order logic is in some sense (for which see Lindstrm's theorem) the most expressive logic for which both the LwenheimSkolem theorem and the compactness theorem hold: Compactness theorem Every unsatisfiable first-order theory has a finite unsatisfiable subset. This important theorem, due to Gdel, is of central importance in infinite model theory, where the words "by compactness" are commonplace. One way to prove it is by means of ultraproducts. An alternative proof uses the completeness theorem, which is otherwise reduced to a marginal role in most of modern model theory.
79
Model theory
80
Categoricity
As observed in the section on first-order logic, first-order theories cannot be categorical, i.e. they cannot describe a unique model up to isomorphism, unless that model is finite. But two famous model-theoretic theorems deal with the weaker notion of -categoricity for a cardinal . A theory T is called -categorical if any two models of T that are of cardinality are isomorphic. It turns out that the question of -categoricity depends critically on whether is bigger than the cardinality of the language (i.e. +||, where || is the cardinality of the signature). For finite or countable signatures this means that there is a fundamental difference between uncountable . A few characterizations of -categoricity include: For a complete first-order theory T in a finite or countable signature the following conditions are equivalent: 1. T is -categorical. 2. For every natural number n, the Stone space Sn(T) is finite. 3. For every natural number n, the number of formulas (x1, ..., xn) in n free variables, up to equivalence modulo T, is finite. This result, due independently to Engeler, Ryll-Nardzewski and Svenonius, is sometimes referred to as the Ryll-Nardzewski theorem. Further, -categorical theories and their countable models have strong ties with oligomorphic groups. They are often constructed as Frass limits. Michael Morley's highly non-trivial result that (for countable languages) there is only one notion of uncountable categoricity was the starting point for modern model theory, and in particular classification theory and stability theory: Morley's categoricity theorem If a first-order theory T in a finite or countable signature is -categorical for some uncountable cardinal , then T is -categorical for all uncountable cardinals . Uncountably categorical (i.e. -categorical for all uncountable cardinals ) theories are from many points of view the most well-behaved theories. A theory that is both -categorical and uncountably categorical is called totally categorical. -cardinality and -cardinality for
Model theory
81
Interpretability
Given a mathematical structure, there are very often associated structures which can be constructed as a quotient of part of the original structure via an equivalence relation. An important example is a quotient group of a group. One might say that to understand the full structure one must understand these quotients. When the equivalence relation is definable, we can give the previous sentence a precise meaning. We say that these structures are interpretable. A key fact is that one can translate sentences from the language of the interpreted structures to the language of the original structure. Thus one can show that if a structure M interprets another whose theory is undecidable, then M itself is undecidable.
Model theory
82
Types
Fix an -structure , and a natural number . The set of definable subsets of over some parameters is a Boolean algebra. By Stone's representation theorem for Boolean algebras there is a natural dual notion to this. One can consider this to be the topological space consisting of maximal consistent sets of formulae over . We call this the space of (complete) Now consider an element so that One can show that for any that is the type of over . -type -types over , and write . with parameters in over of in free variables . so . Then the set of all formulae
is consistent and maximal such. It is called the type of , there exists some elementary extension
and some
Many important properties in model theory can be expressed with types. Further many proofs go via constructing models with elements that contain elements with certain types and then using these elements. Illustrative Example: Suppose is an algebraically closed field. The theory has quantifier elimination . This allows us to show that a type is determined exactly by the polynomial equations it contains. Thus the space of -types over a subfield is bijective with the set of prime ideals of the polynomial ring . This is the same set as the spectrum of . Note however that the topology considered on the type space is the or of the form constructible topology: a set of types is basic open iff it is of the form . This is finer than the Zariski topology.
Early history
Model theory as a subject has existed since approximately the middle of the 20th century. However some earlier research, especially in mathematical logic, is often regarded as being of a model-theoretical nature in retrospect. The first significant result in what is now model theory was a special case of the downward LwenheimSkolem theorem, published by Leopold Lwenheim in 1915. The compactness theorem was implicit in work by Thoralf Skolem,[2] but it was first published in 1930, as a lemma in Kurt Gdel's proof of his completeness theorem. The LwenheimSkolems theorem and the compactness theorem received their respective general forms in 1936 and 1941 from Anatoly Maltsev.
Notes
[1] In a countable signature. The theorem has a straightforward generalization to uncountable signatures. [2] All three commentators [i.e. Vaught, van Heijenoort and Dreben] agree that both the completeness and compactness theorems were implicit in Skolem 1923 [...], Dawson (1993).
References
Canonical textbooks
Chang, Chen Chung; Keisler, H. Jerome (1990) [1973]. Model Theory. Studies in Logic and the Foundations of Mathematics (3rd ed.). Elsevier. ISBN978-0-444-88054-3 Hodges, Wilfrid (1997). A shorter model theory. Cambridge: Cambridge University Press. ISBN978-0-521-58713-6 Marker, David (2002). Model Theory: An Introduction. Graduate Texts in Mathematics 217. Springer. ISBN0-387-98760-6.
Model theory
83
Other textbooks
Bell, John L.; Slomson, Alan B. (2006) [1969]. Models and Ultraproducts: An Introduction (reprint of 1974 ed.). Dover Publications. ISBN0-486-44979-3. Ebbinghaus, Heinz-Dieter; Flum, Jrg; Thomas, Wolfgang (1994). Mathematical Logic. Springer. ISBN0-38794258-0. Hinman, Peter G. (2005). Fundamentals of Mathematical Logic. A K Peters. ISBN1-568-81262-0. Manzano, Maria (1989). Teoria de modelos. Alianza editorial. ISBN84-206-8126-1. Hodges, Wilfrid (1993). Model theory. Cambridge University Press. ISBN0-521-30442-3. Manzano, Maria (1999). Model theory. Oxford University Press. ISBN0-19-853851-0. Poizat, Bruno (2000). A Course in Model Theory. Springer. ISBN0-387-98655-3. Rautenberg, Wolfgang (2010). A Concise Introduction to Mathematical Logic (http://www.springerlink.com/ content/978-1-4419-1220-6/) (3rd ed.). New York: Springer Science+Business Media. doi:10.1007/978-1-4419-1221-3. ISBN978-1-4419-1220-6. Rothmaler, Philipp (2000). Introduction to Model Theory (new ed.). Taylor & Francis. ISBN9056993135.
Compactness theorem
84
Compactness theorem
In mathematical logic, the compactness theorem states that a set of first-order sentences has a model if and only if every finite subset of it has a model. This theorem is an important tool in model theory, as it provides a useful method for constructing models of any set of sentences that is finitely consistent. The compactness theorem for the propositional calculus is a consequence of Tychonoff's theorem (which says that the product of compact spaces is compact) applied to compact Stone spaces;[1] hence, the theorem's name. Likewise, it is analogous to the finite intersection property characterization of compactness in topological spaces: a collection of closed sets in a compact space has a non-empty intersection if every finite subcollection has a non-empty intersection. The compactness theorem is one of the two key properties, along with the downward LwenheimSkolem theorem, that is used in Lindstrm's theorem to characterize first-order logic. Although there are some generalizations of the compactness theorem to non-first-order logics, the compactness theorem itself does not hold in them.
Applications
The compactness theorem has many applications in model theory; a few typical results are sketched here. The compactness theorem implies Robinson's principle: If a first-order sentence holds in every field of characteristic zero, then there exists a constant p such that the sentence holds for every field of characteristic larger than p. This can be seen as follows: suppose is a sentence that holds in every field of characteristic zero. Then its negation , together with the field axioms and the infinite sequence of sentences 1+10, 1+1+10, , is not satisfiable (because there is no field of characteristic 0 in which holds, and the infinite sequence of sentences ensures any model would be a field of characteristic 0). Therefore, there is a finite subset A of these sentences that is not satisfiable. We can assume that A contains , the field axioms, and, for some k, the first k sentences of the form 1+1+...+10 (because adding more sentences doesn't change unsatisfiability). Let B contains all the sentences of A except . Then any model of B is a field of characteristic greater than k, and together with B is not satisfiable. This means that must hold in every model of B, which means precisely that holds in every field of characteristic greater than k. A second application of the compactness theorem shows that any theory that has arbitrarily large finite models, or a single infinite model, has models of arbitrary large cardinality (this is the Upward LwenheimSkolem theorem). So, for instance, there are nonstandard models of Peano arithmetic with uncountably many 'natural numbers'. To achieve this, let T be the initial theory and let be any cardinal number. Add to the language of T one constant symbol for every element of . Then add to T a collection of sentences that say that the objects denoted by any two distinct constant symbols from the new collection are distinct (this is a collection of 2 sentences). Since every finite subset of this new theory is satisfiable by a sufficiently large finite model of T, or by any infinite model, the entire extended theory is satisfiable. But any model of the extended theory has cardinality at least A third application of the compactness theorem is the construction of nonstandard models of the real numbers, that is, consistent extensions of the theory of the real numbers that contain "infinitesimal" numbers. To see this, let be a first-order axiomatization of the theory of the real numbers. Consider the theory obtained by adding a new constant symbol to the language and adjoining to the axiom >0 and the axioms <1/n for all positive integers n. Clearly, the standard real numbers R are a model for every finite subset of these axioms, because the real numbers satisfy everything in and, by suitable choice of , can be made to satisfy any finite subset of the axioms about . By the compactness theorem, there is a model *R that satisfies and also contains an infinitesimal element . A similar argument, adjoining axioms >0, >1, etc., shows that the existence of infinitely large integers cannot be ruled out by any axiomatization of the reals.[2]
Compactness theorem
85
Proofs
One can prove the compactness theorem using Gdel's completeness theorem, which establishes that a set of sentences is satisfiable if and only if no contradiction can be proven from it. Since proofs are always finite and therefore involve only finitely many of the given sentences, the compactness theorem follows. In fact, the compactness theorem is equivalent to Gdel's completeness theorem, and both are equivalent to the Boolean prime ideal theorem, a weak form of the axiom of choice.[3] Gdel originally proved the compactness theorem in just this way, but later some "purely semantic" proofs of the compactness theorem were found, i.e., proofs that refer to truth but not to provability. One of those proofs relies on ultraproducts hinging on the axiom of choice as follows: Proof: Fix a first-order language L, and let be a collection of L-sentences such that every finite subcollection of L-sentences, i of it has a model . Also let be the direct product of the structures and I be the
collection of finite subsets of . For each i in I let Ai := { j I : j i}. The family of all these sets Ai generates a filter, so there is an ultrafilter U containing all sets of the form Ai. Now for any formula in we have: the set A{} is in U whenever jA{}, then j, hence holds in the set of all j with the property that holds in is a superset of A{}, hence also in U . So this ultraproduct satisfies all formulas
Notes
[1] See Truss (1997). [2] Goldblatt, Robert (1998). Lectures on the Hyperreals. New York: Springer. pp.1011. ISBN038798464X. [3] See Hodges (1993).
References
Boolos, George; Jeffrey, Richard; Burgess, John (2004). Computability and Logic (fourth ed.). "Cambridge University Press. Chang, C.C.; Keisler, H. Jerome (1989). Model Theory (third ed.). Elsevier. ISBN0-7204-0692-7. Dawson, John W. junior (1993). "The compactness of first-order logic: From Gdel to Lindstrm". History and Philosophy of Logic 14: 1537. doi:10.1080/01445349308837208. Hodges, Wilfrid (1993). Model theory. Cambridge University Press. ISBN0-521-30442-3. Marker, David (2002). Model Theory: An Introduction. Graduate Texts in Mathematics 217. Springer. ISBN0-387-98760-6. Truss, John K. (1997). Foundations of Mathematical Analysis. Oxford University Press. ISBN0198533756.
Further reading
Hummel, Christoph (1997). Gromov's compactness theorem for pseudo-holomorphic curves. Basel, Switzerland: Birkhuser. ISBN3-7643-5735-5.
LwenheimSkolem theorem
86
LwenheimSkolem theorem
In mathematical logic, the LwenheimSkolem theorem, named for Leopold Lwenheim and Thoralf Skolem, states that if a countable first-order theory has an infinite model, then for every infinite cardinal number it has a model of size . The result implies that first-order theories are unable to control the cardinality of their infinite models, and that no first-order theory with an infinite model can have exactly one model up to isomorphism. The (downward) LwenheimSkolem theorem is one of the two key properties, along with the compactness theorem, that are used in Lindstrm's theorem to characterize first-order logic. In general, the LwenheimSkolem theorem does not hold in stronger logics such as second-order logic.
Background
A signature consists of a set of function symbols Sfunc, a set of relation symbols Srel, and a function representing the arity of function and relation symbols. (A nullary function symbol is called a constant symbol.) In the context of first-order logic, a signature is sometimes called a language. It is called countable if the set of function and relation symbols in it is countable, and in general the cardinality of a signature is the cardinality of the set of all the symbols it contains. A first-order theory consists of a fixed signature and a fixed set of sentences (formulas with no free variables) in that signature. Theories are often specified by giving a list of axioms that generate the theory, or by giving a structure and taking the theory to consist of the sentences satisfied by the structure. Given a signature , a -structure M is a concrete interpretation of the symbols in . It consists of an underlying set (often also denoted by "M") together with an interpretation of the function and relation symbols of . An interpretation of a constant symbol of in M is simply an element of M. More generally, an interpretation of an n-ary function symbol f is a function from Mn to M. Similarly, an interpretation of a relation symbol R is an n-ary relation on M, i.e. a subset ofMn. A substructure of a -structure M is obtained by taking a subset N of M which is closed under the interpretations of all the function symbols in (hence includes the interpretations of all constant symbols in ), and then restricting the interpretations of the relation symbols to N. An elementary substructure is a very special case of this; in particular an elementary substructure satisfies exactly the same first-order sentences as the original structure (its elementary extension).
Precise statement
The modern statement of the theorem is both more general and stronger than the version for countable signatures stated in the introduction. In its general form, the LwenheimSkolem Theorem states that for every signature , every infinite -structure M and every infinite cardinal number || there is a -structure N such that |N| = and if < |M| then N is an elementary substructure of M; if > |M| then N is an elementary extension of M. The theorem is often divided into two parts corresponding to the two bullets above. The part of the theorem asserting that a structure has elementary substructures of all smaller infinite cardinalities is known as the downward LwenheimSkolem Theorem. The part of the theorem asserting that a structure has elementary extensions of all larger cardinalities is known as the upward LwenheimSkolem Theorem. The statement given in the introduction follows immediately by taking M to be an infinite model of the theory. The proof of the upward part of the theorem also shows that a theory with arbitrarily large finite models must have an infinite model; sometimes this is considered to be part of the theorem. For historical variants of the theorem, see the
87
Proof sketch
Downward part
For each first-order -formula the axiom of choice implies the existence of a function
, either
or
Applying the axiom of choice again we get a function from the first order formulas The family of functions for Iterating countably many times results in a closure operator , and having defined gives rise to a preclosure operator on the power set of
to such functions
such that
is an elementary substructure
of by the TarskiVaught test. The trick used in this proof is essentially due to Skolem, who introduced function symbols for the Skolem functions into the language. One could also define the as partial functions such that is defined if and only if The only important point is that contains a solution for every formula with parameters in is a preclosure operator such that and that which has a solution in
LwenheimSkolem theorem
88
Upward part
First, one extends the signature by adding a new constant symbol for every element of M. The complete theory of M for the extended signature ' is called the elementary diagram of M. In the next step one adds many new constant symbols to the signature and adds to the elementary diagram of M the sentences c c' for any two distinct new constant symbols c and c'. Using the compactness theorem, the resulting theory is easily seen to be consistent. Since its models must have cardinality at least , the downward part of this theorem guarantees the existence of a model N which has cardinality exactly . It contains an isomorphic copy of M as an elementary substructure.
Historical notes
This account is based mainly on Dawson (1993). To understand the early history of model theory one must distinguish between syntactical consistency (no contradiction can be derived using the deduction rules for first-order logic) and satisfiability (there is a model). Somewhat surprisingly, even before the completeness theorem made the distinction unnecessary, the term consistent was used sometimes in one sense and sometimes in the other. The first significant result in what later became model theory was Lwenheim's theorem in Leopold Lwenheim's publication "ber Mglichkeiten im Relativkalkl" (1915): For every countable signature , every -sentence which is satisfiable is satisfiable in a countable model. Lwenheim's proof, however, was faulty. Thoralf Skolem (1920) gave a correct proof using formulas in what would later be called Skolem normal form and relying on the axiom of choice: Every countable theory which is satisfiable in a model M, is satisfiable in a countable substructure of M. Skolem (1923) also proved the following weaker version without the axiom of choice: Every countable theory which is satisfiable in a model is also satisfiable in a countable model. Skolem (1929) simplified Skolem (1920). Finally, Anatoly Ivanovich Maltsev ( , 1936) proved the LwenheimSkolem theorem in its full generality. He cited a note by Skolem, according to which the theorem had been proved by Alfred Tarski in a seminar in 1928. Therefore the general theorem is sometimes known as the LwenheimSkolemTarski theorem. But Tarski did not remember his proof, and it remains a mystery how he could do it without the compactness theorem. It is somewhat ironic that Skolem's name is connected with the upward direction of the theorem as well as with the downward direction: "I follow custom in calling Corollary 6.1.4 the upward Lwenheim-Skolem theorem. But in fact Skolem didn't even believe it, because he didn't believe in the existence of uncountable sets." Hodges (1993). "Skolem [...] rejected the result as meaningless; Tarski [...] very reasonably responded that Skolem's formalist viewpoint ought to reckon the downward Lwenheim-Skolem theorem meaningless just like the upward." Hodges (1993). "Legend has it that Thoralf Skolem, up until the end of his life, was scandalized by the association of his name to a result of this type, which he considered an absurdity, nondenumerable sets being, for him, fictions without real existence." Poizat (2000).
LwenheimSkolem theorem
89
References
The LwenheimSkolem theorem is treated in all introductory texts on model theory or mathematical logic.
Historical publications
Veblen, Oswald (1904), "A System of Axioms for Geometry", Transactions of the American Mathematical Society 5 (3): 343384, doi:10.2307/1986462, ISSN0002-9947, JSTOR1986462 (1915), "ber Mglichkeiten im Relativkalkl", Mathematische Annalen 76 (4): 447470, doi:10.1007/BF01458217, ISSN0025-5831 Skolem, Thoralf (1920), "Logisch-kombinatorische Untersuchungen ber die Erfllbarkeit oder Beweisbarkeit mathematischer Stze nebst einem Theoreme ber dichte Mengen", Videnskapsselskapet Skrifter, I. Matematisk-naturvidenskabelig Klasse 6: 136 Skolem, Thoralf (1923) Einige Bemerkungen zu axiomatischen Begrndung der Mengenlehre, Mathematikerkongressen i Helsingfors 4.7. Juli 1922, Den femte skandinaviska matematikerkongressen, Redogoerelse, 217232. Skolem, Thoralf (1929), "ber einige Grundlagenfragen der Mathematik", Skrifter utgitt av det Norske Videnskaps-Akademi i Oslo, I. Matematisk-naturvidenskabelig Klasse 7: 149 (1936), "Untersuchungen aus dem Gebiete der mathematischen Logik", Matematicheskii Sbornik, n.s. 1: 323336
Secondary sources
Badesa, Calixto (2004), The Birth of Model Theory: Lwenheim's Theorem in the Frame of the Theory of Relatives, Princeton, NJ: Princeton University Press, ISBN978-0-691-05853-5 Burris, Stanley N., Contributions of the Logicians, Part II, From Richard Dedekind to Gerhard Gentzen [1] Burris, Stanley N., Downward LwenheimSkolem theorem [2] Dawson, John W., Jr. (1993), "The compactness of First-Order Logic: From Gdel to Lindstrm", History and Philosophy of Logic 14: 1537, doi:10.1080/01445349308837208 Hodges, Wilfrid (1993), Model theory, Cambridge: Cambridge Univ. Pr., ISBN978-0-521-30442-9 Poizat, Bruno (2000), A Course in Model Theory: An Introduction to Contemporary Mathematical Logic, Berlin, New York: Springer, ISBN978-0-387-98655-5 Simpson, Stephen G. (1998), Model Theory [3]
References
[1] http:/ / www. math. uwaterloo. ca/ ~snburris/ htdocs/ LOGIC/ LOGICIANS/ notes2. pdf [2] http:/ / www. math. uwaterloo. ca/ ~snburris/ htdocs/ WWW/ PDF/ downward. pdf [3] http:/ / www. math. psu. edu/ simpson/ notes/ master. pdf
Elementary class
90
Elementary class
In the branch of mathematical logic called model theory, an elementary class (or axiomatizable class) is a class consisting of all structures satisfying a fixed first-order theory.
Definition
A class K of structures of a signature is called an elementary class if there is a first-order theory T of signature , such that K consists of all models of T, i.e., of all -structures that satisfy T. If T can be chosen as a theory consisting of a single first-order sentence, then K is called a basic elementary class. More generally, K is a pseudo-elementary class if there is a first-order theory T of a signature that extends , such that K consists of all -structures that are reducts to of models of T. In other words, a class K of -structures is pseudo-elementary iff there is an elementary class K' such that K consists of precisely the reducts to of the structures in K'. For obvious reasons, elementary classes are also called axiomatizable in first-order logic, and basic elementary classes are called finitely axiomatizable in first-order logic. These definitions extend to other logics in the obvious way, but since the first-order case is by far the most important, axiomatizable implicitly refers to this case when no other logic is specified.
Examples
A basic elementary class
Let be a signature consisting only of a unary function symbol f. The class K of -structures in which f is one-to-one is a basic elementary class. This is witnessed by the theory T, which consists only of the single sentence .
Elementary class
91
Non-pseudo-elementary class
Let be an arbitrary signature. The class K of all finite -structures is not elementary, because (as shown above) its complement is elementary but not basic elementary. Since this is also true for every signature extending , K is not even a pseudo-elementary class. This example demonstrates the limits of expressive power inherent in first-order logic as opposed to the far more expressive second-order logic. Second-order logic, however, fails to retain many desirable properties of first-order logic, such as the compactness theorem.
Elementary class
92
References
Chang, Chen Chung; Keisler, H. Jerome (1990) [1973], Model Theory, Studies in Logic and the Foundations of Mathematics (3rd ed.), Elsevier, ISBN978-0-444-88054-3 Ebbinghaus, Heinz-Dieter; Flum, Jrg (2005) [1995], Finite model theory, Berlin, New York: Springer-Verlag, pp.360, ISBN978-3-540-28787-2 Ebbinghaus, Heinz-Dieter; Flum, Jrg; Thomas, Wolfgang (1994), Mathematical Logic (2nd ed.), Berlin, New York: Springer-Verlag, ISBN978-0-387-94258-2 Hodges, Wilfrid (1997), A shorter model theory, Cambridge University Press, ISBN978-0-521-58713-6 Poizat, Bruno (2000), A Course in Model Theory: An Introduction to Contemporary Mathematical Logic, Berlin, New York: Springer-Verlag, ISBN978-0-387-98655-5
Saturated model
In mathematical logic, and particularly in its subfield model theory, a saturated model M is one which realizes as many complete types as may be "reasonably expected" given its size. For example, an ultrapower model of the hyperreals is countably saturated, meaning that every descending nested sequence of internal sets has a nonempty intersection, see Goldblatt (1998).
Definition
Let be a finite or infinite cardinal number and M a model in some first-order language. Then M is called -saturated if for all subsets A M of cardinality less than , M realizes all complete types over A. The model M is called saturated if it is |M|-saturated where |M| denotes the cardinality of M. That is, it realizes all complete types over sets of parameters of size less than |M|. According to some authors, a model M is called countably saturated if it is -saturated; that is, it realizes all complete types over countable sets of parameters. According to others, it is countably saturated if it is -saturated; i.e. realizes all complete types over finite parameter sets.
Motivation
The seemingly more intuitive notion that all complete types of the language are realized turns out to be too weak (and is, appropriately, named weak saturation, which is the same as 1-saturation). The difference lies in the fact that many structures contain elements which are not definable (for example, any transcendental element of R is, by definition of the word, not definable in the field language). However, they still form a part of the structure, so we need types to describe relationships with them. Thus we allow sets of parameters from the structure in our definition of types. This argument allows us to discuss specific features of the model which we may otherwise miss for example, a specific increasing sequence cn having a bound can be expressed as realizing the type {x > cn : n }, which uses countably many parameters. If the sequence is not definable, this fact about the structure cannot be described using the base language, so a weakly saturated structure may not bound the sequence, while an -saturated structure will. The reason we only require parameter sets which are strictly smaller than the model is trivial: without this restriction, no infinite model is saturated. Consider a model M, and the type {x m : m M}. Each finite subset of this type is realized in the (infinite) model M, so by compactness it is consistent with M, but is trivially not realized. Any definition which is universally unsatisfied is useless; hence the restriction.
Saturated model
93
Examples
Saturated models exist for certain theories and cardinalities: (Q, <) the set of rational numbers with their usual ordering is saturated. Intuitively, this is because any type consistent with the theory is implied by the order type; that is, the order the variables come in tells you everything there is to know about their role in the structure. (R, <) the set of real numbers with their usual ordering is not saturated. For example, take the type (in one variable x) which contains the formula for every natural number n, as well as the formula . This type uses different parameters from R. Every finite subset of the type is realized on R by some real x, so by compactness it is consistent with the structure, but it is not realized, as it would imply an upper bound to the sequence 1/n which is less than 0 (its least upper bound). Thus (R,<) is not 1-saturated, and not saturated. However, it is -saturated, for essentially the same reason as Q every finite type is given by the order type, which if consistent, is always realized, because of the density of the order. The countable random graph, with the only non-logical symbol being the edge existence relation, is also saturated, because any complete type is implied by the finite subgraph consisting of the variables and paramaters used to define the type. Both of these theories can be shown to be -categorical through the back-and-forth method. This can be generalized as follows: the unique model of cardinality of a countable -categorical theory is saturated. However, the statement that every model has a saturated elementary extension is not provable in ZFC. In fact, this statement is equivalent to the existence of a proper class of cardinals such that <=. The latter identity implies that either = + = 2 for some , or is weakly inaccessible.
References
Chang, C. C.; Keisler, H. J. Model theory. Third edition. Studies in Logic and the Foundations of Mathematics, 73. North-Holland Publishing Co., Amsterdam, 1990. xvi+650 pp. ISBN 0-444-88054-2 R. Goldblatt (1998). Lectures on the hyperreals. An introduction to nonstandard analysis. Springer. Marker, David (2002). Model Theory: An Introduction. New York: Springer-Verlag. ISBN 0-387-98760-6 Poizat, Bruno; Trans: Klein, Moses (2000), A Course in Model Theory, New York: Springer-Verlag. ISBN 0-387-98655-3 Sacks, Gerald E. (1972), Saturated model theory, W. A. Benjamin, Inc., Reading, Mass., MR0398817
Kripke semantics
94
Kripke semantics
Kripke semantics (also known as relational semantics or frame semantics, and often confused with possible world semantics) is a formal semantics for non-classical logic systems created in the late 1950s and early 1960s by Saul Kripke. It was first made for modal logics, and later adapted to intuitionistic logic and other non-classical systems. The discovery of Kripke semantics was a breakthrough in the theory of non-classical logics, because the model theory of such logics was nonexistent before Kripke.
Basic definitions
A Kripke frame or modal frame is a pair A Kripke model is a triple W and modal formulas, such that: if and only if if and only if if and only if We read , or , for all such that . is called the satisfaction , where , where W is a non-empty set, and R is a binary relation on W. is a Kripke frame, and is a relation between nodes of Elements of W are called nodes or worlds, and R is known as the accessibility relation.
relation, evaluation, or forcing relation. The satisfaction relation is uniquely determined by its value on propositional variables. A formula A is valid in: a model , if for all wW, a frame , if it is valid in for all possible choices of a class C of frames or models, if it is valid in every member of C. ,
We define Thm(C) to be the set of all formulas that are valid in C. Conversely, if X is a set of formulas, let Mod(X) be the class of all frames which validate every formula from X. A modal logic (i.e., a set of formulas) L is sound with respect to a class of frames C, if LThm(C). L is complete wrt C if LThm(C).
Kripke semantics corresponding class. Consider the schema T : propositional variable p as follows: . T is valid in any reflexive frame if and only if wRu. Then : if , thus , then by T, which means
95
since wRw. On the other hand, a frame which validates T has to be reflexive: fix wW, and define satisfaction of a wRw using the definition of . T corresponds to the class of reflexive Kripke frames. It is often much easier to characterize the corresponding class of L than to prove its completeness, thus correspondence serves as a guide to completeness proofs. Correspondence is also used to show incompleteness of modal logics: suppose L1L2 are normal modal logics that correspond to the same class of frames, but L1 does not prove all theorems of L2. Then L1 is Kripke incomplete. For example, the schema generates an incomplete logic, as it corresponds to the same class of frames as GL (viz. transitive and converse well-founded frames), but does not prove the GL-tautology . The table below is a list of common modal axioms together with their corresponding classes. The naming of the axioms often varies.
Here is a list of several common modal systems. Frame conditions for some of them were simplified: the logics are complete with respect to the frame classes given in the table, but they may correspond to a larger class of frames.
Kripke semantics
96
Grz or T, 4, Grz finite partial order D D, 4, 5 serial transitive, serial, and Euclidean
Canonical models
For any normal modal logic L, a Kripke model (called the canonical model) can be constructed, which validates precisely the theorems of L, by an adaptation of the standard technique of using maximal consistent sets as models. Canonical Kripke models play a role similar to the LindenbaumTarski algebra construction in algebraic semantics. A set of formulas is L-consistent if no contradiction can be derived from them using the axioms of L, and Modus Ponens. A maximal L-consistent set (an L-MCS for short) is an L-consistent set which has no proper L-consistent superset. The canonical model of L is a Kripke model are as follows: if and only if for every formula if and only if . , if , where W is the set of all L-MCS, and the relations R and then ,
The canonical model is a model of L, as every L-MCS contains all theorems of L. By Zorn's lemma, each L-consistent set is contained in an L-MCS, in particular every formula unprovable in L has a counterexample in the canonical model. The main application of canonical models are completeness proofs. Properties of the canonical model of K immediately imply completeness of K with respect to the class of all Kripke frames. This argument does not work for arbitrary L, because there is no guarantee that the underlying frame of the canonical model satisfies the frame conditions of L. We say that a formula or a set X of formulas is canonical with respect to a property P of Kripke frames, if X is valid in every frame which satisfies P, for any normal modal logic L which contains X, the underlying frame of the canonical model of L satisfies P. A union of canonical sets of formulas is itself canonical. It follows from the preceding discussion that any logic axiomatized by a canonical set of formulas is Kripke complete, and compact. The axioms T, 4, D, B, 5, H, G (and thus any combination of them) are canonical. GL and Grz are not canonical, because they are not compact. The axiom M by itself is not canonical (Goldblatt, 1991), but the combined logic S4.1 (in fact, even K4.1) is canonical.
Kripke semantics In general, it is undecidable whether a given axiom is canonical. We know a nice sufficient condition: H. Sahlqvist identified a broad class of formulas (now called Sahlqvist formulas) such that a Sahlqvist formula is canonical, the class of frames corresponding to a Sahlqvist formula is first-order definable, there is an algorithm which computes the corresponding frame condition to a given Sahlqvist formula. This is a powerful criterion: for example, all axioms listed above as canonical are (equivalent to) Sahlqvist formulas.
97
Multimodal logics
Kripke semantics has a straightforward generalization to logics with more than one modality. A Kripke frame for a language with as the set of its necessity operators consists of a non-empty set W equipped with binary relations Ri for each iI. The definition of a satisfaction relation is modified as follows: if and only if A simplified semantics, discovered by Tim Carlson, is often used for polymodal provability logics. A Carlson model is a structure with a single accessibility relation R, and subsets DiW for each modality. Satisfaction is defined as if and only if Carlson models are easier to visualize and to work with than usual polymodal Kripke models; there are, however, Kripke complete polymodal logics which are Carlson incomplete.
Kripke semantics The negation of A, A, could be defined as an abbreviation for A . If for all u such that w u, not u A, then w A is vacuously true, so w A. Intuitionistic logic is sound and complete with respect to its Kripke semantics, and it has FMP.
98
Here e(xa) is the evaluation which gives x the value a, and otherwise agrees with e. See a slightly different formalization in.[1]
KripkeJoyal semantics
As part of the independent development of sheaf theory, it was realised around 1965 that Kripke semantics was intimately related to the treatment of existential quantification in topos theory.[2] That is, the 'local' aspect of existence for sections of a sheaf was a kind of logic of the 'possible'. Though this development was the work of a number of people, the name KripkeJoyal semantics is often used in this connection.
Model constructions
As in the classical model theory, there are methods for constructing a new Kripke model from other models. The natural homomorphisms in Kripke semantics are called p-morphisms (which is short for pseudo-epimorphism, but the latter term is rarely used). A p-morphism of Kripke frames and is a mapping such that f preserves the accessibility relation, i.e., uRv implies f(u)Rf(v), whenever f(u)Rv, there is a vW such that uRv and f(v)=v. A p-morphism of Kripke models , which satisfies if and only if , for any propositional variable p. and P-morphisms are a special kind of bisimulations. In general, a bisimulation between frames is a relation BWW, which satisfies the following zig-zag property: if uBu and uRv, there exists vW such that vBv and uRv, if uBu and uRv, there exists vW such that vBv and uRv. A bisimulation of models is additionally required to preserve forcing of atomic formulas: and is a p-morphism of their underlying frames
Kripke semantics if wBw, then if and only if , for any propositional variable p.
99
The key property which follows from this definition is that bisimulations (hence also p-morphisms) of models preserve the satisfaction of all formulas, not only propositional variables. We can transform a Kripke model into a tree using unravelling. Given a model w0W, we define a model such that wiRwi+1 for all i<n, and , where W is the set of all finite sequences if and only if for a propositional variable p. The definition of and a fixed node
the accessibility relation R varies; in the simplest case we put , but many applications need the reflexive and/or transitive closure of this relation, or similar modifications. Filtration is a variant of a p-morphism. Let X be a set of formulas closed under taking subformulas. An X-filtration of a model is a mapping f from W to a model such that f is a surjection, f preserves the accessibility relation, and (in both directions) satisfaction of variables pX, if f(u)Rf(v) and , where , then . It follows that f preserves satisfaction of all formulas from X. In typical applications, we take f as the projection onto the quotient of W over the relation uXv if and only if for all AX, if and only if .
As in the case of unravelling, the definition of the accessibility relation on the quotient varies.
Kripke semantics Arthur Prior, building on unpublished work of C. A. Meredith, developed a translation of sentential modal logic into classical predicate logic that, if he had combined it with the usual model theory for the latter, would have produced a model theory equivalent to Kripke models for the former. But his approach was resolutely syntactic and anti-model-theoretic. Stig Kanger gave a rather more complex approach to the interpretation of modal logic, but one that contains many of the key ideas of Kripke's approach. He first noted the relationship between conditions on accessibility relations and Lewis-style axioms for modal logic. Kanger failed, however, to give a completeness proof for his system; Jaakko Hintikka gave a semantics in his papers introducing epistemic logic that is a simple variation of Kripke's semantics, equivalent to the characterisation of valuations by means of maximal consistent sets. He doesn't give inference rules for epistemic logic, and so cannot give a completeness proof; Richard Montague had many of the key ideas contained in Kripke's work, but he did not regard them as significant, because he had no completeness proof, and so did not publish until after Kripke's papers had created a sensation in the logic community; Evert Willem Beth presented a semantics of intuitionistic logic based on trees, which closely resembles Kripke semantics, except for using a more cumbersome definition of satisfaction. Though the essential ideas of Kripke semantics were very much in the air by the time Kripke first published, Saul Kripke's work on modal logic is rightly regarded as ground-breaking. Most importantly, it was Kripke who proved the completeness theorems for modal logic, and Kripke who identified the weakest normal modal logic. Despite the seminal contribution of Kripke's work, many modal logicians deprecate the term Kripke semantics as disrespectful of the important contributions these other pioneers made. The other most widely used term possible world semantics is deprecated as inappropriate when applied to modalities other than possibility and necessity, such as in epistemic or deontic logic. Instead they prefer the terms relational semantics or frame semantics. The use of "semantics" for "model theory" has been objected to as well, on the grounds that it invites confusion with linguistic semantics: whether the apparatus of "possible worlds" that appears in models has anything to do with the linguistic meaning of modal constructions in natural language is a contentious issue.
100
Notes
[1] Intuitionistic Logic (http:/ / www. seop. leeds. ac. uk/ archives/ spr2004/ entries/ logic-intuitionistic/ ). Written by Joan Moschovakis (http:/ / www. math. ucla. edu/ ~joan/ ). Published in Stanford Encyclopedia of Philosophy. [2] Goldblatt, Robert; A Kripke-Joyal Semantics for Noncommutative Logic in Quantales; Advances in Modal Logic; Volume 6; 2006
References
Blackburn, P., M. de Rijke, and Y. Venema, 2001. Modal Logic. Cambridge University Press. Bull, Robert. A., and K. Segerberg, 1984, "Basic Modal Logic" in The Handbook of Philosophical Logic, vol. 2. Kluwer: 188. Chagrov, A, and Zakharyaschev, M., 1997. Modal Logic. Oxford University Press. Michael Dummett, 1977. Elements of Intuitionism. Oxford Univ. Press. Fitting, Melvin, 1969. Intuitionistic Logic, Model Theory and Forcing. North Holland. Robert Goldblatt (link) (http://www.msor.vuw.ac.nz/~rob/), 2003, " Mathematical Modal Logic: a View of its Evolution (http://www.msor.vuw.ac.nz/~rob/papers/modalhist.pdf)", In Logic & the Modalities in the Twentieth Century, volume 7 of the Handbook of the History of Logic, edited by Dov M. Gabbay and John Woods, Elsevier, 2006, 1-98. Hughes, G. E., and M. J. Cresswell, 1996. A New Introduction to Modal Logic. Routledge. Saunders Mac Lane and Moerdijk, I., 1991. Sheaves in Geometry and Logic. Springer-Verlag. van Dalen, Dirk, 1986, "Intuitionistic Logic" in The Handbook of Philosophical Logic, vol. 3. Reidel: 225339.
Kripke semantics
101
External links
The Stanford Encyclopedia of Philosophy: " Modal Logic (http://plato.stanford.edu/archives/win2001/entries/ logic-modal)" by James Garson. Intuitionistic Logic (http://www.seop.leeds.ac.uk/archives/spr2004/entries/logic-intuitionistic/). Written by Joan Moschovakis (http://www.math.ucla.edu/~joan/). Published in Stanford Encyclopedia of Philosophy. Detlovs and Podnieks, K., " Constructive Propositional Logic Kripke Semantics. (http://www.ltn.lv/ ~podnieks/mlog/ml4a.htm#s44)" Chapter 4.4 of Introduction to Mathematical Logic. N.B: Constructive = intuitionistic. Burgess, John P., " Kripke Models. (http://www.princeton.edu/~jburgess/Kripke1.doc)"
Forcing (mathematics)
In the mathematical discipline of set theory, forcing is a technique invented by Paul Cohen for proving consistency and independence results. It was first used, in 1963, to prove the independence of the axiom of choice and the continuum hypothesis from ZermeloFraenkel set theory. Forcing was considerably reworked and simplified in the 1960s, and has proven to be an extremely powerful technique both within set theory and in areas of mathematical logic such as recursion theory. Descriptive set theory uses both the notion of forcing from recursion theory as well as set theoretic forcing. Forcing has also been used in model theory but it is common in model theory to define genericity directly without mention of forcing.
Intuitions
Forcing is equivalent to the method of Boolean-valued models, which some feel is conceptually more natural and intuitive, but usually much more difficult to apply. Intuitively, forcing consists of expanding the set theoretical universe V to a larger universe V*. In this bigger universe, for example, one might have lots of new subsets of = {0,1,2,} that were not there in the old universe, and thereby violate the continuum hypothesis. While impossible on the face of it, this is just another version of Cantor's paradox about infinity. In principle, one could consider , identify form with , and then introduce an expanded membership relation involving the "new" sets of the
. Forcing is a more elaborate version of this idea, reducing the expansion to the existence of one new
set, and allowing for fine control over the properties of the expanded universe. Cohen's original technique, now called ramified forcing, is slightly different from the unramified forcing expounded here.
Forcing (mathematics)
102
Forcing posets
A forcing poset is an ordered triple (P, , 1) where "" is a preorder on P that satisfies following splitting condition For all p P, there are q, r P such that q, r p with no s P such that s q, r and 1 is a largest element, that is, p 1 for all p P,. Members of P are called conditions. One reads pq as p is stronger than q. Intuitively, the "smaller" condition provides "more" information, just as the smaller interval [3.1415926,3.1415927] provides more information about the number than the interval [3.1,3.2] does. (There are various conventions here. Some authors require "" to also be antisymmetric, so that the relation is a partial order. Some use the term partial order anyway, conflicting with standard terminology, while some use the term preorder. The largest element can be dispensed with. The reverse ordering is also used, most notably by Saharon Shelah and his co-authors.) Associated with a forcing poset P are the P-names. P-names are sets of the form {(u,p):u is a P-name and p P and (some criterion involving u and p)}. This definition is circular; which in set theory means it is really a definition by transfinite recursion. In long form, one defines Name(0) = {}; Name( + 1) = a well-defined subset of the power set of (Name() P); Name() = {Name() : < }, for a limit ordinal, and then defines the class of P-names to be V(P) = {Name() : is an ordinal}. The P-names are, in fact, an expansion of the universe. Given x in V, one defines x to be the P-name {(y,1) : y x}. Again, this is really a definition by transfinite recursion. Given any subset G of P, one next defines the interpretation or valuation map from names by val(u, G) = {val(v, G) : p G , (v, p) u}. (Again a definition by transfinite recursion.) Note that if 1 is in G, then val(x, G) = x. One defines G = {(p, p) : p G}, then val(G,G) = G. A good example of a forcing poset is
Forcing (mathematics) (Bor(I) , , I ), where I = [0,1] and Bor(I) are the Borel subsets of I having non-zero Lebesgue measure. In this case, one can talk about the conditions as being probabilities, and a Bor(I)-name assigns membership in a probabilistic sense. Because of the ready intuition this example can provide, probabilistic language is sometimes used with other forcing posets.
103
Forcing
Given a generic filter GP, one proceeds as follows. The subclass of P-names in M is denoted M(P). Let M[G]={val(u,G):uM(P)}. To reduce the study of the set theory of M[G] to that of M, one works with the forcing language, which is built up like ordinary first-order logic, with membership as binary relation and all the names as constants. Define p (u1,,un) (read "p forces in model M with poset P") where p is a condition, is a formula in is often written P or . Such statements are
the forcing language, and the ui are names, to mean that if G is a generic filter containing p, then M[G] (val(u1,G),,val(un,G)). The special case 1 true in M[G] no matter what G is. What is important is that this "external" definition of the forcing relation p
is equivalent to an "internal"
definition, defined by transfinite induction over the names on instances of u v and u = v, and then by ordinary induction over the complexity of formulas. This has the effect that all the properties of M[G] are really properties of M, and the verification of ZFC in M[G] becomes straightforward. This is usually summarized as three key properties: Truth: M[G] (val(u1,G),,val(un,G)) if and only if it is forced by G, that is, for some condition p G, p (u1,,un). Definability: The statement "p (u1,,un)" is definable in M. Coherence: If p (u1,,un) and q p, then q (u1,,un). We define the forcing relation in V by induction on complexity, in which we simultaneously define forcing of atomic formulas by -induction.
Forcing (mathematics) 1. p 2. p a b if for any q p there is r q such that there is (s, c) b such that r s and r a = b if p where p 3. p 4. p 5. p a b if for all q p and for all (r,c) a if q r then q f if there is no q p such that q f g if p x f if p f and p g. f. c b. a b and p ba a = c.
104
f(a) for any name a where f(a) is result of replacing all free occurrences of x in f by a.
In 1-5 p is an arbitrary condition. In 1 and 2 a and b are arbitrary names and in 3-5 f and g are arbitrary formulas. This definition provides the possibility of working in V without any countable transitive model M. The following statement gives announced definability: p f if and only if M p f. .)
Consistency
The above can be summarized by saying the fundamental consistency result is that given a forcing poset P, we may assume that there exists a generic filter G, not in the universe V, such that V[G] is again a set theoretic universe, modelling ZFC. Furthermore, all truths in V[G] can be reduced to truths in V regarding the forcing relation. Both styles, adjoining G to a countable transitive model M or to the whole universe V, are commonly used. Less commonly seen is the approach using the "internal" definition of forcing, and no mention of set or class models is made. This was Cohen's original method, and in one elaboration, it becomes the method of Boolean-valued analysis.
Cohen forcing
The simplest nontrivial forcing poset is ( Fin(,2) , , 0 ), the finite partial functions from to 2={0,1} under reverse inclusion. That is, a condition p is essentially two disjoint finite subsets p1[1] and p1[0] of , to be thought of as the "yes" and "no" parts of p, with no information provided on values outside the domain of p. q is stronger than p means that q p, in other words, the "yes" and "no" parts of q are supersets of the "yes" and "no" parts of p, and in that sense, provide more information. Let G be a generic filter for this poset. If p and q are both in G, then pq is a condition, because G is a filter. This means that g=G is a well-defined partial function from to 2, because any two conditions in G agree on their common domain. g is in fact a total function. Given n , let Dn={ p : p(n) is defined }, then Dn is dense. (Given any p, if n is not in ps domain, adjoin a value for n, the result is in Dn.) A condition p GDn has n in its domain, and since p g, g(n) is defined. Let X=g1[1], the set of all "yes" members of the generic conditions. It is possible to give a name for X directly. Let X = { ( n , p ) : p(n)=1 }, then val( X , G ) = X. Now suppose A in V. We claim that XA. Let DA = { p : n, ndom(p) and p(n)=1 if and only if nA }. DA is dense. (Given any p, if n is not in ps domain, adjoin a value for n contrary to the status of "nA".) Then any pGDA witnesses XA. To summarize, X is a new subset of , necessarily infinite. Replacing with 2, that is, consider instead finite partial functions whose inputs are of the form (n,), with n< and <2, and whose outputs are 0 or 1, one gets 2 new subsets of . They are all distinct, by a density argument: given <<2, let D,={p:n, p(n,)p(n,)}, then each D, is dense, and a generic condition in it proves that the th new set disagrees somewhere with the th new set.
Forcing (mathematics) This is not yet the falsification of the continuum hypothesis. One must prove that no new maps have been introduced which map onto 1 or 1 onto 2. For example, if one considers instead Fin(,1), finite partial functions from to 1, the first uncountable ordinal, one gets in V[G] a bijection from to 1. In other words, 1 has collapsed, and in the forcing extension, is a countable ordinal. The last step in showing the independence of the continuum hypothesis, then, is to show that Cohen forcing does not collapse cardinals. For this, a sufficient combinatorial property is that all of the antichains of this poset are countable.
105
Forcing (mathematics)
106
Easton forcing
The exact value of the continuum in the above Cohen model, and variants like Fin( , 2) for cardinals in general, was worked out by Robert M. Solovay, who also worked out how to violate GCH (the generalized continuum hypothesis), for regular cardinals only, a finite number of times. For example, in the above Cohen model, if CH holds in V, then 2 = 2 holds in V[G]. W. B. Easton worked out the infinite and proper class version of violating the GCH for regular cardinals, basically showing the known restrictions (monotonicity, Cantor's theorem, and Knig's theorem) were the only ZFC provable restrictions. See Easton's theorem. Easton's work was notable in that it involved forcing with a proper class of conditions. In general, the method of forcing with a proper class of conditions will fail to give a model of ZFC. For example, Fin ( On , 2 ), where "On" is the proper class of all ordinals, will make the continuum a proper class. Fin ( , On ) will introduce a countable enumeration of the ordinals. In both cases, the resulting V[G] is visibly not a model of ZFC. At one time, it was thought that more sophisticated forcing would also allow arbitrary variation in the powers of singular cardinals. But this has turned out to be a difficult, subtle and even surprising problem, with several more restrictions provable in ZFC, and with the forcing models depending on the consistency of various large cardinal properties. Many open problems remain.
Random reals
In the Borel sets ( Bor(I) , , I ) example, the generic filter converges to a real number r, called a random real. A name for the decimal expansion of r (in the sense of the canonical set of decimal intervals that converge to r) can be given by letting r = { ( E , E ) : E = [ k10n , (k+1)10n ], 0k<10n }. This is, in some sense, just a subname of G. To recover G from r, one takes those Borel subsets of I that "contain" r. Since the forcing poset is in V, but r is not in V, this containment is actually impossible. But there is a natural sense in which the interval [.5,.6] in V "contains" a random real whose decimal expansion begins .5. This is formalized by the notion of "Borel code". Every Borel set can, nonuniquely, be built up, starting from intervals with rational endpoints and applying the operations of complement and countable unions, a countable number of times. The record of such a construction is called a Borel code. Given a Borel set B in V, one recovers a Borel code, and then applies the same construction sequence in V[G], getting a Borel set B*. One can prove that one gets the same set independent of the construction of B, and that basic properties are preserved. For example, if BC, then B*C*. If B has measure zero, then B* has measure zero. So given r, a random real, one can show that G = { B (in V) : rB* (in V[G]) }. Because of the mutual interdefinability between r and G, one generally writes V[r] for V[G]. A different interpretation of reals in V[G] was provided by Dana Scott. Rational numbers in V[G] have names that correspond to countably many distinct rational values assigned to a maximal antichain of Borel sets, in other words, a certain rational-valued function on I = [0,1]. Real numbers in V[G] then correspond to Dedekind cuts of such functions, that is, measurable functions.
Boolean-valued models
Main article: Boolean-valued model Perhaps more clearly, the method can be explained in terms of Boolean-valued models. In these, any statement is assigned a truth value from some complete atomless Boolean algebra, rather than just a true/false value. Then an ultrafilter is picked in this Boolean algebra, which assigns values true/false to statements of our theory. The point is that the resulting theory has a model which contains this ultrafilter, which can be understood as a new model obtained by extending the old one with this ultrafilter. By picking a Boolean-valued model in an appropriate way, we
Forcing (mathematics) can get a model that has the desired property. In it, only statements which must be true (are "forced" to be true) will be true, in a sense (since it has this extension/minimality property).
107
Meta-mathematical explanation
In forcing we usually seek to show some sentence is consistent with ZFC (or optionally some extension of ZFC). One way to interpret the argument is that we assume ZFC is consistent and use it to prove ZFC combined with our new sentence is also consistent. Each "condition" is a finite piece of information - the idea is that only finite pieces are relevant for consistency, since by the compactness theorem a theory is satisfiable if and only if every finite subset of its axioms is satisfiable. Then, we can pick an infinite set of consistent conditions to extend our model. Thus, assuming consistency of set theory, we prove consistency of the theory extended with this infinite set.
Logical explanation
By Godel's incompleteness theorem one can not prove consistency of ZFC using ZFC axioms and consequently for considered hypothesis H. By this reason one proves consistency of ZFC+H relative to consistency of ZFC. Such problems are known as problems of relative consistency. In fact one proves (*) We will give the general schema of relative consistency proofs. Because any proof is finite it uses finite number of axioms.
For any given proof ZFC can verify validity of this proof. This is provable by induction by the length of the proof.
Now we obtain
which is equivalent to
which gives (*). Core of relative consistency proof is proving of (**). One have to construct ZFC proof of Con(T+H) for any given finite set T of ZFC axioms (by ZFC instruments of course). (No universal proof of Con(T+H) of course.) In ZFC is provable that for any condition p the set of formulas (evaluated by names) forced by p is deductive closed. Also, for any ZFC axiom ZFC proves that this axiom is forced by 1. Then proving that there is at least one condition which forces H suffices. In case of Boolean valued forcing procedure is similar - one have to prove that Boolean value of H is not 0. Another approach is using reflection theorem. For any given finite set of ZFC axioms there is ZFC proof that this set of axioms has countable transitive model. For any given finite set T of ZFC axioms there is finite set T' of ZFC axioms such that ZFC proves that if countable transitive model M satisfies T' then M[G] satisfies T. One have to prove that there is finite set T" of ZFC axioms such that if countable transitive model M satisfies T" then M[G] satisfies considered hypothesis H. Then for any given finite set T of ZFC axioms ZFC proves Con(T+H).
Forcing (mathematics) Sometimes in (**) some stronger theory S than ZFC is used for proving Con(T+H). Then we have proof of consistency of ZFC+H relative to consistency of S. Note that , where ZFL is ZF + V=L (axiom of constructibility).
108
References
Bell, J. L. (1985) Boolean-Valued Models and Independence Proofs in Set Theory, Oxford. ISBN 0-19-853241-5 Cohen, P. J. (1966). Set theory and the continuum hypothesis. Addison-Wesley. ISBN0-8053-2327-9. Grishin, V.N. (2001), "Forcing method" [1], in Hazewinkel, Michiel, Encyclopaedia of Mathematics, Springer, ISBN978-1556080104 Kunen, Kenneth (1980). Set Theory: An Introduction to Independence Proofs. North-Holland. ISBN0-444-85401-0.
External links
Tim Chow's article A Beginner's Guide to Forcing [2] is a good introduction to the concepts of forcing that avoids a lot of technical detail. This paper grew out of Chow's newsgroup article Forcing for dummies [3]. In addition to improved exposition, the Beginner's Guide includes a section on Boolean Valued Models. See also Kenny Easwaran's article A Cheerful Introduction to Forcing and the Continuum Hypothesis [4], which is also aimed at the beginner but includes more technical details than Chow's article. The Independence of the Continuum Hypothesis [5] Paul J. Cohen, Proceedings of the National Academy of Sciences of the United States of America, Vol. 50, No. 6. (Dec. 15, 1963), pp. 11431148. The Independence of the Continuum Hypothesis, II [6] Paul J. Cohen Proceedings of the National Academy of Sciences of the United States of America, Vol. 51, No. 1. (Jan. 15, 1964), pp. 105110.
References
[1] [2] [3] [4] [5] [6] http:/ / eom. springer. de/ F/ f040770. htm http:/ / arxiv. org/ abs/ 0712. 1320 http:/ / alum. mit. edu/ www/ tchow/ mathstuff/ forcingdum http:/ / arxiv. org/ abs/ 0712. 2279 http:/ / dx. doi. org/ 10. 1073/ pnas. 50. 6. 1143 http:/ / dx. doi. org/ 10. 1073/ pnas. 51. 1. 105
Proof theory
109
Proof theory
Proof theory is a branch of mathematical logic that represents proofs as formal mathematical objects, facilitating their analysis by mathematical techniques. Proofs are typically presented as inductively-defined data structures such as plain lists, boxed lists, or trees, which are constructed according to the axioms and rules of inference of the logical system. As such, proof theory is syntactic in nature, in contrast to model theory, which is semantic in nature. Together with model theory, axiomatic set theory, and recursion theory, proof theory is one of the so-called four pillars of the foundations of mathematics.[1] Proof theory is important in philosophical logic, where the primary interest is in the idea of a proof-theoretic semantics, an idea which depends upon technical ideas in structural proof theory to be feasible.
History
Although the formalisation of logic was much advanced by the work of such figures as Gottlob Frege, Giuseppe Peano, Bertrand Russell, and Richard Dedekind, the story of modern proof theory is often seen as being established by David Hilbert, who initiated what is called Hilbert's program in the foundations of mathematics. Kurt Gdel's seminal work on proof theory first advanced, then refuted this program: his completeness theorem initially seemed to bode well for Hilbert's aim of reducing all mathematics to a finitist formal system; then his incompleteness theorems showed that this is unattainable. All of this work was carried out with the proof calculi called the Hilbert systems. In parallel, the foundations of structural proof theory were being founded. Jan ukasiewicz suggested in 1926 that one could improve on Hilbert systems as a basis for the axiomatic presentation of logic if one allowed the drawing of conclusions from assumptions in the inference rules of the logic. In response to this Stanisaw Jakowski (1929) and Gerhard Gentzen (1934) independently provided such systems, called calculi of natural deduction, with Gentzen's approach introducing the idea of symmetry between the grounds for asserting propositions, expressed in introduction rules, and the consequences of accepting propositions in the elimination rules, an idea that has proved very important in proof theory.[2] Gentzen (1934) further introduced the idea of the sequent calculus, a calculus advanced in a similar spirit that better expressed the duality of the logical connectives,[3] and went on to make fundamental advances in the formalisation of intuitionistic logic, and provide the first combinatorial proof of the consistency of Peano arithmetic. Together, the presentation of natural deduction and the sequent calculus introduced the fundamental idea of analytic proof to proof theory,
Proof theory
110
Consistency proofs
As previously mentioned, the spur for the mathematical investigation of proofs in formal theories was Hilbert's program. The central idea of this program was that if we could give finitary proofs of consistency for all the sophisticated formal theories needed by mathematicians, then we could ground these theories by means of a metamathematical argument, which shows that all of their purely universal assertions (more technically their provable sentences) are finitarily true; once so grounded we do not care about the non-finitary meaning of their existential theorems, regarding these as pseudo-meaningful stipulations of the existence of ideal entities. The failure of the program was induced by Kurt Gdel's incompleteness theorems, which showed that any -consistent theory that is sufficiently strong to express certain simple arithmetic truths, cannot prove its own consistency, which on Gdel's formulation is a sentence. Much investigation has been carried out on this topic since, which has in particular led to: Refinement of Gdel's result, particularly J. Barkley Rosser's refinement, weakening the above requirement of -consistency to simple consistency; Axiomatisation of the core of Gdel's result in terms of a modal language, provability logic; Transfinite iteration of theories, due to Alan Turing and Solomon Feferman; The recent discovery of self-verifying theories, systems strong enough to talk about themselves, but too weak to carry out the diagonal argument that is the key to Gdel's unprovability argument. See also Mathematical logic
Proof theory
111
Proof-theoretic semantics
In linguistics, type-logical grammar, categorial grammar and Montague grammar apply formalisms based on structural proof theory to give a formal natural language semantics.
Tableau systems
Analytic tableaux apply the central idea of analytic proof from structural proof theory to provide decision procedures and semi-decision procedures for a wide range of logics.
Ordinal analysis
Ordinal analysis is a powerful technique for providing combinatorial consistency proofs for theories formalising arithmetic and analysis.
Notes
[1] E.g., Wang (1981), pp. 34, and Barwise (1978). [2] Prawitz (1965). [3] Girard, Lafont, and Taylor (1988).
References
J. Avigad, E.H. Reck (2001). Clarifying the nature of the infinite: the development of metamathematics and proof theory (http://www.andrew.cmu.edu/user/avigad/Papers/infinite.pdf). Carnegie-Mellon Technical Report CMU-PHIL-120. J. Barwise (ed., 1978). Handbook of Mathematical Logic. North-Holland. 2ix.com: Logic (http://2piix.com/articles/title/Logic/) Part of a series of articles covering mathematics and logic. A. S. Troelstra, H. Schwichtenberg (1996). Basic Proof Theory. In series Cambridge Tracts in Theoretical Computer Science, Cambridge University Press, ISBN 0-521-77911-1. G. Gentzen (1935/1969). Investigations into logical deduction. In M. E. Szabo, editor, Collected Papers of Gerhard Gentzen. North-Holland. Translated by Szabo from Untersuchungen ber das logische Schliessen, Mathematisches Zeitschrift 39: 176-210, 405-431. Luis Moreno & Bharath Sriraman (2005).Structural Stability and Dynamic Geometry: Some Ideas on Situated Proof. International Reviews on Mathematical Education. Vol. 37, no.3, pp.130139 (http://www.springerlink. com/content/n602313107541846/?p=74ab8879ce75445da488d5744cbc3818&pi=0) J. von Plato (2008). The Development of Proof Theory (http://plato.stanford.edu/entries/ proof-theory-development/). Stanford Encyclopedia of Philosophy. Wang, Hao (1981). Popular Lectures on Mathematical Logic. Van Nostrand Reinhold Company. ISBN0442231091.
Hilbert system
112
Hilbert system
In mathematical physics, Hilbert system is an infrequently used term for a physical system described by a C*-algebra. In logic, especially mathematical logic, a Hilbert system, sometimes called Hilbert calculus or HilbertAckermann system, is a type of system of formal deduction attributed to Gottlob Frege[1] and David Hilbert. These deductive systems are most often studied for first-order logic, but are of interest for other logics as well. Most variants of Hilbert systems take a characteristic tack in the way they balance a trade-off between logical axioms and rules of inference.[1] Hilbert systems can be characterised by the choice of a large number of schemes of logical axioms and a small set of rules of inference. The most commonly studied Hilbert systems have either just one rule of inference modus ponens, for propositional logics or two with generalisation, to handle predicate logics, as well and several infinite axiom schemes. Hilbert systems for propositional modal logics, sometimes called Hilbert-Lewis systems, are generally axiomatised with two additional rules, the necessitation rule and the uniform substitution rule. A characteristic feature of the many variants of Hilbert systems is that the context is not changed in any of their rules of inference, while both natural deduction and sequent calculus contain some context-changing rules. Thus, if we are interested only in the derivability of tautologies, no hypothetical judgments, then we can formalize the Hilbert system in such a way that its rules of inference contain only judgments of a rather simple form. The same cannot be done with the other two deductions systems: as context is changed in some of their rules of inferences, they cannot be formalized so that hypothetical judgments could be avoided not even if we want to use them just for proving derivability of tautologies. Systems of natural deduction take the opposite tack, including many deduction rules but very few or no axiom schemes.
Hilbert system
113
Formal deductions
In a Hilbert-style deduction system, a formal deduction is a finite sequence of formulas in which each formula is either an axiom or is obtained from previous formulas by a rule of inference. These formal deductions are meant to mirror natural-language proofs, although they are far more detailed.
Suppose
is a set of formulas, considered as hypotheses. For example . Thus, informally, means that
could be a set of axioms for group theory using as axioms only logical .
Hilbert-style deduction systems are characterized by the use of numerous schemes of logical axioms. An axiom scheme is an infinite set of axioms obtained by substituting all formulas of some form into a specific pattern. The set of logical axioms includes not only those axioms generated from this pattern, but also any generalization of one of those axioms. A generalization of a formula is obtained by prefixing zero or more universal quantifiers on the formula; thus
is a generalization of
Logical axioms
There are several variant axiomatisations of predicate logic, since for any logic there is freedom in choosing axioms and rules that characterise that logic. We describe here a Hilbert system with nine axioms and just the rule modus ponens, which we call the one-rule axiomatisation and which describes classical equational logic. We deal with a minimal language for this logic, where formulas use only the connectives and and only the quantifier . Later we show how the system can be extended to include additional logical connectives, such as without enlarging the class of deducible formulas. and ,
The first four logical axiom schemes allow (together with modus ponens) for the manipulation of logical connectives.
Hilbert system P1. P2. P3. P4. The axiom P1 is redundant, as it follows from P3, P2 and modus ponens. These axioms describe classical propositional logic; without axiom P4 we get (minimal) intuitionistic logic. Full intuitionistic logic is achieved by adding instead the axiom P4i for ex falso quodlibet, which is an axiom of classical propositional logic. P4i. Note that these are axiom schemes, which represent infinitely many specific instances of axioms. For example, P1 might represent the particular axiom instance , or it might represent : the is a place where any formula can be placed. A variable such as this that ranges over formulae is called a 'schematic variable'. With a second rule of uniform substitution (US), we can change each of these axiom schemes into a single axiom, replacing each schematic variable by some propositional variable that isn't mentioned in any axiom to get what we call the substitutional axiomatisation. Both formalisations have variables, but where the one-rule axiomatisation has schematic variables that are outside the logic's language, the substitutional axiomatisation uses propositional variables that do the same work by expressing the idea of a variable ranging over formulae with a rule that uses substitution. US. Let be a formula with one or more instances of the propositional variable , and let be another
114
formula. Then from , infer . The next three logical axiom schemes provide ways to add, manipulate, and remove universal quantifiers. Q5. Q6. Q7. where x is not a free variable of . These three additional rules extend the propositional system to axiomatise classical predicate logic. Likewise, these three rules extend system for intuitionstic propositional logic (with P1-3 and P4i) to intuitionistic predicate logic. Universal quantification is often given an alternative axiomatisation using an extra rule of generalisation (see the section on Metatheorems), in which case the rules Q5 and Q6 are redundant. The final axiom schemes are required to work with formulas involving the equality symbol. I8. I9. for every variable x. where t may be substituted for x in
Conservative extensions
It is common to include in a Hilbert-style deduction system only axioms for implication and negation. Given these axioms, it is possible to form conservative extensions of the deduction theorem that permit the use of additional connectives. These extensions are called conservative because if a formula involving new connectives is rewritten as a logically equivalent formula involving only negation, implication, and universal quantification, then is derivable in the extended system if and only if is derivable in the original system. When fully extended, a Hilbert-style system will resemble more closely a system of natural deduction.
Hilbert system
115
Existential quantification
Introduction
Metatheorems
Because Hilbert-style systems have very few deduction rules, it is common to prove metatheorems that show that additional deduction rules add no deductive power, in the sense that a deduction using the new deduction rules can be converted into a deduction using only the original deduction rules. Some common metatheorems of this form are: The deduction theorem: if and only if . if and only if and . Contraposition: If then . Generalization: If and x does not occur free in any formula of
then
Alternative axiomatizations
The axiom 3 above is credited to ukasiewicz.[2] The original system by Frege had axioms P2 and P3 but four other axioms instead of axiom P4 (see Frege's propositional calculus). Russell and Whitehead also suggested a system with five propositional axioms.
Further connections
Axioms P1, P2 and P3, with the deduction rule modus ponens (formalising intuitionistic propositional logic), correspond to combinatory logic base combinators I, K and S with the application operator. Proofs in the Hilbert system then correspond to combinator terms in combinatory logic. See also Curry-Howard correspondence.
Hilbert system
116
Notes
[1] Mt & Ruzsa 1997:129 [2] A. Tarski, Logic, semantics, metamathematics, Oxford, 1956
References
Curry, Haskell B.; Robert Feys (1958). Combinatory Logic Vol. I. 1. Amsterdam: North Holland. Monk, J. Donald (1976). Mathematical Logic. Graduate Texts in Mathematics. Berlin, New York: Springer-Verlag. ISBN978-0-387-90170-1. Ruzsa, Imre; Mt, Andrs (1997) (in Hungarian). Bevezets a modern logikba. Budapest: Osiris Kiad. Tarski, Alfred (1990) (in Hungarian). Bizonyts s igazsg. Budapest: Gondolat. It is a Hungarian translation of Alfred Tarski's selected papers on semantic theory of truth. David Hilbert (1927) "The foundations of mathematics", translated by Stephan Bauer-Menglerberg and Dagfinn Fllesdal (pp.464479). in: van Heijenoort, Jean (1967, 3rd printing 1976). From Frege to Gdel: A Source Book in Mathematical Logic, 18791931. Cambrdige MA: Harvard University Press. ISBN0-674-32449-8 (pbk.). Hilbert's 1927, Based on an earlier 1925 "foundations" lecture (pp. 367392), presents his 17 axioms -axioms of implication #1-4, axioms about & and V #5-10, axioms of negation #11-12, his logical -axiom #13, axioms of equality #14-15, and axioms of number #16-17 -- along with the other necessary elements of his Formalist "proof theory" -- e.g. induction axioms, recursion axioms, etc; he also offers up a spirited defense against L.E.J. Brouwer's Intuitionism. Also see Hermann Weyl's (1927) comments and rebuttal (pp. 480484), Paul Bernay's (1927) appendix to Hilbert's lecture (pp. 485489) and Luitzen Egbertus Jan Brouwer's (1927) response (pp. 490495) Kleene, Stephen Cole (1952, 10th impression with 1971 corrections). Introduction to Metamathematics. Amsterdam NY: North Holland Publishing Company. ISBN0 7204 2103 9. See in particular Chapter IV Formal System (pp. 6985) wherein Kleene presents subchapters 16 Formal symbols, 17 Formation rules, 18 Free and bound variables (including substitution), 19 Transformation rules (e.g. modus ponens) -- and from these he presents 21 "postulates" -- 18 axioms and 3 "immediate-consequence" relations divided as follows: Postulates for the propostional calculus #1-8, Additional postulates for the predicate calculus #9-12, and Additional postulates for number theory #13-21.
External links
Farmer, W. M. "Propositional logic" (http:/ / imps. mcmaster. ca/ courses/ SE-2F03-05/ slides/ 02-prop-logic. pdf) (pdf). It describes (among others) a part of the Hilbert-style deduction system (restricted to propositional calculus).
Natural deduction
117
Natural deduction
In logic and proof theory, natural deduction is a kind of proof calculus in which logical reasoning is expressed by inference rules closely related to the "natural" way of reasoning. This contrasts with the axiomatic systems which instead use axioms as much as possible to express the logical laws of deductive reasoning.
Motivation
Natural deduction grew out of a context of dissatisfaction with the axiomatizations of deductive reasoning common to the systems of Hilbert, Frege, and Russell (see, e.g., Hilbert system). Such axiomatizations were most famously used by Russell and Whitehead in their mathematical treatise Principia Mathematica. Spurred on by a series of seminars in Poland in 1926 by ukasiewicz that advocated a more natural treatment of logic, Jakowski made the earliest attempts at defining a more natural deduction, first in 1929 using a diagrammatic notation, and later updating his proposal in a sequence of papers in 1934 and 1935. His proposals led to different notations such as Fitch-style calculus (or Fitch's diagrams) or Suppes' method of which e.g. Lemmon gave a variant called system L. Natural deduction in its modern form was independently proposed by the German mathematician Gentzen in 1935, in a dissertation delivered to the faculty of mathematical sciences of the university of Gttingen. The term natural deduction (or rather, its German equivalent natrliches Schlieen) was coined in that paper: Ich wollte zunchst einmal einen Formalismus aufstellen, der dem wirklichen Schlieen mglichst nahe kommt. So ergab sich ein Kalkl des natrlichen Schlieens. (First I wished to construct a formalism that comes as close as possible to actual reasoning. Thus arose a "calculus of natural deduction".) Gentzen, Untersuchungen ber das logische Schlieen (Mathematische Zeitschrift 39, pp.176210, 1935) Gentzen was motivated by a desire to establish the consistency of number theory. He was unable to prove the main result required for the consistency result, the cut elimination theorem - the Hauptsatz - directly for Natural Deduction. For this reason he introduced his alternative system, the sequent calculus for which he proves the Hauptsatz both for classical and intuitionistic logic. In a series of seminars in 1961 and 1962 Prawitz gave a comprehensive summary of natural deduction calculi, and transported much of Gentzen's work with sequent calculi into the natural deduction framework. His 1965 monograph Natural deduction: a proof-theoretical study was to become a reference work on natural deduction, and included applications for modal and second-order logic. In natural deduction, a proposition is deduced from a collection of premises by applying inference rules repeatedly. The system presented in this article is a minor variation of Gentzen's or Prawitz's formulation, but with a closer adherence to Martin-Lf's description of logical judgments and connectives (Martin-Lf, 1996).
Natural deduction many others. To start with, we shall concern ourselves with the simplest two judgments "A is a proposition" and "A is true", abbreviated as "A prop" and "A true" respectively. The judgment "A prop" defines the structure of valid proofs of A, which in turn defines the structure of propositions. For this reason, the inference rules for this judgment are sometimes known as formation rules. To illustrate, if we have two propositions A and B (that is, the judgments "A prop" and "B prop" are evident), then we form the compound proposition A and B, written symbolically as " ". We can write this in the form of an inference rule:
118
This inference rule is schematic: A and B can be instantiated with any expression. The general form of an inference rule is:
is a judgment and the inference rule is named "name". The judgments above the line are known as ), ). Their ), implication ( ), and the logical constants truth ( ) and falsehood (
premises, and those below the line are conclusions. Other common logical propositions are disjunction ( formation rules are below.
It must be understood that in such rules the objects are propositions. That is, the above rule is really an abbreviation for:
form. In this article we shall elide the "prop" judgments where they are understood. In the nullary case, one can derive truth from no premises.
If the truth of a proposition can be established in more than one way, the corresponding connective has multiple introduction rules.
Note that in the nullary case, i.e., for falsehood, there are no introduction rules. Thus one can never infer falsehood from simpler judgments.
Natural deduction Dual to introduction rules are elimination rules to describe how to de-construct information about a compound proposition into information about its constituents. Thus, from "A B true", we can conclude "A true" and "B true":
119
As an example of the use of inference rules, consider commutativity of conjunction. If A B is true, then B A is true; This derivation can be drawn by composing inference rules in such a fashion that premises of a lower inference match the conclusion of the next higher inference.
The inference figures we have seen so far are not sufficient to state the rules of implication introduction or disjunction elimination; for these, we need a more general notion of hypothetical derivation.
Hypothetical derivations
A pervasive operation in mathematical logic is reasoning from assumptions. For example, consider the following derivation:
This derivation does not establish the truth of B as such; rather, it establishes the following fact: If A (B C) is true then B is true. In logic, one says "assuming A (B C) is true, we show that B is true"; in other words, the judgement "B true" depends on the assumed judgement "A (B C) true". This is a hypothetical derivation, which we write as follows:
The interpretation is: "B true is derivable from A (B C) true". Of course, in this specific example we actually know the derivation of "B true" from "A (B C) true", but in general we may not a-priori know the derivation. The general form of a hypothetical derivation is:
Each hypothetical derivation has a collection of antecedent derivations (the Di) written on the top line, and a succedent judgement (J) written on the bottom line. Each of the premises may itself be a hypothetical derivation. (For simplicity, we treat a judgement as a premise-less derivation.) The notion of hypothetical judgement is internalised as the connective of implication. The introduction and elimination rules are as follows.
In the introduction rule, the antecedent named u is discharged in the conclusion. This is a mechanism for delimiting the scope of the hypothesis: its sole reason for existence is to establish "B true"; it cannot be used for any other purpose, and in particular, it cannot be used below the introduction. As an example, consider the derivation of "A
120
This full derivation has no unsatisfied premises; however, sub-derivations are hypothetical. For instance, the derivation of "B (A B) true" is hypothetical with antecedent "A true" (named u). With hypothetical derivations, we can now write the elimination rule for disjunction:
In words, if A B is true, and we can derive C true both from A true and from B true, then C is indeed true. Note that this rule does not commit to either A true or B true. In the zero-ary case, i.e. for falsehood, we obtain the following elimination rule:
This is read as: if falsehood is true, then any proposition C is true. Negation is similar to implication.
The introduction rule discharges both the name of the hypothesis u, and the succedent p, i.e., the proposition p must not occur in the conclusion A. Since these rules are schematic, the interpretation of the introduction rule is: if from "A true" we can derive for every proposition p that "p true", then A must be false, i.e., "not A true". For the elimination, if both A and not A are shown to be true, then there is a contradiction, in which case every proposition C is true. Because the rules for implication and negation are so similar, it should be fairly easy to see that not A and A are equivalent, i.e., each is derivable from the other.
Natural deduction
121
Dually, local completeness says that the elimination rules are strong enough to decompose a connective into the forms suitable for its introduction rule. Again for conjunctions:
---------- u A B true ---------- u ---------- u A B true A B true ---------- E1 ---------- E2 A true B true ----------------------- I A B true
These notions correspond exactly to -reduction (beta reduction) and -conversion (eta conversion) in the lambda calculus, using the CurryHoward isomorphism. By local completeness, we see that every derivation can be converted to an equivalent derivation where the principal connective is introduced. In fact, if the entire derivation obeys this ordering of eliminations followed by introductions, then it is said to be normal. In a normal derivation all eliminations happen above introductions. In most logics, every derivation has an equivalent normal derivation, called a normal form. The existence of normal forms is generally hard to prove using natural deduction alone, though such accounts do exist in the literature, most notably by Dag Prawitz in 1961; see his book Natural deduction: a proof-theoretical study, A&W Stockholm 1965, no ISBN. It is much easier to show this indirectly by means of a cut-free sequent calculus presentation.
f F t1 term t2 term ... tn term ------------------------------------------ app-F f (t1, t2, ..., tn) term
For propositions, we consider a third countable set P of predicates, and define atomic predicates over terms with the following formation rule:
P t1 term t2 term ... tn term ------------------------------------------ pred-F (t1, t2, ..., tn) prop
Natural deduction In addition, we add a pair of quantified propositions: universal () and existential ():
------ u x term A prop ---------- Fu x. A prop ------ u x term A prop ---------- Fu x. A prop
122
These quantified propositions have the following introduction and elimination rules.
------ u a term [a/x] A true ------------ Iu, x. A true [t/x] A true ------------ I x. A true x. A true t term -------------------- E [t/x] A true
------ u ------------ v a term [a/x] A true x. A true C true -------------------------- Ea, u,v C true
In these rules, the notation [t/x] A stands for the substitution of t for every (visible) instance of x in A, avoiding capture; see the article on lambda calculus for more detail about this standard operation. As before the superscripts on the name stand for the components that are discharged: the term a cannot occur in the conclusion of I (such terms are known as eigenvariables or parameters), and the hypotheses named u and v in E are localised to the second premise in a hypothetical derivation. Although the propositional logic of earlier sections was decidable, adding the quantifiers makes the logic undecidable. So far the quantified extensions are first-order: they distinguish propositions from the kinds of objects quantified over. Higher-order logic takes a different approach and has only a single sort of propositions. The quantifiers have as the domain of quantification the very same sort of propositions, as reflected in the formation rules:
------ u p prop A prop ---------- Fu p. A prop ------ u p prop A prop ---------- Fu p. A prop
A discussion of the introduction and elimination forms for higher-order logic is beyond the scope of this article. It is possible to be in between first-order and higher-order logics. For example, second-order logic has two kinds of propositions, one kind quantifying over terms, and the second kind quantifying over propositions of the first kind.
Natural deduction
123
Sequential presentations
Jakowski's representations of natural deduction led to different notations such as Fitch-style calculus (or Fitch's diagrams) or Suppes' method of which e.g. Lemmon gave a variant called system L.
The collection of hypotheses will be written as when their exact composition is not relevant. To make proofs explicit, we move from the proof-less judgement "A true" to a judgement: " is a proof of (A true)", which is written symbolically as " : A true". Following the standard approach, proofs are specified with their own formation rules for the judgement " proof". The simplest possible proof is the use of a labelled hypothesis; in this case the evidence is the label itself.
u V ------- proof-F u proof --------------------- hyp u:A true u : A true
For brevity, we shall leave off the judgemental label true in the rest of this article, i.e., write " : A". Let us re-examine some of the connectives with explicit proofs. For conjunction, we look at the introduction rule I to discover the form of proofs of conjunction: they must be a pair of proofs of the two conjuncts. Thus:
1 proof 2 proof -------------------- pair-F (1, 2) proof 1 : A 2 : B ------------------------ I (1, 2) : A B
The elimination rules E1 and E2 select either the left or the right conjunct; thus the proofs are a pair of projections first (fst) and second (snd).
proof ----------- fst-F fst proof proof ----------- snd-F snd proof : A B ------------- E1 fst : A : A B ------------- E2 snd : B
Natural deduction For implication, the introduction form localises or binds the hypothesis, written using a ; this corresponds to the discharged label. In the rule, ", u:A" stands for the collection of hypotheses , together with the additional hypothesis u.
proof ------------ -F u. proof 1 proof 2 proof ------------------- app-F 1 2 proof , u:A : B ----------------- I u. : A B 1 : A B 2 : A ---------------------------- E 1 2 : B
124
With proofs available explicitly, one can manipulate and reason about proofs. The key operation on proofs is the substitution of one proof for an assumption used in another proof. This is commonly known as a substitution theorem, and can be proved by induction on the depth (or structure) of the second judgement. Substitution theorem If 1 : A and , u:A 2 : B, then [1/u] 2 : B. So far the judgement " : A" has had a purely logical interpretation. In type theory, the logical view is exchanged for a more computational view of objects. Propositions in the logical interpretation are now viewed as types, and proofs as programs in the lambda calculus. Thus the interpretation of " : A" is "the program has type A". The logical connectives are also given a different reading: conjunction is viewed as product (), implication as the function arrow (), etc. The differences are only cosmetic, however. Type theory has a natural deduction presentation in terms of formation, introduction and elimination rules; in fact, the reader can easily reconstruct what is known as simple type theory from the previous sections. The difference between logic and type theory is primarily a shift of focus from the types (propositions) to the programs (proofs). Type theory is chiefly interested in the convertibility or reducibility of programs. For every type, there are canonical programs of that type which are irreducible; these are known as canonical forms or values. If every program can be reduced to a canonical form, then the type theory is said to be normalising (or weakly normalising). If the canonical form is unique, then the theory is said to be strongly normalising. Normalisability is a rare feature of most non-trivial type theories, which is a big departure from the logical world. (Recall that every logical derivation has an equivalent normal derivation.) To sketch the reason: in type theories that admit recursive definitions, it is possible to write programs that never reduce to a value; such looping programs can generally be given any type. In particular, the looping program has type , although there is no logical proof of " true". For this reason, the propositions as types; proofs as programs paradigm only works in one direction, if at all: interpreting a type theory as a logic generally gives an inconsistent logic. Like logic, type theory has many extensions and variants, including first-order and higher-order versions. An interesting branch of type theory, known as dependent type theory, allows quantifiers to range over programs themselves. These quantified types are written as and instead of and , and have the following formation rules:
A type , x:A B type ----------------------------- -F x:A. B type A type , x:A B type ---------------------------- -F x:A. B type
These types are generalisations of the arrow and product types, respectively, as witnessed by their introduction and elimination rules.
, x:A : B -------------------- I x. : x:A. B 1 : x:A. B 2 : A ----------------------------- E 1 2 : [2/x] B
Natural deduction
125
Dependent type theory in full generality is very powerful: it is able to express almost any conceivable property of programs directly in the types of the program. This generality comes at a steep price checking that a given program is of a given type is undecidable. For this reason, dependent type theories in practice do not allow quantification over arbitrary programs, but rather restrict to programs of a given decidable index domain, for example integers, strings, or linear programs. Since dependent type theories allow types to depend on programs, a natural question to ask is whether it is possible for programs to depend on types, or any other combination. There are many kinds of answers to such questions. A popular approach in type theory is to allow programs to be quantified over types, also known as parametric polymorphism; of this there are two main kinds: if types and programs are kept separate, then one obtains a somewhat more well-behaved system called predicative polymorphism; if the distinction between program and type is blurred, one obtains the type-theoretic analogue of higher-order logic, also known as impredicative polymorphism. Various combinations of dependency and polymorphism have been considered in the literature, the most famous being the lambda cube of Henk Barendregt. The intersection of logic and type theory is a vast and active research area. New logics are usually formalised in a general type theoretic setting, known as a logical framework. Popular modern logical frameworks such as the calculus of constructions and LF are based on higher-order dependent type theory, with various trade-offs in terms of decidability and expressive power. These logical frameworks are themselves always specified as natural deduction systems, which is a testament to the versatility of the natural deduction approach.
(XM3 is merely XM2 expressed in terms of E.) This treatment of excluded middle, in addition to being objectionable from a purist's standpoint, introduces additional complications in the definition of normal forms. A comparatively more satisfactory treatment of classical natural deduction in terms of introduction and elimination rules alone was first proposed by Parigot in 1992 in the form of a classical lambda calculus called . The key insight of his approach was to replace a truth-centric judgement A true with a more classical notion, reminiscent of the sequent calculus: in localised form, instead of A, he used , with a collection of propositions similar to . was treated as a conjunction, and as a disjunction. This structure is essentially lifted directly from classical sequent calculi, but the innovation in was to give a computational meaning to classical natural deduction proofs in terms of a callcc or a throw/catch mechanism seen in LISP and its descendants. (See also: first class control.)
Natural deduction Another important extension was for modal and other logics that need more than just the basic judgement of truth. These were first described, for the alethic modal logics S4 and S5, in a natural deduction style by Prawitz in 1965, and have since accumulated a large body of related work. To give a simple example, the modal logic S4 requires one new judgement, "A valid", that is categorical with respect to truth: If "A true" under no assumptions of the form "B true", then "A valid". This categorical judgement is internalised as a unary connective A (read "necessarily A") with the following introduction and elimination rules:
A valid -------- I A true A true -------- E A true
126
Note that the premise "A valid" has no defining rules; instead, the categorical definition of validity is used in its place. This mode becomes clearer in the localised form when the hypotheses are explicit. We write "; A true" where contains the true hypotheses as before, and contains valid hypotheses. On the right there is just a single judgement "A true"; validity is not needed here since " A valid" is by definition the same as "; A true". The introduction and elimination forms are then:
; : A true -------------------- I ; box : A true ; : A true ---------------------- E ; unbox : A true
The modal hypotheses have their own version of the hypothesis rule and substitution theorem.
------------------------------- valid-hyp , u: (A valid) ; u : A true
Modal substitution theorem If ; 1 : A true and , u: (A valid) ; 2 : C true, then ; [1/u] 2 : C true. This framework of separating judgements into distinct collections of hypotheses, also known as multi-zoned or polyadic contexts, is very powerful and extensible; it has been applied for many different modal logics, and also for linear and other substructural logics, to give a few examples. However, relatively few systems of modal logic can be formalised directly in natural deduction. To give proof-theoretic characterisations of these systems, extensions such as labelling or systems of deep inference. The addition of labels to formulae permits much finer control of the conditions under which rules apply, allowing the more flexible techniques of analytic tableaux to be applied, as has been done in the case of labelled deduction. Labels also allow the naming of worlds in Kripke semantics; Simpson (1993) presents an influential technique for converting frame conditions of modal logics in Kripke semantics into inference rules in a natural deduction formalisation of hybrid logic. Stouppa (2004) surveys the application of many proof theories, such as Avron and Pottinger's hypersequents and Belnap's display logic to such modal logics as S5 and B.
Natural deduction though he initially intended it as a technical device for clarifying the consistency of predicate logic. Kleene, in his seminal 1952 book Introduction to Metamathematics (ISBN 0-7204-2103-9), gave the first formulation of the sequent calculus in the modern style. In the sequent calculus all inference rules have a purely bottom-up reading. Inference rules can apply to elements on both sides of the turnstile. (To differentiate from natural deduction, this article uses a double arrow instead of the right tack for sequents.) The introduction rules of natural deduction are viewed as right rules in the sequent calculus, and are structurally very similar. The elimination rules on the other hand turn into left rules in the sequent calculus. To give an example, consider disjunction; the right rules are familiar:
A --------- R1 A B B --------- R2 A B
127
On the left:
, u:A C , v:B C --------------------------- L , w: (A B) C
The proposition A B, which is the succedent of a premise in E, turns into a hypothesis of the conclusion in the left rule L. Thus, left rules can be seen as a sort of inverted elimination rule. This observation can be illustrated as follows:
natural deduction ------ hyp | | elim. rules | ---------------------- meet | | intro. rules | conclusion sequent calculus --------------------------- init | | | left rules | right rules | | conclusion
In the sequent calculus, the left and right rules are performed in lock-step until one reaches the initial sequent, which corresponds to the meeting point of elimination and introduction rules in natural deduction. These initial rules are superficially similar to the hypothesis rule of natural deduction, but in the sequent calculus they describe a transposition or a handshake of a left and a right proposition:
---------- init , u:A A
The correspondence between the sequent calculus and natural deduction is a pair of soundness and completeness theorems, which are both provable by means of an inductive argument. Soundness of wrt.
Natural deduction If A, then A. Completeness of wrt. If A, then A. It is clear by these theorems that the sequent calculus does not change the notion of truth, because the same collection of propositions remain true. Thus, one can use the same proof objects as before in sequent calculus derivations. As an example, consider the conjunctions. The right rule is virtually identical to the introduction rule
sequent calculus 1 : A 2 : B --------------------------- R (1, 2) : A B natural deduction 1 : A 2 : B ------------------------- I (1, 2) : A B
128
The left rule, however, performs some additional substitutions that are not performed in the corresponding elimination rules.
sequent calculus , v: (A B), u:A : C -------------------------------- L1 , v: (A B) [fst v/u] : C , v: (A B), u:B : C -------------------------------- L2 , v: (A B) [snd v/u] : C natural deduction : A B ------------- E1 fst : A : A B ------------- E2 snd : B
The kinds of proofs generated in the sequent calculus are therefore rather different from those of natural deduction. The sequent calculus produces proofs in what is known as the -normal -long form, which corresponds to a canonical representation of the normal form of the natural deduction proof. If one attempts to describe these proofs using natural deduction itself, one obtains what is called the intercalation calculus (first described by John Byrnes [3]), which can be used to formally define the notion of a normal form for natural deduction. The substitution theorem of natural deduction takes the form of a structural rule or structural theorem known as cut in the sequent calculus. Cut (substitution) If 1 : A and , u:A 2 : C, then [1/u] 2 : C. In most well behaved logics, cut is unnecessary as an inference rule, though it remains provable as a meta-theorem; the superfluousness of the cut rule is usually presented as a computational process, known as cut elimination. This has an interesting application for natural deduction; usually it is extremely tedious to prove certain properties directly in natural deduction because of an unbounded number of cases. For example, consider showing that a given proposition is not provable in natural deduction. A simple inductive argument fails because of rules like E or E which can introduce arbitrary propositions. However, we know that the sequent calculus is complete with respect to natural deduction, so it is enough to show this unprovability in the sequent calculus. Now, if cut is not available as an inference rule, then all sequent rules either introduce a connective on the right or the left, so the depth of a sequent derivation is fully bounded by the connectives in the final conclusion. Thus, showing unprovability is much easier, because there are only a finite number of cases to consider, and each case is composed entirely of sub-propositions of the conclusion. A simple instance of this is the global consistency theorem: " true" is not provable. In the sequent calculus version, this is manifestly true because there is no rule that can have " " as a conclusion! Proof theorists often prefer to work on cut-free sequent calculus formulations because of such properties.
Natural deduction
129
Notes References
Historical references
Stanaslaw Jakowski, 1934. On the Rules of Suppositions in Formal Logic. Gerhard Gentzen, 1934/5. Untersuchungen uber das logische Schlieen (English translation Investigations into Logical Deduction in Szabo)
Other references
Frank Pfenning and Rowan Davies (2001). "A judgmental reconstruction of modal logic" (http://www-2.cs. cmu.edu/~fp/papers/mscs00.pdf). Mathematical Structures in Computer Science 11 (4): 511540. doi:10.1017/S0960129501003322. Alex Simpson, 1993. The Proof Theory and Semantics of Intuitionistic Modal Logic. PhD thesis, University of Edinburgh. Phiniki Stouppa, 2004. The Design of Modal Proof Theories: The Case of S5. MSc thesis, University of Dresden.
External links
Clemente, Daniel, " Introduction to natural deduction. (http://www.danielclemente.com/logica/dn.en.pdf)" Domino On Acid. (http://www.winterdrache.de/freeware/domino/) Natural deduction visualized as a game of dominoes. Pelletier, Jeff, " A History of Natural Deduction and Elementary Logic Textbooks. (http://www.sfu.ca/ ~jeffpell/papers/pelletierNDtexts.pdf)" Levy, Michel, A Propositional Prover. (http://teachinglogic.liglab.fr/DN/)
Sequent calculus
130
Sequent calculus
In proof theory and mathematical logic, sequent calculus is a family of formal systems sharing a certain style of inference and certain formal properties. The first sequent calculi, systems LK and LJ, were introduced by Gerhard Gentzen in 1934 as a tool for studying natural deduction in first-order logic (in classical and intuitionistic versions, respectively). Gentzen's so-called "Main Theorem" (Hauptsatz) about LK and LJ was the cut-elimination theorem, a result with far-reaching meta-theoretic consequences, including consistency. Gentzen further demonstrated the power and flexibility of this technique a few years later, applying a cut-elimination argument to give a (transfinite) proof of the consistency of Peano arithmetic, in surprising response to Gdel's incompleteness theorems. Since this early work, sequent calculi (also called Gentzen systems) and the general concepts relating to them have been widely applied in the fields of proof theory, mathematical logic, and automated deduction.
Introduction
One way to classify different styles of deduction systems is to look at the form of judgments in the system, i.e., which things may appear as the conclusion of a (sub)proof. The simplest judgment form is used in Hilbert-style deduction systems, where a judgment has the form
where
is any formula of first-order-logic (or whatever logic the deduction system applies to, e.g., propositional
calculus or a higher-order logic or a modal logic). The theorems are those formulae that appear as the concluding judgment in a valid proof. A Hilbert-style system needs no distinction between formulae and judgments; we make one here solely for comparison with the cases that follow. The price paid for the simple syntax of a Hilbert-style system is that complete formal proofs tend to get extremely long. Concrete arguments about proofs in such a system almost always appeal to the deduction theorem. This leads to the idea of including the deduction theorem as a formal rule in the system, which happens in natural deduction. In natural deduction, judgments have the shape
where the
's and
. In words, a judgment consists of a list (possibly empty) of ", with a single formula on the right-hand side. The (with an empty left-hand side) is the conclusion of a valid proof. (In s and the turnstile are not written down explicitly; instead a , , etc., are all
formulae on the left-hand side of a turnstile symbol " theorems are those formulae some presentations of natural deduction, the
two-dimensional notation from which they can be inferred is used). The standard semantics of a judgment in natural deduction is that it asserts that whenever[1] true, will also be true. The judgments
are equivalent in the strong sense that a proof of either one may be extended to a proof of the other. Finally, sequent calculus generalizes the form of a natural deduction judgment to
a syntactic object called a sequent. The formulas on left-hand side of the turnstile are called the antecedent, and the formulas on right-hand side are called the succedent; together they are called cedents. Again, and are formulae, and and are nonnegative integers, that is, the left-hand-side or the right-hand-side (or neither or where is the conclusion of a valid proof. is true, at least one will also be true. both) may be empty. As in natural deduction, theorems are those
The empty sequent, having both cedents empty, is defined to be false. The standard semantics of a sequent is an assertion that whenever every One way to express this is that a comma to the left of the turnstile should be thought of as an "and", and a comma to
Sequent calculus the right of the turnstile should be thought of as an (inclusive) "or". The sequents
131
are equivalent in the strong sense that a proof of either one may be extended to a proof of the other. At first sight, this extension of the judgment form may appear to be a strange complication it is not motivated by an obvious shortcoming of natural deduction, and it is initially confusing that the comma seems to mean entirely different things on the two sides of the turnstile. However, in a classical context the semantics of the sequent can also (by propositional tautology) be expressed either as
(it cannot be the case that all of the As are true and all of the Bs are false). In these formulations, the only difference between formulae on either side of the turnstile is that one side is negated. Thus, swapping left for right in a sequent corresponds to negating all of the constituent formulae. This means that a symmetry such as De Morgan's laws, which manifests itself as logical negation on the semantic level, translates directly into a left-right symmetry of sequents and indeed, the inference rules in sequent calculus for dealing with conjunction () are mirror images of those dealing with disjunction (). Many logicians feel that this symmetric presentation offers a deeper insight in the structure of the logic than other styles of proof system, where the classical duality of negation is not as apparent in the rules.
The system LK
This section introduces the rules of the sequent calculus LK (which is short for logistischer klassischer Kalkl), as introduced by Gentzen in 1934. [2] A (formal) proof in this calculus is a sequence of sequents, where each of the sequents is derivable from sequents appearing earlier in the sequence by using one of the rules below.
Inference rules
The following notation will be used: known as the turnstile, separates the assumptions on the left from the propositions on the right and denote formulae of first-order predicate logic (one may also restrict this to propositional logic), , and are finite (possibly empty) sequences of formulae (in fact, the order of formulae do not matter; , the sequence of formulas is considered conjunctively (all assumed to hold at the , the sequence of formulas is considered disjunctively (at least one of the formulas see subsection Structural Rules), called contexts, when on the left of the same time), while on the right of the
must hold for any assignment of variables), denotes an arbitrary term, and denote variables, denotes the formula that is obtained by substituting the term for every free occurrence of the variable . for
in formula , a variable is said to occur free within a formula if it occurs outside the scope of quantifiers or and stand for Weakening Left/Right, and for Contraction, and and Permutation.
Sequent calculus
132
Axiom:
Cut:
and
, the variable
and
. Alternatively,
An intuitive explanation
The above rules can be divided into two major groups: logical and structural ones. Each of the logical rules introduces a new logical formula either on the left or on the right of the turnstile . In contrast, the structural rules operate on the structure of the sequents, ignoring the exact shape of the formulae. The two exceptions to this general scheme are the axiom of identity (I) and the rule of (Cut). Although stated in a formal way, the above rules allow for a very intuitive reading in terms of classical logic. Consider, for example, the rule (L1). It says that, whenever one can prove that can be concluded from some sequence of formulae that contain A, then one can also conclude from the (stronger) assumption, that AB holds. Likewise, the rule (R) states that, if and A suffice to conclude , then from alone one can either still conclude or A must be false, i.e. A holds. All the rules can be interpreted in this way. For an intuition about the quantifier rules, consider the rule (R). Of course concluding that x A holds just from the fact that A[y/x] is true is not in general possible. If, however, the variable y is not mentioned elsewhere (i.e. it can still be chosen freely, without influencing the other formulae), then one may assume, that A[y/x] holds for any value of y. The other rules should then be pretty straightforward. Instead of viewing the rules as descriptions for legal derivations in predicate logic, one may also consider them as instructions for the construction of a proof for a given statement. In this case the rules can be read bottom-up; for example, (R) says that, to prove that AB follows from the assumptions and , it suffices to prove that A can be concluded from and B can be concluded from , respectively. Note that, given some antecedent, it is not clear how this is to be split into and . However, there are only finitely many possibilities to be checked since the antecedent by assumption is finite. This also illustrates how proof theory can be viewed as operating on proofs in a combinatorial fashion: given proofs for both A and B, one can construct a proof for AB. When looking for some proof, most of the rules offer more or less direct recipes of how to do this. The rule of cut is different: It states that, when a formula A can be concluded and this formula may also serve as a premise for concluding other statements, then the formula A can be "cut out" and the respective derivations are joined. When constructing a proof bottom-up, this creates the problem of guessing A (since it does not appear at all below). The
Sequent calculus cut-elimination theorem is thus crucial to the applications of sequent calculus in automated deduction: it states that all uses of the cut rule can be eliminated from a proof, implying that any provable sequent can be given a cut-free proof. The second rule that is somewhat special is the axiom of identity (I). The intuitive reading of this is obvious: every formula proves itself. Like the cut rule, the axiom of identity is somewhat redundant: the completeness of atomic initial sequents states that the rule can be restricted to atomic formulas without any loss of provability. Observe that all rules have mirror companions, except the ones for implication. This reflects the fact that the usual language of first-order logic does not include the "is not implied by" connective that would be the De Morgan dual of implication. Adding such a connective with its natural rules would make the calculus completely left-right symmetric.
133
Example derivations
Here is the derivation of " ", known as the Law of excluded middle (tertium non datur in Latin).
Next is the proof of a simple fact involving quantifiers. Note that the converse is not true, and its falsity can be seen when attempting to derive it bottom-up, because an existing free variable cannot be used in substitution in the rules and .
For something more interesting we shall prove straightforward to find the derivation, which exemplifies the usefulness of LK in automated proving.
. It is
Sequent calculus
134
These derivations also emphasize the strictly formal structure of the sequent calculus. For example, the logical rules as defined above always act on a formula immediately adjacent to the turnstile, such that the permutation rules are necessary. Note, however, that this is in part an artifact of the presentation, in the original style of Gentzen. A common simplification involves the use of multisets of formulas in the interpretation of the sequent, rather than sequences, eliminating the need for an explicit permutation rule. This corresponds to shifting commutativity of assumptions and derivations outside the sequent calculus, whereas LK embeds it within the system itself.
Structural rules
The structural rules deserve some additional discussion. Weakening (W) allows the addition of arbitrary elements to a sequence. Intuitively, this is allowed in the antecedent because we can always add assumptions to our proof, and in the succedent because we can always allow for alternative conclusions. Contraction (C) and Permutation (P) assure that neither the order (P) nor the multiplicity of occurrences (C) of elements of the sequences matters. Thus, one could instead of sequences also consider sets. The extra effort of using sequences, however, is justified since part or all of the structural rules may be omitted. Doing so, one obtains the so-called substructural logics.
Sequent calculus
135
Variants
The above rules can be modified in various ways:
Substructural logics
Alternatively, one may restrict or forbid the use of some of the structural rules. This yields a variety of substructural logic systems. They are generally weaker than LK (i.e., they have fewer theorems), and thus not complete with respect to the standard semantics of first-order logic. However, they have other interesting properties that have led to applications in theoretical computer science and artificial intelligence.
The resulting system is called LJ. It is sound and complete with respect to intuitionistic logic and admits a similar cut-elimination proof.
Sequent calculus
136
Notes
[1] Here, "whenever" is used as an informal abbreviation "for every assignment of values to the free variables in the judgment" [2] Gentzen, Gerhard (1934/1935). "Untersuchungen ber das logische Schlieen. I". Mathematische Zeitschrift 39 (2): 176210. doi:10.1007/BF01201353.
References
Girard, Jean-Yves; Paul Taylor, Yves Lafont (1990) [1989]. Proofs and Types (http://www.paultaylor.eu/ stable/Proofs+Types.html). Cambridge University Press (Cambridge Tracts in Theoretical Computer Science, 7). ISBN0-521-37181-3. Samuel R. Buss (1998). "An introduction to proof theory" (http://math.ucsd.edu/~sbuss/ResearchWeb/ handbookI/). In Samuel R. Buss. Handbook of proof theory. Elsevier. pp.178. ISBN0-444-89840-9.
External links
A Brief Diversion: Sequent Calculus (http://scienceblogs.com/goodmath/2006/07/ a_brief_diversion_sequent_calc.php)
Resolution (logic)
In mathematical logic and automated theorem proving, resolution is a rule of inference leading to a refutation theorem-proving technique for sentences in propositional logic and first-order logic. In other words, iteratively applying the resolution rule in a suitable way allows for telling whether a propositional formula is satisfiable and for proving that a first-order formula is unsatisfiable; this method may prove the satisfiability of a first-order satisfiable formula, but not always, as it is the case for all methods for first-order logic. Resolution was introduced by John Alan Robinson in 1965.
is the complement to
the dividing line stands for entails The clause produced by the resolution rule is called the resolvent of the two input clauses. When the two clauses contain more than one pair of complementary literals, the resolution rule can be applied (independently) for each such pair. However, only the pair of literals that are resolved upon can be removed: all other pairs of literals remain in the resolvent clause. Note that resolving any two clauses that can be resolved over more than one variable always results in a tautology.
Resolution (logic) Modus ponens can be seen as a special case of resolution of a one-literal clause and a two-literal clause.
137
A resolution technique
When coupled with a complete search algorithm, the resolution rule yields a sound and complete algorithm for deciding the satisfiability of a propositional formula, and, by extension, the validity of a sentence under a set of axioms. This resolution technique uses proof by contradiction and is based on the fact that any sentence in propositional logic can be transformed into an equivalent sentence in conjunctive normal form. The steps are as follows. All sentences in the knowledge base and the negation of the sentence to be proved (the conjecture) are conjunctively connected. The resulting sentence is transformed into a conjunctive normal form with the conjuncts viewed as elements in a set, S, of clauses. For example gives rise to the set
. The resolution rule is applied to all possible pairs of clauses that contain complementary literals. After each application of the resolution rule, the resulting sentence is simplified by removing repeated literals. If the sentence contains complementary literals, it is discarded (as a tautology). If not, and if it is not yet present in the clause set S, it is added to S, and is considered for further resolution inferences. If after applying a resolution rule the empty clause is derived, the original formula is unsatisfiable (or contradictory), and hence it can be concluded that the initial conjecture follows from the axioms. If, on the other hand, the empty clause cannot be derived, and the resolution rule cannot be applied to derive any more new clauses, the conjecture is not a theorem of the original knowledge base. One instance of this algorithm is the original DavisPutnam algorithm that was later refined into the DPLL algorithm that removed the need for explicit representation of the resolvents. This description of the resolution technique uses a set S as the underlying data-structure to represent resolution derivations. Lists, Trees and Directed Acyclic Graphs are other possible and common alternatives. Tree representations are more faithful to the fact that the resolution rule is binary. Together with a sequent notation for clauses, a tree representation also makes it clear to see how the resolution rule is related to a special case of the cut-rule, restricted to atomic cut-formulas. However, tree representations are not as compact as set or list representations, because they explicitly show redundant subderivations of clauses that are used more than once in the derivation of the empty clause. Graph representations can be as compact in the number of clauses as list representations and they also store structural information regarding which clauses were resolved to derive each resolvent.
Resolution (logic)
138
A simple example
In English: Suppose
to be true,
is true. In order for the premise to be true, must be true. Therefore regardless of falsehood or veracity of , if both premises hold, then the conclusion is true.
Therefore, To recast the reasoning using the resolution technique, first the clauses must be converted to conjunctive normal form. In this form, all quantification becomes implicit: universal quantifiers on variables (X, Y, ) are simply omitted as understood, while existentially-quantified variables are replaced by Skolem functions.
Therefore, So the question is, how does the resolution technique derive the last clause from the first two? The rule is simple: Find two clauses containing the same predicate, where it is negated in one clause but not in the other. Perform a unification on the two predicates. (If the unification fails, you made a bad choice of predicates. Go back to the previous step and try again.) If any unbound variables which were bound in the unified predicates also occur in other predicates in the two clauses, replace them with their bound values (terms) there as well. Discard the unified predicates, and combine the remaining ones from the two clauses into a new clause, also joined by the "" operator. To apply this rule to the above example, we find the predicate P occurs in negated form P(X) in the first clause, and in non-negated form P(a) in the second clause. X is an unbound variable, while a is a bound value (term). Unifying the two produces the substitution Xa Discarding the unified predicates, and applying this substitution to the remaining predicates (just Q(X), in this case), produces the conclusion: Q(a) For another example, consider the syllogistic form
Resolution (logic) All Cretans are islanders. All islanders are liars. Therefore all Cretans are liars. Or more generally, X P(X) Q(X) X Q(X) R(X) Therefore, X P(X) R(X) In CNF, the antecedents become: P(X) Q(X) Q(Y) R(Y) (Note that the variable in the second clause was renamed to make it clear that variables in different clauses are distinct.) Now, unifying Q(X) in the first clause with Q(Y) in the second clause means that X and Y become the same variable anyway. Substituting this into the remaining clauses and combining them gives the conclusion: P(X) R(X) The resolution rule, as defined by Robinson, also incorporated factoring, which unifies two literals in the same clause, before or during the application of resolution as defined above. The resulting inference rule is refutation complete, in that a set of clauses is unsatisfiable if and only if there exists a derivation of the empty clause using resolution alone.
139
Implementations
Carine Gandalf Otter Prover9 SNARK SPASS Vampire
References
Robinson, J. Alan (1965). "A Machine-Oriented Logic Based on the Resolution Principle". Journal of the ACM (JACM) 12 (1): 2341. Leitsch, Alexander (1997). The Resolution Calculus. Springer-Verlag. Gallier, Jean H. (1986). Logic for Computer Science: Foundations of Automatic Theorem Proving [1]. Harper & Row Publishers. Lee, Chin-Liang Chang, Richard Char-Tung (1987). Symbolic logic and mechanical theorem proving ([Nachdr.] ed.). San Diego: Academic Press. ISBN0121703509.
Resolution (logic)
140
External links
Alex Sakharov, "Resolution Principle [2]" from MathWorld. Alex Sakharov, "Resolution [3]" from MathWorld. Notes on computability and resolution [4]
References
[1] [2] [3] [4] http:/ / www. cis. upenn. edu/ ~jean/ gbooks/ logic. html http:/ / mathworld. wolfram. com/ ResolutionPrinciple. html http:/ / mathworld. wolfram. com/ Resolution. html http:/ / www. cs. uu. nl/ docs/ vakken/ pv/ resources/ computational_prop_of_fol. pdf
Introduction
For refutation tableaux, the objective is to show that the negation propositional tableau of a formula cannot be satisfied. There are rules for handling each of the usual connectives, starting with the main connective. In many cases, applying these rules causes the subtableau to divide into two. Quantifiers are instantiated. If any branch of a tableau leads to an evident contradiction, the branch closes. If all branches close, the proof is complete and the original formula is a logical truth. Although the fundamental idea behind the analytic tableau method is derived from the cut-elimination theorem of structural proof theory, the origins of tableau calculi lie in the meaning (or semantics) of the logical connectives, as the connection with proof theory was made only in recent decades. More specifically, a tableau calculus consists of a finite collection of rules with each rule specifying how to break down one logical connective into its constituent parts. The rules typically are expressed in terms of finite sets of formulae, although there are logics for which we must use more complicated data structures, such as multisets, lists, or even trees of formulas. Henceforth, "set" denotes any of {set, multiset, list, tree}. If there is such a rule for every logical connective then the procedure will eventually produce a set which consists only of atomic formulae and their negations, which cannot be broken down any further. Such a set is easily recognizable as satisfiable or unsatisfiable with respect to the semantics of the logic in question. To keep track of this process, the nodes of a tableau itself are set out in the form of a tree and the branches of this tree are created and
A graphical representation of a partially built
Method of analytic tableaux assessed in a systematic way. Such a systematic method for searching this tree gives rise to an algorithm for performing deduction and automated reasoning. Note that this larger tree is present regardless of whether the nodes contain sets, multisets, lists or trees.
141
Propositional logic
This section presents the tableau calculus for classical propositional logic. A tableau checks whether a given set of formulae is satisfiable or not. It can be used to check either validity or entailment: a formula is valid if its negation is unsatisfiable and formulae imply if is unsatisfiable. The main principle of propositional tableaux is to attempt to "break" complex formulae into smaller ones until complementary pairs of literals are produced or no further expansion is possible.
The method works on a tree whose nodes are labeled with formulae. At each step, this tree is modified; in the propositional case, the only allowed changes are additions of a node as descendant of a leaf. The procedure starts by generating the tree made of a chain of all formulae in the set to prove unsatisfiability. A variant to this starting step is to begin with a single-node tree whose root is labeled by ; in this second case, the procedure can always copy a formula in the set below a leaf. As a running example, the tableau for the set is shown. The principle of tableau is that formulae in nodes of the same branch are considered in conjunction while the different branches are considered to be disjuncted. As a result, a tableau is a tree-like representation of a formula that is a disjunction of conjunctions. This formula is equivalent to the set to prove unsatisfiability. The procedure modifies the tableau in such a way that the formula represented by the resulting tableau is equivalent to the original one. One of these conjunctions may contain a pair of complementary literals, in which case that conjunction is proved to be unsatisfiable. If all conjunctions are proved unsatisfiable, the original set of formulae is unsatisfiable.
142
And
Whenever a branch of a tableau contains a formula that is the conjunction of two formulae, these two formulae are both consequences of that formula. This fact can be formalized by the following rule for expansion of a tableau: ( ) If a branch of the tableau contains a conjunctive formula and , add to its leaf the chain of two nodes containing the formulae This rule is generally written as follows:
A variant of this rule allows a node to contain a set of formulae rather than a single one. In this case, the formulae in this set are considered in conjunction, so one can add at the end of a branch containing . More precisely, if a node on a branch is labeled . , one can add to the branch the new leaf
143
Or
If a branch of a tableau contains a formula that is a disjunction of two formulae, such as , the following rule can be applied: ( ) If a node on a branch contains a disjunctive formula , then create two sibling children to the leaf of and ,
This rules splits a branch into two ones, differing only for the final node. Since branches are considered in disjunction to each other, the two resulting branches are equivalent to the original one, as the disjunction of their non-common nodes is precisely . The rule for disjunction is generally formally written using the symbol separating the formulae of the two distinct nodes to be created: for
ab generates a and b
If nodes are assumed to contain sets of formulae, this rule is replaced by: if a node is labeled leaf of the branch this node is in can be appended two sibling child nodes labeled respectively. and
,a ,
Not
The aim of tableau is to generate progressively simpler formulae until pairs of opposite literals are produced or no other rule can be applied. Negation can be treated by initially making formulae in negation normal form, so that negation only occurs in front of literals. Alternatively, one can use De Morgan's laws during the expansion of the tableau, so that for example is treated as . Rules that introduce or remove a pair of negations (such as in like : ) are also used in this case (otherwise, there would be no way of expanding a formula
144
Closure
Every tableau can be considered as a graphical representation of a formula, which is equivalent to the set the tableau is built from. This formula is as follows: each branch of the tableau represents the conjunction of its formulae; the tableau represents the disjunction of its branches. The expansion rules transforms a tableau into one having an equivalent represented formula. Since the tableau is initialized as a single branch containing the formulae of the input set, all subsequent tableaux obtained from it represent formulae which are equivalent to that set (in the variant where the initial tableau is the single node labeled true, the formulae represented by tableaux are consequences of the original set.)
145
The method of tableaux works by starting with the initial set of formulae and then adding to the tableau simpler and simpler formulae until contradiction is shown in the simple form of opposite literals. Since the formula represented by a tableau is the disjunction of the formulae represented by its branches, contradiction is obtained when every branch contains a pair of opposite literals. Once a branch contains a literal and its negation, its corresponding formula is unsatisfiable. As a result, this branch can be now "closed", as there is no need to further expand it. If all branches of a tableau are closed, the formula represented by the tableau is unsatisfiable; therefore, the original set is unsatisfiable as well. Obtaining a tableau where all branches are closed is a way for proving the unsatisfiability of the original set. In the propositional case, one can also prove that satisfiability is proved by the impossibility of finding a closed tableau, provided that every expansion rule has been applied everywhere it could be applied. In particular, if a tableau contains some open (non-closed) branches and every formula that is not a literal has been used by a rule to generate a new node on every branch the formula is in, the set is satisfiable.
This rule takes into account that a formula may occur in more than one branch (this is the case if there is at least a branching point "below" the node). In this case, the rule for expanding the formula has to be applied so that its conclusion(s) are appended to all of these branches that are still open, before one can conclude that the tableau cannot be further expanded and that the formula is therefore satisfiable.
A tableau for the satisfiable set {ac,ab}: all rules have been applied to every formula on every branch, but the tableau is not closed (only the left branch is closed), as expected for satisfiable sets
Set-labeled tableau
A variant of tableau is to label nodes with sets of formulae rather than single formulae. In this case, the initial tableau is a single node labeled with the set to be proved satisfiable. The formulae in a set are therefore considered to be in conjunction. The rules of expansion of the tableau can now work on the leaves of the tableau, ignoring all internal nodes. For conjunction, the rule is based on the equivalence of a set containing a conjunction with the set containing both and in place of it. In particular, if a leaf is labeled with : , a node can be appended to it with label
and
. As a result, if the first set labels a leaf, two children can be appended to it, labeled with the latter two formulae.
Finally, if a set contains both a literal and its negation, this branch can be closed:
Method of analytic tableaux A tableau for a given finite set X is a finite (upside down) tree with root X in which all child nodes are obtained by applying the tableau rules to their parents. A branch in such a tableau is closed if its leaf node contains "closed". A tableau is closed if all its branches are closed. A tableau is open if at least one branch is not closed. Here are two closed tableaux for the set X = {r0 & ~r0, p0 & ((~p0 q0) & ~q0)} with each rule application marked at the right hand side (& and ~ stand for and , respectively)
{r0 & ~r0, p0 & ((~p0 v q0) & ~q0)} --------------------------------------(&) {r0 , ~r0, p0 & ((~p0 v q0) & ~q0)} -------------------------------------(id) closed {r0 & ~r0, p0 & ((~p0 v q0) & ~q0)} ------------------------------------------------------------(&) {r0 & ~r0, p0, ((~p0 v q0) & ~q0)} ----------------------------------------------------------(&) {r0 & ~r0, p0, (~p0 v q0), ~q0}
146
-------------------------------------------------------------(v) {r0 & ~r0, p0, ~p0, ~q0} -------------------------- (id) closed | {r0 & ~r0, p0, q0, ~q0} ---------------------closed (id)
The left hand tableau closes after only one rule application while the right hand one misses the mark and takes a lot longer to close. Clearly, we would prefer to always find the shortest closed tableaux but it can be shown that one single algorithm that finds the shortest closed tableaux for all input sets of formulae cannot exist. The three rules , and given above are then enough to decide if a given set of formulae in or until we
negated normal form are jointly satisfiable: Just apply all possible rules in all possible orders until we find a closed tableau for exhaust all possibilities and conclude that every tableau for In the first case, is open.
is jointly unsatisfiable and in the second the case the leaf node of the open branch gives an jointly satisfiable. Classical logic
assignment to the atomic formulae and negated atomic formulae which makes
actually has the rather nice property that we need to investigate only (any) one tableau completely: if it closes then is unsatisfiable and if it is open then is satisfiable. But this property is not generally enjoyed by other logics. These rules suffice for all of classical logic by taking an initial set of formulae X and replacing each member C by its logically equivalent negated normal form C' giving a set of formulae X' . We know that X is satisfiable if and only if X' is satisfiable, so it suffices to search for a closed tableau for X' using the procedure outlined above. By setting If the tableau for for we can test whether the formula A is a tautology of classical logic: closes then is unsatisfiable and so A is a tautology since no assignment
of truth values will ever make A false. Otherwise any open leaf of any open branch of any open tableau gives an assignment that falsifies A.
Conditional
Classical propositional logic usually has a connective to denote material implication. If we write this connective as , then the formula A B stands for "if A then B". It is possible to give a tableau rule for breaking down A B into its constituent formulae. Similarly, we can give one rule each for breaking down each of (A B), (A B), (A), and (A B). Together these rules would give a terminating procedure for deciding whether a given set of formulae is simultaneously satisfiable in classical logic since each rule breaks down one formula into its constituents but no rule builds larger formulae out of smaller constituents. Thus we must eventually reach a node that contains only atoms and negations of atoms. If this last node matches (id) then we can close the branch, otherwise it remains open. But note that the following equivalences hold in classical logic where (...) = (...) means that the left hand side formula is logically equivalent to the right hand side formula:
147
If we start with an arbitrary formula C of classical logic, and apply these equivalences repeatedly to replace the left hand sides with the right hand sides in C, then we will obtain a formula C' which is logically equivalent to C but which has the property that C' contains no implications, and appears in front of atomic formulae only. Such a formula is said to be in negation normal form and it is possible to prove formally that every formula C of classical logic has a logically equivalent formula C' in negation normal form. That is, C is satisfiable if and only if C' is satisfiable.
Contrarily to the rules for the propositional connectives, multiple applications of this rule to the same formula may be necessary. As an example, the set can only be proved unsatisfiable if both and are generated from . Existential quantifiers are dealt with Skolemization. In particular, a formula with a leading existential quantifier like generates its Skolemization , where is a new constant symbol. where is a new constant symbol
148
The Skolem term is a constant (a function of arity 0) because the quantification over does not occur within the scope of any universal quantifier. If the original formula contained some universal quantifiers such that the quantification over was within their scope, these quantifiers have evidently been removed by the application of the rule for universal quantifiers.
A tableau without unification for {x.P(x),x.(P(x)P(f(x)))}. For clarity, formulae are numbered on the left and the formula and rule used at each step is on the right
The rule for existential quantifiers introduces new constant symbols. These symbols can be used by the rule for universal quantifiers, so that can generate even if was not in the original formula but is a Skolem constant created by the rule for existential quantifiers. The above two rules for universal and existential quantifiers are correct, and so are the propositional rules: if a set of formulae generates a closed tableau, this set is unsatisfiable. Completeness can also be proved: if a set of formulae is unsatisfiable, there exists a closed tableau built from it by these rules. However, actually finding such a closed tableau requires a suitable policy of application of rules. Otherwise, an unsatisfiable set can generate an infinite-growing tableau. As an example, the set is unsatisfiable, but a closed tableau is never obtained if one unwisely keeps applying the rule for universal quantifiers to example , generating for . A closed tableau can always be found by ruling out this and similar
"unfair" policies of application of tableau rules. The rule for universal quantifiers is the only non-deterministic rule, as it does not specify which term to instantiate with. Moreover, while the other rules need to be applied only once for each formula and each path the formula is in, this one may require multiple applications. Application of this rule can however be restricted by delaying the application of the rule until no other rule is applicable and by restricting the application of the rule to ground terms that already appear in the path of the tableau. The variant of tableaux with unification shown below aims at solving the problem of non-determinism.
149
where
While the initial set of formulae is supposed not to contain free variables, a formula of the tableau contain the free variables generated by this rule. These free variables are implicitly considered universally quantified. This rule employs a variable instead of a ground term. The gain of this change is that these variables can be then given a value when a branch of the tableau can be closed, solving the problem of generating terms that might be useless.
if is the most general unifier of two literals and , where be applied at the same time to all formulae of the tableau and the negation of occur in the same branch of the tableau, can
can be proved unsatisfiable by first generating , the most general unifier being the substitution that replaces
this substitution results in replacing with , which closes the tableau. This rule closes at least a branch of the tableau -the one containing the considered pair of literals. However, the substitution has to be applied to the whole tableau, not only on these two literals. This is expressed by saying that the free variables of the tableau are rigid: if an occurrence of a variable is replaced by something else, all other occurrences of the same variable must be replaced in the same way. Formally, the free variables are (implicitly) universally quantified and all formulae of the tableau are within the scope of these quantifiers. Existential quantifiers are dealt with by Skolemization. Contrary to the tableau without unification, Skolem terms may not be simple constant. Indeed, formulae in a tableau with unification may contain free variables, which are implicitly considered universally quantified. As a result, a formula like may be within the scope of universal quantifiers; if this is the case, the Skolem term is not a simple constant but a term made of a new function symbol and the free variables of the formula. where This rule incorporates a simplification over a rule where are the free variables of the branch, not of alone. This rule can be further simplified by the reuse of a function symbol if it has already been used in a formula that is identical to up to variable renaming. is a new function symbol and the free variables of
A first-order tableau with unification for {x.P(x),x.(P(x)P(f(x)))}. For clarity, formulae are numbered on the left and the formula and rule used at each step is on the right
The formula represented by a tableau is obtained in a way that is similar to the propositional case, with the additional assumption that free variables are considered universally quantified. As for the propositional case, formulae in each branch are conjoined and the resulting formulae are disjoined. In addition, all free variables of the resulting formula are universally quantified. All these quantifiers have the whole formula in their scope. In other words, if is the
Method of analytic tableaux formula obtained by disjoining the conjunction of the formulae in each branch, and the formula represented by the tableau. The following considerations apply: The assumption that free variables are universally quantified is what makes the application of a most general unifier a sound rule: since means that is true for every possible value of , then is true for the term that the most general unifier replaces with. Free variables in a tableau are rigid: all occurrences of the same variable have to be replaced all with the same term. Every variable can be considered a symbol representing a term that is yet to be decided. This is a consequence of free variables being assumed universally quantified over the whole formula represented by the tableau: if the same variable occurs free in two different nodes, both occurrences are in the scope of the same quantifier. As an example, if the formulae in two nodes are and , where is free in both, the formula represented by the tableau is something in the form is true for any value of different terms and , but does not in general imply . This formula implies that for two cannot be
, as these two terms may in general take different values. This means that
replaced by two different terms in and . Free variables in a formula to check for validity are also considered universally quantified. However, these variables cannot be left free when building a tableau, because tableau rules works on the converse of the formula but still treats free variables as universally quantified. For example, is not valid (it is not true in the model where Consequently, closed tableau could be generated with and , and the interpretation where , and substituting with ). is satisfiable (it is satisfied by the same model and interpretation). However, a would generate a closure. .
A correct procedure is to first make universal quantifiers explicit, thus generating The following two variants are also correct.
Applying to the whole tableau a substitution to the free variables of the tableau is a correct rule, provided that this substitution is free for the formula representing the tableau. In other worlds, applying such a substitution leads to a tableau whose formula is still a consequence of the input set. Using most general unifiers automatically ensures that the condition of freeness for the tableau is met. While in general every variable has to be replaced with the same term in the whole tableau, there are some special cases in which this is not necessary. Tableaux with unification can be proved complete: if a set of formulae is unsatisfiable, it has a tableau-with-unification proof. However, actually finding such a proof may be a difficult problem. Contrarily to the case without unification, applying a substitution can modify the existing part of a tableau; while applying a substitution closes at least a branch, it may make other branches impossible to close (even if the set is unsatisfiable). A solution to this problem is that delayed instantiation: no substitution is applied until one that closes all branches at the same time is found. With this variant, a proof for an unsatisfiable set can always be found by a suitable policy of application of the other rules. This method however requires the whole tableau to be kept in memory: the general method closes branches which can be then discarded, while this variant does not close any branch until the end. The problem that some tableaux that can be generated are impossible to close even if the set is unsatisfiable is common to other sets of tableau expansion rules: even if some specific sequences of application of these rules allow constructing a closed tableau (if the set is unsatisfiable), some other sequences lead to tableau that cannot be closed. General solutions for these cases are outlined in the "Searching for a tableau" section.
151
Proof procedures
A tableau calculus is simply a set of rules that tells how a tableau can be modified. A proof procedure is a method for actually finding a proof (if one exists). In other words, a tableau calculus is a set of rules, while a proof procedure is a policy of application of these rules. Even if a calculus is complete, not every possible choice of application of rules leads to a proof of an unsatisfiable set. For example is unsatisfiable, but both tableaux with unification and tableaux without unification allow the rule for the universal quantifiers to be applied repeatedly to the last formula, while simply applying the rule for disjunction to the third one would directly lead to closure. For proof procedures, a definition of completeness has been given: a proof procedure is strongly complete if it allows finding a closed tableau for any given unsatisfiable set of formulae. Proof confluence of the underlying calculus is relevant to completeness: proof confluence is the guarantee that a closed tableau can be always generated from an arbitrary partially constructed tableau (if the set is unsatisfiable). Without proof confluence, the application of a 'wrong' rule may result in the impossibility of making the tableau complete by applying other rules. Propositional tableaux and tableaux without unification have strongly complete proof procedures. In particular, a complete proof procedure is that of applying the rules in a fair way. This is because the only way such calculi cannot generate a closed tableau from an unsatisfiable set is by not applying some applicable rules. For propositional tableaux, fairness amounts to expanding every formula in every branch. More precisely, for every formula and every branch the formula is in, the rule having the formula as a precondition has been used to expand the branch. A fair proof procedure for propositional tableaux is strongly complete. For first-order tableaux without unification, the condition of fairness is similar, with the exception that the rule for universal quantifier might require more than one application. Fairness amounts to expanding every universal quantifier infinitely often. In other words, a fair policy of application of rules cannot keep applying other rules without expanding every universal quantifier in every branch that is still open once in a while.
152
A search tree in the space of tableau for {x.P(x),P(c)Q(c),y.Q(c)}. For simplicity, the formulae of the set have been omitted from all tableau in the figure and a rectangle used in their place. A closed tableau is in the bold box; the other branches could be still expanded.
Since one such branch can be infinite, this tree has to be visited breadth-first rather than depth-first. This requires a large amount of space, as the breadth of the tree can grow exponentially. A method that may visit some nodes more than once but works in polynomial space is to visit in a depth-first manner with iterative deepening: one first visits the tree up to a certain depth, then increases the depth and perform the visit again. This particular procedure uses the depth (which is also the number of tableau rules that have been applied) for deciding when to stop at each step. Various other parameters (such as the size of the tableau labeling a node) have been used instead.
Reducing search
The size of the search tree depends on the number of (children) tableau that can be generated from a given (parent) one. Reducing the number of such tableau therefore reduces the required search. A way for reducing this number is to disallow the generation of some tableau based on their internal structure. An example is the condition of regularity: if a branch contains a literal, using an expansion rule that generates the same literal is useless because the branch containing two copies of the literals would have the same set of formulae of the original one. This expansion can be disallowed because if a closed tableau exists, it can be found without it. This restriction is structural because it can be checked by looking at the structure of the tableau to expand only. Different methods for reducing search disallow the generation of some tableau on the ground that a closed tableau can still be found by expanding the other ones. These restrictions are called global. As an example of a global restriction, one may employ a rule that specify which of the open branches is to be expanded. As a result, if a tableau has for example two non-closed branches, the rule tells which one is to be expanded, disallowing the expansion of the second one. This restriction reduces the search space because one possible choice is now forbidden; completeness if however not harmed, as the second branch will still be expanded if the first one is eventually closed. As an example, a tableau with root , child , and two leaves and can be closed in two ways: applying first to and then to , or vice versa. There is clearly no need to follow both possibilities; one may is first applied to and disregard the case in which it is first applied to . consider only the case in which
This is a global restriction because what allows neglecting this second expansion is the presence of the other tableau,
153
Clause tableaux
When applied to sets of clauses (rather than of arbitrary formulae), tableaux methods allow for a number of efficiency improvements. A first-order clause is a formula that does not contain free variables and such that each example of is a literal. The universal quantifiers are often omitted for clarity, so that for actually means . Note that, if taken literally, these is the same as that
two formulae are not the same as for satisfiability: rather, the satisfiability
definition of first-order satisfiability; it is rather used as an implicit common assumption when dealing with clauses. The only expansion rules that are applicable to a clause are and ; these two rules can be replaced by their combination without losing completeness. In particular, the following rule corresponds to applying in sequence the rules and of the first-order calculus with unification. where is obtained by replacing every variable with a new one in
When the set to be checked for satisfiability is only composed of clauses, this and the unification rules are sufficient to prove unsatisfiability. In other worlds, the tableau calculi composed of and is complete. Since the clause expansion rule only generates literals and never new clauses, the clauses to which it can be applied are only clauses of the input set. As a result, the clause expansion rule can be further restricted to the case where the clause is in the input set. where new one in is obtained by replacing every variable with a
Since this rule directly exploit the clauses in the input set there is no need to initialize the tableau to the chain of the input clauses. The initial tableau can therefore be initialize with the single node labeled ; this label is often omitted as implicit. As a result of this further simplification, every node of the tableau (apart of the root) is labeled with a literal. A number of optimizations can be used for clause tableau. These optimization are aimed at reducing the number of possible tableaux to be explored when searching for a closed tableau as described in the "Searching for a closed tableau" section above.
Connection tableau
Connection is a condition over tableau that forbids expanding a branch using clauses that are unrelated to the literals that are already in the branch. Connection and be defined in two ways: strong connectedness when expanding a branch, use an input clause only if it contains a literal that can be unified with the negation of the literal in the current leaf weak connectedness allow the use of clauses that contain a literal that unifies with the negation of a literal on the branch Both conditions apply only to branches consisting not only of the root. The second definition allows for the use of a clause containing a literal that unifies with the negation of a literal in the branch, while the first only further constraint that literal to be in leaf of the current branch.
Method of analytic tableaux If clause expansion is restricted by connectedness (either strong or weak), its application produces a tableau in which substitution can applied to one of the new leaves, closing its branch. In particular, this is the leaf containing the literal of the clause that unifies with the negation of a literal in the branch (or the negation of the literal in the parent, in case of strong connection). Both conditions of connectedness lead to a complete first-order calculus: if a set of clauses is unsatisfiable, it has a closed connected (strongly or weakly) tableau. Such a closed tableau can be found by searching in the space of tableaux as explained in the "Searching for a closed tableau" section. During this search, connectedness eliminates some possible choices of expansion, thus reducing search. In other worlds, while the tableau in a node of the tree can be in general expanded in several different ways, connection may allow only few of them, thus reducing the number of resulting tableaux that need to be further expanded. This can be seen on the following (propositional) example. The tableau made of a chain clauses only allows the expansion that uses for the set of can be in general expanded using each of the four input clauses, but connection . This means that the tree of tableaux has four leaves in general but
154
only one if connectedness is imposed. This means that connectedness leaves only one tableau to try and expand, instead of the four ones to consider in general. In spite of this reduction of choices, the completeness theorem implies that a closed tableau can be found if the set is unsatisfiable. The connectedness conditions, when applied to the propositional (clausal) case, make the resulting calculus non-confluent. As an example, is unsatisfiable, but applying to generates the chain , which is not closed and to which no other expansion rule can be applied without violating either strong or weak connectedness. In the case of weak connectedness, confluence holds provided that the clause used for expanding the root is relevant to unsatisfiability, that is, it is contained in a minimally unsatisfiable subset of the set of clauses. Unfortunately, the problem of checking whether a clause meets this condition is itself a hard problem. In spite of non-confluence, a closed tableau can be found using search, as presented in the "Searching for a closed tableau" section above. While search is made necessary, connectedness reduces the possible choices of expansion, thus making search more efficient.
Regular tableaux
A tableau is regular if no literal occurs twice in the same branch. Enforcing this condition allows for a reduction of the possible choices of tableau expansion, as the clauses that would generate a non-regular tableau cannot be expanded. These disallowed expansion steps are however useless. If whose expansion violates regularity, then close, among others, the branch where exactly the same as the formulae of . This means that expanding contains , where is a branch containing a literal , and is a clause . In order to close the tableau, one needs to expand and occurs twice. However, the formulae in this branch are also close contained other literals, its expansion
alone. As a result, the same expansion steps that close was unnecessary; moreover, if
generated other leaves that needed to be closed. In the propositional case, the expansion needed to close these leaves are completely useless; in the first-order case, they may only affect the rest of the tableau because of some unifications; these can however be combined to the substitutions used to close the rest of the tableau.
155
In propositional tableaux all formulae refer to the same truth evaluation, but the precondition of the rule above holds in a world while the consequence holds in another. Not taking into account this would generate wrong results. For example, formula states that is true in the current world and is false in a world that is accessible from it. Simply applying and the expansion rule above would produce and , but these two formulae should not in general generate a contradiction, as they hold in different worlds. Modal tableaux calculi do contain rules of the kind of the one above, but include mechanisms to avoid the incorrect interaction of formulae referring to different worlds. Technically, tableaux for modal logics check the satisfiability of a set of formulae: they check whether there exists a model and world such that the formulae in the set are true in that model and world. In the example above, while states the truth of in , the formula states the truth of in some world that is accessible from and which may in general be different from formulae may refer to different worlds. . Tableaux calculi for modal logic take into account that
This fact has an important consequence: formulae that hold in a world may imply conditions over different successors of that world. Unsatisfiability may then be proved from the subset of formulae referring to the a single successor. This holds if a world may have more than one successor, which is true for most modal logic. If this is the case, a formula like is true if a successor where holds exists and a successor where holds exists. In the other way around, if one can show unsatisfiability of is proved unsatisfiable without checking for worlds where unsatisfiability of , there is no need to check in an arbitrary successor, the formula holds. At the same time, if one can show
. As a result, while there are two possible way to expand holds. If this expansion
, one of these two ways is always sufficient to prove unsatisfiability if the formula is unsatisfiable. For example, one may expand the tableau by considering an arbitrary world where leads to unsatisfiability, the original formula is unsatisfiable. However, it is also possible that unsatisfiability cannot be proved this way, and that the world where holds should have been considered instead. As a result, one can always prove unsatisfiability by expanding either only or only; however, if the wrong choice is done the resulting tableau may not be closed. Expanding either subformula leads to tableau calculi that are complete but not proof-confluent. Searching as described in the "Searching for a closed tableau" may therefore be necessary. Depending on whether the precondition and consequence of a tableau expansion rule refer to the same world or not, the rule is called static or transactional. While rules for propositional connectives are all static, not all rules for modal connectives are transactional: for example, in every modal logic including axiom T, it holds that implies in the same world. As a result, the relative (modal) tableau expansion rule is static, as both its precondition and consequence refer to the same world.
156
Formula-deleting tableau
A way for making formulae referring to different worlds not interacting in the wrong way is to make sure that all formulae of a branch refer to the same world. This condition is initially true as all formulae in the set to be checked for consistency are assumed referring to the same world. When expanding a branch, two situations are possible: either the new formulae refer to the same world as the other one in the branch or not. In the first case, the rule is applied normally. In the second case, all formulae of the branch that do not also hold in the new world are deleted from the branch, and possibly added to all other branches that are still relative to the old world. As an example, in S5 every formula accessible worlds both and that is true in a world is also true in all accessible worlds (that is, in all are true). Therefore, when applying , whose consequence holds in a , as these hold in the new
different world, one deletes all formulae from the branch, but can keep all formulae
world as well. In order to retain completeness, the deleted formulae are then added to all other branches that still refer to the old world.
World-labeled tableau
A different mechanism for ensuring the correct interaction between formulae referring to different worlds is to switch from formulae to labeled formulae: instead of writing , one would write to make it explicit that holds in world . All propositional expansion rules are adapted to this variant by stating that they all refer to formulae with the same world label. For example, generates two nodes labeled with and ; a branch is closed only if it contains two opposite literals of the same world, like two world labels are different, like in and . would be written as follows and ; no closure is generated if the
The modal expansion rule may have a consequence that refer to a different worlds. For example, the rule for
and
use different methods for keeping track of the accessibility of the worlds used as labels. Some include pseudo-formulae like to denote that is accessible from . Some others use sequences of integers as world labels, this notation implicitly representing the accessibility relation (for example, from .) is accessible
Set-labeling tableaux
The problem of interaction between formulae holding in different worlds can be overcome by using set-labeling tableaux. These are trees whose nodes are labeled with sets of formulae; the expansion rules tell how to attach new nodes to a leaf, based only on the label of the leaf (and not on the label of other nodes in the branch). Tableaux for modal logics are used to verify the satisfiability of a set of modal formulae in a given modal logic. Given a set of formulae , they check the existence of a model and a world such that . The expansion rules depend on the particular modal logic used. A tableau system for the basic modal logic K can be obtained by adding to the propositional tableau rules the following one:
Intuitively, the precondition of this rule expresses the truth of all formulae and truth of those worlds where is true.
at some accessible worlds. The consequence of this rule is a formula that must be true at one of
Method of analytic tableaux More technically, modal tableaux methods check the existence of a model formulae true. If that makes in such . are assumed satisfied by , the consequences are assumed satisfied in : same model but possibly different worlds. Set-labeled While the preconditions are true in , there must be a world and a world that make set of and
157
true. This rule therefore amounts to deriving a set of formulae that must be satisfied
tableaux do not explicitly keep track of the world where each formula is assumed true: two nodes may or may not refer to the same world. However, the formulae labeling any given node are assumed true at the same world. As a result of the possibly different worlds where formulae are assumed true, a formula in a node is not automatically valid in all its descendants, as every application of the modal rule correspond to a move from a world to another one. This condition is automatically captured by set-labeling tableaux, as expansion rules are based only on the leaf where they are applied and not on its ancestors. Remarkably, does not directly extend to multiple negated boxed formulae such as in
is false, these two worlds are not necessarily the same. Differently from the propositional rules, states conditions over all its preconditions. For example, it cannot be applied to a node labeled by proved by applying ; while this set is inconsistent and this could be easily , which is not even relevant to , this rule cannot be applied because of formula
The addition of this rule (thinning rule) makes the resulting calculus non-confluent: a tableau for an inconsistent set may be impossible to close, even if a closed tableau for the same set exists. Rule is non-deterministic: the set of formulae to be removed (or to be kept) can be chosen arbitrarily; this
creates the problem of choosing a set of formulae to discard that is not so large it makes the resulting set satisfiable and not so small it makes the necessary expansion rules inapplicable. Having a large number of possible choices makes the problem of searching for a closed tableau harder. This non-determinism can be avoided by restricting the usage of so that it is only applied before a modal expansion rule, and so that it only removes the formulae that make that other rule inapplicable. This condition can be also formulated by merging the two rules in a single one. The resulting rule produces the same result as the old one, but implicitly discard all formulae that made the old rule inapplicable. This mechanism for removing has been proved to preserve completeness for many modal logics. Axiom T expresses reflexivity of the accessibility relation: every world is accessible from itself. The corresponding tableau expansion rule is:
This rule relates conditions over the same world: if This rule copies generate
world. This rule is static, not transactional, as both its precondition and consequent refer to the same world. from the precondition to the consequent, in spite of this formula having being "used" to also holds there. This "copying" is : the only is not copied. . This is correct, as the considered world is the same, so
necessary in some cases. It is for example necessary to prove the inconsistency of applicable rules are in order , from which one is blocked if
158
Auxiliary tableaux
A different method for dealing with formulae holding in alternate worlds is to start a different tableau for each now world that is introduced in the tableau. For example, implies that is false in an accessible world, so one starts a new tableau rooted by . This new tableau is attached to the node of the original tableau where the expansion rule has been applied; a closure of this tableau immediately generates a closure of all branches where that node is, regardless of whether the same node is associated other auxiliary tableaux. The expansion rules for the auxiliary tableaux are the same as for the original one; therefore, an auxiliary tableau can have in turns other (sub-)auxiliary tableaux.
Global assumptions
The above modal tableaux establish the consistency of a set of formulae, and can be used for solving the local logical consequence problem. This is the problem of telling whether, for each model , if is true in a world , then is also true in the same world. This is the same as checking whether assumption that is also true in the same world of the same model. where is is true in a world of a model, in the
A related problem is the global consequence problem, where the assumption is that a formula (or set of formulae) is true in all possible worlds of the model. The problem is that of checking whether, in all models true in all worlds, is also true in all worlds.
Local and global assumption differ on models where the assumed formula is true in some worlds but not in others. As an example, entails globally but not locally. Local entailment does not hold in a model consisting of two worlds making and true, respectively, and where the second is accessible from is false. This counterexample works because can be is not under the the first; in the first world, the assumption is true but
assumed true in a world and false in another one. If however the same assumption is considered global, allowed in any world of the model. These two problems can be combined, so that one can check whether global assumption node, regardless of the world it refers to. is a local consequence of
. Tableaux calculi can deal with global assumption by a rule allowing its addition to every
Notations
The following conventions are sometimes used.
Uniform notation
When writing tableaux expansion rules, formulae are often denoted using a convention, so that for example is always considered to be . The following table provides the notation for formulae in propositional, first-order, and modal logic.
159
Notation
Formulae
Each label in the first column is taken to be either formula in the other columns. An overlined formula such as indicates that is the negation of whatever formula appears in its place, so that for example in formula the subformula is the negation of . Since every label indicates many equivalent formulae, this notation allows writing a single rule for all these equivalent formulae. For example, the conjunction expansion rule is formulated as:
Signed formulae
A formula in a tableau is assumed true. Signed tableaux allows stating that a formula is false. This is generally achieved by adding a label to each formula, where the label T indicates formulae assumed true and F those assumed false. A different but equivalent notation is that to write formulae that are assumed true at the left of the node and formulae assumed false at its right.
References
Beth, Evert W., 1955. "Semantic entailment and formal derivability", Mededlingen van de Koninklijke Nederlandse Akademie van Wetenschappen, Afdeling Letterkunde, N.R. Vol 18, no 13, 1955, pp 309-42. Reprinrted in Jaakko Intikka (ed.) The Philosophy of Mathematics, Oxford University Press, 1969. Bostock, David, 1997. Intermediate Logic. Oxford Univ. Press. M D'Agostino, D Gabbay, R Haehnle, J Posegga (Eds), Handbook of Tableau Methods, Kluwer,1999. Girle, Rod, 2000. Modal Logics and Philosophy. Teddington UK: Acumen. Gor, Rajeev (1999) "Tableau Methods for Modal and Temporal Logics" in D'Agostino, M., Dov Gabbay, R. Haehnle, and J. Posegga, eds., Handbook of Tableau Methods. Kluwer: 297-396. Richard Jeffrey, 1990 (1967). Formal Logic: Its Scope and Limits, 3rd ed. McGraw Hill. Raymond Smullyan, 1995 (1968). First Order-Logic. Dover Publications. Melvin Fitting (1996). First-order logic and automated theorem proving (2nd ed.). Springer-Verlag. Reiner Hhnle (2001). Tableaux and Related Methods. Handbook of Automated Reasoning Reinhold Letz, Gernot Stenz (2001). Model Elimination and Connection Tableau Procedures. Handbook of Automated Reasoning Zeman, J. J. (1973) Modal Logic. [1] Reidel.
160
External links
TABLEAUX [2]: an annual international conference on automated reasoning with analytic tableaux and related methods. JAR [3]: Journal of Automated Reasoning. lolo [4]: a simple theorem prover written in Haskell that uses analytic tableaux for propositional logic. tableaux.cgi [5]: an interactive prover for propositional and first-order logic using tableaux.
References
[1] [2] [3] [4] [5] http:/ / www. clas. ufl. edu/ users/ jzeman/ modallogic/ http:/ / i12www. ira. uka. de/ TABLEAUX/ http:/ / www-unix. mcs. anl. gov/ JAR/ http:/ / lolo. svn. sourceforge. net/ viewvc/ lolo/ http:/ / www. ncc. up. pt/ ~pbv/ cgi/ tableaux. cgi
Boolean satisfiability problem There are several special cases of the Boolean satisfiability problem in which the formulae are required to be conjunctions of clauses (i.e. formulae in conjunctive normal form). Determining the satisfiability of a formula in conjunctive normal form where each clause is limited to at most three literals is NP-complete; this problem is called "3SAT", "3CNFSAT", or "3-satisfiability". Determining the satisfiability of a formula in which each clause is limited to at most two literals is NL-complete; this problem is called "2SAT". Determining the satisfiability of a formula in which each clause is a Horn clause (i.e. it contains at most one positive literal) is P-complete; this problem is called Horn-satisfiability. The CookLevin theorem states that the Boolean satisfiability problem is NP-complete, and in fact, this was the first decision problem proved to be NP-complete. However, beyond this theoretical significance, efficient and scalable algorithms for SAT that were developed over the last decade have contributed to dramatic advances in our ability to automatically solve problem instances involving tens of thousands of variables and millions of constraints. Examples of such problems in electronic design automation (EDA) include formal equivalence checking, model checking, formal verification of pipelined microprocessors, automatic test pattern generation, routing of FPGAs, and so on. A SAT-solving engine is now considered to be an essential component in the EDA toolbox.
161
2-satisfiability
SAT is also easier if the number of literals in a clause is limited to 2, in which case the problem is called 2SAT. This problem can also be solved in polynomial time, and in fact is complete for the class NL. Similarly, if we limit the number of literals per clause to 2 and change the AND operations to XOR operations, the result is exclusive-or 2-satisfiability, a problem complete for SL = L. One of the most important restrictions of SAT is HORNSAT, where the formula is a conjunction of Horn clauses. This problem is solved by the polynomial-time Horn-satisfiability algorithm, and is in fact P-complete. It can be seen as P's version of the Boolean satisfiability problem. Provided that the complexity classes P and NP are not equal, none of these restrictions are NP-complete, unlike SAT. The assumption that P and NP are not equal is currently not proven.
162
3-satisfiability
3-satisfiability is a special case of k-satisfiability (k-SAT) or simply satisfiability (SAT), when each clause contains exactly k = 3 literals. It was one of Karp's 21 NP-complete problems. Here is an example, where indicates negation:
E has two clauses (denoted by parentheses), four variables (x1, x2, x3, x4), and k=3 (three literals per clause). To solve this instance of the decision problem we must determine whether there is a truth value (TRUE or FALSE) we can assign to each of the variables (x1 through x4) such that the entire expression is TRUE. In this instance, there is such an assignment (x1 = TRUE, x2 = TRUE, x3=TRUE, x4=TRUE), so the answer to this instance is YES. This is one of many possible assignments, with for instance, any set of assignments including x1 = TRUE being sufficient. If there were no such assignment(s), the answer would be NO. 3-SAT is NP-complete and it is used as a starting point for proving that other problems are also NP-hard. This is done by polynomial-time reduction from 3-SAT to the other problem. An example of a problem where this method has been used is the Clique problem. 3-SAT can be further restricted to One-in-three 3SAT, where we ask if exactly one of the literals in each clause is true, rather than at least one. This restriction remains NP-complete. There is a simple randomized algorithm due to Schning (1999) that runs in time algorithm can solve 3-Sat in time . where n is the number of
clauses and succeeds with high probability to correctly decide 3-Sat. The exponential time hypothesis is that no
Horn-satisfiability
A clause is Horn if it contains at most one positive literal. Such clauses are of interest because they are able to express implication of one variable from a set of other variables. Indeed, one such clause can be rewritten as , that is, if are all true, then y needs to be true as well. The problem of deciding whether a set of Horn clauses is satisfiable is in P. This problem can indeed be solved by a single step of the Unit propagation, which produces the single minimal (w.r.t. the set of literal assigned to true) model of the set of Horn clauses. A generalization of the class of Horn formulae is that of renamable-Horn formulae, which is the set of formulae that can be placed in Horn form by replacing some variables with their respective negation. Checking the existence of such a replacement can be done in linear time; therefore, the satisfiability of such formulae is in P as it can be solved by first performing this replacement and then checking the satisfiability of the resulting Horn formula.
XOR-satisfiability
Another special case are problems where each clause only contains exclusive or operators. Because the exclusive or operation is equivalent to addition on a Galois field of size 2 (see also modular arithmetic), the clauses can be viewed as a system of linear equations and corresponding methods such as Gaussian elimination can be used to find the solution.
163
Runtime behavior
As mentioned briefly above, though the problem is NP-complete, many practical instances can be solved much more quickly. Many practical problems are actually "easy", so the SAT solver can easily find a solution, or prove that none exists, relatively quickly, even though the instance has thousands of variables and tens of thousands of constraints. Other much smaller problems exhibit run-times that are exponential in the problem size, and rapidly become impractical. Unfortunately, there is no reliable way to tell the difficulty of the problem without trying it. Therefore, almost all SAT solvers include time-outs, so they will terminate even if they cannot find a solution. Finally, different SAT solvers will find different instances easy or hard, and some excel at proving unsatisfiability, and others at finding solutions. All of these behaviors can be seen in the SAT solving contests.[1]
Extensions of SAT
An extension that has gained significant popularity since 2003 is Satisfiability modulo theories (SMT) that can enrich CNF formulas with linear constraints, arrays, all-different constraints, uninterpreted functions, etc. Such extensions typically remain NP-complete, but very efficient solvers are now available that can handle many such kinds of constraints. The satisfiability problem becomes more difficult (PSPACE-complete) if we extend our logic to include second-order Booleans, allowing quantifiers such as "for all" and "there exists" that bind the Boolean variables. An example of such an expression would be:
problem. If we allow both, the problem is called the quantified Boolean formula problem (QBF), which can be shown to be PSPACE-complete. It is widely believed that PSPACE-complete problems are strictly harder than any problem in NP, although this has not yet been proved. A number of variants deal with the number of variable assignments making the formula true. Ordinary SAT asks if there is at least one such assignment. MAJSAT, which asks if the majority of all assignments make the formula true, is complete for PP, a probabilistic class. The problem of how many variable assignments satisfy a formula, not a decision problem, is in #P. UNIQUE-SAT or USAT or Unambiguous SAT is the problem of determining whether a formula known to have either zero or one satisfying assignments has zero or has one. Although this problem seems easier, it has been shown that if there is a practical (randomized polynomial-time) algorithm to solve this problem, then all problems in NP can be solved just as easily. The maximum satisfiability problem, an FNP generalization of SAT, asks for the maximum number of clauses which can be satisfied by any assignment. It has efficient approximation algorithms, but is NP-hard to solve exactly. Worse still, it is APX-complete, meaning there is no polynomial-time approximation scheme (PTAS) for this problem unless P=NP.
Self-reducibility
An algorithm which correctly answers if an instance of SAT is solvable can be used to find a satisfying assignment. First, the question is asked on formula . If the answer is "no", the formula is unsatisfable. Otherwise, the question is asked on , i.e. the first variable is assumed to be 0. If the answer is "no", it is assumed that , otherwise . Values of other variables are found subsequently. This property is used in several theorems in complexity theory: (Karp-Lipton theorem)
164
Notes
[1] [2] [3] [4] "The international SAT Competitions web page" (http:/ / www. satcompetition. org/ ). . Retrieved 2007-11-15. http:/ / minisat. se/ http:/ / www. satcompetition. org/ http:/ / www. st. ewi. tudelft. nl/ sat/ march_dl. php
References
References are ordered by date of publication: Davis, M.; Putnam, H. (1960). "A Computing Procedure for Quantification Theory". Journal of the ACM 7: 201. doi:10.1145/321033.321034. Davis, M.; Logemann, G.; Loveland, D. (1962). "A machine program for theorem-proving". Communications of the ACM 5: 394397. doi:10.1145/368273.368557. Cook, S. A. (1971). "The complexity of theorem-proving procedures". Proceedings of the 3rd Annual ACM Symposium on Theory of Computing: 151158. doi:10.1145/800157.805047.
Boolean satisfiability problem Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman. ISBN0-7167-1045-5. A9.1: LO1 LO7, pp.259 260. Marques-Silva, J. P.; Sakallah, K. A. (1999). "GRASP: a search algorithm for propositional satisfiability". IEEE Transactions on Computers 48: 506. doi:10.1109/12.769433. Marques-Silva, J.; Glass, T. (1999). Combinational equivalence checking using satisfiability and recursive learning. pp. 145. doi:10.1109/DATE.1999.761110. R. E. Bryant, S. M. German, and M. N. Velev, Microprocessor Verification Using Efficient Decision Procedures for a Logic of Equality with Uninterpreted Functions (http://portal.acm.org/citation.cfm?id=709275), in Analytic Tableaux and Related Methods, pp.113, 1999. Schoning, T. (1999). A probabilistic algorithm for k-SAT and constraint satisfaction problems. pp. 410. doi:10.1109/SFFCS.1999.814612. Moskewicz, M. W.; Madigan, C. F.; Zhao, Y.; Zhang, L.; Malik, S. (2001). Chaff. pp. 530. doi:10.1145/378239.379017. Clarke, E.; Biere, A.; Raimi, R.; Zhu, Y. (2001). Formal Methods in System Design 19: 7. doi:10.1023/A:1011276507260. Gi-Joon Nam; Sakallah, K. A.; Rutenbar, R. A. (2002). "A new FPGA detailed routing approach via search-based Boolean satisfiability". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 21: 674. doi:10.1109/TCAD.2002.1004311. Giunchiglia, E.; Tacchella, A. (2004). Giunchiglia, Enrico; Tacchella, Armando. eds. 2919. doi:10.1007/b95238. Babic, D.; Bingham, J.; Hu, A. J. (2006). "B-Cubing: New Possibilities for Efficient SAT-Solving". IEEE Transactions on Computers 55: 1315. doi:10.1109/TC.2006.175. Rodriguez, C.; Villagra, M.; Baran, B. (2007). Asynchronous team algorithms for Boolean Satisfiability. pp. 66. doi:10.1109/BIMNICS.2007.4610083.
165
Further reading
Carla P. Gomes, Henry Kautz, Ashish Sabharwal, Bart Selman (2008). "Satisfiability Solvers". In Frank Van Harmelen, Vladimir Lifschitz, Bruce Porter. Handbook of knowledge representation. Foundations of Artificial Intelligence. 3. Elsevier. pp.89134. doi:10.1016/S1574-6526(07)03002-7. ISBN9780444522115.
External links
More information on SAT: SAT and MAX-SAT for the Lay-researcher (http://www.mqasem.net/sat/sat) SAT Applications: WinSAT v2.04 (http://www.mqasem.net/sat/winsat/): A Windows-based SAT application made particularly for researchers. SAT Solvers: Chaff (http://www.princeton.edu/~chaff/) HyperSAT (http://www.domagoj-babic.com/index.php/ResearchProjects/HyperSAT) Spear (http://www.domagoj-babic.com/index.php/ResearchProjects/Spear) The MiniSAT Solver (http://minisat.se/) UBCSAT (http://www.satlib.org/ubcsat/) Sat4j (http://www.sat4j.org/) RSat (http://reasoning.cs.ucla.edu/rsat/home.html)
Fast SAT Solver (http://dudka.cz/fss) - simple but fast implementation of SAT solver based on genetic algorithms
Boolean satisfiability problem PicoSAT (http://fmv.jku.at/picosat/) CryptoMiniSat (http://www.msoos.org/cryptominisat2) International Conference on Theory and Applications of Satisfiability Testing: SAT 2011 (http://www.lri.fr/SAT2011/) SAT 2010 (http://ie.technion.ac.il/SAT10/) SAT 2009 (http://www.cs.swansea.ac.uk/~csoliver/SAT2009/) SAT 2008 (http://wwwcs.uni-paderborn.de/cs/ag-klbue/en/workshops/sat-08/sat08-main.php) SAT 2007 (http://sat07.ecs.soton.ac.uk/)
166
Publications: Journal on Satisfiability, Boolean Modeling and Computation (http://jsat.ewi.tudelft.nl) Survey Propagation (http://www.ictp.trieste.it/~zecchina/SP/) Benchmarks: Forced Satisfiable SAT Benchmarks (http://www.nlsde.buaa.edu.cn/~kexu/benchmarks/benchmarks.htm) IBM Formal Verification SAT Benchmarks (http://www.haifa.il.ibm.com/projects/verification/ RB_Homepage/bmcbenchmarks.html) SATLIB (http://www.satlib.org) Software Verification Benchmarks (http://www.cs.ubc.ca/~babic/index_benchmarks.htm) Fadi Aloul SAT Benchmarks (http://www.aloul.net/benchmarks.html) SAT solving in general: http://www.satlive.org http://www.satisfiability.org Evaluation of SAT solvers: Yearly evaluation of SAT solvers (http://www.maxsat.udl.cat/) SAT solvers evaluation results for 2008 (http://www.maxsat.udl.cat/08/ms08.pdf) This article includes material from a column in the ACM SIGDA (http:/ / www. sigda. org) e-newsletter (http:/ / www.sigda.org/newsletter/index.html) by Prof. Karem Sakallah (http://www.eecs.umich.edu/~karem) Original text is available here (http://www.sigda.org/newsletter/2006/eNews_061201.html)
167
Basic terminology
Formally speaking, an SMT instance is a formula in first-order logic, where some function and predicate symbols have additional interpretations, and SMT is the problem of determining whether such a formula is satisfiable. In other words, imagine an instance of the Boolean satisfiability problem (SAT) in which some of the binary variables are replaced by predicates over a suitable set of non-binary variables. A predicate is basically a binary-valued function of non-binary variables. Example predicates include linear inequalities (e.g., ) or equalities involving uninterpreted terms and function symbols (e.g., where is some unspecified function of two unspecified arguments.) These predicates are classified according to the theory they belong to. For instance, linear inequalities over real variables are evaluated using the rules of the theory of linear real arithmetic, whereas predicates involving uninterpreted terms and function symbols are evaluated using the rules of the theory of uninterpreted functions with equality (sometimes referred to as the empty theory). Other theories include the theories of arrays and list structures (useful for modeling and verifying software programs), and the theory of bit vectors (useful in modeling and verifying hardware designs). Subtheories are also possible: for example, difference logic is a sub-theory of linear arithmetic in which each inequality is restricted to have the form for variables and and constant . Most SMT solvers support only quantifier free fragments of their logics.
SMT solvers
Early attempts for solving SMT instances involved translating them to Boolean SAT instances (e.g., a 32-bit integer variable would be encoded by 32 bit variables with appropriate weights and word-level operations such as 'plus' would be replaced by lower-level logic operations on the bits) and passing this formula to a Boolean SAT solver. This approach, which is referred to as the eager approach, has its merits: by pre-processing the SMT formula into an equivalent Boolean SAT formula we can use existing Boolean SAT solvers "as-is" and leverage their performance
168
and capacity improvements over time. On the other hand, the loss of the high-level semantics of the underlying theories means that the Boolean SAT solver has to work a lot harder than necessary to discover "obvious" facts (such as for integer addition.) This observation led to the development of a number of SMT solvers that tightly integra the Boolean reasoning of a DPLL-style search with theory-specific solvers (T-solvers) that handle conjunctions (ANDs) of predicates from a given theory. This approach is referred to as the lazy approach. Dubbed DPLL(T) [1] , this architecture gives the responsibility of Boolean reasoning to the DPLL-based SAT solver which, in turn, interacts with a solver for theory T through a well-defined interface. The theory solver need only worry about checking the feasibility of conjunctions of theory predicates passed on to it from the SAT solver as it explores the Boolean search space of the formula. For this integration to work well, however, the theory solver must be able to participate in propagation and conflict analysis, i.e., it must be able to infer new facts from already established facts, as well as to supply succinct explanations of infeasibility when theory conflicts arise. In other words, the theory solver must be incremental and backtrackable.
where
is satisfiable. Then, such problems become undecidable in general. (It is important to note, however, that the theory of real closed fields, and thus the full first order theory of the real numbers, are decidable using quantifier elimination. This is due to Alfred Tarski.) The first order theory of the natural numbers with addition (but not multiplication), called Presburger arithmetic, is also decidable. Since multiplication by constants can be implemented as nested additions, the arithmetic in many computer programs can be expressed using Presburger arithmetic, resulting in decidable formulas. Examples of SMT solvers addressing Boolean combinations of theory atoms from undecidable arithmetic theories over the reals are ABsolver[2] , which employs a classical DPLL(T) architecture with a non-linear optimization packet as (necessarily incomplete) subordinate theory solver, and HySAT-2, building on a unification of DPLL SAT-solving and interval constraint propagation called the iSAT algorithm [3] .
SMT solvers
The table below summarizes some of the features of the many available SMT solvers. The column "SMT-LIB" indicates compatibility with the SMT-LIB language; many systems marked 'yes' may support only older versions of SMT-LIB, or offer only partial support for the language. The column "CVC" indicates support for the CVC language. The column "DIMACS" indicates support for the DIMACS [4] format. Projects differ not only in features and performance, but also in the viability of the surrounding community, its ongoing interest in a project, and its ability to contribute documentation, fixes, tests and enhancements. Based on these measures, it appears that the most vibrant, well-organized projects are OpenSMT, STP and CVC4.
169
Notes
ABsolver
[6]
Linux
C++
DPLL-based
Alt-Ergo
[7]
Free software
OCaml
2008
v 1.2
Proprietary v 1.2
C++
2009
BSD
v 1.2
OCaml
2009
GPLv3 v 1.2 No No
bitvectors, arrays
2009
SAT-solver based
Linux
BSD
v 1.2
Yes
empty theory, linear arithmetic, arrays, tuples, types, records, bitvectors, quantifiers pre-alpha
C/C++
2010
CVC4
[12]
Linux
BSD
Yes
2010
Linux
Apache No
OCaml
no
Linux
Proprietary
No
non-linear arithmetic empty theory, difference logics, linear arithmetic non-linear arithmetic empty theory, C++ differences, linear arithmetic, bitvectors linear arithmetic, difference logic bitvectors none
no
DPLL-based
MathSAT [15]
Linux
Proprietary
2010
DPLL-based
MiniSmt
[16] Linux
LGPL
2010
Yices-based
OpenSMT [17]
Linux
2010
SatEEn
[18]
Proprietary
2009
Proprietary
2010
SAT-solver based
Proprietary v 1.2
bitvectors
2008
170
bitvectors, arrays partial v2.0 Yes No C, C++, Python, OCaml, Java 2010 SAT-solver based
STP
MIT
SWORD UCLID
Proprietary BSD
v 1.2
bitvectors empty theory, linear arithmetic, bitvectors, and constrained lambda (arrays, memories, cache, etc.) empty theory, difference logic C/C++
2009 no SAT-solver based, written in Moscow ML. Input language is SMV model checker. Well-documented!
[23]
No
No
No
veriT
[24]
Linux
BSD
partial v2.0
2010
SAT-solver based
Yices
[25]
Proprietary v 1.2
2009
Z3
[26]
2008
Notes
[1] Nieuwenhuis, R.; Oliveras, A.; Tinelli, C. (2006), "Solving SAT and SAT Modulo Theories: From an Abstract Davis-Putnam-Logemann-Loveland Procedure to DPLL(T)" (ftp:/ / ftp. cs. uiowa. edu/ pub/ tinelli/ papers/ NieOT-JACM-06. pdf), Journal of the ACM, 53, pp.937977., [2] Bauer, A.; Pister, M.; Tautschnig, M. (2007), "Tool-support for the analysis of hybrid systems and models", Proceedings of the 2007 Conference on Design, Automation and Test in Europe (DATE'07), IEEE Computer Society, p.1, doi:10.1109/DATE.2007.364411 [3] Frnzle, M.; Herde, C.; Ratschan, S.; Schubert, T.; Teige, T. (2007), "Efficient Solving of Large Non-linear Arithmetic Constraint Systems with Complex Boolean Structure" (http:/ / jsat. ewi. tudelft. nl/ content/ volume1/ JSAT1_11_Fraenzle. pdf), JSAT Special Issue on SAT/CP Integration, 1, pp.209236, [4] http:/ / www. satcompetition. org/ 2009/ format-benchmarks2009. html [5] http:/ / www. smtcomp. org/ [6] http:/ / absolver. sourceforge. net/ [7] http:/ / ergo. lri. fr/ [8] http:/ / www. lsi. upc. edu/ ~oliveras/ bclt-main. html [9] http:/ / uclid. eecs. berkeley. edu/ jha/ beaver-dist/ beaver. html [10] http:/ / fmv. jku. at/ boolector/ index. html [11] http:/ / www. cs. nyu. edu/ acsys/ cvc3/ [12] http:/ / www. cs. nyu. edu/ acsys/ cvc4/ [13] http:/ / sourceforge. net/ projects/ dpt [14] http:/ / isat. gforge. avacs. org/ [15] http:/ / mathsat4. disi. unitn. it/ [16] http:/ / cl-informatik. uibk. ac. at/ software/ minismt/ [17] http:/ / verify. inf. unisi. ch/ opensmt [18] http:/ / vlsi. colorado. edu/ ~hhkim/ sateen/ [19] http:/ / www. informatik. uni-bremen. de/ ~florian/ sonolar/ [20] http:/ / www. cs. ubc. ca/ ~babic/ index_spear. htm [21] http:/ / sites. google. com/ site/ stpfastprover/ [22] http:/ / www. informatik. uni-bremen. de/ agra/ eng/ sword. php [23] http:/ / uclid. eecs. berkeley. edu/ wiki/ index. php/ Main_Page [24] http:/ / www. verit-solver. org/ [25] http:/ / yices. csl. sri. com/ [26] http:/ / research. microsoft. com/ projects/ z3/
171
References
Vijay Ganesh (PhD. Thesis 2007), Decision Procedures for Bit-Vectors, Arrays and Integers (http://people.csail. mit.edu/vganesh/Publications_files/vg2007-PhD-STANFORD.pdf), Computer Science Department, Stanford University, Stanford, CA, U.S., Sept 2007 Susmit Jha, Rhishikesh Limaye, and Sanjit A. Seshia. Beaver: Engineering an efficient SMT solver for bit-vector arithmetic. (http://dx.doi.org/10.1007/978-3-642-02658-4_53) In Proceedings of 21st International Conference on Computer-Aided Verification, pp.668674, 2009. R. E. Bryant, S. M. German, and M. N. Velev, "Microprocessor Verification Using Efficient Decision Procedures for a Logic of Equality with Uninterpreted Functions," in Analytic Tableaux and Related Methods, pp.113, 1999. M. Davis and H. Putnam, A Computing Procedure for Quantification Theory (doi:10.1145/321033.321034), Journal of the Association for Computing Machinery, vol. 7, no., pp.201215, 1960. M. Davis, G. Logemann, and D. Loveland, A Machine Program for Theorem-Proving (doi:10.1145/368273.368557), Communications of the ACM, vol. 5, no. 7, pp.394397, 1962. D. Kroening and O. Strichman, Decision Procedures an algorithmic point of view (2008), Springer (Theoretical Computer Science series) ISBN 978-3540741046. G.-J. Nam, K. A. Sakallah, and R. Rutenbar, A New FPGA Detailed Routing Approach via Search-Based Boolean Satisfiability (doi:10.1109/TCAD.2002.1004311), IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 21, no. 6, pp.674684, 2002. SMT-LIB: The Satisfiability Modulo Theories Library (http://smtlib.org/) SMT-COMP: The Satisfiability Modulo Theories Competition (http://www.smtcomp.org) Decision procedures - an algorithmic point of view (http://www.decision-procedures.org:) Summer School on SAT/SMT solvers and their applications (http://people.csail.mit.edu/vganesh/ summerschool) This article is adapted from a column in the ACM SIGDA (http:/ / www. sigda. org) e-newsletter (http:/ / www. sigda. org/ newsletter/ index. html) by Prof. Karem Sakallah (http:/ / www. eecs. umich. edu/ ~karem). Original text is available here (http://archive.sigda.org/newsletter/2006/061215.txt)
172
Arithmetic
Presburger arithmetic
Presburger arithmetic is the first-order theory of the natural numbers with addition, named in honor of Mojesz Presburger, who introduced it in 1929. The signature of Presburger arithmetic contains only the addition operation and equality, omitting the multiplication operation entirely. The axioms include a schema of induction. Presburger arithmetic is much weaker than Peano arithmetic, which includes both addition and multiplication operations. Unlike Peano arithmetic, Presburger arithmetic is a decidable theory. This means it is possible to effectively determine, for any sentence in the language of Presburger arithmetic, whether that sentence is provable from the axioms of Presburger arithmetic. The asymptotic running-time computational complexity of this decision problem is doubly exponential, however, as shown by Fischer and Rabin(1974).
Overview
The language of Presburger arithmetic contains constants 0 and 1 and a binary function +, interpreted as addition. In this language, the axioms of Presburger arithmetic are the universal closures of the following: 1. 2. 3. 4. 5. (0 = x + 1) x+1=y+1x=y x+0=x (x + y) + 1 = x + (y + 1) Let P(x) be a first-order formula in the language of Presburger arithmetic with a free variable x (and possibly other free variables). Then the following formula is an axiom: (P(0) x(P(x) P(x + 1))) P(y). (5) is an axiom schema of induction, representing infinitely many axioms. Since the axioms in the schema in (5) cannot be replaced by any finite number of axioms, Presburger arithmetic is not finitely axiomatizable. Presburger arithmetic cannot formalize concepts such as divisibility or prime number. Generally, any number concept leading to multiplication cannot be defined in Presburger arithmetic, since that leads to incompleteness and undecidability. However, it can formulate individual instances of divisibility; for example, it proves "for all x, there exists y : (y + y = x) (y + y + 1 = x)". This states that every number is either even or odd.
Properties
Mojesz Presburger proved Presburger arithmetic to be: consistent: There is no statement in Presburger arithmetic which can be deduced from the axioms such that its negation can also be deduced. complete: For each statement in Presburger arithmetic, either it is possible to deduce it from the axioms or it is possible to deduce its negation. decidable: There exists an algorithm which decides whether any given statement in Presburger arithmetic is true or false. The decidability of Presburger arithmetic can be shown using quantifier elimination, supplemented by reasoning about arithmetical congruence (Enderton 2001, p. 188).
Presburger arithmetic Peano arithmetic, which is Presburger arithmetic augmented with multiplication, cannot be decidable, as a consequence of the negative answer to the Entscheidungsproblem. By Gdel's incompleteness theorem, Peano arithmetic is incomplete and its consistency is not internally provable. The decision problem for Presburger arithmetic is an interesting example in computational complexity theory and computation. Let n be the length of a statement in Presburger arithmetic. Then Fischer and Rabin (1974) proved that any decision algorithm for Presburger arithmetic has a worst-case runtime of at least , for some constant c>0. Hence, the decision problem for Presburger arithmetic is an example of a decision problem that has been proved to require more than exponential run time. Fischer and Rabin also proved that for any reasonable axiomatization (defined precisely in their paper), there exist theorems of length n which have doubly exponential length proofs. Intuitively, this means there are computational limits on what can be proven by computer programs. Fischer and Rabin's work also implies that Presburger arithmetic can be used to define formulas which correctly calculate any algorithm as long as the inputs are less than relatively large bounds. The bounds can be increased, but only by using new formulas. On the other hand, a triply exponential upper bound on a decision procedure for Presburger Arithmetic was proved by Oppen (1978).
173
Applications
Because Presburger arithmetic is decidable, automatic theorem provers for Presburger arithmetic are possible in principle. Indeed, implemented theorem provers exist . The double exponential complexity of the theory makes it infeasible to use the theorem provers on complicated formulas, but this behavior occurs only in the presence of nested quantifiers: Oppen and Nelson (1980) describe an automatic theorem prover which uses the simplex algorithm on an extended Presburger arithmetic without nested quantifiers. The simplex algorithm has exponential worst-case running time, but displays considerably better efficiency for typical real-life instances. Exponential running time is only observed for specially constructed cases. This makes a simplex-based approach practical in a working system. Presburger arithmetic can be extended to include multiplication by constants, since multiplication is repeated addition. Most array subscript calculations then fall within the region of decidable problems. This approach is the basis of at least five proof of correctness systems for computer programs, beginning with the Stanford Pascal Verifier in the late 1970s and continuing though to Microsoft's Spec# system of 2005.
References
Cooper, D. C., 1972, "Theorem Proving in Arithmetic without Multiplication" in B. Meltzer and D. Michie, eds., Machine Intelligence. Edinburgh University Press: 91100. Enderton, Herbert (2001). A mathematical introduction to logic (2nd ed.). Boston, MA: Academic Press. ISBN978-0-12-238452-3 Ferrante, Jeanne, and Charles W. Rackoff, 1979. The Computational Complexity of Logical Theories. Lecture Notes in Mathematics 718. Springer-Verlag. Fischer, M. J., and Michael O. Rabin, 1974, ""Super-Exponential Complexity of Presburger Arithmetic. [1]" Proceedings of the SIAM-AMS Symposium in Applied Mathematics Vol. 7: 2741. G. Nelson and D. C. Oppen (Apr. 1978). ""A simplifier based on efficient decision algorithms"". Proc. 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages: 141150. doi:10.1145/512760.512775. Mojesz Presburger, 1929, "ber die Vollstndigkeit eines gewissen Systems der Arithmetik ganzer Zahlen, in welchem die Addition als einzige Operation hervortritt" in Comptes Rendus du I congrs de Mathmaticiens des Pays Slaves. Warszawa: 92101. William Pugh, 1991, "The Omega test: a fast and practical integer programming algorithm for dependence analysis, [2]".
Presburger arithmetic Reddy, C. R., and D. W. Loveland, 1978, "Presburger Arithmetic with Bounded Quantifier Alternation. [3]" ACM Symposium on Theory of Computing: 320325. Young, P., 1985, "Godel theorems, exponential difficulty and undecidability of arithmetic theories: an exposition" in A. Nerode and R. Shore, Recursion Theory, American Mathematical Society: 503-522. Derek C. Oppen: A 222pn Upper Bound on the Complexity of Presburger Arithmetic. J. Comput. Syst. Sci. 16(3): 323-332 (1978) doi:10.1016/0022-0000(78)90021-1
174
External links
online prover [4] A Java applet proves or disproves arbitrary formulas of Presburger arithmetic (In German) [5] A complete Theorem Prover for Presburger Arithmetic by Philipp Rmmer
References
[1] [2] [3] [4] [5] http:/ / www. lcs. mit. edu/ publications/ pubs/ ps/ MIT-LCS-TM-043. ps http:/ / doi. acm. org/ 10. 1145/ 125826. 125848 http:/ / doi. acm. org/ 10. 1145/ 800133. 804361 http:/ / www. stefan-baur. de/ priv. studies. studienarbeit. html http:/ / www. philipp. ruemmer. org/ princess. shtml
Robinson arithmetic
In mathematics, Robinson arithmetic, or Q, is a finitely axiomatized fragment of Peano arithmetic (PA), first set out in R. M. Robinson (1950). Q is essentially PA without the axiom schema of induction. Since Q is weaker than PA, it is incomplete. Q is important and interesting because it is a finitely axiomatized fragment of PA that is recursively incompletable and essentially undecidable.
Axioms
The background logic of Q is first-order logic with identity, denoted by infix '='. The individuals, called natural numbers, are members of a set called N with a distinguished member 0, called zero. There are three operations over N: A unary operation called successor and denoted by prefix S; Two binary operations, addition and multiplication, denoted by infix + and by concatenation, respectively. The following axioms for Q are Q1Q7 in Burgess (2005: 56), and are also the first seven axioms of second order arithmetic. Variables not bound by an existential quantifier are bound by an implicit universal quantifier. 1. Sx 0 0 is not the successor of any number. 2. (Sx = Sy) x = y If the successor of x is identical to the successor of y, then x and y are identical. (1) and (2) yield the minimum of facts about N (it is an infinite set bounded by 0) and S (it is an injective function whose domain is N) needed for non-triviality. The converse of (2) follows from the properties of identity. 3. y=0 x (Sx = y) Every number is either 0 or the successor of some number. The axiom schema of mathematical induction present in arithmetics stronger than Q turns this axiom into a theorem. 4. x + 0 = x 5. x + Sy = S(x + y)
Robinson arithmetic (4) and (5) are the recursive definition of addition. 6. x0 = 0 7. xSy = (xy) + x (6) and (7) are the recursive definition of multiplication.
175
Variant axiomatizations
The axioms in Robinson (1950) are (1)(13) in Mendelson (1997: 201). The first 6 of Robinson's 13 axioms are required only when, unlike here, the background logic does not include identity. Machover (1996: 25657) dispenses with axiom (3). The usual strict total order on N, "less than" (denoted by "<"), can be defined in terms of addition via the rule (Burgess 2005:230, fn. 24). Taking "<" as primitive requires adding four axioms to (1)(7) above: (x < 0) 0=x0<x x < y (Sx < y Sx = y) x < Sy (x < y x = y).
Metamathematics
On the metamathematics of Q, see Boolos et al. (2002: chpt. 14), Tarski, Mostowski, and Robinson (1953), Smullyan (1991), Mendelson (1997: 201-03), and Burgess (2005: 1.5a, 2.2). The intended interpretation of Q is the natural numbers and their usual arithmetic. Hence addition and multiplication have their customary meaning, identity is equality, Sx = x + 1, and 0 is the natural number zero. Q, like Peano arithmetic, has nonstandard models of all infinite cardinalities. However, unlike Peano arithmetic, Tennenbaum's theorem does not apply to Q, and it has computable non-standard models. For instance, there is a computable model of Q consisting of integer-coefficient polynomials with positive leading coefficient, plus the zero polynomial, with their usual arithmetic. The defining characteristic of Q is the absence of the axiom scheme of induction. Hence it is often possible to prove in Q every specific instance of a fact about the natural numbers, but not the associated general theorem. For example, 5 + 7 = 7 + 5 is provable in Q, but the general statement x + y = y + x is not. Similarly, one cannot prove that Sx x (Burgess 2005:56). Q is interpretable in a fragment of Zermelo's axiomatic set theory, consisting of extensionality, existence of the empty set, and the axiom of adjunction. This theory is S' in Tarski et al. (1953: 34) and ST in Burgess (2005: 90-91; 223). See general set theory for more details. Q fascinates because it is a finitely axiomatized first-order theory that is considerably weaker than Peano arithmetic (PA), and whose axioms contain only one existential quantifier, yet like PA is incomplete and incompletable in the sense of Gdel's Incompleteness Theorems, and essentially undecidable. Robinson (1950) derived the Q axioms (1)(7) above by noting just what PA axioms are required to prove (Mendelson 1997: Th. 3.24) that every computable function is representable in PA. The only use this proof makes of the PA axiom schema of induction is to prove a statement that is axiom (3) above, and so, all computable functions are representable in Q (Mendelson 1997: Th. 3.33). The conclusion of Gdel's second incompleteness theorem also holds for Q: no consistent recursively axiomatized extension of Q can prove its own consistency, even if we additionally restrict Gdel numbers of proofs to a definable cut (Bezboruah and Shepherdson 1976; Pudlk 1985; Hjek & Pudlk 1993:387). The first incompleteness theorem applies only to axiomatic systems defining sufficient arithmetic to carry out the necessary coding constructions (of which Gdel numbering forms a part). The axioms of Q were chosen specifically to ensure they are strong enough for this purpose. Thus the usual proof of the first incompleteness theorem can be
Robinson arithmetic used to show that Q is incomplete and undecidable. This indicates that the incompleteness and undecidability of PA cannot be blamed on the only aspect of PA differentiating it from Q, namely the axiom schema of induction. Gdel's theorems do not hold when any one of the seven axioms above is dropped. These fragments of Q remain undecidable, but they are no longer essentially undecidable: they have consistent decidable extensions, as well as uninteresting models (i.e., models which do not extend the standard natural numbers).
176
References
A. Bezboruah and John C. Shepherdson, 1976. Godel's Second Incompleteness Theorem for Q. Journal of Symbolic Logic v.41 n.2, pp.503512. George Boolos, John P. Burgess, and Richard Jeffrey, 2002. Computability and Logic, 4th ed. Cambridge University Press. Burgess, John P., 2005. Fixing Frege. Princeton University Press. Petr Hjek and Pavel Pudlk (1998) [1993]. Metamathematics of first-order arithmetic, 2nd ed. Springer-Verlag. Lucas, J. R., 1999. Conceptual Roots of Mathematics. Routledge. Machover, Moshe, 1996. Set Theory, Logic, and Their Limitation. Cambridge University Press. Mendelson, Elliott, 1997. Introduction to Mathematical Logic, 4th ed. Chapman & Hall. Pavel Pudlk, 1985. "Cuts, consistency statements and interpretations". Journal of Symbolic Logic v.50 n.2, pp.423441. R. M. Robinson, 1950, "An Essentially Undecidable Axiom System" in Proceedings of the International Congress of Mathematics 1950, pp.729730. Raymond Smullyan, 1991. Gdel's Incompleteness Theorems. Oxford University Press. Alfred Tarski, A. Mostowski, and R. M. Robinson, 1953. Undecidable theories. North Holland.
Peano axioms
In mathematical logic, the Peano axioms, also known as the DedekindPeano axioms or the Peano postulates, are a set of axioms for the natural numbers presented by the 19th century Italian mathematician Giuseppe Peano. These axioms have been used nearly unchanged in a number of metamathematical investigations, including research into fundamental questions of consistency and completeness of number theory. The need for formalism in arithmetic was not well appreciated until the work of Hermann Grassmann, who showed in the 1860s that many facts in arithmetic could be derived from more basic facts about the successor operation and induction.[1] In 1881, Charles Sanders Peirce provided an axiomatization of natural-number arithmetic.[2] In1888, Richard Dedekind proposed a collection of axioms about the numbers, and in1889 Peano published a more precisely formulated version of them as a collection of axioms in his book, The principles of arithmetic presented by a new method (Latin: Arithmetices principia, nova methodo exposita). The Peano axioms contain three types of statements. The first axiom asserts the existence of at least one member of the set "number". The next four are general statements about equality; in modern treatments these are often considered axioms of the "underlying logic".[3] The next three axioms are first-order statements about natural numbers expressing the fundamental properties of the successor operation. The ninth, final axiom is a second order statement of the principle of mathematical induction over the natural numbers. A weaker first-order system called Peano arithmetic is obtained by explicitly adding the addition and multiplication operation symbols and replacing the second-order induction axiom with a first-order axiom schema.
Peano axioms
177
The axioms
When Peano formulated his axioms, the language of mathematical logic was in its infancy. The system of logical notation he created to present the axioms did not prove to be popular, although it was the genesis of the modern notation for set membership (, which is from Peano's ) and implication (, which is from Peano's reversed 'C'.) Peano maintained a clear distinction between mathematical and logical symbols, which was not yet common in mathematics; such a separation had first been introduced in the Begriffsschrift by Gottlob Frege, published in 1879.[4] Peano was unaware of Frege's work and independently recreated his logical apparatus based on the work of Boole and Schrder.[5] The Peano axioms define the arithmetical properties of natural numbers, usually represented as a set N or The signature (a formal language's non-logical symbols) for the axioms includes a constant symbol 0 and a unary function symbol S. The constant 0 is assumed to be a natural number: 1. 0 is a natural number. The next four axioms describe the equality relation. For every natural number x, x = x. That is, equality is reflexive. 2. For all natural numbers x and y, if x = y, then y = x. That is, equality is symmetric. 3. For all natural numbers x, y and z, if x = y and y = z, then x = z. That is, equality is transitive. 4. For all a and b, if a is a natural number and a = b, then b is also a natural number. That is, the natural numbers are closed under equality. The remaining axioms define the arithmetical properties of the natural numbers. The naturals are assumed to be closed under a single-valued "successor" function S. For every natural number n, S(n) is a natural number. Peano's original formulation of the axioms used 1 instead of 0 as the "first" natural number. This choice is arbitrary, as axiom 1 does not endow the constant 0 with any additional properties. However, because 0 is the additive identity in arithmetic, most modern formulations of the Peano axioms start from 0. Axioms 1 and 6 define a unary representation of the natural numbers: the number 1 is S(0), 2 is S(S(0)) (which is also S(1)), and, in general, any natural number n is Sn(0). The next two axioms define the properties of this representation. For every natural number n, S(n) = 0 is False. That is, there is no natural number whose successor is 0. 2. For all natural numbers m and n, if S(m) = S(n), then m = n. That is, S is an injection. Axioms 1, 6, 7 and 8 imply that the set of natural numbers is infinite, because it contains at least the infinite subset { 0, S(0), S(S(0)), }, each element of which differs from the rest. To show that every natural number is included in this set requires an additional axiom, which is sometimes called the axiom of induction. This axiom provides a method for reasoning about the set of all natural numbers. 1. If K is a set such that: 0 is in K, and for every natural number n, if n is in K, then S(n) is in K, then K contains every natural number. The induction axiom is sometimes stated in the following form: 1. If is a unary predicate such that: (0) is true, and for every natural number n, if (n) is true, then (S(n)) is true, then (n) is true for every natural number n.
Peano axioms In Peano's original formulation, the induction axiom is a second-order axiom. It is now common to replace this second-order principle with a weaker first-order induction scheme. There are important differences between the second-order and first-order formulations, as discussed in the section Models below. Without the axiom of induction, the remaining Peano axioms give a theory equivalent to Robinson arithmetic, which can be expressed without second-order logic.
178
Arithmetic
The Peano axioms can be augmented with the operations of addition and multiplication and the usual total (linear) ordering on N. The respective functions and relations are constructed in second-order logic, and are shown to be unique using the Peano axioms.
Addition
Addition is the function + : N N N (written in the usual infix notation, mapping elements of N to other elements of N), defined recursively as:
For example, a + 1 = a + S(0) = S(a + 0) = S(a). The structure (N, +) is a commutative semigroup with identity element 0. (N, +) is also a cancellative magma, and thus embeddable in a group. The smallest group embedding N is the integers.
Multiplication
Given addition, multiplication is the function : N N N defined recursively as:
It is easy to see that 1 is the multiplicative identity: a 1 = a S(0) = a + (a 0) = a + 0 = a Moreover, multiplication distributes over addition: a (b + c) = (a b) + (a c). Thus, (N, +, 0, , 1) is a commutative semiring.
Inequalities
The usual total order relation : N N can be defined as follows, assuming 0 is a natural number: For all a, b N, a b if and only if there exists some c N such that a + c = b. This relation is stable under addition and multiplication: for a + c b + c, and a c b c. Thus, the structure (N, +, , 1, 0, ) is an ordered semiring; because there is no natural number between 0 and 1, it is a discrete ordered semiring. The axiom of induction is sometimes stated in the following strong form, making use of the order: For any predicate , if (0) is true, and for every n, k N, if k n implies (k) is true, then (S(n)) is true, , if a b, then:
Peano axioms then for every n N, (n) is true. This form of the induction axiom is a simple consequence of the standard formulation, but is often better suited for reasoning about the order. For example, to show that the naturals are well-orderedevery nonempty subset of N has a least elementone can reason as follows. Let a nonempty X N be given and assume X has no least element. Because 0 is the least element of N, it must be that 0 X. For any n N, suppose for every k n, k X. Then S(n) X, for otherwise it would be the least element of X. Thus, by the strong induction principle, for every n N, n X. Thus, X N = , which contradicts X being a nonempty subset of N. Thus X has a least element.
179
Models
A model of the Peano axioms is a triple (N, 0, S), where N an infinite set, 0 N and S : N N satisfies the axioms above. Dedekind proved in his 1888 book, What are numbers and what should they be (German: Was sind und was sollen die Zahlen) that any two models of the Peano axioms (including the second-order induction axiom) are isomorphic. In particular, given two models (NA, 0A, SA) and (NB, 0B, SB) of the Peano axioms, there is a unique homomorphism f : NA NB satisfying
and it is a bijection. The second-order Peano axioms are thus categorical; this is not the case with any first-order reformulation of the Peano axioms, however.
where
is an abbreviation for y1,...,yk. The first-order induction schema includes every instance of the first-order
induction axiom, that is, it includes the induction axiom for every formula . This schema avoids quantification over sets of natural numbers, which is impossible in first-order logic. For instance, it is not possible in first-order logic to say that any set of natural numbers containing 0 and closed under successor is
Peano axioms the entire set of natural numbers. What can be expressed is that any definable set of natural numbers has this property. Because it is not possible to quantify over definable subsets explicitly with a single axiom, the induction schema includes one instance of the induction axiom for every definition of a subset of the naturals.
180
Equivalent axiomatizations
There are many different, but equivalent, axiomatizations of Peano arithmetic. While some axiomatizations, such as the one just described, use a signature that only has symbols for 0 and the successor, addition, and multiplications operations, other axiomatizations use the language of ordered semirings, including an additional order relation symbol. One such axiomatization begins with the following axioms that describe a discrete ordered semiring.[7] 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. . . . . . . . . . . . . . . . , i.e., addition is associative. , i.e., addition is commutative. , i.e., multiplication is associative. , i.e., multiplication is commutative. , i.e., the distributive law. , i.e., zero is the identity element for addition , i.e., one is the identity element for multiplication. , i.e., the '<' operator is transitive. , i.e., the '<' operator is irreflexive. . . . . . . .
The theory defined by these axioms is known as PA; PA is obtained by adding the first-order induction schema. An important property of PA is that any structure M satisfying this theory has an initial segment (ordered by ) isomorphic to N. Elements of M \ N are known as nonstandard elements.
Nonstandard models
Although the usual natural numbers satisfy the axioms of PA, there are other non-standard models as well; the compactness theorem implies that the existence of nonstandard elements cannot be excluded in first-order logic. The upward LwenheimSkolem theorem shows that there are nonstandard models of PA of all infinite cardinalities. This is not the case for the original (second-order) Peano axioms, which have only one model, up to isomorphism. This illustrates one way the first-order system PA is weaker than the second-order Peano axioms. When interpreted as a proof within a first-order set theory, such as ZFC, Dedekind's categoricity proof for PA shows that each model of set theory has a unique model of the Peano axioms, up to isomorphism, that embeds as an initial segment of all other models of PA contained within that model of set theory. In the standard model of set theory, this smallest model of PA is the standard model of PA; however, in a nonstandard model of set theory, it may be a nonstandard model of PA. This situation cannot be avoided with any first-order formalization of set theory. It is natural to ask whether a countable nonstandard model can be explicitly constructed. Tennenbaum's theorem, proved in 1959, shows that there is no countable nonstandard model of PA in which either the addition or multiplication operation is computable.[8] This result shows it is difficult to be completely explicit in describing the addition and multiplication operations of a countable nonstandard model of PA. However, there is only one possible order type of a countable nonstandard model. Letting be the order type of the natural numbers, be the order type of the integers, and be the order type of the rationals, the order type of any countable nonstandard model of PA is + , which can be visualized as a copy of the natural numbers followed by a dense linear ordering of copies of
181
Set-theoretic models
The Peano axioms can be derived from set theoretic constructions of the natural numbers and axioms of set theory such as the ZF.[9] The standard construction of the naturals, due to John von Neumann, starts from a definition of 0 as the empty set, , and an operator s on sets defined as: s(a) = a { a }. The set of natural numbers N is defined as the intersection of all sets closed under s that contain the empty set. Each natural number is equal (as a set) to the set of natural numbers less than it:
and so on. The set N together with 0 and the successor function s : N N satisfies the Peano axioms. Peano arithmetic is equiconsistent with several weak systems of set theory.[10] One such system is ZFC with the axiom of infinity replaced by its negation. Another such system consists of general set theory (extensionality, existence of the empty set, and the axiom of adjunction), augmented by an axiom schema stating that a property that holds for the empty set and holds of an adjunction whenever it holds of the adjunct must hold for all sets.
Consistency
When the Peano axioms were first proposed, Bertrand Russell and others agreed that these axioms implicitly defined what we mean by a "natural number". Henri Poincar was more cautious, saying they only defined natural numbers if they were consistent; if there is a proof that starts from just these axioms and derives a contradiction such as 0 = 1, then the axioms are inconsistent, and don't define anything. In 1900, David Hilbert posed the problem of proving their consistency using only finitistic methods as the second of his twenty-three problems.[11] In 1931, Kurt Gdel proved his second incompleteness theorem, which shows that such a consistency proof cannot be formalized within Peano arithmetic itself.[12] Although it is widely claimed that Gdel's theorem rules out the possibility of a finitistic consistency proof for Peano arithmetic, this depends on exactly what one means by a finitistic proof. Gdel himself pointed out the possibility of giving a finitistic consistency proof of Peano arithmetic or stronger systems by using finitistic methods that are not formalizable in Peano arithmetic, and in 1958 Gdel published a method for proving the consistency of arithmetic
Peano axioms using type theory.[13] In 1936, Gerhard Gentzen gave a proof of the consistency of Peano's axioms, using transfinite induction up to an ordinal called 0.[14] Gentzen explained: "The aim of the present paper is to prove the consistency of elementary number theory or, rather, to reduce the question of consistency to certain fundamental principles". Gentzen's proof is arguably finitistic, since the transfinite ordinal 0 can be encoded in terms of finite objects (for example, as a Turing machine describing a suitable order on the integers, or more abstractly as consisting of the finite trees, suitably linearly ordered). Whether or not Gentzen's proof meets the requirements Hilbert envisioned is unclear: there is no generally accepted definition of exactly what is meant by a finitistic proof, and Hilbert himself never gave a precise definition. The vast majority of contemporary mathematicians believe that Peano's axioms are consistent, relying either on intuition or the acceptance of a consistency proof such as Gentzen's proof. The small number of mathematicians who advocate ultrafinitism reject Peano's axioms because the axioms require an infinite set of natural numbers.
182
Footnotes
[1] [2] [3] [4] [5] Grassmann 1861 Peirce 1881; also see Shields 1997 van Heijenoort 1967:94 Van Heijenoort 1967, p. 2 Van Heijenoort 1967, p. 83
[6] Mendelson 1997:155 [7] Kaye 1991, pp. 1618 [8] Kaye 1991, sec. 11.3 [9] Suppes 1960; Hatcher 1982 [10] Tarski & Givant 1987, sec. 7.6 [11] Hilbert 1900 [12] Godel 1931 [13] Godel 1958 [14] Gentzen 1936
References
Martin Davis, 1974. Computability. Notes by Barry Jacobs. Courant Institute of Mathematical Sciences, New York University. Richard Dedekind, 1888. Was sind und was sollen die Zahlen? (What are and what should the numbers be?). Braunschweig. Two English translations: 1963 (1901). Essays on the Theory of Numbers. Beman, W. W., ed. and trans. Dover. 1996. In From Kant to Hilbert: A Source Book in the Foundations of Mathematics, 2 vols, Ewald, William B., ed. Oxford University Press: 787832. Gentzen, G., 1936, Die Widerspruchsfreiheit der reinen Zahlentheorie. Mathematische Annalen 112: 132213. Reprinted in English translation in his 1969 Collected works, M. E. Szabo, ed. Amsterdam: North-Holland. K. Gdel,1931, ber formal unentscheidbare Stze der Principia Mathematica und verwandter Systeme, I. Monatshefte fr Mathematik und Physik 38: 17398. See On Formally Undecidable Propositions of Principia Mathematica and Related Systems for details on English translations. --------, 1958, "ber eine bisher noch nicht benzte Erweiterung des finiten Standpunktes," Dialectica 12: 28087. Reprinted in English translation in 1990. Gdel's Collected Works, Vol II. Solomon Feferman et al., eds. Oxford University Press. Hermann Grassmann, 1861. Lehrbuch der Arithmetik (A tutorial in arithmetic). Berlin.
Hatcher, William S., 1982. The Logical Foundations of Mathematics. Pergamon. Derives the Peano axioms (called S) from several axiomatic set theories and from category theory. David Hilbert,1901, "Mathematische Probleme". Archiv der Mathematik und Physik 3(1): 4463, 21337. English translation by Maby Winton, 1902, " Mathematical Problems, (http://aleph0.clarku.edu/~djoyce/
Peano axioms hilbert/problems.html)" Bulletin of the American Mathematical Society 8: 43779. Kaye, Richard, 1991. Models of Peano arithmetic. Oxford University Press. ISBN 0-19-853213-X. Peirce, C.S. (1881). "On the Logic of Number" (http://books.google.com/books?id=LQgPAAAAIAAJ& jtp=85). American Journal of Mathematics 4 (14): pp.8595. doi:10.2307/2369151. JSTOR2369151. MR1507856. Reprinted (CP3.252-88), (W4:299-309). Paul Shields. (1997), "Peirces Axiomatization of Arithmetic", in Houser et al., eds., Studies in the Logic of CharlesS. Peirce. Patrick Suppes, 1972 (1960). Axiomatic Set Theory. Dover. ISBN 0486616304. Derives the Peano axioms from ZFC. Alfred Tarski, and Givant, Steven, 1987. A Formalization of Set Theory without Variables. AMS Colloquium Publications, vol. 41. Edmund Landau, 1965 Grundlagen Der Analysis. AMS Chelsea Publishing. Derives the basic number systems from the Peano axioms. English/German vocabulary included. ISBN 978-0828401418 Jean van Heijenoort, ed. (1967, 1976 3rd printing with corrections). From Frege to Godel: A Source Book in Mathematical Logic, 18791931 (3rd ed.). Cambridge, Mass: Harvard University Press. ISBN0-674-32449-8 (pbk.). Contains translations of the following two papers, with valuable commentary: Richard Dedekind, 1890, "Letter to Keferstein." pp.98103. On p.100, he restates and defends his axioms of 1888. Giuseppe Peano, 1889. Arithmetices principia, nova methodo exposita (The principles of arithmetic, presented by a new method), pp.8397. An excerpt of the treatise where Peano first presented his axioms, and recursively defined arithmetical operations. This article incorporates material from PA on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.
183
External links
Internet Encyclopedia of Philosophy: " Henri Poincare (http://www.utm.edu/research/iep/p/poincare. htm)"--by Mauro Murzi. Includes a discussion of Poincar's critique of the Peano's axioms. First-order arithmetic (http://www.ltn.lv/~podnieks/gt3.html), a chapter of a book on the incompleteness theorems by Karl Podnieks. Peano arithmetic (http://planetmath.org/?op=getobj&from=objects&id=2789) on PlanetMath Weisstein, Eric W., " Peano's Axioms (http://mathworld.wolfram.com/PeanosAxioms.html)" from MathWorld. What are numbers, and what is their meaning?: Dedekind (http://www.math.uwaterloo.ca/~snburris/htdocs/ scav/dedek/dedek.html) commentary on Dedekind's work, Stanley N. Burris, 2001.
184
Gentzen's theorem
In 1936 Gerhard Gentzen proved the consistency of first-order arithmetic using combinatorial methods. Gentzen's proof shows much more than merely that first-order arithmetic is consistent. Gentzen showed that the consistency of first-order arithmetic is provable, over the base theory of primitive recursive arithmetic with the additional principle of quantifier free transfinite induction up to the ordinal 0 (epsilon nought). Informally, this additional principle means that there is a well-ordering on the set of finite rooted trees. The principle of quantifier free transfinite induction up to 0 says that for any formula A(x) with no bound variables transfinite induction up to 0 holds. 0 is the first ordinal , such that , i.e. the limit of the sequence: To express ordinals in the language of arithmetic an ordinal notation is needed, i.e. a way to assign natural numbers to ordinals less than 0. This can be done in various ways, one example provided by Cantor's normal form theorem. That transfinite induction holds for a formula A(x) means that A does not define an infinite descending sequence of ordinals smaller than 0 (in which case 0 would not be well-ordered). Gentzen assigned ordinals smaller than 0 to proofs in first-order arithmetic and showed that if there is a proof of contradiction, then there is an infinite descending sequence of ordinals < 0 produced by a primitive recursive operation on proofs corresponding to a quantifier free formula.
185
References
G. Gentzen, 1936. 'Die Widerspruchfreiheit der reinen Zahlentheorie'. Mathematische Annalen, 112:493565. Translated as 'The consistency of arithmetic', in (M. E. Szabo 1969). G. Gentzen, 1938. 'Neue Fassung des Widerspruchsfreiheitsbeweises fuer die reine Zahlentheorie'. Translated as 'New version of the consistency proof for elementary number theory', in (M. E. Szabo 1969). K. Gdel, 1938. Lecture at Zilsels, In Feferman et al. Kurt Gdel: Collected Works, Vol III, pp. 87113. Herman Ruge Jervell, 1999. A course in proof theory [1], textbook draft. M. E. Szabo (ed.), 1969. 'The collected works of Gerhard Gentzen. North-Holland, Amsterdam. W. W. Tait, 2005. Gdel's reformulation of Gentzen's first consistency proof for arithmetic: the no-counterexample interpretation [2]. The Bulletin of Symbolic Logic 11(2):225-238. Kirby, L. and Paris, J., Accessible independence results for Peano arithmetic, Bull. London. Math. Soc., 14 (1982), 285-93.
References
[1] http:/ / folk. uio. no/ herman/ bevisteori. ps [2] http:/ / home. uchicago. edu/ ~wwtx/ GoedelandNCInew1. pdf
Second-order arithmetic
In mathematical logic, second-order arithmetic is a collection of axiomatic systems that formalize the natural numbers and their subsets. It is an alternative to axiomatic set theory as a foundation for much, but not all, of mathematics. It was introduced by Hilbert & Bernays (1934) in their book Grundlagen der Mathematik. The standard axiomatization of second-order arithmetic is denoted Z2. Second-order arithmetic includes, but is significantly stronger than, its first-order counterpart Peano arithmetic. Unlike Peano arithmetic, second-order arithmetic allows quantification over sets of numbers as well as numbers themselves. Because real numbers can be represented as (infinite) sets of natural numbers in well-known ways, and because second order arithmetic allows quantification over such sets, it is possible to formalize the real numbers in second-order arithmetic. For this reason, second-order arithmetic is sometimes called analysis. Second-order arithmetic can also be seen as a weak version of set theory in which every element is either a natural number or a set of natural numbers. Although it is much weaker than Zermelo-Fraenkel set theory, second-order arithmetic can prove essentially all of the results of classical mathematics expressible in its language. A subsystem of second-order arithmetic is a theory in the language of second-order arithmetic each axiom of which is a theorem of full second-order arithmetic (Z2). Such subsystems are essential to reverse mathematics, a research program investigating how much of classical mathematics can be derived in certain weak subsystems of varying strength. Much of core mathematics can be formalized in these weak subsystems, some of which are defined below. Reverse mathematics also clarifies the extent and manner in which classical mathematics is nonconstructive.
Definition
Syntax
The language of second-order arithmetic is two-sorted. The first sort of terms and variables, usually denoted by lower case letters, consists of individuals, whose intended interpretation is as natural numbers. The other sort of variables, variously called set variables, class variables, or even predicates are usually denoted by upper case letters. They refer to classes/predicates/properties of individuals, and so can be thought of as sets of natural numbers. Both individuals and set variables can be quantified universally or existentially. A formula with no bound set
Second-order arithmetic variables (that is, no quantifiers over set variables) is called arithmetical. An arithmetical formula may have free set variables and bound individual variables. Individual terms are formed from the constant 0, the unary function S (the successor function), and the binary operations + and (addition and multiplication). The successor function adds 1 (=S0) to its input. The relations = (equality) and < (comparison of natural numbers) relate two individuals, whereas the relation (membership) relates an individual and a set (or class). For example, arithmetical formula)whereas , is a well-formed formula of second-order arithmetic that is arithmetical, is a well-formed
186
has one free set variable X and one bound individual variable n (but no bound set variables, as is required of an formula that is not arithmetical with one bound set variable X and one bound individual variable n.
Semantics
Several different interpretations of the quantifiers are possible. If second-order arithmetic is studied using the full semantics of second-order logic then the set quantifiers range over all subsets of the range of the number variables. If second-order arithmetic is formalized using the semantics of first-order logic then any model includes a domain for the set variables to range over, and this domain may be a proper subset of the full powerset of the domain of number variables. Although second-order arithmetic was originally studied using full second-order semantics, the vast majority of current research treats second-order arithmetic in first-order predicate calculus. This is because the model theory of subsystems of second-order arithmetic is more interesting in the setting of first-order logic.
Axioms
Basic The following axioms are known as the basic axioms, or sometimes the Robinson axioms. The resulting first-order theory, known as Robinson arithmetic, is essentially Peano arithmetic without induction. The domain of discourse for the quantified variables is the natural numbers, collectively denoted by N, and including the distinguished member , called "zero." The primitive functions are the unary successor function, denoted by prefix , and two binary operations, addition
and multiplication, denoted by infix "+" and " ", respectively. There is also a primitive binary relation called order, denoted by infix "<". Axioms governing the successor function and zero: 1. 2. 3. Addition defined recursively: 4. 5. Multiplication defined recursively: 6. 7. Axioms governing the order relation "<": 8. 9. (no natural number is smaller than zero) (the successor of a natural number is never zero) (the successor function is injective) (every natural number is zero or a successor)
Second-order arithmetic 10. 11. These axioms are all first order statements. That is, all variables range over the natural numbers and not sets thereof, a fact even stronger than their being arithmetical. Moreover, there is but one existential quantifier, in axiom 3. Axioms 1 and 2, together with an axiom schema of induction make up the usual Peano-Dedekind definition of N. Adding to these axioms any sort of axiom schema of induction makes redundant the axioms 3, 10, and 11. Induction and comprehension schema If (n) is a formula of second-order arithmetic with a free number variable n and possible other free number or set variables (written m and X), the induction axiom for is the axiom: The (full) second-order induction scheme consists of all instances of this axiom, over all second-order formulas. One particularly important instance of the induction scheme is when is the formula that n is a member of X (X being a free set variable): in this case, the induction axiom for is expressing the fact (every natural number is zero or bigger than zero)
187
This sentence is called the second-order induction axiom. Returning to the case where (n) is a formula with a free variable n and possibly other free variables, we define the comprehension axiom for to be:
restriction that the formula may not contain the variable Z, for otherwise the formula
Second-order arithmetic second-order induction axiom have only one model under second-order semantics. When M is the usual set of natural numbers with its usual operations, is called an -model. In this case we may identify the model with D, its collection of sets of naturals, because this set is enough to completely determine an -model. The unique full model, which is the usual set of natural numbers with its usual structure and all its subsets, is called the intended or standard model of second-order arithmetic.
188
Arithmetical comprehension
Many of the well-studied subsystems are related to closure properties of models. For example, it can be shown that every -model of full second-order arithmetic is closed under Turing jump, but not every -model closed under Turing jump is a model of full second-order arithmetic. We may ask whether there is a subsystem of second-order arithmetic satisfied by every -model that is closed under Turing jump and satisfies some other, more mild, closure conditions. The subsystem just described is called . is defined as the theory consisting of the basic axioms, the arithmetical comprehension axiom scheme, in other words the comprehension axiom for every arithmetical formula , and the ordinary second-order induction axiom; again, we could also choose to include the arithmetical induction axiom scheme, in other words the induction axiom for every arithmetical formula , without making a difference. It can be seen that a collection S of subsets of determines an -model of Turing jump, Turing reducibility, and Turing join. The subscript 0 in indicates that we have not included every instance of the induction axiom in this subsystem. This makes no difference when we study only -models, which automatically satisfy every instance of the induction axiom. It is of crucial importance, however, when we study models that are not -models. The system consisting of The system plus induction for all formulas is sometimes called . is a conservative extension of first-order arithmetic (or first-order Peano axioms), defined as if and only if S is closed under
the basic axioms, plus the first order induction axiom scheme (for all formulas involving no class variables at all, bound or otherwise), in the language of first order arithmetic (which does not permit class variables at all). In particular it has the same proof-theoretic ordinal 0 as first-order arithmetic, owing to the limited induction schema.
Second-order arithmetic
189
stands for
and
stands for . A formula is called 1 (or sometimes 1), respectively 01 (or sometimes 1) when it of the form m(), respectively m() where is a bounded arithmetical formula and m is an individual variable (that is free in ). More generally, a formula is called 0n, respectively 0n when it is obtained by adding existential, respectively universal, individual quantifiers to a 0n1, respectively 0n1 formula (and 00 and 00 are all equivalent to 00). Note that by construction all these formulas are arithmetical (no class variables are ever bound) and, in fact, by putting the formula in Skolem prenex form one can see that every arithmetical formula is equivalent to a 0n or 0n formula for all large enough n.
0
Recursive comprehension
The subsystem is an even weaker system than
0
mathematics. It consists of: the basic axioms, the 1 induction scheme, and the 01 comprehension scheme. The former term is clear: the 01 induction scheme is the induction axiom for every 01 formula . The term 01 comprehension requires a little more explaining, however: there is no such thing as a 01 formula (the intended meaning is a formula that is both 01 and 01), but we are instead postulating the comprehension axiom for every 01 formula subject to the condition that it is equivalent to a 01 formula, in other words, for every 01 formula and every 01 formula we postulate The set of first-order consequences of
0
which induction is restricted to 1 formulas. In turn, I1 is conservative over primitive recursive arithmetic (PRA) for sentences. Moreover, the proof-theoretic ordinal of is , the same as that of PRA. It can be seen that a collection S of subsets of determines an -model of if and only if S is closed under Turing reducibility and Turing join. In particular, the collection of all computable subsets of gives an -model of . This is the motivation behind the name of this systemif a set can be proved to exist using , then the set is computable (i.e. recursive).
Weaker systems
Sometimes an even weaker system than is desired. One such system is defined as follows: one must first augment the language of arithmetic with an exponential function (in stronger systems the exponential can be defined in terms of addition and multiplication by the usual trick, but when the system becomes too weak this is no longer possible) and the basic axioms by the obvious axioms defining exponentiation inductively from multiplication; then the system consists of the (enriched) basic axioms, plus 01 comprehension plus 00 induction.
Second-order arithmetic
190
Stronger systems
Much as we have defined n and n (or, more accurately, 0n and 0n) formulae, we can define 1n and 1n formulae in the following way: a 10 (or 10 or 10) formula is just an arithmetical formula, and a 1n, respectively 1n, formula is obtained by adding existential, respectively universal, class quantifiers in front of a 1n1, respectively 1n1. It is not too hard to see that over a not too weak system, any formula of second-order arithmetic is equivalent to a 1n or 1n formula for all large enough n. The system 11-comprehension is the system consisting of the basic axioms, plus the ordinary second-order induction axiom and the comprehension axiom for every 11 formula . It is an easy exercise to show that this is actually equivalent to 11-comprehension (on the other hand, 11-comprehension, defined by the same trick as introduced earlier for 01 comprehension, is actually weaker).
References
Burgess, John P., 2005. Fixing Frege. Princeton University Press. Buss, S. R., Handbook of proof theory ISBN 0-444-89840-9 Friedman, Harvey. "Systems of second order arithmetic with restricted induction," I, II (Abstracts). Journal of Symbolic Logic, v.41, pp. 557-- 559, 1976. JStor [1] Girard, Lafont and Taylor, 1987. Proofs and Types [2]. Cambridge University Press. Hilbert, David; Bernays, Paul (1934), Grundlagen der Mathematik, Die Grundlehren der mathematischen Wissenschaften, Band 40, 50, Berlin, New York: Springer-Verlag, MR0237246 Simpson, Stephen G. (2009), Subsystems of second order arithmetic [3], Perspectives in Logic (2nd ed.), Cambridge University Press, ISBN978-0-521-88439-6, MR2517689 Gaisi Takeuti (1975) Proof theory ISBN 0-444-10492-5
References
[1] http:/ / www. jstor. org/ stable/ 2272259 [2] http:/ / www. monad. me. uk/ stable/ Proofs%2BTypes. html [3] http:/ / www. math. psu. edu/ simpson/ sosoa/
Reverse mathematics
191
Reverse mathematics
Reverse mathematics is a program in mathematical logic that seeks to determine which axioms are required to prove theorems of mathematics. Its defining method can briefly be described as "going backwards from the theorems to the axioms", in contrast to the ordinary mathematical practice of deriving theorems from axioms. The reverse mathematics program was foreshadowed by results in set theory such as the classical theorem that the axiom of choice and Zorn's lemma are equivalent over ZF set theory. The goal of reverse mathematics, however, is to study ordinary theorems of mathematics rather than possible axioms for set theory. Reverse mathematics is usually carried out using subsystems of second-order arithmetic, where many of its definitions and methods are inspired by previous work in constructive analysis and proof theory. The use of second-order arithmetic also allows many techniques from recursion theory to be employed; many results in reverse mathematics have corresponding results in computable analysis. The program was founded by Harvey Friedman(1975, 1976). A standard reference for the subject is (Simpson 2009).
General principles
In reverse mathematics, one starts with a framework language and a base theorya core axiom systemthat is too weak to prove most of the theorems one might be interested in, but still powerful enough to develop the definitions necessary to state these theorems. For example, to study the theorem Every bounded sequence of real numbers has a supremum it is necessary to use a base system which can speak of real numbers and sequences of real numbers. For each theorem that can be stated in the base system but is not provable in the base system, the goal is to determine the particular axiom system (stronger than the base system) that is necessary to prove that theorem. To show that a system S is required to prove a theorem T, two proofs are required. The first proof shows T is provable from S; this is an ordinary mathematical proof along with a justification that it can be carried out in the system S. The second proof, known as a reversal, shows that T itself implies S; this proof is carried out in the base system. The reversal establishes that no axiom system S that extends the base system can be weaker than S while still provingT.
Reverse mathematics vector space has a basis" but it cannot express the principle "Every vector space has a basis". In practical terms, this means that theorems of algebra and combinatorics are restricted to countable structures, while theorems of analysis and topology are restricted to separable spaces. Many principles that imply the axiom of choice in their general form (such as "Every vector space has a basis") become provable in weak subsystems of second-order arithmetic when they are restricted. For example, "every field has an algebraic closure" is not provable in ZF set theory, but the restricted form "every countable field has an algebraic closure" is provable in RCA0, the weakest system typically employed in reverse mathematics.
192
0 0 0()
Conservative over PRA for sentences. Conservative over RCA0 for sentences.
Conservative over Peano arithmetic for arithmetical sentences Conservative over Feferman's system IR for sentences
-CA0
The subscript 0 in these names means that the induction scheme has been restricted from the full second-order induction scheme (Simpson 2009, p. 6). For example, ACA0 includes the induction axiom (0X n(nX n+1X)) n nX. This together with the full comprehension axiom of second order arithmetic implies the full second-order induction scheme given by the universal closure of ((0) n((n) (n+1))) n (n) for any second order formula . However ACA0 does not have the full comprehension axiom, and the subscript 0 is a reminder that it does not have the full second-order induction scheme either. This restriction is important: systems with restricted induction have significantly lower proof-theoretical ordinals than systems with the full second-order induction scheme.
Reverse mathematics
193
Reverse mathematics effectively inseparable recursively enumerable sets. It turns out that RCA0 and WKL0 have the same first-order part, meaning that they prove the same first-order sentences. WKL0 can prove a good number of classical mathematical results which do not follow from RCA0, however. These results are not expressible as first order statements but can be expressed as second-order statements. The following results are equivalent to weak Knig's lemma and thus to WKL0 over RCA0: The HeineBorel theorem for the closed unit real interval, in the following sense: every covering by a sequence of open intervals has a finite subcovering. The HeineBorel theorem for complete totally bounded separable metric spaces (where covering is by a sequence of open balls). A continuous real function on the closed unit interval (or on any compact separable metric space, as above) is bounded (or: bounded and reaches its bounds). A continuous real function on the closed unit interval can be uniformly approximated by polynomials (with rational coefficients). A continuous real function on the closed unit interval is uniformly continuous. A continuous real function on the closed unit interval is Riemann integrable. The Brouwer fixed point theorem (for continuous functions on a finite product of copies of the closed unit interval). The separable HahnBanach theorem in the form: a bounded linear form on a subspace of a separable Banach space extends to a bounded linear form on the whole space. The Jordan curve theorem Gdel's completeness theorem (for a countable language). Every countable commutative ring has a prime ideal. Every countable formally real field is orderable. Uniqueness of algebraic closure (for a countable field).
194
Reverse mathematics Every countable vector space over the rationals (or over any countable field) has a basis. Every countable field has a transcendence basis. Knig's lemma (for arbitrary finitely branching trees, as opposed to the weak version described above). Various theorems in combinatorics, such as certain forms of Ramsey's theorem.
195
11 comprehension 11-CA0
11-CA0 is stronger than arithmetical transfinite recursion and is fully impredicative. It consists of RCA0 plus the comprehension scheme for 11 formulas. In a sense, 11-CA0 comprehension is to arithmetical transfinite recursion (11 separation) as ACA0 is to weak Knig's lemma (01 separation). It is equivalent to several statements of descriptive set theory whose proofs make use of strongly impredicative arguments; this equivalence shows that these impredicative arguments cannot be removed. The following theorems are equivalent to 11-CA0 over RCA0: The CantorBendixson theorem (every closed set of reals is the union of a perfect set and a countable set). Every countable abelian group is the direct sum of a divisible group and a reduced group.
Additional systems
Weaker systems than recursive comprehension can be defined. The weak system RCA consists of elementary function arithmetic EFA (the basic axioms plus 00 induction in the enriched language with an exponential operation) plus 01 comprehension. Over RCA , recursive comprehension as defined earlier (that is, with 01 induction) is equivalent to the statement that a polynomial (over a countable field) has only finitely many roots and to the classification theorem for finitely generated Abelian groups. The system RCA has the same proof theoretic ordinal 3 as EFA and is conservative over EFA for sentences. Weak Weak Knig's Lemma is the statement that a subtree of the infinite binary tree having no infinite paths has an asymptotically vanishing proportion of the leaves at length n (with a uniform estimate as to how many leaves of length n exist). An equivalent formulation is that any subset of Cantor space that has positive measure is nonempty (this is not provable in RCA0). WWKL0 is obtained by adjoining this axiom to RCA0. It is equivalent to the statement that if the unit real interval is covered by a sequence of intervals then the sum of their lengths is at least one. The model theory of WWKL0 is closely connected to the theory of algorithmically random
Reverse mathematics sequences. In particular, an -model of RCA0 satisfies weak weak Knig's lemma if and only if for every set X there is a set Y which is 1-random relative to X. DNR (short for "diagonally non-recursive") adds to RCA0 an axiom asserting the existence of a diagonally non-recursive function relative to every set. That is, DNR states that, for any set A, there exists a total function f such that for all e the eth partial recursive function with oracle A is not equal to f. DNR is strictly weaker than WWKL (Lempp et al., 2004). 11-comprehension is in certain ways analogous to arithmetical transfinite recursion as recursive comprehension is to weak Knig's lemma. It has the hyperarithmetical sets as minimal -model. Arithmetical transfinite recursion proves 11-comprehension but not the other way around. 11-choice is the statement that if (n,X) is a 11 formula such that for each n there exists an X satisfying then there is a sequence of sets Xn such that (n,Xn) holds for each n. 11-choice also has the hyperarithmetical sets as minimal -model. Arithmetical transfinite recursion proves 11-choice but not the other way around.
196
References
Ambos-Spies, K.; Kjos-Hanssen, B.; Lempp, S.; Slaman, T.A. (2004), "Comparing DNR and WWKL", Journal of Symbolic Logic 69 (4): 1089, doi:10.2178/jsl/1102022212. Friedman, Harvey (1975), "Some systems of second order arithmetic and their use", Proceedings of the International Congress of Mathematicians (Vancouver, B. C., 1974), Vol. 1, Canad. Math. Congress, Montreal, Que., pp.235242, MR0429508 Friedman, Harvey; Martin, D. A.; Soare, R. I.; Tait, W. W. (1976), Systems of second order arithmetic with restricted induction, I, II, "Meeting of the Association for Symbolic Logic", The Journal of Symbolic Logic (Association for Symbolic Logic) 41 (2): 557559, ISSN0022-4812, JSTOR2272259 Simpson, Stephen G. (2009), Subsystems of second order arithmetic [3], Perspectives in Logic (2nd ed.), Cambridge University Press, ISBN978-0-521-88439-6, MR2517689 Solomon, Reed (1999), "Ordered groups: a case study in reverse mathematics", The Bulletin of Symbolic Logic 5 (1): 4558, doi:10.2307/421140, ISSN1079-8986, JSTOR421140, MR1681895
Reverse mathematics
197
External links
Harvey Friedman's home page [1] Stephen G. Simpson's home page [2]
References
[1] http:/ / www. math. ohio-state. edu/ ~friedman/ [2] http:/ / www. math. psu. edu/ simpson/
198
Ring (mathematics) Existence of additive inverse: For any integer a, there exists an integer denoted by a such that a + (a) = (a) + a = 0. The element, a, is called the additive inverse of a because adding a to a (in any order) returns the identity. Commutativity of addition: For any two integers a and b, a + b = b + a. So the order in which two integers are added is irrelevant. The integers form a multiplicative monoid (a monoid under multiplication); that is: Closure axiom for multiplication: Given two integers a and b, their product, a b is also an integer. Associativity of multiplication: Given any integers, a, b and c, (a b) c = a (b c). So multiplying b with a, and then multiplying c to this result, is the same as multiplying c with b, and then multiplying a to this result. Existence of multiplicative identity: For any integer a, a 1 = 1 a = a. So multiplying any integer with 1 (in any order) gives back that integer. One is therefore called the multiplicative identity. Multiplication is distributive over addition : These two structures on the integers (addition and multiplication) are compatible in the sense that a (b + c) = (a b) + (a c), and (a + b) c = (a c) + (b c) for any three integers a, b and c.
199
Formal definition
There are some differences in exactly what axioms are used to define a ring. Here one set of axioms for a ring with identity is given, and comments on variations follow. A ring is a set R equipped with two binary operations + : R R R and : R R R (where denotes the Cartesian product), called addition and multiplication. To qualify as a ring, the set and two operations, (R, +, ), must satisfy the following requirements known as the ring axioms.[4] (R, +) is required to be an abelian group under addition:
1. Closure under addition. 2. Associativity of addition. For all a, b in R, the result of the operation a + b is also in R.c[] For all a, b, c in R, the equation (a + b) + c = a + (b + c) holds.
3. Existence of additive identity. There exists an element 0 in R, such that for all elements a in R, the equation 0 + a = a + 0 = a holds. 4. Existence of additive inverse. For each a in R, there exists an element b in R such that a + b = b + a = 0 5. Commutativity of addition. For all a, b in R, the equation a + b = b + a holds.
3. Existence of multiplicative identity.a[] There exists an element 1 in R, such that for all elements a in R, the equation 1 a = a 1 = a holds.
This definition assumes that a binary operation on R is a function defined on RR with values in R. Therefore, for any a and b in R, the addition a + b and the product a b are elements of R.
Ring (mathematics) The most familiar example of a ring is the set of all integers, Z = {..., 4, 3, 2, 1, 0, 1, 2, 3, 4,... }, together with the usual operations of addition and multiplication.[3] Another familiar example is the set of real numbers R, equipped with the usual addition and multiplication. Another example of a ring is the set of all square matrices of a fixed size, with real elements, using the matrix addition and multiplication of linear algebra. In this case, the ring elements 0 and 1 are the zero matrix (with all entries equal to 0) and the identity matrix, respectively.
200
Ring (mathematics)
201
+ 0 1 2 3
0 0 1 2 3
1 1 2 3 0
2 2 3 0 1
3 3 0 1 2
It is simple (but tedious) to verify that Z4 is a ring under these operations. First of all, one can use the left-most table to show that Z4 is closed under addition (any result is either 0, 1, 2 or 3). Associativity of addition in Z4 follows from associativity of addition in the set of all integers. The additive identity is 0 as can be verified by looking at the left-most table. Given an integer x, there is always an inverse of x; this inverse is given by 4 - x as one can verify from the additive table. Therefore, Z4 is an abelian group under addition. Similarly, Z4 is closed under multiplication as the right-most table shows (any result above is either 0, 1, 2 or 3). Associativity of multiplication in Z4 follows from associativity of multiplication in the set of all integers. The multiplicative identity is 1 as can be verified by looking at the right-most table. Therefore, Z4 is a monoid under multiplication. Distributivity of the two operations over each other follow from distributivity of addition over multiplication (and vice-versa) in Z (the set of all integers). Therefore, this set does indeed form a ring under the given operations of addition and multiplication. Properties of this ring In general, given any two integers, x and y, if x y = 0, then either x is 0 or y is 0. It is interesting to note that this does not hold for the ring (Z4, +, ): 22=0 although neither factor is 0. In general, a non-zero element a of a ring, (R, +, ) is said to be a zero divisor in (R, +, ), if there exists a non-zero element b of R such that a b = 0. So in this ring, the only zero divisor is 2 (note that 0 a = 0 for any a in a ring (R, +, ) so 0 is not considered to be a zero divisor). A commutative ring which has no zero divisors is called an integral domain (see below). So Z, the ring of all integers (see above), is an integral domain (and therefore a ring), although Z4 (the above example) does not form an integral domain (but is still a ring). So in general, every integral domain is a ring but not every ring is an integral domain.
Ring (mathematics)
202
Basic concepts
Subring
Informally, a subring of a ring is another ring that uses the "same" operations and is contained in it. More formally, suppose (R, +, ) is a ring, and S is a subset of R such that for every a, b in S, a + b is in S; for every a, b in S, a b is in S; for every a in S, the additive inverse a of a is in S; and the multiplicative identity '1' of R is in S.
Let '+S' and 'S' denote the operations '+' and '', restricted to SS. Then (S, +S, S) is a subring of (R, +, ).[9] Since the restricted operations are completely determined by S and the original ones, the subring is often written simply as (S, +, ). For example, a subring of the complex number ring C is any subset of C that includes 1 and is closed under addition, multiplication, and negation, such as: The rational numbers Q The algebraic numbers A The real numbers R If A is a subring of R, and B is a subset of A such that B is also a subring of R, then B is a subring of A.
Homomorphism
A homomorphism from a ring (R, +, ) to a ring (S, , *) is a function f from R to S that commutes with the ring operations; namely, such that, for all a, b in R the following identities hold: f(a + b) = f(a) f(b) f(a b) = f(a) * f(b) Moreover, the function f must take the identity element 1R of '' to the identity element 1S of '*'. (this is not required if identity is not required). For example, the function that maps each integer x to its remainder modulo 4 (a number in {0, 1, 2, 3}) is a homomorphism from the ring Z to the ring Z4. A ring homomorphism is said to be an isomorphism if there exists an inverse homomorphism to f (i.e., a ring homomorphism which is an inverse function). Equivalently, any bijective ring homomorphism is a ring isomorphism.
Ideal
The purpose of an ideal in a ring is to somehow allow one to define the quotient ring of a ring (analogous to the quotient group of a group; see below). An ideal in a ring can therefore be thought of as a generalization of a normal subgroup in a group. More formally, let (R, +, ) be a ring. A subset I of R is said to be a right ideal in R if: (I, +) is a subgroup of the underlying additive group in (R, +, ) (i.e. (I, +) is a subgroup of (R, +)). For every x in I and r in R, x r is in I. A left ideal is similarly defined with the second condition being replaced. More specifically, a subset I of R is a left ideal in R if: (I, +) is a subgroup of the underlying additive group in (R, +, ) (i.e. (I, +) is a subgroup of (R, +)). For every x in I and r in R, r x is in I. Notes
Ring (mathematics) If k is in R, then the set of elements k R is a right ideal in R, and the set of elements R k is a left ideal in R. These ideals (for any k in R) are called the principal right and left ideals generated by k. If every ideal in a ring (R, +, ) is a principal ideal in (R, +, ), (R, +, ) is said to be a principal ideal ring. An ideal in a ring, (R, +, ), is said to be a two-sided ideal if it is both a left ideal and right ideal in (R, +, ). It is preferred to call a two-sided ideal, simply an ideal. If I = {0} (where 0 is the additive identity of the ring (R, +, )), then I is an ideal known as the trivial ideal. Similarly, R is also an ideal in (R, +, ) called the unit ideal. Examples Any additive subgroup of the integers is an ideal in the integers with its natural ring structure. There are no non-trivial ideals in R (the ring of all real numbers) (i.e., the only ideals in R are {0} and R itself). More generally, a field cannot contain any non-trivial ideals. From the previous example, every field must be a principal ideal ring. A subset, I, of a commutative ring (R, +, ) is a left ideal if and only if it is a right ideal. So for simplicity's sake, we refer to any ideal in a commutative ring as just an ideal.
203
History
The study of rings originated from the theory of polynomial rings and the theory of algebraic integers. Furthermore, the appearance of hypercomplex numbers in the mid-19th century undercut the pre-eminence of fields in mathematical analysis. In the 1880s Richard Dedekind introduced the concept of a ring,[2] and the term ring (Zahlring) was coined by David Hilbert in 1892 and published in the article Die Theorie der algebraischen Zahlkrper, Jahresbericht der Deutschen Mathematiker Vereinigung, Vol. 4, 1897. According to Harvey Cohn, Hilbert used the term for a specific ring that had the property of "circling directly back" to an element of itself.[10] The first axiomatic definition of a ring was given by Adolf Fraenkel in an essay in Journal fr die reine und angewandte Mathematik (A. L. Crelle), vol. 145, 1914.[2] [11] In 1921, Emmy Noether gave the first axiomatic foundation of the theory of commutative rings in her monumental paper Ideal Theory in Rings.[2]
A portrait of Richard Dedekind: one of the founders of ring theory.
Ring (mathematics)
204
Polynomial ring
Formal definition Let (R, +R, R) be a ring and let where the convention that S, as follows (where is the set of all nonnegative integers, is adopted. Define +S : S S S and S : S S and are arbitrary elements of S):
Then (S, +S, S) is a ring referred to as the polynomial ring over R. Notes Most authors write S as R[X], where because if is an element of S, and if is an element of S. This convention is adopted , f can be written as a polynomial
This allows one to view S as merely the set of all polynomials over R in the variable X, with multiplication and addition of polynomials defined in the canonical manner. Therefore, in all that follows, S, denoted by R[X], shall be identified in this fashion. Addition and multiplication in S, and those of the underlying ring R, will be denoted by juxtaposition.
Matrix ring
Formal definition Let (R, +R, R) be a ring and let Define +M : Mr(R) Mr(R) Mr(R) and M : Mr(R) Mr(R) Mr(R), as follows (where are arbitrary elements of Mr(R)): . and
Then (Mr(R), +M, M) is a ring referred to as the ring of rr matrices over R. Notes
Ring (mathematics) The ring Mr(R) can be viewed as merely the ring of rr matrices over R with matrix addition and multiplication defined in the canonical manner. For given in Mr(R), f can be identified with the rr matrix whose (i, j)-entry is . Therefore, in all that follows, Mr(R) and each of its elements shall be identified in this fashion. Addition and multiplication in Mr(R), and those of the underlying ring R, will be denoted by juxtaposition.
205
useful tool for distinguishing between certain pairs of topological spaces, like the spheres and tori, for which the methods of point-set topology are not well-suited. Cohomology groups were later defined in terms of homology groups in a way which is roughly analogous to the dual of a vector space. To know each individual integral homology group is essentially the same as knowing each individual integral cohomology group, because of the universal coefficient theorem. However, the advantage of the cohomology groups is that there is a natural product, which is analogous to the observation that one can multiply pointwise a k-multilinear form and an l-multilinear form to get a (k + l)-mulilinear form. The ring structure in cohomology provides the foundation for characteristic classes of fiber bundles, intersection theory on manifolds and algebraic varieties, Schubert calculus and much more.
Ring (mathematics)
206
in the On-Line
Ring (mathematics)
207
Associative algebras
An associative algebra is a ring that is also a vector space over a field K. For instance, the set of n by n matrices over the real field R has dimension n2 as a real vector space, and the matrix multiplication corresponds to the ring multiplication. For a non-trivial but elementary example consider 2 2 real matrices.
Lie ring
A Lie ring is defined to be a ring that is nonassociative and anticommutative under multiplication, that also satisfies the Jacobi identity. More specifically we can define a Lie ring to be an abelian group under addition with an operation Bilinearity: that has the following properties:
Lie rings need not be Lie groups under addition. Any Lie algebra is an example of a Lie ring. Any associative ring can be made into a Lie ring by defining a bracket operator . Conversely to any Lie algebra there is a corresponding ring, called the universal enveloping algebra. Lie rings are used in the study of finite p-groups through the Lazard correspondence. The lower central factors of a p-group are finite abelian p-groups, so modules over Z/pZ. The direct sum of the lower central factors is given the structure of a Lie ring by defining the bracket to be the commutator of two coset representatives. The Lie ring structure is enriched with another module homomorphism, then pth power map, making the associated Lie ring a so-called restricted Lie ring. Lie rings are also useful in the definition of a p-adic analytic groups and their endomorphisms by studying Lie algebras over rings of integers such as the p-adic integers. The definition of finite groups of Lie type due to Chevalley involves restricting from a Lie algebra over the complex numbers to a Lie algebra over the integers, and the reducing modulo p to get a Lie algebra over a finite field.
Topological ring
Let (X, T) is a topological space and (X, +, ) be a ring. Then (X, T, +, ) is said to be a topological ring, if its ring structure and topological structure are both compatible (i.e work together) over each other. That is, the addition map ( ) and the multiplication map ( ) have to be both continuous as maps between topological spaces where X x X inherits the product topology. So clearly, any topological ring is a topological group (under addition). Examples The set of all real numbers, R, with its natural ring structure and the standard topology forms a topological ring. The direct product of two topological rings is also a topological ring.
Ring (mathematics)
208
Commutative rings
Although ring addition is commutative, so that for every a, b in R, a + b = b + a, ring multiplication is not required to be commutative; a b need not equal b a for all a, b in R. Rings that also satisfy commutativity for multiplication are called commutative rings.e[] Formally, Formal definition Let (R, +, ) be a ring. Then (R, +, ) is said to be a commutative ring if for every a, b in R, a b = b a. That is, (R, +, ) is required to be a commutative monoid under multiplication. Examples The integers form a commutative ring under the natural operations of addition and multiplication. An example of a noncommutative ring is the ring of n n matrices over a non-trivial field K, for n>1. In particular, the ring of all 2 2 matrices over R (the set of all real numbers) do not form a commutative ring as the following computation shows: , which is not equal to
Ring (mathematics)
209
and bj's, bi = ai ui where ui is a unit in R. The second condition above guarantees that "non-trivial" elements of R can be decomposed into irreducibles, and according to the third condition, such a finite decomposition is unique "up to multiplication by unit elements." This weakened form of uniqueness is reasonable to assume for otherwise even the integers would not satisfy the properties of being a UFD ((-2)2 = 22 = 4 demonstrates two "distinct" decompositions of 4; however both decompositions of 4 are equivalent up to multiplication by units (-1 and +1)). The fact that the integers constitute a UFD follows from the fundamental theorem of arithmetic. For arbitrary rings, one may define a prime element and an irreducible element; these two may not in general coincide. However, a prime element in a domain is always irreducible. For UFD's, irreducible elements are also primes. The class of unique factorization domains is related to other classes of rings. For instance, any Noetherian domain satisfies conditions 1 and 2 above, but in general Noetherian domains fail to satisfy condition 3. However, if the set of prime elements and the set of irreducible elements coincide for a Noetherian domain, the third condition of a UFD is satisfied. In particular, principal ideal domains are UFD's.
Ring (mathematics) Examples The integers form a commutative ring under the natural operations of addition and multiplication. In fact, the integers form what is known as an integral domain (a commutative ring with no zero divisors). A ring whose non-zero elements form an abelian group under multiplication (not just a commutative monoid), is called a field. So every field is an integral domain and every integral domain is a commutative ring. Furthermore, any finite integral domain is a field. Relation to other algebraic structures The following is a chain of class inclusions that describes the relationship between rings, domains and fields: Commutative rings integral domains half factorization domains unique factorization domains principal ideal domains Euclidean domains fields Fields and integral domains are very important in modern algebra.
210
Noncommutative rings
The study of noncommutative rings is a major area in modern algebra; especially ring theory. Often noncommutative rings possess interesting invariants that commutative rings do not. As an example, there exist rings which contain non-trivial proper left or right ideals, but are still simple; that is contain no non-trivial proper (two-sided) ideals. This example illustrates how one must take care when studying noncommutative rings because of possible counterintuitive misconceptions. The theory of vector spaces is one illustration of a special case of an object studied in noncommutative ring theory. In linear algebra, the "scalars of a vector space" are required to lie in a field - a commutative division ring. The concept of a module, however, requires only that the scalars lie in an abstract ring. Neither commutativity nor the division ring assumption is required on the scalars in this case. Module theory has various applications in noncommutative ring theory, as one can often obtain information about the structure of a ring by making use of its modules. The concept of the Jacobson radical of a ring; that is, the intersection of all right/left annihilators of simple right/left modules over a ring, is one example. The fact that the Jacobson radical can be viewed as the intersection of all maximal right/left ideals in the ring, shows how the internal structure of the ring is reflected by its modules. It is also remarkable that the intersection of all maximal right ideals in a ring is the same as the intersection of all maximal left ideals in the ring, in the context of all rings; whether commutative or noncommutative. Therefore, the Jacobson radical also captures a concept which may seem to be not well-defined for noncommutative rings. Noncommutative rings serve as an active area of research due to their ubiquity in mathematics. For instance, the ring of n by n matrices over a field is noncommutative despite its natural occurrence in physics. More generally, endomorphism rings of abelian groups are rarely commutative. Noncommutative rings, like noncommutative groups, are not very well understood. For instance, although every finite abelian group is the direct sum of (finite) cyclic groups of prime-power order, non-abelian groups do not possess such a simple structure. Likewise, various invariants exist for commutative rings, whereas invariants of noncommutative rings are difficult to find. As an example, the nilradical, although "innocent" in nature, need not be an ideal unless the ring is assumed to be commutative. Specifically, the set of all nilpotent elements in the ring of all n x n matrices over a division ring never forms an ideal, irrespective of the division ring chosen. Therefore, the nilradical cannot be studied in noncommutative ring theory. Note however that there are analogues of the nilradical defined for noncommutative rings, that coincide with the nilradical when commutativity is assumed. One of the best known noncommutative rings is the division ring of quaternions.
Ring (mathematics)
211
Notes
^a:Some authors only require that a ring be a semigroup under multiplication; that is, do not require that there be a multiplicative identity (1). See the section Notes on the definition for more details. ^b:Elements which do have multiplicative inverses are called units, see Lang2002, II.1, p. 84. ^c:The closure axiom is already implied by the condition that +/ be a binary operation. Some authors therefore omit this axiom. Lang2002 ^d:The transition from the integers to the rationals by adding fractions is generalized by the quotient field. ^e:Many authors include commutativity of rings in the set of ring axioms (see above) and therefore refer to "commutative rings" as just "rings".
Citations
[1] Herstein1964, 3, p. 83 [2] The development of Ring Theory (http:/ / www-gap. dcs. st-and. ac. uk/ ~history/ HistTopics/ Ring_theory. html) [3] Lang2005, App. 2, p. 360 [4] Herstein1975, 2.1, p. 27 [5] Herstein, I. N. Topics in Algebra, Wiley; 2 edition (June 20, 1975), ISBN 0-471-01090-1. [6] Joseph Gallian (2004), Contemporary Abstract Algebra, Houghton Mifflin, ISBN9780618514717 [7] Neal H. McCoy (1964), The Theory of Rings, The MacMillian Company, pp.161, ISBN978-1124045559 [8] Raymond Louis Wilder (1965), Introduction to Foundations of Mathematics, John Wiley and Sons, p.176 [9] Lang2005, II.1, p. 90 [10] Cohn, Harvey (1980), Advanced Number Theory, New York: Dover Publications, p.49, ISBN9780486640235 [11] Jacobson (2009), p. 86, footnote 1. [12] Jacobson1945 [13] Pinter-Lucke2007 [14] http:/ / www. research. att. com/ ~njas/ sequences/ A027623 [15] Jacobson (2009), p. 162, Theorem 3.2.
Ring (mathematics)
212
References
General references
R.B.J.T. Allenby (1991), Rings, Fields and Groups, Butterworth-Heinemann, ISBN0-340-54440-6 Atiyah M. F., Macdonald, I. G., Introduction to commutative algebra. Addison-Wesley Publishing Co., Reading, Mass.-London-Don Mills, Ont. 1969 ix+128 pp. Beachy, J. A. Introductory Lectures on Rings and Modules. Cambridge, England: Cambridge University Press, 1999. T.S. Blyth and E.F. Robertson (1985), Groups, rings and fields: Algebra through practice, Book 3, Cambridge university Press, ISBN0-521-27288-2 Dresden, G. "Small Rings." (http://home.wlu.edu/~dresdeng/smallrings/) Ellis, G. Rings and Fields. Oxford, England: Oxford University Press, 1993. Goodearl, K. R., Warfield, R. B., Jr., An introduction to noncommutative Noetherian rings. London Mathematical Society Student Texts, 16. Cambridge University Press, Cambridge, 1989. xviii+303 pp.ISBN 0-521-36086-2 Herstein, I. N., Noncommutative rings. Reprint of the 1968 original. With an afterword by Lance W. Small. Carus Mathematical Monographs, 15. Mathematical Association of America, Washington, DC, 1994. xii+202 pp.ISBN 0-88385-015-X Jacobson, Nathan (2009), Basic algebra, 1 (2nd ed.), Dover, ISBN978-0-486-47189-1 Nagell, T. "Moduls, Rings, and Fields." 6 in Introduction to Number Theory. New York: Wiley, pp.1921, 1951 Nathan Jacobson, Structure of rings. American Mathematical Society Colloquium Publications, Vol. 37. Revised edition American Mathematical Society, Providence, R.I. 1964 ix+299 pp. Nathan Jacobson, The Theory of Rings. American Mathematical Society Mathematical Surveys, vol. I. American Mathematical Society, New York, 1943. vi+150 pp. Kaplansky, Irving (1974), Commutative rings (Revised ed.), University of Chicago Press, ISBN0226424545, MR0345945 Lam, T. Y., A first course in noncommutative rings. Second edition. Graduate Texts in Mathematics, 131. Springer-Verlag, New York, 2001. xx+385 pp.ISBN 0-387-95183-0 Lam, T. Y., Exercises in classical ring theory. Second edition. Problem Books in Mathematics. Springer-Verlag, New York, 2003. xx+359 pp.ISBN 0-387-00500-5 Lam, T. Y., Lectures on modules and rings. Graduate Texts in Mathematics, 189. Springer-Verlag, New York, 1999. xxiv+557 pp.ISBN 0-387-98428-3 Lang, Serge (2002), Algebra, Graduate Texts in Mathematics, 211 (Revised third ed.), New York: Springer-Verlag, ISBN978-0-387-95385-4, MR1878556 Lang, Serge (2005), Undergraduate Algebra (3rd ed.), Berlin, New York: Springer-Verlag, ISBN978-0-387-22025-3. Matsumura, Hideyuki (1989), Commutative Ring Theory, Cambridge Studies in Advanced Mathematics (2nd ed.), Cambridge University Press, ISBN978-0-521-36764-6 McConnell, J. C.; Robson, J. C. Noncommutative Noetherian rings. Revised edition. Graduate Studies in Mathematics, 30. American Mathematical Society, Providence, RI, 2001. xx+636 pp.ISBN 0-8218-2169-5 Pinter-Lucke, James (2007), "Commutativity conditions for rings: 19502005", Expositiones Mathematicae 25 (2): 165174, doi:10.1016/j.exmath.2006.07.001, ISSN0723-0869 Rowen, Louis H., Ring theory. Vol. I, II. Pure and Applied Mathematics, 127, 128. Academic Press, Inc., Boston, MA, 1988. ISBN 0-12-599841-4, ISBN 0-12-599842-2 Sloane, N. J. A. Sequences A027623 and A037234 in "The On-Line Encyclopedia of Integer Sequences Zwillinger, D. (Ed.). "Rings." 2.6.3 in CRC Standard Mathematical Tables and Formulae. Boca Raton, FL: CRC Press, pp.141143, 1995
Ring (mathematics)
213
Special references
Balcerzyk, Stanisaw; Jzefiak, Tadeusz (1989), Commutative Noetherian and Krull rings, Ellis Horwood Series: Mathematics and its Applications, Chichester: Ellis Horwood Ltd., ISBN978-0-13-155615-7 Balcerzyk, Stanisaw; Jzefiak, Tadeusz (1989), Dimension, multiplicity and homological methods, Ellis Horwood Series: Mathematics and its Applications., Chichester: Ellis Horwood Ltd., ISBN978-0-13-155623-2 Ballieu, R. "Anneaux finis; systmes hypercomplexes de rang trois sur un corps commutatif." Ann. Soc. Sci. Bruxelles. Sr. I 61, 222-227, 1947. Berrick, A. J. and Keating, M. E. An Introduction to Rings and Modules with K-Theory in View. Cambridge, England: Cambridge University Press, 2000. Eisenbud, David (1995), Commutative algebra. With a view toward algebraic geometry., Graduate Texts in Mathematics, 150, Berlin, New York: Springer-Verlag, ISBN978-0-387-94268-1; 978-0-387-94269-8, MR1322960 Fine, B. "Classification of Finite Rings of Order ." Math. Mag. 66, 248-252, 1993 Fletcher, C. R. "Rings of Small Order." Math. Gaz. 64, 9-22, 1980 Fraenkel, A. "ber die Teiler der Null und die Zerlegung von Ringen." J. reine angew. Math. 145, 139-176, 1914 Gilmer, R. and Mott, J. "Associative Rings of Order ." Proc. Japan Acad. 49, 795-799, 1973 Harris, J. W. and Stocker, H. Handbook of Mathematics and Computational Science. New York: Springer-Verlag, 1998 Jacobson, Nathan (1945), "Structure theory of algebraic algebras of bounded degree", Annals of Mathematics (Annals of Mathematics) 46 (4): 695707, doi:10.2307/1969205, ISSN0003-486X, JSTOR1969205 Knuth, D. E. The Art of Computer Programming, Vol. 2: Seminumerical Algorithms, 3rd ed. Reading, MA: Addison-Wesley, 1998 Korn, G. A. and Korn, T. M. Mathematical Handbook for Scientists and Engineers. New York: Dover, 2000 Nagata, Masayoshi (1962), Local rings, Interscience Tracts in Pure and Applied Mathematics, 13, Interscience Publishers, pp.xiii+234, ISBN978-0-88275-228-0 (1975 reprint), MR0155856 Pierce, Richard S., Associative algebras. Graduate Texts in Mathematics, 88. Studies in the History of Modern Science, 9. Springer-Verlag, New YorkBerlin, 1982. xii+436 pp.ISBN 0-387-90693-2 Zariski, Oscar; Samuel, Pierre (1975), Commutative algebra, Graduate Texts in Mathematics, 28, 29, Berlin, New York: Springer-Verlag, ISBN0387900896
Historical references
History of ring theory at the MacTutor Archive (http://www-gap.dcs.st-and.ac.uk/~history/HistTopics/ Ring_theory.html) Birkhoff, G. and Mac Lane, S. A Survey of Modern Algebra, 5th ed. New York: Macmillian, 1996 Bronshtein, I. N. and Semendyayev, K. A. Handbook of Mathematics, 4th ed. New York: Springer-Verlag, 2004. ISBN 3-540-43491-7 Faith, Carl, Rings and things and a fine array of twentieth century associative algebra. Mathematical Surveys and Monographs, 65. American Mathematical Society, Providence, RI, 1999. xxxiv+422 pp.ISBN 0-8218-0993-8 It, K. (Ed.). "Rings." 368 in Encyclopedic Dictionary of Mathematics, 2nd ed., Vol. 2. Cambridge, MA: MIT Press, 1986 Kleiner, I. "The Genesis of the Abstract Ring Concept." Amer. Math. Monthly 103, 417-424, 1996 Renteln, P. and Dundes, A. "Foolproof: A Sampling of Mathematical Folk Humor." Notices Amer. Math. Soc. 52, 24-34, 2005 Singmaster, D. and Bloom, D. M. "Problem E1648." Amer. Math. Monthly 71, 918-920, 1964 Van der Waerden, B. L. A History of Algebra. New York: Springer-Verlag, 1985 Wolfram, S. A New Kind of Science. Champaign, IL: Wolfram Media, p.1168, 2002
Commutative ring
214
Commutative ring
In ring theory, a branch of abstract algebra, a commutative ring is a ring in which the multiplication operation is commutative. The study of commutative rings is called commutative algebra. Some specific kinds of commutative rings are given with the following chain of class inclusions: Commutative rings integral domains integrally closed domains unique factorization domains principal ideal domains Euclidean domains fields
First examples
An important example, and in some sense crucial, is the ring of integers Z with the two operations of addition and multiplication. As the multiplication of integers is a commutative operation, this is a commutative ring. It is usually denoted Z as an abbreviation of the German word Zahlen (numbers). A field is a commutative ring where every non-zero element a is invertible; i.e., has a multiplicative inverse b such that a b = 1. Therefore, by definition, any field is a commutative ring. The rational, real and complex numbers form fields. The ring of 22 matrices is not commutative, since matrix multiplication fails to be commutative, as the following example shows:
However, matrices that can be diagonalized with the same similarity transformation do form a commutative ring. An example is the set of matrices of divided differences with respect to a fixed set of nodes. If R is a given commutative ring, then the set of all polynomials in the variable X whose coefficients are in R forms the polynomial ring, denoted R[X]. The same holds true for several variables. If V is some topological space, for example a subset of some Rn, real- or complex-valued continuous functions on V form a commutative ring. The same is true for differentiable or holomorphic functions, when the two concepts are defined, such as for V a complex manifold.
Commutative ring
215
Localizations
The localization of a ring is the counterpart to factor rings insofar as in a factor ring R / I certain elements (namely the elements of I) become zero, whereas in the localization certain elements are rendered invertible, i.e. multiplicative inverses are added to the ring. Concretely, if S is a multiplicatively closed subset of R (i.e. whenever s, t S then so is st) then the localization of R at S, or ring of fractions with denominators in S, usually denoted S1R consists of symbols with r R, s S subject to certain rules that mimick the cancellation familiar from rational numbers. Indeed, in this language Q is the localization of Z at all nonzero integers. This construction works for any integral domain R instead of Z. The localization (R \ {0})1R is called the quotient field of R. If S consists of the powers of one fixed element f, the localisation is written Rf.
Commutative ring
216
Commutative ring
217
Ring homomorphisms
As usual in algebra, a function f between two objects that respects the structures of the objects in question is called homomorphism. In the case of rings, a ring homomorphism is a map f : R S such that f(a + b) = f(a) + f(b), f(ab) = f(a)f(b) and f(1) = 1. These conditions ensure f(0) = 0, but the requirement that the multiplicative identity element 1 is preserved under f would not follow from the two remaining properties. In such a situation S is also called an R-algebra, by understanding that s in S may be multiplied by some r of R, by setting r s := f(r) s. The kernel and image of f are defined by ker (f) = {r R, f(r) = 0} and im (f) = f(R) = {f(r), r R}. Both kernel and image are subrings of R and S, respectively.
Modules
The outer structure of a commutative ring is determined by considering linear algebra over that ring, i.e., by investigating the theory of its modules, which are similar to vector spaces, except that the base is not necessarily a field, but can be any ring R. The theory of R-modules is significantly more difficult than linear algebra of vector spaces. Module theory has to grapple with difficulties such as modules not having bases, that the rank of a free module (i.e. the analog of the dimension of vector spaces) may not be well-defined and that submodules of finitely generated modules need not be finitely generated (unless R is Noetherian, see below). Ideals within a ring R can be characterized as R-modules which are submodules of R. On the one hand, a good understanding of R-modules necessitates enough information about R. Vice versa, however, many techniques in commutative algebra that study the structure of R, by examining its ideals, proceed by studying modules in general.
Noetherian rings
A ring is called Noetherian (in honor of Emmy Noether, who developed this concept) if every ascending chain of ideals 0 I0 I1 ... In In + 1 ... becomes stationary, i.e. becomes constant beyond some index n. Equivalently, any ideal is generated by finitely many elements, or, yet equivalent, submodules of finitely generated modules are finitely generated. A ring is called Artinian (after Emil Artin), if every descending chain of ideals R I0 I1 ... In In + 1 ... becomes stationary eventually. Despite the two conditions appearing symmetric, Noetherian rings are much more general than Artinian rings. For example, Z is Noetherian, since every ideal can be generated by one element, but is not Artinian, as the chain Z 2Z 4Z 8Z ... shows. In fact, every Artinian ring is Noetherian. Being Noetherian is an extremely important finiteness condition. The condition is preserved under many operations that occur frequently in geometry: if R is Noetherian, then so is the polynomial ring R[X1, X2, ..., Xn] (by Hilbert's basis theorem), any localization S1R, factor rings R / I.
Commutative ring
218
Dimension
The Krull dimension (or simply dimension) dim R of a ring R is a notion to measure the "size" of a ring, very roughly by the counting independent elements in R. Precisely, it is defined as the supremum of lengths n of chains of prime ideals 0 p0 p1 ... pn. For example, a field is zero-dimensional, since the only prime ideal is the zero ideal. It is also known that a commutative ring is Artinian if and only if it is Noetherian and zero-dimensional (i.e., all its prime ideals are maximal). The integers are one-dimensional: any chain of prime ideals is of the form 0 = p0 pZ = p1, where p is a prime number since any ideal in Z is principal. The dimension behaves well if the rings in question are Noetherian: the expected equality dim R[X] = dim R + 1 holds in this case (in general, one has only dim R + 1 dim R[X] 2 dim R + 1). Furthermore, since the dimension depends only on one maximal chain, the dimension of R is the supremum of all dimensions of its localisations Rp, where p is an arbitrary prime ideal. Intuitively, the dimension of R is a local property of the spectrum of R. Therefore, the dimension is often considered for local rings only, also since general Noetherian rings may still be infinite, despite all their localisations being finite-dimensional. Determining the dimension of, say, k[X1, X2, ..., Xn] / (f1, f2, ..., fm), where k is a field and the fi are some polynomials in n variables, is generally not easy. For R Noetherian, the dimension of R / I is, by Krull's principal ideal theorem, at least dim R n, if I is generated by n elements. If the dimension does drops as much as possible, i.e. dim R / I = dim R n, the R / I is called a complete intersection. A local ring R, i.e. one with only one maximal ideal m, is called regular, if the (Krull) dimension of R equals the dimension (as a vector space over the field R / m) of the cotangent space m / m2.
Completions
If I is an ideal in a commutative ring R, the powers of I form topological neighborhoods of 0 which allow R to be viewed as a topological ring. This topology is called the I-adic topology. R can then be completed with respect to this topology. Formally, the I-adic completion is the inverse limit of the rings R/In. For example, if k is a field, k[[X]], the formal power series ring in one variable over k, is the I-adic completion of k[X] where I is the principal ideal generated by X. Analogously, the ring of p-adic integers is the I-adic completion of Z where I is the principal ideal generated by p. Any ring that is isomorphic to its own completion, is called complete.
Commutative ring
219
Properties
By Wedderburn's theorem, every finite division ring is commutative, and therefore a finite field. Another condition ensuring commutativity of a ring, due to Jacobson, is the following: for every element r of R there exists an integer n > 1 such that rn = r.[2] If, r2 = r for every r, the ring is called Boolean ring. More general conditions which guarantee commutativity of a ring are also known.[3]
Notes
[1] This notion can be related to the spectrum of a linear operator, see Spectrum of a C*-algebra and Gelfand representation. [2] Jacobson1945 [3] Pinter-Lucke2007
Citations
References
Atiyah, Michael; Macdonald, I. G. (1969), Introduction to commutative algebra, Addison-Wesley Publishing Co. Balcerzyk, Stanisaw; Jzefiak, Tadeusz (1989), Commutative Noetherian and Krull rings, Ellis Horwood Series: Mathematics and its Applications, Chichester: Ellis Horwood Ltd., ISBN978-0-13-155615-7 Balcerzyk, Stanisaw; Jzefiak, Tadeusz (1989), Dimension, multiplicity and homological methods, Ellis Horwood Series: Mathematics and its Applications., Chichester: Ellis Horwood Ltd., ISBN978-0-13-155623-2 Eisenbud, David (1995), Commutative algebra. With a view toward algebraic geometry., Graduate Texts in Mathematics, 150, Berlin, New York: Springer-Verlag, ISBN978-0-387-94268-1; 978-0-387-94269-8, MR1322960 Jacobson, Nathan (1945), "Structure theory of algebraic algebras of bounded degree", Annals of Mathematics 46 (4): 695707, doi:10.2307/1969205, ISSN0003-486X, JSTOR1969205 Kaplansky, Irving (1974), Commutative rings (Revised ed.), University of Chicago Press, MR0345945 Matsumura, Hideyuki (1989), Commutative Ring Theory, Cambridge Studies in Advanced Mathematics (2nd ed.), Cambridge University Press, ISBN978-0-521-36764-6 Nagata, Masayoshi (1962), Local rings, Interscience Tracts in Pure and Applied Mathematics, 13, Interscience Publishers, pp.xiii+234, ISBN978-0-88275-228-0 (1975 reprint), MR0155856 Pinter-Lucke, James (2007), "Commutativity conditions for rings: 19502005", Expositiones Mathematicae 25 (2): 165174, doi:10.1016/j.exmath.2006.07.001, ISSN0723-0869 Zariski, Oscar; Samuel, Pierre (1958-60), Commutative Algebra I, II, University series in Higher Mathematics, Princeton, N.J.: D. van Nostrand, Inc. (Reprinted 1975-76 by Springer as volumes 28-29 of Graduate Texts in Mathematics.)
Field (mathematics)
220
Field (mathematics)
In abstract algebra, a field is a commutative ring whose nonzero elements form a group under multiplication. As such it is an algebraic structure with notions of addition, subtraction, multiplication, and division, satisfying certain axioms. The most commonly used fields are the field of real numbers, the field of complex numbers, and the field of rational numbers, but there are also finite fields, fields of functions, various algebraic number fields, p-adic fields, and so forth. To avoid confusion with other uses of the word "field", the term "corpus" may also be used. Any field may be used as the scalars for a vector space, which is the standard general context for linear algebra. The theory of field extensions (including Galois theory) involves the roots of polynomials with coefficients in a field; among other results, this theory leads to impossibility proofs for the classical problems of angle trisection and squaring the circle with a compass and straightedge, as well as a proof of the AbelRuffini theorem on the algebraic insolubility of quintic equations. In modern mathematics, the theory of fields (or field theory) plays an essential role in number theory and algebraic geometry. As an algebraic structure, every field is a ring, but not every ring is a field. The most important difference is that fields allow for division (though not division by zero), while a ring need not possess multiplicative inverses; for example the integers form a ring, but 2x=1 has no solution in integers. Also, the multiplication operation in a field is required to be commutative. A ring in which division is possible but commutativity is not assumed (such as the quaternions) is called a division ring or skew field. (Historically, division rings were sometimes referred to as fields, while fields were called commutative fields.) As a ring, a field may be classified as a specific type of integral domain, and can be characterized by the following (not exhaustive) chain of class inclusions: Commutative rings integral domains integrally closed domains unique factorization domains principal ideal domains Euclidean domains fields finite fields.
Field (mathematics) For every a in F, there exists an element a in F, such that a + (a) = 0. Similarly, for any a in F other than 0, there exists an element a1 in F, such that a a1 = 1. (The elements a+(b) and ab1 are also denoted ab and a/b, respectively.) In other words, subtraction and division operations exist. Distributivity of multiplication over addition For all a, b and c in F, the following equality holds: a (b + c) = (a b) + (a c). A field is therefore an algebraic structure F, +, , , abelian groups: F under +, , and 0; F \ {0} under , 1, and 1, with 0 1, with distributing over +.[2]
1
221
The abstractly required field axioms reduce to standard properties of rational numbers, such as the law of distributivity
O I A B
O O O O O
I O I A B
A O A B I
B O B I A
In addition to familiar number systems such as the rationals, there are other, less immediate examples of fields. The following example is a field consisting of four elements called O, I, A and B. The notation is chosen such that O plays the role of the additive identity element (denoted 0 in the axioms), and I is the multiplicative identity (denoted 1 above). One can check that all field axioms are satisfied. For example: A (B + A) = A I = A, which equals A B + A A = I + B = A, as required by the distributivity. The above field is called a finite field with four elements, and can be denoted F4. Field theory is concerned with understanding the reasons for the existence of this field, defined in a fairly ad-hoc manner, and describing its inner
Field (mathematics) structure. For example, from a glance at the multiplication table, it can be seen that any non-zero element (i.e., I, A, and B) is a power of A: A = A1, B = A2 = A A, and finally I = A3 = A A A. This is not a coincidence, but rather one of the starting points of a deeper understanding of (finite) fields.
222
Alternative axiomatizations
As with other algebraic structures, there exist alternative axiomatizations. Because of the relations between the operations, one can alternatively axiomatize a field by explicitly assuming that are four binary operations (add, subtract, multiply, divide) with axioms relating these, or in terms of two binary operations (add, multiply) and two unary operations (additive inverse, multiplicative inverse), or other variants. The usual axiomatization in terms of the two operations of addition and multiplication is brief and allows the other operations to be defined in terms of these basic ones, but in other contexts, such as topology and category theory, it is important to include all operations as explicitly given, rather than implicitly defined (compare topological group). This is because without further assumptions, the implicitly defined inverses may not be continuous (in topology), or may not be able to be defined (in category theory): defining an inverse requires that one be working with a set, not a more general object. For a very economical axiomatization of the field of real numbers, whose primitives are merely a set R with 1R, addition, and a binary relation, <, see Tarski's axiomatization of the reals.
Abelian (additive) group structure Multiplicative structure and distributivity Commutativity of multiplication Multiplicative inverses
Yes
Yes
Yes
Yes
Yes No No
Yes Yes No
Yes No Yes
The axioms imposed above resemble the ones familiar from other algebraic structures. For example, the existence of the binary operation "", together with its commutativity, associativity, (multiplicative) identity element and inverses are precisely the axioms for an abelian group. In other words, for any field, the subset of nonzero elements F \ {0}, also often denoted F, is an abelian group (F, ) usually called multiplicative group of the field. Likewise (F, +) is an abelian group. The structure of a field is hence the same as specifying such two group structures (on the same set), obeying the distributivity. Important other algebraic structures such as rings arise when requiring only part of the above axioms. For example, if the requirement of commutativity of the multiplication operation is dropped, one gets structures usually called division rings or skew fields.
Field (mathematics)
223
Remarks
By elementary group theory, applied to the abelian groups (F, ), and (F, +), the additive inverse a and the multiplicative inverse a1 are uniquely determined by a. Similar direct consequences from the field axioms include (a b) = (a) b = a (b), in particular a = (1) a as well as a 0 = 0. Both can be shown by replacing b or c with 0 in the distributive property
History
The concept of field was used implicitly by Niels Henrik Abel and variste Galois in their work on the solvability of polynomial equations with rational coefficients of degree five or higher. In 1857, Karl von Staudt published his Algebra of Throws which provided a geometric model satisfying the axioms of a field. This construction has been frequently recalled as a contribution to the foundations of mathematics. In 1871, Richard Dedekind introduced, for a set of real or complex numbers which is closed under the four arithmetic operations, the German word Krper, which means "body" or "corpus" (to suggest an organically closed entity), hence the common use of the letter K to denote a field. He also defined rings (then called order or order-modul), but the term "a ring" (Zahlring) was invented by Hilbert.[3] In 1893, Eliakim Hastings Moore called the concept "field" in English.[4] In 1881, Leopold Kronecker defined what he called a "domain of rationality", which is indeed a field of polynomials in modern terms. In 1893, Heinrich M. Weber gave the first clear definition of an abstract field.[5] In 1910, Ernst Steinitz published the very influential paper Algebraische Theorie der Krper (English: Algebraic Theory of Fields).[6] In this paper he axiomatically studies the properties of fields and defines many important field theoretic concepts like prime field, perfect field and the transcendence degree of a field extension. Emil Artin developed the relationship between groups and fields in great detail from 1928 through 1942.
Examples
Rationals and algebraic numbers
The field of rational numbers Q has been introduced above. A related class of fields very important in number theory are algebraic number fields. We will first give an example, namely the field Q() consisting of numbers of the form a + b with a, b Q, where is a primitive third root of unity, i.e., a complex number satisfying 3 = 1, 1. This field extension can be used to prove a special case of Fermat's last theorem, which asserts the non-existence of rational nonzero solutions to the equation x3 + y3 = z3. In the language of field extensions detailed below, Q() is a field extension of degree 2. Algebraic number fields are by definition finite field extensions of Q, that is, fields containing Q having finite dimension as a Q-vector space.
Field (mathematics)
224
Constructible numbers
In antiquity, several geometric problems concerned the (in)feasibility of constructing certain numbers with compass and straightedge. For example it was unknown to the Greeks that it is in general impossible to trisect a given angle. Using the field notion and field theory allows these problems to be settled. To do so, the field of constructible numbers is considered. It contains, on the plane, the points 0 and 1, and all complex numbers that can be constructed from these two by a finite number of construction steps using only compass and straightedge. This set, endowed with Given 0, 1, r1 and r2, the construction yields r1r2 the usual addition and multiplication of complex numbers does form a field. For example, multiplying two (real) numbers r1 and r2 that have already been constructed can be done using construction at the right, based on the intercept theorem. This way, the obtained field F contains all rational numbers, but is bigger than Q, because for any f F, the square root of f is also a constructible number.
Finite fields
Finite fields (also called Galois fields) are fields with finitely many elements. The above introductory example F4 is a field with four elements. Highlighted in the multiplication and addition tables above is the field F2 consisting of two elements 0 and 1. This is the smallest field, because by definition a field has at least two distinct elements 1 0. Interpreting the addition and multiplication in this latter field as XOR and AND operations, this field finds applications in computer science, especially in cryptography and coding theory. In a finite field there is necessarily an integer n such that 1 + 1 + + 1 (n repeated terms) equals 0. It can be shown that the smallest such n must be a prime number, called the characteristic of the field. If a (necessarily infinite) field has the property that 1 + 1 + + 1 is never zero, for any number of summands, such as in Q, for example, the characteristic is said to be zero. A basic class of finite fields are the fields Fp with p elements (p a prime number): Fp = Z/pZ = {0, 1, ..., p 1}, where the operations are defined by performing the operation in the set of integers Z, dividing by p and taking the remainder; see modular arithmetic. A field K of characteristic p necessarily contains Fp,[7] and therefore may be
Field (mathematics) viewed as a vector space over Fp, of finite dimension if K is finite. Thus a finite field K has prime power order, i.e., K has q = pn elements (where n > 0 is the number of elements in a basis of K over Fp). By developing more field theory, in particular the notion of the splitting field of a polynomial f over a field K, which is the smallest field containing K and all roots of f, one can show that two finite fields with the same number of elements are isomorphic, i.e., there is a one-to-one mapping of one field onto the other that preserves multiplication and addition. Thus we may speak of the finite field with q elements, usually denoted by Fq or GF(q).
225
Field of functions
Given a geometric object X, one can consider functions on such objects. Adding and multiplying them pointwise, i.e., (fg)(x) = f(x) g(x) this leads to a field. However, due to the presence of possible zeros, i.e., points x X where f(x) = 0, one has to take poles into account, i.e., formally allowing f(x) = . If X is an algebraic variety over F, then the rational functions X F, i.e., functions defined almost everywhere, form a field, the function field of X. Likewise, if X is a Riemann surface, then the meromorphic functions S C form a field. Under certain circumstances, namely when S is compact, S can be reconstructed from this field.
Constructing fields
Closure operations
Assuming the axiom of choice, for every field F, there exists a field F, called the algebraic closure of F, which contains F, is algebraic over F, which means that any element x of F satisfies a polynomial equation fnxn + fn1xn1 + + f1x + f0 = 0, with coefficients fn, ..., f0 F, and is algebraically closed, i.e., any such polynomial does have at least one solution in F. The algebraic closure is unique up to isomorphism inducing the identity on F. However, in many circumstances in mathematics, it is not appropriate to treat F as being uniquely determined by F, since the isomorphism above is not itself unique. In these cases, one refers to such a F as an algebraic closure of F. A similar concept is the separable closure, containing all roots of separable polynomials, instead of all polynomials. For example, if F = Q, the algebraic closure Q is also called field of algebraic numbers. The field of algebraic numbers is an example of an algebraically closed field of characteristic zero; as such it satisfies the same first-order sentences as the field of complex numbers C.
Field (mathematics) In general, all algebraic closures of a field are isomorphic. However, there is in general no preferable isomorphism between two closures. Likewise for separable closures.
226
where p(X) and q(X) are polynomials with coefficients in E, and q is not the zero polynomial, forms a field. This is the simplest example of a transcendental extension of E. It also is an example of a domain (the ring of polynomials in this case) being embedded into its field of fractions . The ring of formal power series is also a domain, and again the (equivalence classes of) fractions of the form form the field of fractions for . . This field is actually the ring p(X)/ q(X) where p and q are elements of of Laurent series over the field E, denoted
In the above two cases, the added symbol X and its powers did not interact with elements of E. It is possible however that the adjoined symbol may interact with E. This idea will be illustrated by adjoining an element to the field of real numbers R. As explained above, C is an extension of R. C can be obtained from R by adjoining the imaginary symbol i which satisfies i2 = 1. The result is that R[i]=C. This is different from adjoining the symbol X to R, because in that case, the powers of X are all distinct objects, but here, i2=1 is actually an element of R. Another way to view this last example is to note that i is a zero of the polynomial p(X) = X2 + 1. The quotient ring can be mapped onto C using the map . Since the ideal (X2+1) is generated by a polynomial irreducible over R, the ideal is maximal, hence the quotient ring is a field. This nonzero ring map from the quotient to C is necessarily an isomorphism of rings. The above construction generalises to any irreducible polynomial in the polynomial ring E[X], i.e., a polynomial p(X) that cannot be written as a product of non-constant polynomials. The quotient ring F = E[X] / (p(X)), is again a field. Alternatively, constructing such field extensions can also be done, if a bigger container is already given. Suppose given a field E, and a field G containing E as a subfield, for example G could be the algebraic closure of E. Let x be an element of G not in E. Then there is a smallest subfield of G containing E and x, denoted F = E(x) and called field extension F / E generated by x in G.[8] Such extensions are also called simple extensions. Many extensions are of this type; see the primitive element theorem. For instance, Q(i) is the subfield of C consisting of all numbers of the form a + bi where both a and b are rational numbers. One distinguishes between extensions having various qualities. For example, an extension K of a field k is called algebraic, if every element of K is a root of some polynomial with coefficients in k. Otherwise, the extension is called transcendental. The aim of Galois theory is the study of algebraic extensions of a field.
Field (mathematics)
227
Rings vs fields
Adding multiplicative inverses to an integral domain R yields the field of fractions of R. For example, the field of fractions of the integers Z is just Q. Also, the field F(X) is the quotient field of the ring of polynomials F[X]. "Getting back" the ring from the field is sometimes possible; see discrete valuation ring. Another method to obtain a field from a commutative ring R is taking the quotient R / m, where m is any maximal ideal of R. The above construction of F = E[X] / (p(X)), is an example, because the irreducibility of the polynomial p(X) is equivalent to the maximality of the ideal generated by this polynomial. Another example are the finite fields Fp = Z / pZ.
Ultraproducts
If I is an index set, U is an ultrafilter on I, and Fi is a field for every i in I, the ultraproduct of the Fi with respect to U is a field. For example, a non-principal ultraproduct of finite fields is a pseudo finite field; i.e., a PAC field having exactly one extension of any degree. This construction is important to the study of the elementary theory of finite fields.
Galois theory
Galois theory aims to study the algebraic extensions of a field by studying the symmetry in the arithmetic operations of addition and multiplication. The fundamental theorem of Galois theory shows that there is a strong relation between the structure of the symmetry group and the set of algebraic extensions. In the case where F / E is a finite (Galois) extension, Galois theory studies the algebraic extensions of E that are subfields of F. Such fields are called intermediate extensions. Specifically, the Galois group of F over E, denoted Gal(F/E), is the group of field automorphisms of F that are trivial on E (i.e., the bijections : F F that preserve addition and multiplication and that send elements of E to themselves), and the fundamental theorem of Galois theory states that there is a one-to-one correspondence between subgroups of Gal(F/E) and the set of intermediate extensions of the extension F/E. The theorem, in fact, gives an explicit correspondence and further properties. To study all (separable) algebraic extensions of E at once, one must consider the absolute Galois group of E, defined as the Galois group of the separable closure, Esep, of E over E (i.e., Gal(Esep/E). It is possible that the degree of this extension is infinite (as in the case of E = Q). It is thus necessary to have a notion of Galois group for an infinite algebraic extension. The Galois group in this case is obtained as a "limit" (specifically an inverse limit) of the Galois groups of the finite Galois extensions of E. In this way, it acquires a topology.[9] The fundamental theorem of Galois theory can be generalized to the case of infinite Galois extensions by taking into consideration the topology of the Galois group, and in the case of Esep/E it states that there this a one-to-one correspondence between closed subgroups of Gal(Esep/E) and the set of all separable algebraic extensions of E (technically, one only obtains those separable algebraic extensions of E that occur as subfields of the chosen separable closure Esep, but since all separable closures of E are isomorphic, choosing a different separable closure would give the same Galois group and thus an "equivalent" set of algebraic extensions).
Field (mathematics)
228
Generalizations
There are also proper classes with field structure, which are sometimes called Fields, with a capital F: The surreal numbers form a Field containing the reals, and would be a field except for the fact that they are a proper class, not a set. The set of all surreal numbers with birthday smaller than some inaccessible cardinal form a field. The nimbers form a Field. The set of nimbers with birthday smaller than 22n, the nimbers with birthday smaller than any infinite cardinal are all examples of fields. In a different direction, differential fields are fields equipped with a derivation. For example, the field R(X), together with the standard derivative of polynomials forms a differential field. These fields are central to differential Galois theory. Exponential fields, meanwhile, are fields equipped with an exponential function that provides a homomorphism between the additive and multiplicative groups within the field. The usual exponential function makes the real and complex numbers exponential fields, denoted Rexp and Cexp respectively. Generalizing in a more categorical direction yields the field with one element and related objects.
Exponentiation
One does not in general study generalizations of field with three binary operations. The familiar addition/subtraction, multiplication/division, exponentiation/root-extraction operations from the natural numbers to the reals, each built up in terms of iteration of the last, mean that generalizing exponentiation as a binary operation is tempting, but has generally not proven fruitful; instead, an exponential field assumes a unary exponential function from the additive group to the multiplicative group, not a partially defined binary function. Note that the exponential operation of is neither associative nor commutative, nor has a unique inverse ( are both square roots of 4, for instance), unlike addition and multiplication, and further is not defined for many pairsfor example, does not define a single number. These all show that even for rational numbers exponentiation is not nearly as well-behaved as addition and multiplication, which is why one does not in general axiomatize exponentiation.
Applications
The concept of a field is of use, for example, in defining vectors and matrices, two structures in linear algebra whose components can be elements of an arbitrary field. Finite fields are used in number theory, Galois theory, coding theory and combinatorics; and again the notion of algebraic extension is an important tool. Fields of characteristic 2 are useful in computer science.
Notes
[1] That is, the axiom for addition only assumes a binary operation The axiom of inverse allows one to define a unary operation that sends an element to its negative (its additive inverse); this is not taken as given, but is implicitly defined in terms of addition as " is the unique b such that ", "implicitly" because it is defined in terms of solving an equationand one then defines the binary operation of subtraction, also denoted by "", as in terms of addition and additive inverse.
In the same way, one defines the binary operation of division in terms of the assumed binary operation of multiplication and the implicitly defined operation of "reciprocal" (multiplicative inverse).
[2] Wallace, D A R (1998) Groups, Rings, and Fields, SUMS. Springer-Verlag: 151, Th. 2. [3] J J O'Connor and E F Robertson, The development of Ring Theory (http:/ / www-history. mcs. st-andrews. ac. uk/ HistTopics/ Ring_theory. html), September 2004. [4] Earliest Known Uses of Some of the Words of Mathematics (F) (http:/ / jeff560. tripod. com/ f. html) [5] Fricke, Robert; Weber, Heinrich Martin (1924), Lehrbuch der Algebra (http:/ / resolver. sub. uni-goettingen. de/ purl?PPN234788267), Vieweg,
Field (mathematics)
[6] Steinitz, Ernst (1910), "Algebraische Theorie der Krper" (http:/ / resolver. sub. uni-goettingen. de/ purl?GDZPPN002167042), Journal fr die reine und angewandte Mathematik 137: 167309, ISSN0075-4102, [7] Jacobson (2009), p. 213 [8] Jacobson (2009), p. 213 [9] As an inverse limit of finite discrete groups, it is equipped with the profinite topology, making it a profinite topological group
229
References
Artin, Michael (1991), Algebra, Prentice Hall, ISBN978-0130047632, especially Chapter 13 Allenby, R.B.J.T. (1991), Rings, Fields and Groups, Butterworth-Heinemann, ISBN978-0-340-54440-2 Blyth, T.S.; Robertson, E. F. (1985), Groups, rings and fields: Algebra through practice, Cambridge University Press. See especially Book 3 (ISBN 0-521-27288-2) and Book 6 (ISBN 0-521-27291-2). Jacobson, Nathan (2009), Basic algebra, 1 (2nd ed.), Dover, ISBN978-0-486-47189-1 James Ax (1968), The elementary theory of finite fields, Ann. of Math. (2), 88, 239271
External links
Field Theory Q&A (http://www.compsoc.nuigalway.ie/~pappasmurf/fields/index.php) Fields at ProvenMath (http://www.apronus.com/provenmath/fields.htm) definition and basic properties. Field (http://planetmath.org/?op=getobj&from=objects&id=355) on PlanetMath
Perfect field
In algebra, a field k is said to be perfect if any one of the following equivalent conditions holds: Every irreducible polynomial over k has distinct roots. Every polynomial over k is separable. Every finite extension of k is separable. (This implies that every algebraic extension of k is separable.) Either k has characteristic 0, or, when k has characteristic p > 0, every element of k is a pth power. Every element of k is a qth power. (Here, q is the characteristic exponent, equal to 1 if k has characteristic 0, and equal to p if k has characteristic p > 0). The separable closure of k is algebraically closed. Every k-algebra A is a separable algebra; i.e., Otherwise, k is called imperfect. In particular, every field of characteristic zero and finite fields are perfect. More generally, a ring of characteristic p (p a prime) is called perfect if the Frobenius endomorphism is an automorphism.[1] is reduced for every field extension F/k.
Perfect field
230
Examples
Examples of perfect fields are: a field of characteristic zero, finite fields, algebraically closed fields, the union of perfect fields, fields algebraic over a perfect field. (In particular, an imperfect field is necessarily transcendental over its prime subfield, which is perfect.) On the other hand, if has a positive characteristic, then , indeterminate, is not perfect. In fact, most fields that appear in practice are perfect. The imperfect case arises mainly in algebraic geometry.
inseparable subextension. If is a normal finite extension, then .[2] In terms of universal properties, the perfect closure of a ring A of characteristic p is a perfect ring Ap of characteristic p together with a ring homomorphism u : A Ap such that for any other perfect ring B of characteristic p with a homomorphism v : A B there is a unique homomorphism f : Ap B such that v factors through u (i.e. v = fu). The perfect closure always exists.[3] The perfection of a ring A of characteristic p is the dual notion (though this term is sometimes used for the perfect closure). In other words, the perfection R(A) of A is a perfect ring of characteristic p together with a map : R(A) A such that for any perfect ring B of characteristic p equipped with a map : B A, there is a unique map f : B R(A) such that factors through (i.e. = f). The perfection of A may be constructed as follows. Consider the projective system
where the transition maps are the Frobenius endomorphism. The inverse limit of this system is R(A) and consists of sequences (x0, x1, ... ) of elements of A such that for all i. The map : R(A) A sends (xi) to x0.[4]
Notes
[1] [2] [3] [4] Serre 1979, Section II.4 Cohn, Theorem 11.4.10 Bourbaki 2003, Section V.5.1.4, page 111 Brinon & Conrad 2009, section 4.2
References
Bourbaki, Nicholas (2003), Algebra II, Springer, ISBN978-3-540-00706-7 Brinon, Olivier; Conrad, Brian (2009), CMI Summer School notes on p-adic Hodge theory (http://math.stanford. edu/~conrad/papers/notes.pdf), retrieved 2010-02-05 Serre, Jean-Pierre (1979), Local fields, Graduate Texts in Mathematics, 67 (2 ed.), Springer-Verlag, ISBN978-0-387-90424-5, MR554237 Cohn, P.M. (2003), Basic Algebra: Groups, Rings and Fields
Finite field
231
Finite field
In abstract algebra, a finite field or Galois field (so named in honor of variste Galois) is a field that contains a finite number of elements. Finite fields are important in number theory, algebraic geometry, Galois theory, cryptography, and coding theory. The finite fields are classified by size; there is exactly one finite field up to isomorphism of size pk for each prime p and positive integer k. Each finite field of size q is the splitting field of the polynomial xq - x, and thus the fixed field of the Frobenius endomorphism which takes x to xq. Similarly, the multiplicative group of the field is a cyclic group. Wedderburn's little theorem states that the Brauer group of a finite field is trivial, so that every finite division ring is a finite field. Finite fields have applications in many areas of mathematics and computer science, including coding theory, LFSRs, modular representation theory, and the groups of Lie type. Finite fields are an active area of research, including recent results on the Kakeya conjecture and open problems on the size of the smallest primitive root. Finite fields appear in the following chain of class inclusions: Commutative rings integral domains integrally closed domains unique factorization domains principal ideal domains Euclidean domains fields finite fields.
Classification
The finite fields are classified as follows (Jacobson 2009, 4.13,p. 287): The order, or number of elements, of a finite field is of the form pn, where p is a prime number called the characteristic of the field, and n is a positive integer. For every prime number p and positive integer n, there exists a finite field with pn elements. Any two finite fields with the same number of elements are isomorphic. That is, under some renaming of the elements of one of these, both its addition and multiplication tables become identical to the corresponding tables of the other one. This classification justifies using a naming scheme for finite fields that specifies only the order of the field. One notation for a finite field is Fpn. Another notation is GF(pn), where the letters "GF" stand for "Galois field".
Examples
First we consider fields where the size is prime, i.e., n = 1. Such a field is also called a Prime field. An example of such a finite field is the ring Z/pZ. It is a finite field with p elements, usually labelled 0, 1, 2, ..., p1, where arithmetic is performed modulo p. It is also sometimes denoted Zp, but within some areas of mathematics, particularly number theory, this may cause confusion because the same notation Zp is used for the ring of p-adic integers. Next we consider fields where the size is not prime, but is a prime power, i.e., n > 1. Two isomorphic constructions of the field with 4 elements are (Z/2Z)[T]/(T2+T+1) and Z[]/(2Z[]), where = . A field with 8
elements is (Z/2Z)[T]/(T3+T+1). Two isomorphic constructions of the field with 9 elements are (Z/3Z)[T]/(T2+1) and Z[i]/(3Z[i]). Even though all fields of size p are isomorphic to Z/pZ, for n 2 the ring Z/pnZ (the ring of integers modulo pn) is not a field. The element p (mod pn) is nonzero and has no multiplicative inverse. By comparison with the ring Z/4Z of size 4, the underlying additive group of the field (Z/2Z)[T]/(T2+T+1) of size 4 is not cyclic but rather is isomorphic to the Klein four-group, (Z/2Z)2. A prime power field with n=2 is also called a binary field.
Finite field Finally, we consider fields where the size is not a prime power. As it turns out, none exist. For example, there is no field with 6 elements, because 6 is not a prime power. Each and every pair of operations on a set of 6 elements fails to satisfy the mathematical definition of a field.
232
Proof outline
The characteristic of a finite field is a prime p (since a field has no zero divisors), and the field is a vector space of some finite dimension, say n, over Z/pZ, hence the field has pn elements. A field of order p exists, because Fp = Z/pZ is a field, where primality is required for the nonzero elements to have multiplicative inverses. For any prime power q = pn, Fq is the splitting field of the polynomial f(T) = Tq T over Fp. This field exists and is unique up to isomorphism by the construction of splitting fields. The set of roots is a field, the fixed field of the nth iterate of the Frobenius endomorphism, so the splitting field is exactly the q roots of this polynomial, which are distinct because the polynomial Tq T is separable over Fp: its derivative is 1, which has no roots.
Finite field Now the first proof, using linear algebra, is a lot shorter and is the standard argument found in (nearly) all textbooks that treat finite fields. The second proof is interesting because it gets the same result by working much more heavily with the additive structure of a finite field. Of course we had to use the multiplicative structure somewhere (after all, not all finite rings have prime-power order), and it was used right at the start: multiplication by b/a on F sends a to b. The second proof is actually the one which was used in E. H. Moore's 1903 paper which (for the first time) classified all finite fields. Existence The proof of the second statement, concerning the existence of a finite field of size q = pn for any prime p and positive integer n, is more involved. We again give two arguments. The case n = 1 is easy: take Fp = Z/pZ. For general n, inside Fp[T] consider the polynomial f(T) = Tq T. It is possible to construct a field F (called the splitting field of f over Fp), which contains Fp and which is large enough for f(T) to split completely into linear factors: f(T) = (Tr1)(Tr2)(Trq) in F[T]. The existence of splitting fields in general is discussed in construction of splitting fields. These q roots are distinct, because Tq T is a polynomial of degree q which has no repeated roots in F: its derivative is qTq1 1, which is 1 (because q = 0 in F) and therefore the derivative has no roots in common with f(T). Furthermore, setting R to be the set of these roots, one sees that R itself forms a field, as follows. Both 0 and 1 are in R, because 0q = 0 and 1q = 1. If r and s are in R, then (r+s)q = rq + sq = r + s so that r+s is in R. The first equality above follows from the binomial theorem and the fact that F has characteristic p. Therefore R is closed under addition. Similarly, R is closed under multiplication and taking inverses, because (rs)q = rq sq = rs and (r1)q = (rq)1 = r1. Therefore R is a field with q elements, proving the second statement. For the second proof that a field of size q = pn exists, we just sketch the ideas. We will give a combinatorial argument that a monic irreducible f(T) of degree n exists in Fp[T]. Then the quotient ring Fp[T] / (f(T)) is a field of size q. Because Tq T has no repeated irreducible factors (it is a separable polynomial in Fp[T]), it is a product of distinct monic irreducibles. We ask: which monic irreducibles occur in the factorization? Using some group theory, one can show that a monic irreducible in Fp[T] is a factor precisely when its degree divides n. Writing Np(d) for the number of monic irreducibles of degree d in Fp[T], computing the degree of the irreducible factorization of Tq T shows q = pn is the sum of dNp(d) over all d dividing n. This holds for all n, so by Moebius inversion one can get a formula for Np(n) for all n, and a simple lower bound estimate using this formula shows Np(n) is positive. Thus a (monic) irreducible of degree n in Fp[T] exists, for any n. R = { r1, ..., rq } = { roots of the equation Tq = T }
233
Finite field Uniqueness Finally the uniqueness statement: a field of size q = pn is the splitting field of Tq - T over its subfield of size p, and for any field K, two splitting fields of a polynomial in K[T] are unique up to isomorphism over K. That is, the two splitting fields are isomorphic by an isomorphism extending the identification of the copies of K inside the two splitting fields. Since a field of size p can be embedded in a field of characteristic p in only one way (the multiplicative identity 1 in the field is unique, then 2 = 1 + 1, and so on up to p - 1), the condition of two fields of size q being isomorphic over their subfields of size p is the same as just being isomorphic fields. Warning: it is not the case that two finite fields of the same size are isomorphic in a unique way, unless the fields have size p. Two fields of size pn are isomorphic to each other in n ways (because a field of size pn is isomorphic to itself in n ways, from Galois theory for finite fields).
234
Examples
The polynomial f(T) = T2 + T + 1 is irreducible over Z/2Z, and (Z/2Z)[T] / (T2+T+1) has size 4. Its elements can be written as the set {0, 1, t, t+1} where the multiplication is carried out by using the relation t2 + t + 1 = 0. In fact, since we are working over Z/2Z (that is, in characteristic 2), we may write this as t2 = t + 1. (This follows because 1 = 1 in Z/2Z) Then, for example, to determine t3, we calculate: t3 = t(t2) = t(t+1) = t2+t = t+1+t = 2t + 1 = 1, so t3 = 1. In order to find the multiplicative inverse of t in this field, we have to find a polynomial p(T) such that T * p(T) = 1 modulo T2 + T + 1. The polynomial p(T) = T + 1 works, and hence 1/t = t + 1. To construct a field of size 27, we could start for example with the irreducible polynomial T3 + T2 + T + 2 over Z/3Z. The field (Z/3Z)[T]/(T3 + T2 + T + 2) has size 27. Its elements have the form at2 + bt + c where a, b, and c lie in Z/3Z and the multiplication is defined by t3 + t2 + t + 2 = 0, or by rearranging this equation, t3 = 2t2 + 2t + 1.
Finite field
235
Algebraic closure
Finite fields are not algebraically closed: the polynomial
has no roots over F, as f() = 1 for all in F. However, for each prime p there is an algebraic closure of any finite field of characteristic p, as below.
Containment
The field Fpn contains a copy of Fpm if and only if m divides n. "Only if" is because the larger field is a vector space over the smaller field, of some finite dimension, say d, so it must have size , so m divides n. "If" is because there exist irreducible polynomials of every degree over Fpm. The direct limit of this system is a field, and is an algebraic closure of Fp (or indeed of Fpn for any n), denoted . This field is infinite, as it is algebraically closed, or more simply because it contains a subfield of size pn for all n. The inclusions commute with the Frobenius map, as it is defined the same way on each field (it is still just the function raising to the pth power), so the Frobenius map defines an automorphism of , which carries all subfields back to themselves. Unlike in the case of finite fields, the Frobenius automorphism on the algebraic closure of Fp has infinite order (no iterate of it is the identity function on the whole field), and it does not generate the full group of automorphisms of this field. That is, there are automorphisms of the algebraic closure which are not iterates of the pth power map. However, the iterates of the pth power map do form a dense subgroup of the automorphism group in the Krull topology. Algebraically, this corresponds to the additive group Z being dense in the profinite integers (direct product of the p-adic integers over all primes p, with the product topology). The field Fpn can be recovered as the fixed points of the nth iterate of the Frobenius map. If we actually construct our finite fields in such a fashion that Fpn is contained in Fpm whenever n divides m, then this direct limit can be constructed as the union of all these fields. Even if we do not construct our fields this way, we can still speak of the algebraic closure, but some more delicacy is required in its construction.
Irreducibility of polynomials
If F is a finite field, a polynomial f(X) with coefficients in F is said to be irreducible over F if and only if f(X) is irreducible as an element of the polynomial ring over F (that is, in F[X]). Note that since the polynomial ring F[X] is a unique factorization domain, a polynomial f(X) is irreducible if and only if it is prime as an element of F[X]. There are several fundamental questions one can ask about irreducible polynomials over a given finite field. Firstly, is it possible to give an explicit formula, in the variables q and n, that yields the number of irreducible polynomials over Fq of degree n? Note that since there are only finitely many polynomials of a given degree n over the finite field Fq, there can be only finitely many such irreducible polynomials. However, while little theory is required to compute the number of polynomials of degree n over Fq (there are precisely qn(q1) such polynomials), it is not immediately obvious how to compute the number of irreducible polynomials of degree n over q. Secondly, is it possible to describe an algorithm that may be used to decide whether a given polynomial over Fq is irreducible? In fact, there exists two such (known) algorithms: the Berlekamp algorithm and the Cantor-Zassenhaus algorithm. Furthermore, these algorithms do much more than merely decide whether a given polynomial is irreducible; they may also be implemented to explicitly compute the irreducible factors of f.
Finite field Number of monic irreducible polynomials of a given degree over a finite field If Fq denotes the finite field of order q, then the number N of monic irreducible polynomials of degree n over Fq is given by:[1]
236
where is the Mbius function. By the above formula, the number of irreducible polynomials of degree n over Fq is given by . A (slightly simpler) lower bound on N also exists, and is given by:
Multiplicative structure
Cyclic
The multiplicative group of every finite field is cyclic, a special case of a theorem mentioned in the article about fields (see Field (mathematics)#Some first theorems). A generator for the multiplicative group is a primitive element. This means that if F is a finite field with q elements, then there exists an element x in F such that F = { 0, 1, x, x2, ..., xq-2 }. The primitive element x is not unique (unless q = 2 or 3): the set of generators has size where is
Euler's totient function. If we fix a generator, then for any non-zero element a in Fq, there is a unique integer n with 0nq2 such that a = xn. The value of n for a given a is called the discrete log of a (in the given field, to base x).
Finite field
237
Applications
Discrete exponentiation, also known as calculating a = xn from x and n, can be computed quickly using techniques of fast exponentiation such as binary exponentiation, which takes only O(log n) field operations. No fast way of computing the discrete logarithm n given a and x is known, and this has many applications in cryptography, such as the Diffie-Hellman protocol. Finite fields also find applications in coding theory: many codes are constructed as subspaces of vector spaces over finite fields. Within number theory, the significance of finite fields is their role in the definition of the Frobenius element (or, more accurately, Frobenius conjugacy class) attached to a prime ideal in a Galois extension of number fields, which in turn is needed to make sense of Artin L-functions of representations of the Galois group, the non-abelian generalization of Dirichlet L-functions. Counting solutions to equations over finite fields leads into deep questions in algebraic geometry, the Weil conjectures, and in fact was the motivation for Grothendieck's development of modern algebraic geometry.
F3:
+ 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 0 1 2 0 0 0 0 1 0 1 2 2 0 2 1
F4:
+ 0 1 A B 0 0 1 A B 1 1 0 B A A A B 0 1 B B A 1 0 0 1 A B 0 0 0 0 0 1 0 1 A B A 0 A B 1 B 0 B 1 A
Finite field
238
Notes
[1] Jacobson 2009, 4.13
References
Jacobson, Nathan (2009) [1985], Basic algebra I (Second ed.), Dover Publications, ISBN978-0-486-47189-1 Lidl, Rudolf; Niederreiter, Harald (1997), Finite Fields (2nd ed.), Cambridge University Press, ISBN0-521-39231-4
External links
Finite Fields (http://mathworld.wolfram.com/FiniteField.html) at Wolfram research.
Definitions
A real closed field is a field F in which any of the following equivalent conditions are true: 1. F is elementarily equivalent to the real numbers. In other words it has the same first-order properties as the reals: any sentence in the first-order language of fields is true in F if and only if it is true in the reals. 2. There is a total order on F making it an ordered field such that, in this ordering, every positive element of F is a square in F and any polynomial of odd degree with coefficients in F has at least one root in F. 3. F is a formally real field such that every polynomial of odd degree with coefficients in F has at least one root in F, and for every element a of F there is b in F such that a=b2 or a=b2. 4. F is not algebraically closed but its algebraic closure is a finite extension. 5. F is not algebraically closed but the field extension is algebraically closed. 6. There is an ordering on F which does not extend to an ordering on any proper algebraic extension of F. 7. F is a formally real field such that no proper algebraic extension of F is formally real. (In other words, the field is maximal in an algebraic closure with respect to the property of being formally real.) 8. There is an ordering on F making it an ordered field such that, in this ordering, the intermediate value theorem holds for all polynomials over F. 9. F is a field and a real closed ring. If F is an ordered field (not just orderable, but a definite ordering is fixed as part of the structure), the ArtinSchreier theorem states that F has an algebraic extension, called the real closure K of F, such that K is a real closed field whose ordering is an extension of the given ordering on F, and is unique up to a unique isomorphism of fields (note that every ring homomorphism between real closed fields automatically is order preserving, because xy if and only if zy=x+z2). For example, the real closure of the rational numbers is the field of real algebraic numbers. The theorem is named for Emil Artin and Otto Schreier, who proved it in 1926. If F is a field (so this time, no order is fixed, and it is even not necessary to assume that F is orderable) then F still has a real closure, which in general is not a field anymore, but a real closed ring. For example the real closure of the field is the ring (the two copies correspond to the two orderings of ). Whereas the real closure of the ordered subfield of is again the field .
239
Order properties
A crucially important property of the real numbers is that it is an archimedean field, meaning it has the archimedean property that for any real number, there is an integer larger than it in absolute value. An equivalent statement is that for any real number, there are integers both larger and smaller. A non-archimedean field is, of course, a field that is not archimedean, and there are real closed non-archimedean fields; for example any field of hyperreal numbers is real closed and non-archimedean. The archimedean property is related to the concept of cofinality. A set X contained in an ordered set F is cofinal in F if for every y in F there is an x in X such that y < x. In other words, X is an unbounded sequence in F. The cofinality of F is the size of the smallest cofinal set, which is to say, the size of the smallest cardinality giving an unbounded sequence. For example natural numbers are cofinal in the reals, and the cofinality of the reals is therefore . We have therefore the following invariants defining the nature of a real closed field F: The cardinality of F. The cofinality of F. To this we may add The weight of F, which is the minimum size of a dense subset of F. These three cardinal numbers tell us much about the order properties of any real closed field, though it may be difficult to discover what they are, especially if we are not willing to invoke generalized continuum hypothesis. There are also particular properties which may or may not hold: A field F is complete if there is no ordered field K properly containing F such that F is dense in K. If the cofinality of K is , this is equivalent to saying Cauchy sequences indexed by are convergent in F.
Real closed field An ordered field F has the property for the ordinal number if for any two subsets L and U of F of cardinality less than , at least one of which is nonempty, and such that every element of L is less than every element of U, there is an element x in F with x larger than every element of L and smaller than every element of U. This is closely related to the model-theoretic property of being a saturated model; any two real closed fields are if and only if they are -saturated, and moreover two real closed fields both of cardinality are order isomorphic.
240
References
Basu, Saugata, Richard Pollack, and Marie-Franoise Roy (2003) "Algorithms in real algebraic geometry" in Algorithms and computation in mathematics. Springer. ISBN 3540330984 (online version [1]) Caviness, B F, and Jeremy R. Johnson, eds. (1998) Quantifier elimination and cylindrical algebraic decomposition. Springer. ISBN 3211827943 Chen Chung Chang and Howard Jerome Keisler (1989) Model Theory. North-Holland. Dales, H. G., and W. Hugh Woodin (1996) Super-Real Fields. Oxford Univ. Press. Mishra, Bhubaneswar (1997) "Computational Real Algebraic Geometry, [2]" in Handbook of Discrete and Computational Geometry. CRC Press. 2004 edition, p. 743. ISBN 1-58488-301-4 Alfred Tarski (1951) A Decision Method for Elementary Algebra and Geometry. Univ. of California Press.
Real closed field J. Davenport and J. Heintz, "Real quantifier elimination is doubly exponential", Journal of Symbolic Computation 5:12 (1988), pp. 2935.
241
External links
Real Algebraic and Analytic Geometry Preprint Server [3] Model Theory preprint server [4]
References
[1] [2] [3] [4] http:/ / perso. univ-rennes1. fr/ marie-francoise. roy/ bpr-ed2-posted1. html http:/ / www. cs. nyu. edu/ mishra/ PUBLICATIONS/ 97. real-alg. ps http:/ / www. maths. manchester. ac. uk/ raag/ http:/ / www. logique. jussieu. fr/ modnet/ Publications/ Preprint%20server/
Alternative Definitions
The definition given above is not a first-order definition, as it requires quantifiers over sets. However, the following criteria can be coded as first-order sentences in the language of fields, and are equivalent to the above definition. A formally real field F is a field that satisfies in addition one of the following equivalent properties:[1] [2] 1 is not a sum of squares in F. (In particular, such a field must have characteristic 0, since in a field of characteristic p the element 1 is a sum of 1's.) There exists an element of F which is not a sum of squares in F, and the characteristic of F is not 2. If any sum of squares of elements of F equals zero, then each of those elements must be zero. It is easy to see that these three properties are equivalent. It is also easy to see that a field which admits an ordering must satisfy these three properties. A proof that if F satisfies these three properties, then F admits an ordering uses the notion of prepositive cones and positive cones. Suppose 1 is not a sum of squares, then a Zorn's Lemma argument shows that the prepositive cone of sums of squares can be extended to a positive cone PF. One uses this positive cone to define an ordering: ab if and only if b-a belongs to P.
242
Notes
[1] Rajwade, Theorem 15.1. [2] Milnor and Husemoller (1973) p.60
References
Milnor, John; Husemoller, Dale (1973). Symmetric bilinear forms. Springer. ISBN3-540-06009-X. Rajwade, A. R. (1993). Squares. Cambridge University Press. ISBN 0521426685.
243
Set theory
General set theory
General set theory (GST) is George Boolos's (1998) name for a fragment of the axiomatic set theory Z. GST is sufficient for all mathematics not requiring infinite sets, and is the weakest known set theory whose theorems include the Peano axioms.
Ontology
The ontology of GST is identical to that of ZFC, and hence is thoroughly canonical. GST features a single primitive ontological notion, that of set, and a single ontological assumption, namely that all individuals in the universe of discourse (hence all mathematical objects) are sets. There is a single primitive binary relation, set membership; that set a is a member of set b is written ab (usually read "a is an element of b").
Axioms
The symbolic axioms below are from Boolos (1998: 196), and govern how sets behave and interact. The natural language versions of the axioms are intended to aid the intuition. The background logic is first order logic with identity. 1) Axiom of Extensionality: The sets x and y are the same set if they have the same members.
The converse of this axiom follows from the substitution property of equality. 2) Axiom Schema of Specification (or Separation or Restricted Comprehension): If z is a set and is any property
which may be satisfied by all, some, or no elements of z, then there exists a subset y of z containing just those elements x in z which satisfy the property . The restriction to z is necessary to avoid Russell's paradox and its variants. More formally, let be any formula in the language of GST in which x is free and y is not. Then all instances of the following schema are axioms: 3) Axiom of Adjunction: If x and y are sets, then there exists a set w, the adjunction of x and y, whose members are just y and the members of x.[1]
Adjunction refers to an elementary operation on two sets, and has no bearing on the use of that term elsewhere in mathematics, including in category theory.
Discussion
GST is the fragment of Z obtained by omitting the axioms Union, Power Set, Infinity, and Choice, then taking Adjunction, a theorem of Z, as an axiom. The result is a first order theory. Setting (x) in Separation to xx, and assuming that the domain is nonempty, assures the existence of the empty set. Adjunction implies that if x is a set, then so is . Given Adjunction, the usual construction of the successor ordinals from the empty set can proceed, one in which the natural numbers are defined as (see Peano's axioms). More generally, given any model M of ZFC, the collection of hereditarily finite sets in M will satisfy the GST axioms. Therefore, GST cannot prove the existence of even a
General set theory countable infinite set, that is, of a set whose cardinality is 0. Even if GST did afford a countably infinite set, GST could not prove the existence of a set whose cardinality is 1, that of the continuum, because GST lacks the axiom of power set. Hence GST cannot ground analysis and geometry, and is too weak to serve as a foundation for mathematics. Boolos was interested in GST only as a fragment of Z that is just powerful enough to interpret Peano arithmetic. He never lingered over GST, only mentioning it briefly in several papers discussing the systems of Frege's Grundlagen and Grundgesetze, and how they could be modified to eliminate Russell's paradox. The system A'[0] in Tarski and Givant (1987: 223) is essentially GST with an axiom schema of induction replacing Specification, and with the existence of a null set explicitly assumed. GST is called STZ in Burgess (2005), p. 223.[2] Burgess's theory ST[3] is GST with Null Set replacing the axiom schema of specification. That the letters "ST" also appear in "GST" is a coincidence.
244
Metamathematics
The most remarkable fact about ST (and hence GST), is that these tiny fragments of set theory give rise to such rich metamathematics. While ST is a small fragment of the well-known canonical set theories ZFC and NBG, ST interprets Robinson arithmetic (Q), so that ST inherits the nontrivial metamathematics of Q. For example, ST is essentially undecidable because Q is, and every consistent theory whose theorems include the ST axioms is also essentially undecidable.[4] This includes GST and every axiomatic set theory worth thinking about, assuming these are consistent. In fact, the undecidability of ST implies the undecidability of first-order logic with a single binary predicate letter.[5] Q is also incomplete in the sense of Gdel's incompleteness theorem. Any theory, such as ST and GST, whose theorems include the Q axioms is likewise incomplete. Moreover, the consistency of GST cannot be proved within GST itself, unless GST is in fact inconsistent. GST is: Mutually interpretable with Peano arithmetic (thus it has the same proof-theoretic strength as PA); Immune to the three great antinomies of nave set theory: Russell's, Burali-Forti's, and Cantor's; Not finitely axiomatizable. Montague (1961) showed that ZFC is not finitely axiomatizable, and his argument carries over to GST. Hence any axiomatization of GST must either include at least one axiom schema such as Separation; Interpretable in relation algebra because no part of any GST axiom lies in the scope of more than three quantifiers. This is the necessary and sufficient condition given in Tarski and Givant (1987).
Footnotes
[1] [2] [3] [4] [5] This is seldom mentioned in the literature. Exceptions are Burgess (2005), passim, and QIII in Tarski and Givant (1987: 223). The Null Set axiom in STZ is redundant, because the existence of the null set is derivable from the axiom schema of Specification. Called S' in Tarski et al. (1953: 34). Burgess (2005), 2.2, p. 91. Tarski et al. (1953), p. 34.
References
George Boolos (1998) Logic, Logic, and Logic. Harvard Univ. Press. Burgess, John, 2005. Fixing Frege. Princeton Univ. Press. Richard Montague (1961) "Semantical closure and non-finite axiomatizability" in Infinistic Methods. Warsaw: 45-69. Alfred Tarski, Andrzej Mostowski, and Raphael Robinson (1953) Undecidable Theories. North Holland.
General set theory Tarski, A., and Givant, Steven (1987) A Formalization of Set Theory without Variables. Providence RI: AMS Colloquium Publications, v. 41.
245
External links
Stanford Encyclopedia of Philosophy: Set Theory (http://plato.stanford.edu/entries/set-theory/) -- by Thomas Jech.
The axioms of KP
Axiom of extensionality: Two sets are the same if and only if they have the same elements. Axiom of induction: If (a) is a formula, and if for all sets x it follows from the fact that (y) is true for all elements y of x that (x) holds, then (x) holds for all sets x. Axiom of empty set: There exists a set with no members, called the empty set and denoted {}. (Note: the existence of a member in the universe of discourse, i.e., x(x=x), is implied in certain formulations[1] of first-order logic, in which case the axiom of empty set follows from the axiom of separation, and is thus redundant.) Axiom of pairing: If x, y are sets, then so is {x, y}, a set containing x and y as its only elements. Axiom of union: For any set x, there is a set y such that the elements of y are precisely the elements of the elements of x. Axiom of 0-separation: Given any set and any 0-formula (x), there is a subset of the original set containing precisely those elements x for which (x) holds. (This is an axiom schema.) Axiom of 0-collection: Given any 0-formula (x, y), if for every set x there exists a set y such that (x, y) holds, then for all sets u there exists a set v such that for every x in u there is a y in v such that (x, y) holds. Here, a 0, or 0, or 0 formula is one all of whose quantifiers are bounded. This means any quantification is the form or (More generally, we would say that a formula is n+1 when it is obtained by adding existential quantifiers in front of a n formula, and that it is n+1 when it is obtained by adding universal quantifiers in front of a n formula: this is related to the arithmetical hierarchy but in the context of set theory.) These axioms differ from ZFC in as much as they exclude the axioms of: infinity, powerset, and choice. Also the axioms of separation and collection here are weaker than the corresponding axioms in ZFC because the formulas used in these are limited to bounded quantifiers only. The axiom of induction here is stronger than the usual axiom of regularity (which amounts to applying induction to the complement of a set (the class of all sets not in the given set)).
246
Thus a superset of A{b} = {(a, b) | a in A} exists by the axiom of collection. Abbreviate the formula above by axiom of separation. If v is intended to stand for Then A{b}, then a is 0. Thus A{b} itself exists by the 0 formula expressing that is:
Thus a superset of {A{b} | b in B} exists by the axiom of collection. Putting separation. Finally, AB = {A{b} | b in B} exists by the axiom of union. This is what was to be proved. in front of that last formula and we get that the set {A{b} | b in B} itself exists by the axiom of
Admissible sets
A set is called admissible if it is transitive and is a model of KripkePlatek set theory. An ordinal number is called an admissible ordinal if L is an admissible set. The ordinal is an admissible ordinal if and only if is a limit ordinal and there does not exist a < for which there is a 1(L) mapping from onto . If M is a standard model of KP, then the set of ordinals in M is an admissible ordinal. If L is a standard model of KP set theory without the axiom of 0-collection, then it is said to be an "amenable set".
References
Gostanian, Richard (1980). "Constructible Models of Subsystems of ZF". Journal of Symbolic Logic (Association for Symbolic Logic) 45 (2): 237. doi:10.2307/2273185. JSTOR2273185
[1] Poizat, Bruno (2000). A course in model theory: an introduction to contemporary mathematical logic. Springer. ISBN0-387-98655-3., note at end of 2.3 on page 27: Those who do not allow relations on an empty universe consider (x)x=x and its consequences as theses; we, however, do not share this abhorrence, with so little logical ground, of a vacuum.
247
Zermelo set theory In the usual cumulative hierarchy V of ZFC set theory (for ordinals ), any one of the sets V for a limit ordinal larger than the first infinite ordinal (such as V2) forms a model of Zermelo set theory. So the consistency of Zermelo set theory is a theorem of ZFC set theory. Zermelo's axioms do not imply the existence of or larger infinite cardinals, as the model V2 does not contain such cardinals. (Cardinals have to be defined differently in Zermelo set theory, as the usual definition of cardinals and ordinals does not work very well: with the usual definition it is not even possible to prove the existence of the ordinal 2.) The axiom of infinity is usually now modified to assert the existence of the first infinite von Neumann ordinal ; the original Zermelo axioms cannot prove the existence of this set, nor can the modified Zermelo axioms prove Zermelo's axiom of infinity. Zermelo's axioms (original or modified) cannot prove the existence of as a set nor of any rank of the cumulative hierarchy of sets with infinite index. Zermelo set theory is similar in strength to topos theory with a natural number object, or to the system in Principia mathematica. It is strong enough to carry out almost all ordinary mathematics not directly connected with set theory or logic.
248
Zermelo set theory 1. If is in , then contains an element x for which x is in x (i.e. is an element of M, then which is a contradiction. is wrong, proving the theorem. Hence not all objects of the universal itself), which would contradict the
249
definition of . 2. If is not in , and assuming definition " ", and so is in is in Therefore the assumption that
domain B can be elements of one and the same set. "This disposes of the Russell antinomy as far as we are concerned". This left the problem of "the domain B" which seems to refer to something. This led to the idea of a proper class.
Cantor's theorem
Zermelo's paper is notable for what may be the first mention of Cantor's theorem explicitly and by name. This appeals strictly to set theoretical notions, and is thus not exactly the same as Cantor's diagonal argument. Cantor's theorem: "If M is an arbitrary set, then always M < P(M) [the power set of M]. Every set is of lower cardinality than the set of its subsets". Zermelo proves this by considering a function : M P(M). By Axiom III this defines the following set M': M' = {m: m (m)}. But no element m' of M could correspond to M', i.e. such that (m') = M'. Otherwise we can construct a contradiction: 1) If m' is in M' then by definition m' (m') = M', which is the first part of the contradiction 2) If m' is in not in M' but in M then by definition m' M' = (m') which by definition implies that m' is in M', which is the second part of the contradiction. so by contradiction m' does not exist. Note the close resemblance of this proof to the way Zermelo disposes of Russell's paradox.
References
Zermelo, Ernst (1908), "Untersuchungen ber die Grundlagen der Mengenlehre I", Mathematische Annalen 65 (2): 261281, doi:10.1007/BF01449999 English translation in Heijenoort, Jean van (1967), "Investigations in the foundations of set theory", From Frege to Gdel: A Source Book in Mathematical Logic, 1879-1931, Source Books in the History of the Sciences, Harvard Univ. Press, pp.199215, ISBN978-0-674-32449-7
250
The language
Ackermann set theory is formulated in first-order logic. The language constant sets. (Ackermann used a predicate is that the object interpretation of is in the class consists of one binary relation for and one instead). We will write . The intended is the class of all
The axioms
The axioms of Ackermann set theory, collectively referred to as A, consists of the universal closure of the following formulas in the language 1) Axiom of extensionality:
2) Class construction axiom schema: Let free. 3) Reflection axiom schema: Let or the variable free. If then
all sub-formulas of
is a formula of
and ZF proves
251
References
Ackermann, Wilhelm "Zur Axiomatik der Mengenlehre" in Mathematische Annalen, 1956, Vol. 131, pp. 336--345. Levy, Azriel, "On Ackermann's set theory" Journal of Symbolic Logic Vol. 24, 1959 154--166 Reinhardt, William, "Ackermann's set theory equals ZF" Annals of Mathematical Logic Vol. 2, 1970 no. 2, 189--249
History
In 1908, Ernst Zermelo proposed the first axiomatic set theory, Zermelo set theory. This axiomatic theory did not allow the construction of the ordinal numbers. While most of "ordinary mathematics" can be developed without ever using ordinals, ordinals are an essential tool in most set-theoretic investigations. Moreover, one of Zermelo's axioms invoked a concept, that of a "definite" property, whose operational meaning was not clear. In 1922, Abraham Fraenkel and Thoralf Skolem independently proposed operationalizing a "definite" property as one that could be formulated as a first order theory whose atomic formulas were limited to set membership and identity. They also independently proposed replacing the axiom schema of specification with the axiom schema of replacement. Appending this schema, as well as the axiom of regularity (first proposed by Dimitry Mirimanoff in 1917), to Zermelo set theory yields the theory denoted by ZF. Adding to ZF either the axiom of choice (AC) or a statement that is equivalent to it yields ZFC.
252
The axioms
There are many equivalent formulations of the ZFC axioms; for a rich but somewhat dated discussion of this fact, see Fraenkel et al. (1973). The following particular axiom set is from Kunen (1980). The axioms per se are expressed in the symbolism of first order logic. The associated English prose is only intended to aid the intuition. All formulations of ZFC imply that at least one set exists. Kunen includes an axiom that directly asserts the existence of a set, in addition to the axioms given below. Its omission here can be justified in two ways. Many authors require a nonempty domain of discourse as part of the semantics of the first-order logic in which ZFC is formalized. Also the axiom of infinity (below) also implies that at least one set exists, as it begins with an existential quantifier, so its presence makes superfluous an axiom (merely) asserting the existence of a set.
1. Axiom of extensionality
Two sets are equal (are the same set) if they have the same elements.
The converse of this axiom follows from the substitution property of equality. If the background logic does not include equality "=", x=y may be defined as an abbreviation for the following formula (Hatcher 1982, p.138, def.1):
which says that if x and y have the same elements, then they belong to the same sets (Fraenkel et al. 1973).
3. Axiom schema of specification (also called the axiom schema of separation or of restricted comprehension)
If z is a set, and is any property which may characterize the elements x of z, then there is a subset y of z containing those x in z which satisfy the property. The "restriction" to z is necessary to avoid Russell's paradox and its variants. More formally, let be any formula in the language of ZFC with free variables among . So y is not free in . Then:
This axiom is part of Z, but can be redundant in ZF, in that it may follow from the axiom schema of replacement, with (as here) or without the axiom of the empty set. The axiom of specification can be used to prove the existence of the empty set, denoted , once the existence of at least one set is established (see above). A common way to do this is to use an instance of specification for a property which all sets do not have. For example, if w is a set which already exists, the empty set can be constructed as . If the background logic includes equality, it is also possible to define the empty set as . Thus the axiom of the empty set is implied by the nine axioms presented here. The axiom of extensionality implies the empty set is unique, if it exists. It is common to make a definitional extension that adds the symbol to the language of ZFC.
253
4. Axiom of pairing
If x and y are sets, then there exists a set which contains x and y as elements.
This axiom is part of Z, but is redundant in ZF because it follows from the axiom schema of replacement applied to any two-member set. The existence of such a set is assured by either the axiom of infinity, or by the axiom of the power set applied twice to the empty set.
5. Axiom of union
For any set there is a set A containing every set that is a member of some member of
Less formally, this axiom states that if the domain of a definable function f is a set, and f(x) is a set for any x in that domain, then the range of f is a subclass of a set, subject to a restriction needed to avoid paradoxes. The form stated here, in which B may be larger than strictly necessary, is sometimes called the axiom schema of collection.
7. Axiom of infinity
Let abbreviate , where is some set. Then there exists a set X such that the empty set is also a member of X. is a member of X and, whenever a set y is a member of X, then
More colloquially, there exists a set X having infinitely many members. The minimal set X satisfying the axiom of infinity is the von Neumann ordinal , which can also be thought of as the natural numbers .
9. Well-ordering theorem
For any set X, there is a binary relation R which well-orders X. This means R is a linear order on X such that every nonempty subset of X has a member which is minimal under R.
Given axioms 1-8, there are many statements provably equivalent to axiom 9, the best known of which is the axiom of choice (AC), which goes as follows. Let X be a set whose members are all non-empty. Then there exists a function f from X to the union of the members of X, called a "choice function", such that for all Y X one has f(Y) Y. Since the existence of a choice function when X is a finite set is easily proved from axioms 1-8, AC only matters for certain
ZermeloFraenkel set theory infinite sets. AC is characterized as nonconstructive because it asserts the existence of a choice set but says nothing about how the choice set is to be "constructed." Much research has sought to characterize the definability (or lack thereof) of certain sets whose existence AC asserts.
254
Metamathematics
The axiom schemata of replacement and separation each contain infinitely many instances. Montague (1961) included a result first proved in his 1957 Ph.D. thesis: if ZFC is consistent, it is impossible to axiomatize ZFC using only finitely many axioms. On the other hand, Von NeumannBernaysGdel set theory (NBG) can be finitely axiomatized. The ontology of NBG includes proper classes as well as sets; a set is any class that can be a member of another class. NBG and ZFC are equivalent set theories in the sense that any theorem not mentioning classes and provable in one theory can be proved in the other. Gdel's second incompleteness theorem says that a recursively axiomatizable system that can interpret Robinson arithmetic can prove its own consistency only if it is inconsistent. Moreover, Robinson arithmetic can be interpreted in general set theory, a small fragment of ZFC. Hence the consistency of ZFC cannot be proved within ZFC itself (unless it is actually inconsistent). Thus, to the extent that ZFC is identified with ordinary mathematics, the consistency of ZFC cannot be demonstrated in ordinary mathematics. The consistency of ZFC does follow from the existence of a weakly inaccessible cardinal, which is unprovable in ZFC if ZFC is consistent. Nevertheless, it is unlikely that ZFC harbors an unsuspected contradiction; if ZFC were inconsistent, it is widely believed that that fact would have been uncovered by now. This much is certain ZFC is immune to the classic paradoxes of naive set theory: Russell's paradox, the Burali-Forti paradox, and Cantor's paradox. Abian and LaMacchia (1978) studied a subtheory of ZFC consisting of the axioms of extensionality, union, powerset, replacement, and choice. Using models, they proved this subtheory consistent, and proved that each of the
ZermeloFraenkel set theory axioms of extensionality, replacement, and power set is independent of the four remaining axioms of this subtheory. If this subtheory is augmented with the axiom of infinity, each of the axioms of union, choice, and infinity is independent of the five remaining axioms. Because non-well-founded set theory is a model of ZFC without the axiom of regularity, that axiom is independent of the other ZFC axioms. If consistent, ZFC cannot prove the existence of the inaccessible cardinals that category theory requires. Huge sets of this nature are possible if ZF is augmented with Tarski's axiom (Tarski 1939). Assuming that axiom turns the axioms of infinity, power set, and choice (7 9 above) into theorems.
255
Independence in ZFC
Many important statements are independent of ZFC (see list of statements undecidable in ZFC). The independence is usually proved by forcing, whereby it is shown that every countable transitive model of ZFC (sometimes augmented with large cardinal axioms) can be expanded to satisfy the statement in question. A different expansion is then shown to satisfy the negation of the statement. An independence proof by forcing automatically proves independence from arithmetical statements, other concrete statements, and large cardinal axioms. Some statements independent of ZFC can be proven to hold in particular inner models, such as in the constructible universe. However, some statements that are true about constructible sets are not consistent with hypothesized large cardinal axioms. Forcing proves that the following statements are independent of ZFC: Continuum hypothesis Diamond principle Suslin hypothesis Kurepa hypothesis Martin's axiom (which is not a ZFC axiom) Axiom of Constructibility (V=L) (which is also not a ZFC axiom).
Remarks: The consistency of V=L is provable by inner models but not forcing: every model of ZF can be trimmed to become a model of ZFC+V=L. The Diamond Principle implies the Continuum Hypothesis and the negation of the Suslin Hypothesis. Martin's axiom plus the negation of the Continuum Hypothesis implies the Suslin Hypothesis. The constructible universe satisfies the Generalized Continuum Hypothesis, the Diamond Principle, Martin's Axiom and the Kurepa Hypothesis. A variation on the method of forcing can also be used to demonstrate the consistency and unprovability of the axiom of choice, i.e., that the axiom of choice is independent of ZF. The consistency of choice can be (relatively) easily verified by proving that the inner model L satisfies choice. (Thus every model of ZF contains a submodel of ZFC, so that Con(ZF) implies Con(ZFC).) Since forcing preserves choice, we cannot directly produce a model contradicting choice from a model satisfying choice. However, we can use forcing to create a model which contains a suitable submodel, namely one satisfying ZF but not C. Another method of proving independence results, one owing nothing to forcing, is based on Gdel's second incompleteness theorem. This approach employs the statement whose independence is being examined, to prove the existence of a set model of ZFC, in which case Con(ZFC) is true. Since ZFC satisfies the conditions of Gdel's second theorem, the consistency of ZFC is unprovable in ZFC (provided that ZFC is, in fact, consistent). Hence no statement allowing such a proof can be proved in ZFC. This method can prove that the existence of large cardinals is not provable in ZFC, but cannot prove that assuming such cardinals, given ZFC, is free of contradiction.
256
Criticisms
For criticism of set theory in general, see Objections to set theory ZFC has been criticized both for being excessively strong and for being excessively weak, as well as for its failure to capture objects such as proper classes and the universal set. Many mathematical theorems can be proven in much weaker systems than ZFC, such as Peano arithmetic and second order arithmetic (as explored by the program of reverse mathematics). Saunders Mac Lane and Solomon Feferman have both made this point. Some of "mainstream mathematics" (mathematics not directly connected with axiomatic set theory) is beyond Peano arithmetic and second order arithmetic, but still, all such mathematics can be carried out in ZC (Zermelo set theory with choice), another theory weaker than ZFC. Much of the power of ZFC, including the axiom of regularity and the axiom schema of replacement, is included primarily to facilitate the study of the set theory itself. On the other hand, among axiomatic set theories, ZFC is comparatively weak. Unlike New Foundations, ZFC does not admit the existence of a universal set. Hence the universe of sets under ZFC is not closed under the elementary operations of the algebra of sets. Unlike von NeumannBernaysGdel set theory and MorseKelley set theory (MK), ZFC does not admit the existence of proper classes. These ontological restrictions are required for ZFC to avoid Russell's paradox, but critics argue these restrictions make the ZFC axioms fail to capture the informal concept of set. A further comparative weakness of ZFC is that the axiom of choice included in ZFC is weaker than the axiom of global choice included in MK. There are numerous mathematical statements undecidable in ZFC. These include the continuum hypothesis, the Whitehead problem, and the Normal Moore space conjecture. Some of these conjectures are provable with the addition of axioms such as Martin's axiom, large cardinal axioms to ZFC. Some others are decided in ZF+AD where AD is the axiom of determinacy, a strong supposition incompatible with choice. One attraction of large cardinal axioms is that they enable many results from ZF+AD to be established in ZFC adjoined by some large cardinal axiom (see projective determinacy). The Mizar system has adopted TarskiGrothendieck set theory instead of ZFC so that proofs involving Grothendieck universes (encountered in category theory and algebraic geometry) can be formalized.
References
Alexander Abian, 1965. The Theory of Sets and Transfinite Arithmetic. W B Saunders. -------- and LaMacchia, Samuel, 1978, "On the Consistency and Independence of Some Set-Theoretical Axioms, [1] " Notre Dame Journal of Formal Logic 19: 155-58. Keith Devlin, 1996 (1984). The Joy of Sets. Springer. Abraham Fraenkel, Yehoshua Bar-Hillel, and Azriel Levy, 1973 (1958). Foundations of Set Theory. North Holland. Fraenkel's final word on ZF and ZFC. Hatcher, William, 1982 (1968). The Logical Foundations of Mathematics. Pergamon. Peter Hinman, 2005, Fundamentals of Mathematical Logic, A K Peters. ISBN 978-1-56881-262-5 Thomas Jech, 2003. Set Theory: The Third Millennium Edition, Revised and Expanded. Springer. ISBN 3-540-44085-2. Kenneth Kunen, 1980. Set Theory: An Introduction to Independence Proofs. Elsevier. ISBN 0-444-86839-9. Richard Montague, 1961, "Semantic closure and non-finite axiomatizability" in Infinistic Methods. London: Pergamon: 45-69. Patrick Suppes, 1972 (1960). Axiomatic Set Theory. Dover reprint. Perhaps the best exposition of ZFC before the independence of AC and the Continuum hypothesis, and the emergence of large cardinals. Includes many theorems. Gaisi Takeuti and Zaring, W M, 1971. Introduction to Axiomatic Set Theory. Springer Verlag.
ZermeloFraenkel set theory Alfred Tarski, 1939, "On well-ordered subsets of any set,", Fundamenta Mathematicae 32: 176-83. Tiles, Mary, 2004 (1989). The Philosophy of Set Theory. Dover reprint. Weak on metatheory; the author is not a mathematician. Tourlakis, George, 2003. Lectures in Logic and Set Theory, Vol. 2. Cambridge Univ. Press. Jean van Heijenoort, 1967. From Frege to Godel: A Source Book in Mathematical Logic, 1879-1931. Harvard Univ. Press. Includes annotated English translations of the classic articles by Zermelo, Fraenkel, and Skolem bearing on ZFC. Zermelo, Ernst (1908), "Untersuchungen ber die Grundlagen der Mengenlehre I", Mathematische Annalen 65: 261281, doi:10.1007/BF01449999 English translation in *Heijenoort, Jean van (1967), "Investigations in the foundations of set theory", From Frege to Gdel: A Source Book in Mathematical Logic, 1879-1931, Source Books in the History of the Sciences, Harvard Univ. Press, pp.199215, ISBN978-0674324497 Zermelo, Ernst (1930), "ber Grenzzahlen und Mengenbereiche" [2], Fundamenta Mathematicae 16: 2947, ISSN0016-2736
257
External links
Stanford Encyclopedia of Philosophy articles by Thomas Jech: Set Theory [3]; Axioms of Zermelo-Fraenkel Set Theory [4]. Metamath version of the ZFC axioms [5] A concise and nonredundant axiomatization. The background first order logic is defined especially to facilitate machine verification of proofs. A derivation [6] in Metamath of a version of the separation schema from a version of the replacement schema. Zermelo-Fraenkel Axioms [7] on PlanetMath
References
[1] [2] [3] [4] [5] [6] [7] http:/ / projecteuclid. org/ DPubS/ Repository/ 1. 0/ Disseminate?view=body& id=pdf_1& handle=euclid. ndjfl/ 1093888220 http:/ / matwbn. icm. edu. pl/ tresc. php?wyd=1& tom=16 http:/ / plato. stanford. edu/ entries/ set-theory/ http:/ / plato. stanford. edu/ entries/ set-theory/ ZF. html http:/ / us. metamath. org/ mpegif/ mmset. html#staxioms http:/ / us. metamath. org/ mpegif/ axsep. html http:/ / planetmath. org/ ?op=getobj& amp;from=objects& amp;id=317
258
Ontology
The defining aspect of NBG is the distinction between proper class and set. Let a and s be two individuals. Then the atomic sentence is defined if a is a set and s is a class. In other words, is defined unless a is a proper class. A proper class is very large; NBG even admits of "the class of all sets", the universal class called V. However, NBG does not admit "the class of all classes" (which fails because proper classes are not "objects" that can be put into classes in NBG) or "the set of all sets" (whose existence cannot be justified with NBG axioms). By NBG's axiom schema of Class Comprehension, all objects satisfying any given formula in the first order language of NBG form a class; if the class would not be a set in ZFC, it is an NBG proper class. The development of classes mirrors the development of naive set theory. The principle of abstraction is given, and thus classes can be formed out of all individuals satisfying any statement of first order logic whose atomic sentences all involve either the membership relation or predicates definable from membership. Equality, pairing, subclass, and such, are all definable and so need not be axiomatized their definitions denote a particular abstraction of a formula. Sets are developed in a manner very similarly to ZF. Let Rp(A,a), meaning "the set a represents the class A," denote a binary relation defined as follows:
That is, a "represents" A if every element of a is an element of A, and conversely. Classes lacking representations, such as the class of all sets that do not contain themselves (the class invoked by the Russell paradox), are the proper classes.
History
The first variant of NBG, by John von Neumann in the 1920s, took functions and not sets, as primitive. In a series of articles published 1937-54, Paul Bernays modified Von Neumann's theory so as to make sets and set membership primitive; he also discovered that it could be finitely axiomatized. Gdel (1940), while investigating the independence of the Continuum hypothesis, further simplified and used the theory. Montague (1961) showed that ZFC cannot be finitely axiomatized.
Axiomatizating NBG
NBG is presented here as a two-sorted theory, with lower case letters denoting variables ranging over sets, and upper case letters denoting variables ranging over classes. Hence " " should be read "set x is a member of set y," and " ". " " as "set x is a member of class Y." Statements of equality may take the form " " stands for " " or " " and is an abuse of notation. NBG can also be presented as a
one-sorted theory of classes, with sets being those classes that are members of at least one other class. We first axiomatize NBG using the axiom schema of Class Comprehension. This schema is provably equivalent[1] to 9 of its finite instances, stated in the following section. Hence these 9 finite axioms can replace Class
Von NeumannBernaysGdel set theory Comprehension. This is the precise sense in which NBG can be finitely axiomatized.
259
elements are exactly x and y. pairing implies that for any set x, the set {x} (the singleton set) exists. Also, given any two sets x and y and the usual set-theoretic definition of the ordered pair, the ordered pair (x,y) exists and is a set. By Class Comprehension, all relations on sets are classes. Moreover, certain kinds class relations are one or more of functions, injections, and bijections from one class to another. pairing is an axiom in Zermelo set theory and a theorem in ZFC. union: For any set x, there is a set which contains exactly the elements of elements of x. power set: For any set x, there is a set which contains exactly the subsets of x. infinity: There exists an inductive set, namely a set x whose members are (i) the empty set; (ii) for every member y of x, is also a member of x. infinity can be formulated so as to imply the existence of the empty set.[2] The remaining axioms have capitalized names because they are primarily concerned with classes rather than sets. The next two axioms differ from their ZFC counterparts only in that their quantified variables range over classes, not sets: Extensionality: : Classes with the same elements are the same class. Foundation (Regularity): Each nonempty class is disjoint from one of its elements. The last two axioms are peculiar to NBG: Limitation of Size: For any class C, a set x such that x=C exists if and only if there is no bijection between C and the class V of all sets. From this axiom, due to Von Neumann, Subsets, Replacement, and Global Choice can all be derived. This axiom implies the axiom of global choice because the class of ordinals is not a set; hence there exists a bijection between the ordinals and the universe. If Limitation of Size were weakened to "If the domain of a class function is a set, then the range of that function is likewise a set," then no form of the axiom of choice is an NBG theorem. In this case, any of the usual local forms of Choice may be taken as an added axiom, if desired. Limitation of Size cannot be found in Mendelson (1997) NGB. In its place, we find the usual axiom of choice for sets, and the following form of the axiom schema of replacement: if the class F is a function whose domain is a set, the range of F is also a set .[3] Class Comprehension schema: For any formula and set parameters), there is a class A such that This axiom asserts that invoking the principle of unrestricted comprehension of naive set theory yields a class rather than a set, thereby banishing the paradoxes of set theory. Class Comprehension is the only axiom schema of NBG. In the next section, we show how this schema can be replaced by a number of its own instances. Hence NBG can be finitely axiomatized. If the quantified variables in (x) range over classes instead of sets, the result is MorseKelley set theory, a proper extension of ZFC which cannot be finitely axiomatized. containing no quantifiers over classes (it may contain class
260
handling all sentential connectives, because and are a functionally complete set of connectives. Complement: For any class A, the complement Intersection: For any classes A and B, the intersection
We now turn to quantification. In order to handle multiple variables, we need the ability to represent relations. Define the ordered pair as as usual. Note that two applications of pairing to a and b assure that (a,b) is indeed a set. Products: For any classes A and B, the class only is needed.) Converses: For any class R, the classes: and exist. Association: For any class R, the classes: and exist. These axioms license adding dummy arguments, and rearranging the order of arguments, in relations of any arity. The peculiar form of Association is designed exactly to make it possible to bring any term in a list of arguments to the front (with the help of Converses). We represent the argument list as (it is a pair with the first argument as its first projection and the "tail" of the argument list as the second projection). The idea is to apply Assoc1 until the argument to be brought to the front is second, then apply Conv1 or Conv2 as appropriate to bring the second argument to the front, then apply Assoc2 until the effects of the original applications of Assoc1 (which are now behind the moved argument) are corrected. If negation. Ranges: For any class R, the class exists. The above axioms can reorder the arguments of any relation so as to bring any desired argument to the front of the argument list, where it can be quantified. Finally, each atomic formula implies the existence of a corresponding class relation: Membership: The class Diagonal: The class exists. exists. is a class considered as a relation, then its range, is a class. This gives us the existential quantifier. The universal quantifier can be defined in terms of the existential quantifier and is a class. (In practice,
Diagonal, together with addition of dummy arguments and rearrangement of arguments, can build a relation asserting the equality of any two of its arguments; thus repeated variables can be handled.
Von NeumannBernaysGdel set theory Mendelson's variant Mendelson (1997: 230) refers to his axioms B1-B7 of class comprehension as "axioms of class existence." Four of these identical to axioms already stated above: B1 is Membership; B2, Intersection; B3, Complement; B5, Product. B4 is Ranges modified to assert the existence of the domain of R (by existentially quantifying y instead of x). The last two axioms are: B6: B7: B6 and B7 enable what Converses and Association enable: given any class X of ordered triples, there exists another class Y whose members are the members of X each reordered in the same way.
261
Discussion
For a discussion of some ontological and other philosophical issues posed by NBG, especially when contrasted with ZFC and MK, see Appendix C of Potter (2004). Even though NBG is a conservative extension of ZFC, a theorem may have a shorter and more elegant proof in NBG than in ZFC (or vice versa). For a survey of known results of this nature, see Pudlak (1998).
Model theory
ZFC, NBG, and MK have models describable in terms of V, the standard model of ZFC and the von Neumann universe. Now let the members of V include the inaccessible cardinal . Also let Def(X) denote the 0 definable subsets of X (see constructible universe). Then: V is an intended model of ZFC; Def(V) is an intended model of NBG; V+1 is an intended model of MK.
Category theory
The ontology of NBG provides scaffolding for speaking about "large objects" without risking paradox. In some developments of category theory, for instance, a "large category" is defined as one whose objects make up a proper class, with the same being true of its morphisms. A "small category", on the other hand, is one whose objects and morphisms are members of some set. We can thus easily speak of the "category of all sets" or "category of all small categories" without risking paradox. Those categories are large, of course. There is no "category of all categories" since it would have to contain the category of small categories, although yet another ontological extension can enable one to talk formally about such a "category" (see for example the "quasicategory of all categories" of Admek et al. (1990), whose objects and morphisms form a "proper conglomerate"). On whether an ontology including classes as well as sets is adequate for category theory, see Muller (2001).
262
Notes
[1] Mendelson (1997), p. 232, Prop. 4.4, proves Class Comprehension equivalent to the axioms B1-B7 shown on p. 230 and described below. [2] Mendelson (1997), p. 239, Ex. 4.22(b). [3] Mendelson (1997), p. 239, axiom R.
References
Admek, Ji; Herrlich, Horst, and Strecker, George E (2004) [1990] (PDF). Abstract and Concrete Categories (The Joy of Cats) (http://katmat.math.uni-bremen.de/acc/). New York: Wiley & Sons. ISBN0-471-60922-6. Bernays, Paul (1991). Axiomatic Set Theory. Dover Publications. ISBN0-48-666637-9. Mendelson, Elliott, 1997. An Introduction to Mathematical Logic, 4th ed. London: Chapman & Hall. ISBN 0412808307. Pp. 22586 contain the classic textbook treatment of NBG, showing how it does what we expect of set theory, by grounding relations, order theory, ordinal numbers, transfinite numbers, etc. Richard Montague, 1961, "Semantic Closure and Non-Finite Axiomatizability I," in Infinitistic Methods: Proceedings of the Symposium on Foundations of Mathematics, (Warsaw, 2-9 September 1959). Pergamon: 45-69. Muller, F. A., 2001, "Sets, classes, and categories," British Journal of the Philosophy of Science 52: 539-73. Potter, Michael, 2004. Set Theory and Its Philosophy. Oxford Univ. Press. Pudlak, P., 1998, "The lengths of proofs" in Buss, S., ed., Handbook of Proof Theory. North-Holland: 547-637. John von Neumann, 1925, "An Axiomatization of Set Theory." English translation in Jean van Heijenoort, ed., 1967. From Frege to Gdel: A Source Book in Mathematical Logic, 1879-1931. Harvard University Press. 393 413
External links
von Neumann-Bernays-Gdel set theory (http://planetmath.org/?op=getobj&from=objects& amp;id=4395) on PlanetMath
263
V is
Note that a set and a class having the same extension are identical. Hence MK is not a two-sorted theory, appearances to the contrary notwithstanding. Foundation: Each nonempty class A is disjoint from at least one of its members.
Class Comprehension: Let (x) be any formula in the language of MK in which x is a free variable and Y is not free. (x) may contain parameters which are either sets or proper classes. More consequentially, the quantified variables in (x) may range over all classes and not just over all sets; this is the only way MK differs from NBG. Then there exists a class whose members are exactly those sets x such that comes out true. Formally, if Y is not free in : Pairing: For any sets x and y, there exists a set whose members are exactly x and y.
MorseKelley set theory Pairing licenses the unordered pair in terms of which the ordered pair, way, as , may be defined in the usual
264
. With ordered pairs in hand, Class Comprehension enables defining relations and
functions on sets as sets of ordered pairs, making possible the next axiom: Limitation of Size: C is a proper class if and only if V can be mapped one-to-one into C.
The formal version of this axiom resembles the axiom schema of replacement, and embodies the class function F. The next section explains how Limitation of Size is stronger than the usual forms of the axiom of choice. Power set: Let p be a class whose members are all possible subsets of the set a. Then p is a set.
Union: Let
be the sum class of the set a, namely the union of all members of a. Then s is a set.
Infinity: There exists an inductive set y, meaning that (i) the empty set is a member of y; (ii) if x is a member of y, then so is .
Note that p and s in Power Set and Union are universally, not existentially, quantified, as Class Comprehension suffices to establish the existence of p and s. Power Set and Union only serve to establish that p and s cannot be proper classes. The above axioms are shared with other set theories as follows: ZFC and NBG: Pairing, Power Set, Union, Infinity; NBG (and ZFC, if quantified variables were restricted to sets): Extensionality, Foundation; NBG: Limitation of Size; ML: Extensionality, Class Comprehension (in NF, comprehension is restricted to stratified formulas).
Discussion
Monk (1980) and Rubin (1967) are set theory texts built around MK; Rubin's ontology includes urelements. These authors and Mendelson (1997: 287) submit that MK does what we expect of a set theory while being less cumbersome than ZFC and NBG. MK is strictly stronger than ZFC and its conservative extension NBG, the other well-known set theory with proper classes. In fact, NBGand hence ZFCcan be proved consistent if MK is. MK's strength stems from its axiom schema of Class Comprehension being impredicative, meaning that (x) may contain quantified variables ranging over classes. The quantified variables in NBG's axiom schema of Class Comprehension are restricted to sets; hence Class Comprehension in NBG must be predicative. (Separation with respect to sets is still impredicative in NBG, because the quantifiers in (x) may range over all sets.) The NBG axiom schema of Class Comprehension can be replaced with finitely many of its instances; this is not possible in MK. MK is consistent relative to ZFC augmented by an axiom asserting the existence of strongly inaccessible ordinals. The only advantage of the axiom of limitation of size is that it implies the axiom of global choice. Limitation of Size does not appear in Rubin (1967), Monk (1980), or Mendelson (1997). Instead, these authors invoke a usual form of the local axiom of choice, and an "axiom of replacement,"[1] asserting that if the domain of a class function is a set, its range is also a set. Replacement can prove everything that Limitation of Size proves, except prove some form of the axiom of choice. Limitation of Size plus I being a set (hence the universe is nonempty) renders provable the sethood of the empty set; hence no need for an axiom of empty set. Such an axiom could be added, of course, and minor perturbations of the
MorseKelley set theory above axioms would necessitate this addition. The set I is not identified with the limit ordinal larger than In this case, the existence of would follow from either form of Limitation of Size.
The class of von Neumann ordinals can be well-ordered. It cannot be a set (under pain of paradox); hence that class is a proper class, and all proper classes have the same size as V. Hence V too can be well-ordered. MK can be confused with second-order ZFC, ZFC with second-order logic (representing second-order objects in set rather than predicate language) as its background logic. The language of second-order ZFC is similar to that of MK (although a set and a class having the same extension can no longer be identified), and their syntactical resources for practical proof are almost identical (and are identical if MK includes the strong form of Limitation of Size). But the semantics of second-order ZFC are quite different from those of MK. For example, if MK is consistent then it has a countable first-order model, while second-order ZFC has no countable models.
Model theory
ZFC, NBG, and MK each have models describable in terms of V, the standard model of ZFC and the von Neumann universe. Let the inaccessible cardinal be a member of V. Also let Def(X) denote the 0 definable subsets of X (see constructible universe). Then: V is an intended model of ZFC; Def(V) is an intended model of NBG; V+1, the power set of V, is an intended model of MK.
History
MK was first set out in an appendix to J. L. Kelley's (1955) General Topology, using the axioms given in the next section. The system of Anthony Morse's (1965) A Theory of Sets is equivalent to Kelley's, but formulated in an idiosyncratic formal language rather than, as is done here, in standard first order logic. The first set theory to include impredicative class comprehension was Quine's ML, that built on New Foundations rather than on ZFC.[2] Impredicative class comprehension was also proposed in Mostowski (1951) and Lewis (1991).
I is a variant of the axiom of extensionality, except that its scope includes proper classes as well as sets. II. Classification (schema): An axiom results if in For each , if and only if is a set and
'' and '' are replaced by variables, ' A ' by a formula , and ' B ' by the formula obtained from by replacing each occurrence of the variable which replaced by the variable which replaced .
MorseKelley set theory Develop: Algebra of sets. Existence of the null class and the universal class V. III. Subsets: If x is a set, there exists a set y such that for each z, if , then
266
Develop: V is not a set. Existence of singletons. Separation provable. Proof sketch of Power Set: for any class z which is a subclass of the set x, the class z is a member of the set y whose existence III asserts. Hence z is a set. IV. Union: If x and y are both sets, then is a set. of a set x is a set is a set if x
The import of IV is that of Pairing above, whose proof sketch goes as follows. The singleton because it is a subclass of the power set of x (by two applications of III). Then IV implies that and y are sets. Develop: Unordered and ordered pairs, relations, functions, domain, range, function composition. V. Substitution: If f is a [class] function and domain f is a set, then range f is a set.
The import of V is that of the "axiom of replacement" appearing in textbook treatments of NBG and MK. VI. Amalgamation: If x is a set, then is a set.
The import of VI is that of Union above. V and VI may be combined into one axiom.[3] Develop: Cartesian product, injection, surjection, bijection, order theory. VII. Regularity: If there is a member y of x such that
The import of VII is that of Foundation above. Develop: Ordinal numbers, transfinite induction. VIII. Infinity: There exists a set y, such that and whenever is a set
VIII asserts the unconditional existence of two sets, the infinite inductive set y, and the null set
simply because it is a member of y. Up to this point, evertyhing that has been proved to exist is a class, and Kelley's discussion of sets was entirely hypothetical. Develop: Natural numbers, N is a set, Peano axioms, integers, rational numbers, real numbers. Definition: c is a choice function if c is a function and IX. Choice: There exists a choice function c whose domain is for each member x of domain c. .
IX is very similar to the axiom of global choice derivable from Limitation of Size. Develop: Equivalents of the axiom of choice. As is the case with ZFC, the cardinal numbers require some form of this axiom.
Notes
[1] See, e.g., Mendelson (1997), p. 239, axiom R. [2] The locus citandum for ML is the 1951 ed. of Quine's Mathematical Logic. However, the summary of ML given in Mendelson (1997), p. 296, is easier to follow. Mendelson's axiom schema ML2 is identical to the above axiom schema of Class Comprehension. [3] Kelley (1955), p. 261, fn .
References
John L. Kelley 1975 (1955) General Topology. Springer. Earlier ed., Van Nostrand. Appendix, "Elementary Set Theory." Lemmon, E. J. (1986) Introduction to Axiomatic Set Theory. Routledge & Kegan Paul. David K. Lewis (1991) Parts of Classes. Oxford: Basil Blackwell. Mendelson, Elliott (1997). Introduction to Mathematical Logic. Chapman & Hall. ISBN0-534-06624-0. The definitive treatment of the closely related set theory NBG, followed by a page on MK. Harder than Monk or Rubin. Monk, J. Donald (1980) Introduction to Set Theory. Krieger. Easier and less thorough than Rubin.
MorseKelley set theory Morse, A. P., (1965) A Theory of Sets. Academic Press. Mostowski, Andrzej (1950) "Some impredicative definitions in the axiomatic set theory," Fundamenta Mathematicae 37: 111-24. Rubin, Jean E. (1967) Set Theory for the Mathematician. San Francisco: Holden Day. More thorough than Monk; the ontology includes urelements.
267
External links
From Foundations of Mathematics (FOM) discussion group: Allen Hazen on set theory with classes. (http://www.cs.nyu.edu/pipermail/fom/2004-May/008208.html) Joseph Shoenfield's doubts about MK. (http://www.cs.nyu.edu/pipermail/fom/2000-February/003740.html)
New Foundations
In mathematical logic, New Foundations (NF) is an axiomatic set theory, conceived by Willard Van Orman Quine as a simplification of the theory of types of Principia Mathematica. Quine first proposed NF in a 1937 article titled "New Foundations for Mathematical Logic"; hence the name. Much of this entry discusses NFU, an important variant of NF due to Jensen (1969) and exposited in Holmes (1998)[1] .
where represents the set . This type theory is much less complicated than the one first set out in the Principia Mathematica, which included types for relations whose arguments were not necessarily all of the same type. In 1914, Norbert Wiener showed how to code the ordered pair as a set of sets, making it possible to eliminate relation types in favor of the linear hierarchy of sets described here.
New Foundations
268
Even the indirect reference to types implicit in the notion of stratification can be eliminated. Theodore Hailperin showed in 1944 that Comprehension is equivalent to a finite conjunction of its instances,[2] so that NF can be finitely axiomatized without any reference to the notion of type. Comprehension may seem to run afoul of problems similar to those in naive set theory, but this is not the case. For example, the impossible Russell class is not an NF set, because cannot be stratified.
Ordered pairs
Relations and functions are defined in TST (and in NF and NFU) as sets of ordered pairs in the usual way. The usual definition of the ordered pair, first proposed by Kuratowski in 1921, has a serious drawback for NF and related theories: the resulting ordered pair necessarily has a type two higher than the type of its arguments (its left and right projections). Hence for purposes of determining stratification, a function is three types higher than the members of its field. If one can define a pair in such a way that its type is the same type as that of its arguments (resulting in a type-level ordered pair), then a relation or function is merely one type higher than the type of the members of its field. Hence NF and related theories usually employ Quine's set-theoretic definition of the ordered pair, which yields a type-level ordered pair. Holmes (1998) takes the ordered pair and its left and right projections as primitive. Fortunately, whether the ordered pair is type-level by definition or by assumption (i.e., taken as primitive) usually does not matter. The existence of a type-level ordered pair implies Infinity, and NFU + Infinity interprets NFU + "there is a type level ordered pair" (they are not quite the same theory, but the differences are inessential). Conversely, NFU + Infinity + Choice proves the existence of a type-level ordered pair.
269
Finite axiomatizability
New Foundations can be finitely axiomatized. Two such formulations are given here [3].
Cartesian closure
Unfortunately, the category whose objects are the sets of NF and whose morphisms are the functions between those sets is not cartesian closed;[4] this is a highly desirable property for any set theory to have. Intuitively, it means that the functions of NF do not curry as one would normally expect functions to. Furthermore, it means that NF is not a topos.
New Foundations
270
asserted by any instance of Comprehension. Quine presumably constructed NF with this paradox uppermost in mind. Cantor's paradox of the largest cardinal number exploits the application of Cantor's theorem to the universal set. Cantor's theorem says (given ZFC) that the power set of any set is larger than (there can be no injection (one-to-one map) from types: the type of into ). Now of course there is an injection from into , if is the universal set! The resolution requires that we observe that is one higher than the type of makes no sense in the theory of
theory of types for essentially the same reasons that the original form of Cantor's theorem works in ZF) is , where is the set of one-element subsets of . The specific instance of this theorem that interests us is : there are fewer one-element sets than sets (and so fewer from the universe to one-element sets than general objects, if we are in NFU). The "obvious" bijection NFU it is the case that
the one-element sets is not a set; it is not a set because its definition is unstratified. Note that in all known models of ; Choice allows one not only to prove that there are is said urelements but that some are many cardinals between and . We now introduce there useful notions. A set which satisfies the intuitively appealing to be cantorian: a cantorian set satisfies the usual form of Cantor's theorem. A set condition that
, the restriction of the singleton map to A, is not only cantorian set but strongly
cantorian. The Burali-Forti paradox of the largest ordinal number goes as follows. We define (following naive set theory) the ordinals as equivalence classes of well-orderings under similarity. There is an obvious natural well-ordering on the ordinals; since it is a well-ordering it belongs to an ordinal . It is straightforward to prove (by transfinite induction) that the order type of the natural order on the ordinals less than a given ordinal is itself. But this means that is the order type of the ordinals and so is strictly less than the order type of all the ordinals but the latter is, by definition, itself! The solution to the paradox in NF(U) starts with the observation that the order type of the natural order on the ordinals less than is of a higher type than . Hence a type level ordered pair is two, and the usual Kuratowski ordered pair, four, types higher than the type of its arguments. For any order type , we can define an order type one type higher: if , then is the order type of the order . The triviality of the T operation is only a seeming one; it is easy to show that T is a strictly monotone (order preserving) operation on the ordinals. We can now restate the lemma on order types in a stratified manner: the order type of the natural order on the ordinals is or depending on which pair is used (we assume the type level pair hereinafter). From this we deduce that the order type on the ordinals sends an ordinal downward! Since T is monotone, we have is , from which we deduce .
Hence the T operation is not a function; we cannot have a strictly monotone set map from ordinals to ordinals which , a "descending sequence" in the ordinals which cannot be a set. Some have asserted that this result shows that no model of NF(U) is "standard", since the ordinals in any model of NFU are externally not well-ordered. We do not take a position on this, but we note that it is also a theorem of NFU that any set model of NFU has non-well-ordered "ordinals"; NFU does not conclude that the universe V is a model of NFU, despite V being a set, because the membership relation is not a set relation. For a further development of mathematics in NFU, with a comparison to the development of the same in ZFC, see implementation of mathematics in set theory.
New Foundations The set theory of the 1940 first edition of Quine's Mathematical Logic married NF to the proper classes of NBG set theory, and included an axiom schema of unrestricted comprehension for proper classes. In 1942, J. Barkley Rosser proved that the system presented in Mathematical Logic was subject to the Burali-Forti paradox. This result does not apply to NF. In 1950, Hao Wang showed how to amend Quine's axioms so as to avoid this problem, and Quine included the resulting axiomatization in the 1951 second and final edition of Mathematical Logic.
271
Models of NFU
There is a fairly simple method for producing models of NFU in bulk. Using well-known techniques of model theory, one can construct a nonstandard model of Zermelo set theory (nothing nearly as strong as full ZFC is needed for the basic technique) on which there is an external automorphism j (not a set of the model) which moves a rank of the cumulative hierarchy of sets. We may suppose without loss of generality that . We talk about the automorphism moving the rank rather than the ordinal because we do not want to assume that every ordinal in the model is the index of a rank. The domain of the model of NFU will be the nonstandard rank will be We now prove that this actually is a model of NFU. Let be a stratified formula in the language of NFU. Choose an assignment of types to all variables in the formula which witnesses the fact that it is stratified. Choose a natural number N greater than all types assigned to variables by this stratification. Expand the formula into a formula in the language of the nonstandard model of Zermelo set theory with automorphism j using the definition of membership in the model of NFU. Application of any power of j to both sides of an equation or membership statement preserves its truth value because j is an automorphism. Make such an application to each atomic formula in in such a way that each variable x assigned type i occurs with exactly applications of j. This is possible thanks to the form of the atomic membership statements derived from NFU membership statements, and to the formula can be converted to the form being stratified. Each quantified sentence (and similarly for existential . The membership relation of the model of NFU
quantifiers). Carry out this transformation everywhere and obtain a formula bound variable. Choose any free variable y in assigned type i. Apply
uniformly to the entire formula to obtain a formula exists (because j appears applied only to
in which y appears without any application of j. Now free variables and constants), belongs to in the model of NFU.
, and contains exactly those y which satisfy the original formula has this extension in the model of NFU (the application of j corrects for
the different definition of membership in the model of NFU). This establishes that Stratified Comprehension holds in the model of NFU. To see that weak Extensionality holds is straightforward: each nonempty element of inherits a unique extension from the nonstandard model, the empty set inherits its usual extension as well, and all other objects are urelements. The basic idea is that the automorphism j codes the "power set" isomorphic copy of our "universe" into its externally inside our "universe." The remaining objects not coding subsets of the universe are treated
as urelements. If is a natural number n, we get a model of NFU which claims that the universe is finite (it is externally infinite, of course). If is infinite and the Choice holds in the nonstandard model of ZFC, we obtain a model of NFU + Infinity + Choice.
New Foundations
272
of the base "type" with subsets of the base type), one can define embeddings from each "type" into its successor in a natural way. This can be generalized to transfinite sequences with care. Note that the construction of such sequences of sets is limited by the size of the type in which they are being constructed; this prevents TSTU from proving its own consistency (TSTU + Infinity can prove the consistency of TSTU; to prove the consistency of TSTU+Infinity one needs a type containing a set of cardinality , which cannot be proved to exist in TSTU+Infinity without stronger assumptions). Now the same results of model theory can be used to build a model of NFU and verify that it is a model of NFU in much the same way, with the 's being used in place of in the usual construction. The final move is to observe that since NFU is consistent, we can drop the use of absolute types in our metatheory, bootstrapping the metatheory from TSTU to NFU.
273
cantorian (by a usual abuse of language, we refer to such cardinals as "cantorian cardinals"). It is straightforward to show that the assertion that each natural number is cantorian is equivalent to the assertion that the set of all natural numbers is strongly cantorian. Counting is consistent with NFU, but increases its consistency strength noticeably; not, as one would expect, in the area of arithmetic, but in higher set theory. NFU + Infinity proves that each exists, but not that exists; NFU + Counting (easily) proves Infinity, and further proves the existence of . (See beth numbers). Counting implies immediately that one does not need to assign types to variables restricted to the set of natural numbers for purposes of stratification; it is a theorem that the power set of a strongly cantorian set is strongly cantorian, so it is further not necessary to assign types to variables restricted to any iterated power set of the natural numbers, or to such familiar sets as the set of real numbers, the set of functions from reals to reals, and so forth. The set-theoretical strength of Counting is less important in practice than the convenience of not having to annotate variables known to have natural number values (or related kinds of values) with singleton brackets, or to apply the T operation in order to get stratified set definitions. Counting implies Infinity; each of the axioms below needs to be adjoined to NFU + Infinity to get the effect of strong variants of Infinity; Ali Enayat has investigated the strength of some of these axioms in models of NFU + "the universe is finite". A model of the kind constructed above satisfies Counting just in case the automorphism j fixes all natural numbers in the underlying nonstandard model of Zermelo set theory. The next strong axiom we consider is the Axiom of strongly cantorian separation: For any strongly cantorian set A and any formula stratified!) the set {xA|} exists. Immediate consequences include Mathematical Induction for unstratified conditions (which is not a consequence of Counting; many but not all unstratified instances of induction on the natural numbers follow from Counting). This axiom is surprisingly strong. Unpublished work of Robert Solovay shows that the consistency strength of the theory NFU* = NFU + Counting + Strongly Cantorian Separation is the same as that of Zermelo set theory + Replacement. This axiom holds in a model of the kind constructed above (with Choice) if the ordinals which are fixed by j and dominate only ordinals fixed by j in the underlying nonstandard model of Zermelo set theory are standard, and the power set of any such ordinal in the model is also standard. This condition is sufficient but not necessary. Next is Axiom of Cantorian Sets: Every cantorian set is strongly cantorian. This very simple and appealing assertion is extremely strong. Solovay has shown the precise equivalence of the consistency strength of the theory NFUA = NFU + Infinity + Cantorian Sets with that of ZFC + a schema asserting the existence of an n-Mahlo cardinal for each concrete natural number n. Ali Enayat has shown that the theory of cantorian equivalence classes of well-founded extensional relations (which gives a natural picture of an initial segment of the cumulative hierarchy of ZFC) interprets the extension of ZFC with n-Mahlo cardinals directly. A permutation technique can be applied to a model of this theory to give a model in which the hereditarily strongly cantorian sets with the usual membership relation model the strong extension of ZFC. This axiom holds in a model of the kind constructed above (with Choice) just in case the ordinals fixed by j in the underlying nonstandard model of ZFC are an initial (proper class) segment of the ordinals of the model. Next consider the (not necessarily for each n, but not the existence of
New Foundations Axiom of Cantorian Separation: For any cantorian set A and any formula {xA|} exists. This combines the effect of the two preceding axioms and is actually even stronger (precisely how is not known). Unstratified mathematical induction enables proving that there are n-Mahlo cardinals for every n, given Cantorian Sets, which gives an extension of ZFC that is even stronger than the previous one, which only asserts that there are n-Mahlos for each concrete natural number (leaving open the possibility of nonstandard counterexamples). This axiom will hold in a model of the kind described above if every ordinal fixed by j is standard, and every power set of an ordinal fixed by j is also standard in the underlying model of ZFC. Again, this condition is sufficient but not necessary. An ordinal is said to be cantorian if it is fixed by T, and strongly cantorian if it dominates only cantorian ordinals (this implies that it is itself cantorian). In models of the kind constructed above, cantorian ordinals of NFU correspond to ordinals fixed by j (they are not the same objects because different definitions of ordinal numbers are used in the two theories). Equal in strength to Cantorian Sets is the Axiom of Large Ordinals: For each noncantorian ordinal . Recall that , there is a natural number n such that (not necessarily stratified!) the set
274
is the order type of the natural order on all ordinals. This only implies Cantorian Sets if we have : for each of any finite sequence of ordinals s of length n such that ,
Choice (but is at that level of consistency strength in any case). It is remarkable that one can even define this is the nth term appropriate i. This definition is completely unstratified. The uniqueness of
which it exists) and a certain amount of common-sense reasoning about this notion can be carried out, enough to show that Large Ordinals implies Cantorian Sets in the presence of Choice. In spite of the knotty formal statement of this axiom, it is a very natural assumption, amounting to making the action of T on the ordinals as simple as possible. A model of the kind constructed above will satisfy Large Ordinals, if the ordinals moved by j are exactly the ordinals which dominate some in the underlying nonstandard model of ZFC. Axiom of Small Ordinals: For any formula , there is a set A such that the elements of A which are strongly Cantorian ordinals are exactly the strongly cantorian ordinals such that . Solovay has shown the precise equivalence in consistency strength of NFUB = NFU + Infinity + Cantorian Sets + Small Ordinals with MorseKelley set theory plus the assertion that the proper class ordinal (the class of all ordinals) is a weakly compact cardinal. This is very strong indeed! Moreover, NFUB-, which is NFUB with Cantorian Sets omitted, is easily seen to have the same strength as NFUB. A model of the kind constructed above will satisfy this axiom if every collection of ordinals fixed by j is the intersection of some set of ordinals with the ordinals fixed by j, in the underlying nonstandard model of ZFC. Even stronger is the theory NFUM = NFU + Infinity + Large Ordinals + Small Ordinals. This is equivalent to MorseKelley set theory with a predicate on the classes which is a -complete nonprincipal ultrafilter on the proper class ordinal ; in effect, this is MorseKelley set theory + "the proper class ordinal is a measurable cardinal"! The technical details here are not the main point, which is that reasonable and natural (in the context of NFU) assertions turn out to be equivalent in power to very strong axioms of infinity in the ZFC context. This fact is related to the correlation between the existence of models of NFU, described above and satisfying these axioms, and the existence of models of ZFC with automorphisms having special properties.
New Foundations
275
Notes
[1] Holmes, Randall, 1998. Elementary Set Theory with a Universal Set (http:/ / math. boisestate. edu/ ~holmes/ holmes/ head. pdf). Academia-Bruylant. The publisher has graciously consented to permit diffusion of this introduction to NFU via the web. Copyright is reserved. [2] Hailperin, T. A set of axioms for logic, Journal of Symbolic Logic 9, pp. 1-19. [3] http:/ / www. cedar-forest. org/ forest/ events/ history/ NF_at_History_Workshop. ps [4] http:/ / www. dpmms. cam. ac. uk/ ~tf/ cartesian-closed. pdf
References
Crabb, Marcel, 1982, On the consistency of an impredicative fragment of Quine's NF, The Journal of Symbolic Logic 47: 131-136. Jensen, R. B., 1969, "On the Consistency of a Slight(?) Modification of Quine's NF," Synthese 19: 250-63. With discussion by Quine. Quine, W. V., 1980, "New Foundations for Mathematical Logic" in From a Logical Point of View, 2nd ed., revised. Harvard Univ. Press: 80-101. The definitive version of where it all began, namely Quine's 1937 paper in the American Mathematical Monthly.
External links
http://math.stanford.edu/~feferman/papers/ess.pdf Stanford Encyclopedia of Philosophy: Quine's New Foundations (http://plato.stanford.edu/entries/quine-nf) by Thomas Forster. Alternative axiomatic set theories (http://setis.library.usyd.edu.au/stanford/entries/settheory-alternative/) by Randall Holmes. Randall Holmes: New Foundations Home Page. (http://math.boisestate.edu/~holmes/holmes/nf.html) Randall Holmes: Bibliography of Set Theory with a Universal Set. (http://math.boisestate.edu/~holmes/ holmes/setbiblio.html) Randall Holmes: Symmetry as a Criterion for Comprehension Motivating Quines New Foundations (http://dx. doi.org/10.1007/s11225-008-9107-8)
276
ZU etc.
Preliminaries
This section and the next follow Part I of Potter (2004) closely. The background logic is first-order logic with identity. The ontology includes urelements as well as sets, simply to allow the set theories described in this entry to have models that are not purely mathematical in nature. The urelements serve no essential mathematical purpose. Some terminology peculiar to Potter's set theory: a is a collection if a={x : xa}. All sets are collections, but not all collections are sets. The accumulation of a, acc(a), is the set {x : x is a urelement or ba (xb or xb)}. If UV(U = acc(VU)) then V is a history. A level is the accumulation of a history. An initial level has no other levels as members. A limit level is a level that is neither the initial level nor the level above any other level. The birthday of set a, denoted V(a), is the lowest level V such that aV.
Axioms
The following three axioms define the theory ZU. Creation: VV' (VV' ). Remark: There is no highest level, hence there are infinitely many levels. This axiom establishes the ontology of levels. Separation: An axiom schema. For any first-order formula (x) with (bound) variables ranging over the level V, the collection {xV : (x)} is also a set. (See Axiom schema of separation.) Remark: Given the levels established by Creation, this schema establishes the existence of sets and how to form them. It tells us that a level is a set, and all subsets, definable via first-order logic, of levels are also sets. This schema can be seen as an extension of the background logic. Infinity: There exists at least one limit level. (See Axiom of infinity.) Remark: Among the sets Separation allows, at least one is infinite. This axiom is primarily mathematical, as there is no need for the actual infinite in other human contexts, the human sensory order being necessarily finite. For mathematical purposes, the axiom "There exists an inductive set" would suffice.
277
Discussion
The Von Neumann universe implements the "iterative conception of set" by stratifying the universe of sets into a series of "levels," with the sets at a given level being the members of the sets making up the next higher level. Hence the levels form a nested and well-ordered sequence, and would form a hierarchy if set membership were transitive. The resulting iterative conception steers clear, in a well-motivated way, of the well-known paradoxes of Russell, Burali-Forti, and Cantor. These paradoxes all result from the unrestricted use of the principle of comprehension that naive set theory allows. Collections such as "the class of all sets" or "the class of all ordinals" include sets from all levels of the hierarchy. Given the iterative conception, such collections cannot form sets at any given level of the hierarchy and thus cannot be sets at all. The iterative conception has gradually become more accepted over time, despite an imperfect understanding of its historical origins. Boolos's (1989) axiomatic treatment of the iterative conception is his set theory S, a two sorted first order theory involving sets and levels.
278
Scott's theory
Scott (1974) did not mention the "iterative conception of set," instead proposing his theory as a natural outgrowth of the simple theory of types. Nevertheless, Scott's theory can be seen as an axiomatization of the iterative conception and the associated iterative hierarchy. Scott began with an axiom he declined to name: the atomic formula xy implies that y is a set. In symbols: x,ya[xyy=a]. His axiom of Extensionality and axiom schema of Comprehension (Separation) are strictly analogous to their ZF counterparts and so do not mention levels. He then invoked two axioms that do mention levels: Accumulation. A given level "accumulates" all members and subsets of all earlier levels. See the above definition of accumulation. Restriction. All collections belong to some level. Restriction also implies the existence of at least one level and assures that all sets are well-founded. Scott's final axiom, the Reflection schema, is identical to the above existence premise bearing the same name, and likewise does duty for ZF's Infinity and Replacement. Scott's system has the same strength as ZF.
Potter's theory
Potter (1990, 2004) introduced the idiosyncratic terminology described earlier in this entry, and discarded or replaced all of Scott's axioms except Reflection; the result is ZU. Russell's paradox is Potter's (2004) first theorem; Russell's paradox reproduces his very easy proof thereof, one requiring no set theory axioms. Thus Potter establishes from the very outset the need for a more restricted kind of collection, namely sets, that steers clear of Russell's paradox. ZU, like ZF, cannot be finitely axiomatized. ZU differs from ZFC in that it: Includes no axiom of extensionality because the usual extensionality principle follows from the definition of collection and an easy lemma; Admits nonwellfounded sets. However Potter (2004) never invokes such sets, and no theorem in Potter would be overturned were Foundation or its equivalent added to ZU; Includes no equivalents of Choice or the axiom schema of Replacement. Hence ZU is equivalent to the Zermelo set theory of 1908, namely ZFC minus Choice, Replacement, and Foundation. The remaining differences between ZU and ZFC are mainly expositional. What is the strength of ZfU, and ZFU relative to Z, ZF, and ZFC? The natural numbers are not defined as a particular set within the iterative hierarchy, but as models of a "pure" Dedekind algebra. "Dedekind algebra" is Potter's name for a set closed under an unary injective operation, successor, whose domain contains a unique element, zero, absent from its range. Because all Dedekind algebras with the lowest possible birthdays are categorical (all models are isomorphic), any such algebra can proxy for the natural numbers. The FregeRussell definitions of the cardinal and ordinal numbers work in Scott-Potter set theory, because the equivalence classes these definitions require are indeed sets. Thus in ZU an equivalence class of: Equinumerous sets from a common level is a cardinal number; Isomorphic well-orderings, also from a common level, is an ordinal number. In ZFC, defining the cardinals and ordinals in this fashion gives rise to the Cantor and Burali-Forti paradox, respectively. Although Potter (2004) devotes an entire appendix to proper classes, the strength and merits of Scott-Potter set theory relative to the well-known rivals to ZFC that admit proper classes, namely NBG and MorseKelley set theory, have yet to be explored.
ScottPotter set theory Scott-Potter set theory resembles NFU in that the latter is a recently devised axiomatic set theory admitting both urelements and sets that are not well-founded. But the urelements of NFU, unlike those of ZU, play an essential role; they and the resulting restrictions on Extensionality make possible a proof of NFU's consistency relative to Peano arithmetic. But nothing is known about the strength of NFU relative to Creation+Separation, NFU+Infinity relative to ZU, and of NFU+Infinity+Countable Choice relative to ZU+Countable Choice. Unlike nearly all writing on set theory in recent decades, Potter (2004) mentions mereological fusions. His collections are also synonymous with the "virtual sets" of Willard Quine and Richard Milton Martin: entities arising from the free use of the principle of comprehension that can never be admitted to the universe of discourse.
279
References
George Boolos, 1971, "The iterative conception of set," Journal of Philosophy 68: 21531. Reprinted in Boolos 1999. Logic, Logic, and Logic. Harvard Univ. Press: 13-29. --------, 1989, "Iteration Again," Philosophical Topics 42: 5-21. Reprinted in Boolos 1999. Logic, Logic, and Logic. Harvard Univ. Press: 88-104. Potter, Michael, 1990. Sets: An Introduction. Oxford Univ. Press. ------, 2004. Set Theory and its Philosophy. Oxford Univ. Press. Dana Scott, 1974, "Axiomatizing set theory" in Jech, Thomas, J., ed., Axiomatic Set Theory II, Proceedings of Symposia in Pure Mathematics 13. American Mathematical Society: 20714.
External links
Reviews of Potter (2004): Bays, Timothy, 2005, "Review, [1]" Notre Dame Philosophical Reviews. Uzquiano, Gabriel, 2005, "Review, [2]" Philosophia Mathematica 13: 308-46.
References
[1] http:/ / ndpr. nd. edu/ review. cfm?id=2141 [2] http:/ / philmat. oxfordjournals. org/ cgi/ content/ full/ 13/ 3/ 308
280
equality formulas and closed under conjunction, disjunction, existential and universal quantification). Typically, the motivation for these theories is topological: the sets are the classes which are closed under a certain topology. The closure conditions for the various constructions allowed in building positive formulas are readily motivated (and one can further justify the use of universal quantifiers bounded in sets to get generalized positive comprehension): the justification of the existential quantifier seems to require that the topology be compact. The set theory of Olivier Esser consists of the following axioms: . such that (this axiom can be neatly dispensed with if a is a formula in predicate logic using only , , , ,
The axiom of extensionality: The axiom of empty set: there exists a set
false formula is included as a positive formula). The axiom of generalized positive comprehension: if , , and , then the set of all such that Note that negation is specifically not permitted. The axiom of closure: for every formula every x such that
) may be bounded.
, a set exists which is the intersection of all sets which contain and is written in any of the various ways that
topological closures can be presented. This can be put more briefly if class language is allowed (any condition on sets defining a class as in NBG): for any class C there is a set which is the intersection of all sets which contain C as a subclass. This is obviously a reasonable principle if the sets are understood as closed classes in a topology. The axiom of infinity: the von Neumann ordinal exists. This is not an axiom of infinity in the usual sense; if Infinity does not hold, the closure of exists and has itself as its sole additional member (it is certainly infinite); the point of this axiom is that contains no additional elements at all, which boosts the theory from the strength of second order arithmetic to the strength of MorseKelley set theory with the proper class ordinal a weakly compact cardinal.
Interesting properties
The universal set is a proper set in this theory. The sets of this theory are the collections of sets which are closed under a certain topology on the classes. The theory can interpret ZFC (by restricting oneself to the class of well-founded sets, which is not itself a set). It in fact interprets a stronger theory (Morse-Kelley set theory with the proper class ordinal a weakly compact cardinal).
Researchers
Isaac Malitz originally introduced Positive Set Theory in his 1976 PhD Thesis at UCLA Alonzo Church was the chairman of the committee supervising the aforementioned thesis Olivier Esser seems to be the most active in this field.
281
References
Esser, Olivier (1999), "On the consistency of a positive theory.", MLQ Math. Log. Q. 45 (1): 105116, doi:10.1002/malq.19990450110, MR1669902
Axiom of choice
In mathematics, the axiom of choice, or AC, is an axiom of set theory stating that for every family nonempty sets there exists a family of elements with for every of . Informally put, the axiom of choice says that given any collection of bins, each containing at least one object, it is possible to make a selection of exactly one object from each bin. In many cases such a selection can be made without invoking the axiom of choice; this is in particular the case if the number of bins is finite, or if a selection rule is available: a distinguishing property that happens to hold for exactly one object in each bin. For example for any (even infinite) collection of pairs of shoes, one can pick out the left shoe from each pair to obtain an appropriate selection, but for an infinite collection of pairs of socks (assumed to have no distinguishing features), such a selection can be obtained only by invoking the axiom of choice. The axiom of choice was formulated in 1904 by Ernst Zermelo.[1] Although originally controversial, it is now used without reservation by most mathematicians.[2] One motivation for this use is that a number of important mathematical results, such as Tychonoff's theorem, require the axiom of choice for their proofs. Contemporary set theorists also study axioms that are not compatible with the axiom of choice, such as the axiom of determinacy. Unlike the axiom of choice, these alternatives are not ordinarily proposed as axioms for mathematics, but only as principles in set theory with interesting consequences.
Statement
A choice function is a function f, defined on a collection X of nonempty sets, such that for every set s in X, f(s) is an element of s. With this concept, the axiom can be stated: For any set X of nonempty sets, there exists a choice function f defined on X. Thus the negation of the axiom of choice states that there exists a set of nonempty sets which has no choice function. Each choice function on a collection X of nonempty sets is an element of the Cartesian product of the sets in X. This is not the most general situation of a Cartesian product of a family of sets, where a same set can occur more than once as a factor; however, one can focus on elements of such a product that select the same element every time a given set appears as factor, and such elements correspond to an element of the Cartesian product of all distinct sets in the family. The axiom of choice asserts the existence of such elements; it is therefore equivalent to: Given any family of nonempty sets, their Cartesian product is a nonempty set.
Variants
There are many other equivalent statements of the axiom of choice. These are equivalent in the sense that, in the presence of other basic axioms of set theory, they imply the axiom of choice and are implied by it. One variation avoids the use of choice functions by, in effect, replacing each choice function with its range. Given any set X of pairwise disjoint non-empty sets, there exists at least one set C that contains exactly one element in common with each of the sets in X.[3] Another equivalent axiom only considers collections X that are essentially powersets of other sets: For any set A, the power set of A (with the empty set removed) has a choice function.
Axiom of choice Authors who use this formulation often speak of the choice function on A, but be advised that this is a slightly different notion of choice function. Its domain is the powerset of A (with the empty set removed), and so makes sense for any set A, whereas with the definition used elsewhere in this article, the domain of a choice function on a collection of sets is that collection, and so only makes sense for sets of sets. With this alternate notion of choice function, the axiom of choice can be compactly stated as Every set has a choice function.[4] which is equivalent to For any set A there is a function f such that for any non-empty subset B of A, f(B) lies in B. The negation of the axiom can thus be expressed as: There is a set A such that for all functions f (on the set of non-empty subsets of A), there is a B such that f(B) does not lie in B.
282
Usage
Until the late 19th century, the axiom of choice was often used implicitly, although it had not yet been formally stated. For example, after having established that the set X contains only non-empty sets, a mathematician might have said "let F(s) be one of the members of s for all s in X." In general, it is impossible to prove that F exists without the axiom of choice, but this seems to have gone unnoticed until Zermelo. Not every situation requires the axiom of choice. For finite sets X, the axiom of choice follows from the other axioms of set theory. In that case it is equivalent to saying that if we have several (a finite number of) boxes, each containing at least one item, then we can choose exactly one item from each box. Clearly we can do this: We start at the first box, choose an item; go to the second box, choose an item; and so on. The number of boxes is finite, so eventually our choice procedure comes to an end. The result is an explicit choice function: a function that takes the first box to the first element we chose, the second box to the second element we chose, and so on. (A formal proof for all finite sets would use the principle of mathematical induction to prove "for every natural number k, every family of k nonempty sets has a choice function.") This method cannot, however, be used to show that every countable family of nonempty sets has a choice function, as is asserted by the axiom of countable choice. If the method is applied to an infinite sequence (Xi : i) of nonempty sets, a function is obtained at each finite stage, but there is no stage at which a choice function for the entire family is constructed, and no "limiting" choice function can be constructed, in general, in ZF without the axiom of choice.
Examples
The nature of the individual nonempty sets in the collection may make it possible to avoid the axiom of choice even for certain infinite collections. For example, suppose that each member of the collection X is a nonempty subset of the natural numbers. Every such subset has a smallest element, so to specify our choice function we can simply say that it maps each set to the least element of that set. This gives us a definite choice of an element from each set, and makes it unnecessary to apply the axiom of choice.
Axiom of choice The difficulty appears when there is no natural choice of elements from each set. If we cannot make explicit choices, how do we know that our set exists? For example, suppose that X is the set of all non-empty subsets of the real numbers. First we might try to proceed as if X were finite. If we try to choose an element from each set, then, because X is infinite, our choice procedure will never come to an end, and consequently, we will never be able to produce a choice function for all of X. Next we might try specifying the least element from each set. But some subsets of the real numbers do not have least elements. For example, the open interval (0,1) does not have a least element: if x is in (0,1), then so is x/2, and x/2 is always strictly smaller than x. So this attempt also fails. Additionally, consider for instance the unit circle S, and the action on S by a group G consisting of all rational rotations. Namely, these are rotations by angles which are rational multiples of . Here G is countable while S is uncountable. Hence S breaks up into uncountably many orbits under G. Using the axiom of choice, we could pick a single point from each orbit, obtaining an uncountable subset X of S with the property that all of its translates by G are disjoint from X. In other words, the circle gets partitioned into a countable collection of disjoint sets, which are all pairwise congruent. Now it is easy to convince oneself that the set X could not possibly be measurable for a countably additive measure. Hence one couldn't expect to find an algorithm to find a point in each orbit, without using the axiom of choice. See non-measurable set#Example for more details. The reason that we are able to choose least elements from subsets of the natural numbers is the fact that the natural numbers are well-ordered: every nonempty subset of the natural numbers has a unique least element under the natural ordering. One might say, "Even though the usual ordering of the real numbers does not work, it may be possible to find a different ordering of the real numbers which is a well-ordering. Then our choice function can choose the least element of every set under our unusual ordering." The problem then becomes that of constructing a well-ordering, which turns out to require the axiom of choice for its existence; every set can be well-ordered if and only if the axiom of choice holds.
283
Nonconstructive aspects
A proof requiring the axiom of choice is nonconstructive: even though the proof establishes the existence of an object, it may be impossible to define the object in the language of set theory. For example, while the axiom of choice implies that there is a well-ordering of the real numbers, there are models of set theory with the axiom of choice in which no well-ordering of the reals is definable. As another example, a subset of the real numbers that is not Lebesgue measurable can be proven to exist using the axiom of choice, but it is consistent that no such set is definable. The axiom of choice produces these intangibles (objects that are proven to exist by a nonconstructive proof, but cannot be explicitly constructed), which may conflict with some philosophical principles. Because there is no canonical well-ordering of all sets, a construction that relies on a well-ordering may not produce a canonical result, even if a canonical result is desired (as is often the case in category theory). In constructivism, all existence proofs are required to be totally explicit. That is, one must be able to construct, in an explicit and canonical manner, anything that is proven to exist. This foundation rejects the full axiom of choice because it asserts the existence of an object without uniquely determining its structure. In fact the DiaconescuGoodmanMyhill theorem shows how to derive the constructively unacceptable law of the excluded middle, or a restricted form of it, in constructive set theory from the assumption of the axiom of choice. Another argument against the axiom of choice is that it implies the existence of counterintuitive objects. One example of this is the BanachTarski paradox which says that it is possible to decompose ("carve up") the 3-dimensional solid unit ball into finitely many pieces and, using only rotations and translations, reassemble the pieces into two solid balls each with the same volume as the original. The pieces in this decomposition, constructed using the axiom of choice, are non-measurable sets. The majority of mathematicians accept the axiom of choice as a valid principle for proving new results in mathematics. The debate is interesting enough, however, that it is considered of note when a theorem in ZFC (ZF
Axiom of choice plus AC) is logically equivalent (with just the ZF axioms) to the axiom of choice, and mathematicians look for results that require the axiom of choice to be false, though this type of deduction is less common than the type which requires the axiom of choice to be true. It is possible to prove many theorems using neither the axiom of choice nor its negation; this is common in constructive mathematics. Such statements will be true in any model of ZermeloFraenkel set theory (ZF), regardless of the truth or falsity of the axiom of choice in that particular model. The restriction to ZF renders any claim that relies on either the axiom of choice or its negation unprovable. For example, the BanachTarski paradox is neither provable nor disprovable from ZF alone: it is impossible to construct the required decomposition of the unit ball in ZF, but also impossible to prove there is no such decomposition. Similarly, all the statements listed below which require choice or some weaker version thereof for their proof are unprovable in ZF, but since each is provable in ZF plus the axiom of choice, there are models of ZF in which each statement is true. Statements such as the BanachTarski paradox can be rephrased as conditional statements, for example, "If AC holds, the decomposition in the BanachTarski paradox exists." Such conditional statements are provable in ZF when the original statements are provable from ZF and the axiom of choice.
284
Independence
Assuming ZF is consistent, Kurt Gdel showed that the negation of the axiom of choice is not a theorem of ZF by constructing an inner model (the constructible universe) which satisfies ZFC and thus showing that ZFC is consistent. Assuming ZF is consistent, Paul Cohen employed the technique of forcing, developed for this purpose, to show that the axiom of choice itself is not a theorem of ZF by constructing a much more complex model which satisfies ZFC (ZF with the negation of AC added as axiom) and thus showing that ZFC is consistent. Together these results establish that the axiom of choice is logically independent of ZF. The assumption that ZF is consistent is harmless because adding another axiom to an already inconsistent system cannot make the situation worse. Because of independence, the decision whether to use of the axiom of choice (or its negation) in a proof cannot be made by appeal to other axioms of set theory. The decision must be made on other grounds. One argument given in favor of using the axiom of choice is that it is convenient to use it because it allows one to prove some simplifying propositions that otherwise could not be proved. Many theorems which are provable using choice are of an elegant general character: every ideal in a ring is contained in a maximal ideal, every vector space has a basis, and every product of compact spaces is compact. Without the axiom of choice, these theorems may not hold for mathematical objects of large cardinality. The proof of the independence result also shows that a wide class of mathematical statements, including all statements that can be phrased in the language of Peano arithmetic, are provable in ZF if and only if they are provable in ZFC.[6] Statements in this class include the statement that P = NP, the Riemann hypothesis, and many other unsolved mathematical problems. When one attempts to solve problems in this class, it makes no difference whether ZF or ZFC is employed if the only question is the existence of a proof. It is possible, however, that there is a shorter proof of a theorem from ZFC than from ZF. The axiom of choice is not the only significant statement which is independent of ZF. For example, the generalized continuum hypothesis (GCH) is not only independent of ZF, but also independent of ZFC. However, ZF plus GCH implies AC, making GCH a strictly stronger claim than AC, even though they are both independent of ZF.
Axiom of choice
285
Stronger axioms
The axiom of constructibility and the generalized continuum hypothesis both imply the axiom of choice, but are strictly stronger than it. In class theories such as Von NeumannBernaysGdel set theory and MorseKelley set theory, there is a possible axiom called the axiom of global choice which is stronger than the axiom of choice for sets because it also applies to proper classes. And the axiom of global choice follows from the axiom of limitation of size.
Equivalents
There are important statements that, assuming the axioms of ZF but neither AC nor AC, are equivalent to the axiom of choice. The most important among them are Zorn's lemma and the well-ordering theorem. In fact, Zermelo initially introduced the axiom of choice in order to formalize his proof of the well-ordering theorem. Set theory Well-ordering theorem: Every set can be well-ordered. Consequently, every cardinal has an initial ordinal. Tarski's theorem: For every infinite set A, there is a bijective map between the sets A and AA. Trichotomy: If two sets are given, then either they have the same cardinality, or one has a smaller cardinality than the other. The Cartesian product of any family of nonempty sets is nonempty. Knig's theorem: Colloquially, the sum of a sequence of cardinals is strictly less than the product of a sequence of larger cardinals. (The reason for the term "colloquially", is that the sum or product of a "sequence" of cardinals cannot be defined without some aspect of the axiom of choice.) Every surjective function has a right inverse. Order theory Zorn's lemma: Every non-empty partially ordered set in which every chain (i.e. totally ordered subset) has an upper bound contains at least one maximal element. Hausdorff maximal principle: In any partially ordered set, every totally ordered subset is contained in a maximal totally ordered subset. The restricted principle "Every partially ordered set has a maximal totally ordered subset" is also equivalent to AC over ZF. Tukey's lemma: Every non-empty collection of finite character has a maximal element with respect to inclusion. Antichain principle: Every partially ordered set has a maximal antichain. Abstract algebra Every vector space has a basis.[7] Every unital ring other than the trivial ring contains a maximal ideal. For every non-empty set S there is a binary operation defined on S that makes it a group.[8] (A cancellative binary operation is enough.) Functional analysis The closed unit ball of the dual of a normed vector space over the reals has an extreme point. General topology Tychonoff's theorem stating that every product of compact topological spaces is compact. In the product topology, the closure of a product of subsets is equal to the product of the closures. Mathematical logic If S is a set of sentences of first-order logic and B is a consistent subset of S, then B is included in a set that is maximal among consistent subsets of S. The special case where S is the set of all first-order sentences in a given signature is weaker, equivalent to the Boolean prime ideal theorem; see the section "Weaker forms"
286
Category theory
There are several results in category theory which invoke the axiom of choice for their proof. These results might be weaker than, equivalent to, or stronger than the axiom of choice, depending on the strength of the technical foundations. For example, if one defines categories in terms of sets, that is, as sets of objects and morphisms (usually called a small category), or even locally small categories, whose hom-objects are sets, then there is no category of all sets, and so it is difficult for a category-theoretic formulation to apply to all sets. On the other hand, other foundational descriptions of category theory are considerably stronger, and an identical category-theoretic statement of choice may be stronger than the standard formulation, la class theory, mentioned above. Examples of category-theoretic statements which require choice include: Every small category has a skeleton. If two small categories are weakly equivalent, then they are equivalent. Every continuous functor on a small-complete category which satisfies the appropriate solution set condition has a left-adjoint (the Freyd adjoint functor theorem).
Weaker forms
There are several weaker statements that are not equivalent to the axiom of choice, but are closely related. One example is the axiom of dependent choice (DC). A still weaker example is the axiom of countable choice (AC or CC), which states that a choice function exists for any countable set of nonempty sets. These axioms are sufficient for many proofs in elementary mathematical analysis, and are consistent with some principles, such as the Lebesgue measurability of all sets of reals, that are disprovable from the full axiom of choice. Other choice axioms weaker than axiom of choice include the Boolean prime ideal theorem and the axiom of uniformization. The former is equivalent in ZF to the existence of an ultrafilter containing each given filter, proved by Tarski in 1930.
Axiom of choice Every field extension has a transcendence basis. Stone's representation theorem for Boolean algebras needs the Boolean prime ideal theorem. The NielsenSchreier theorem, that every subgroup of a free group is free. The additive groups of R and C are isomorphic.[9] and [10]
287
Functional analysis The HahnBanach theorem in functional analysis, allowing the extension of linear functionals The theorem that every Hilbert space has an orthonormal basis. The BanachAlaoglu theorem about compactness of sets of functionals. The Baire category theorem about complete metric spaces, and its consequences, such as the open mapping theorem and the closed graph theorem. On every infinite-dimensional topological vector space there is a discontinuous linear map. General topology A uniform space is compact if and only if it is complete and totally bounded. Every Tychonoff space has a Stoneech compactification. Mathematical logic Gdel's completeness theorem for first-order logic: every consistent set of first-order sentences has a completion. That is, every consistent set of first-order sentences can be extended to a maximal consistent set.
Axiom of choice For proofs, see Thomas Jech, The Axiom of Choice, American Elsevier Pub. Co., New York, 1973. There exists a model of ZFC in which every set in Rn is measurable. Thus it is possible to exclude counterintuitive results like the BanachTarski paradox which are provable in ZFC. Furthermore, this is possible whilst assuming the Axiom of dependent choice, which is weaker than AC but sufficient to develop most of real analysis. In all models of ZFC, the generalized continuum hypothesis does not hold.
288
Quotes
"The Axiom of Choice is obviously true, the well-ordering principle obviously false, and who can tell about Zorn's lemma?" Jerry Bona This is a joke: although the three are all mathematically equivalent, many mathematicians find the axiom of choice to be intuitive, the well-ordering principle to be counterintuitive, and Zorn's lemma to be too complex for any intuition. "The Axiom of Choice is necessary to select a set from an infinite number of socks, but not an infinite number of shoes." Bertrand Russell The observation here is that one can define a function to select from an infinite number of pairs of shoes by stating for example, to choose the left shoe. Without the axiom of choice, one cannot assert that such a function exists for pairs of socks, because left and right socks are (presumably) indistinguishable from each other. "Tarski tried to publish his theorem [the equivalence between AC and 'every infinite set A has the same cardinality as AxA', see above] in Comptes Rendus, but Frchet and Lebesgue refused to present it. Frchet wrote that an implication between two well known [true] propositions is not a new result, and Lebesgue wrote that an implication between two false propositions is of no interest". Polish-American mathematician Jan Mycielski relates this anecdote in a 2006 article in the Notices of the AMS. "The axiom gets its name not because mathematicians prefer it to other axioms." A. K. Dewdney This quote comes from the famous April Fools' Day article in the computer recreations column of the Scientific American, April 1989.
Notes
[1] Zermelo, Ernst (1904). "Beweis, dass jede Menge wohlgeordnet werden kann" (http:/ / gdz. sub. uni-goettingen. de/ no_cache/ en/ dms/ load/ img/ ?IDDOC=28526) (reprint). Mathematische Annalen 59 (4): 51416. doi:10.1007/BF01445300. . [2] Jech, 1977, p. 348ff; Martin-Lf 2008, p. 210. [3] Herrlich, p. 9. [4] Patrick Suppes, "Axiomatic Set Theory", Dover, 1972 (1960), ISBN 0-486-61630-4, p. 240 [5] Tourlakis (2003), pp. 209210, 215216. [6] This is because arithmetical statements are absolute to the constructible universe L. Shoenfield's absoluteness theorem gives a more general result. [7] Blass, Andreas (1984). "Existence of bases implies the axiom of choice". Contemporary mathematics 31. [8] A. Hajnal, A. Kertsz: Some new algebraic equivalents of the axiom of choice, Publ. Math. Debrecen, 19(1972), 339340, see also H. Rubin, J. Rubin, Equivalents of the axiom of choice, II, North-Holland, 1985, p. 111. [9] http:/ / www. cs. nyu. edu/ pipermail/ fom/ 2006-February/ 009959. html [10] http:/ / journals. cambridge. org/ action/ displayFulltext?type=1& fid=4931240& aid=4931232 [11] Axiom of dependent choice [12] Jech, Thomas (1973) "The axiom of choice", ISBN 0-444-10484-4, CH. 10, p. 142. [13] Stavi, Jonathan (1974). "A model of ZF with an infinite free complete Boolean algebra" (http:/ / www. springerlink. com/ content/ d5710380t753621u/ ) (reprint). Israel Journal of Mathematics 20 (2): 149163. doi:10.1007/BF02757883. .
Axiom of choice
289
References
Horst Herrlich, Axiom of Choice, Springer Lecture Notes in Mathematics 1876, Springer Verlag Berlin Heidelberg (2006). ISBN 3-540-30989-6. Paul Howard and Jean Rubin, "Consequences of the Axiom of Choice". Mathematical Surveys and Monographs 59; American Mathematical Society; 1998. Thomas Jech, "About the Axiom of Choice." Handbook of Mathematical Logic, John Barwise, ed., 1977. Per Martin-Lf, "100 years of Zermelo's axiom of choice: What was the problem with it?", in Logicism, Intuitionism, and Formalism: What Has Become of Them?, Sten Lindstrm, Erik Palmgren, Krister Segerberg, and Viggo Stoltenberg-Hansen, editors (2008). ISBN 1-402-08925-2 Gregory H Moore, "Zermelo's axiom of choice, Its origins, development and influence", Springer; 1982. ISBN 0-387-90670-3 Herman Rubin, Jean E. Rubin: Equivalents of the axiom of choice. North Holland, 1963. Reissued by Elsevier, April 1970. ISBN 0720422256. Herman Rubin, Jean E. Rubin: Equivalents of the Axiom of Choice II. North Holland/Elsevier, July 1985, ISBN 0444877088. George Tourlakis, Lectures in Logic and Set Theory. Vol. II: Set Theory, Cambridge University Press, 2003. ISBN 0-511-06659-7 Ernst Zermelo, "Untersuchungen ber die Grundlagen der Mengenlehre I," Mathematische Annalen 65: (1908) pp.26181. PDF download via digizeitschriften.de (http://www.digizeitschriften.de/no_cache/home/ jkdigitools/loader/?tx_jkDigiTools_pi1[IDDOC]=361762) Translated in: Jean van Heijenoort, 2002. From Frege to Godel: A Source Book in Mathematical Logic, 1879-1931. New edition. Harvard University Press. ISBN 0-674-32449-8 1904. "Proof that every set can be well-ordered," 139-41. 1908. "Investigations in the foundations of set theory I," 199-215.
External links
Axiom of Choice and Its Equivalents at ProvenMath (http://www.apronus.com/provenmath/choice.htm) includes formal statement of the Axiom of Choice, Hausdorff's Maximal Principle, Zorn's Lemma and formal proofs of their equivalence down to the finest detail. Consequences of the Axiom of Choice (http://www.math.purdue.edu/~hrubin/JeanRubin/Papers/conseq. html), based on the book by Paul Howard (http://www.emunix.emich.edu/~phoward/) and Jean Rubin. The Axiom of Choice (http://plato.stanford.edu/entries/axiom-choice) entry by John Lane Bell in the Stanford Encyclopedia of Philosophy
290
Footnotes
[1] Blair, Charles E. The Baire category theorem implies the principle of dependent choices. Bull. Acad. Polon. Sci. Sr. Sci. Math. Astronom. Phys. 25 (1977), no. 10, 933--934.
References
Jech, Thomas, 2003. Set Theory: The Third Millennium Edition, Revised and Expanded. Springer. ISBN 3-540-44085-2.
Continuum hypothesis
291
Continuum hypothesis
In mathematics, the continuum hypothesis (abbreviated CH) is a hypothesis, advanced by Georg Cantor in 1874, about the possible sizes of infinite sets. It states: There is no set whose cardinality is strictly between that of the integers and that of the real numbers. Establishing the truth or falsehood of the continuum hypothesis is the first of Hilbert's twenty-three problems presented in the year 1900. The contributions of Kurt Gdel in 1940 and Paul Cohen in 1963 showed that the hypothesis can neither be disproved nor be proved using the axioms of ZermeloFraenkel set theory, the standard foundation of modern mathematics, provided ZF set theory is consistent. The name of the hypothesis comes from the term the continuum for the real numbers.
Assuming the axiom of choice, there is a smallest cardinal number hypothesis is in turn equivalent to the equality
greater than
There is also a generalization of the continuum hypothesis called the generalized continuum hypothesis (GCH) which says that for all ordinals
A consequence of the hypothesis is that every infinite subset of the real numbers either has the same cardinality as the integers or the same cardinality as the entire set of the reals.
Continuum hypothesis
292
Continuum hypothesis (Woodin 2001a, 2001b). Foreman (2003) does not reject Woodin's argument outright but urges caution.
293
consistent with ZFC that for each , 2 is the nth successor of . On the other hand, Laszlo Patai proved, that if is an ordinal and for each infinite cardinal , 2 is the th successor of , then is finite. For any infinite sets A and B, if there is an injection from A to B then there is an injection from subsets of A to subsets of B. Thus for any infinite cardinals A and B, . If A and B are finite, the stronger inequality
holds. GCH implies that this strict, stronger inequality holds for infinite cardinals as well as finite cardinals.
Continuum hypothesis
294
References
Cohen, P. J. (1966). Set Theory and the Continuum Hypothesis. W. A. Benjamin. Cohen, Paul J. (December 15, 1963). "The Independence of the Continuum Hypothesis". Proceedings of the National Academy of Sciences of the United States of America 50 (6): 11431148. doi:10.1073/pnas.50.6.1143. JSTOR71858. PMC221287. PMID16578557. Cohen, Paul J. (January 15, 1964). "The Independence of the Continuum Hypothesis, II". Proceedings of the National Academy of Sciences of the United States of America 51 (1): 105110. doi:10.1073/pnas.51.1.105. JSTOR72252. PMC300611. PMID16591132. Dales, H. G.; W. H. Woodin (1987). An Introduction to Independence for Analysts. Cambridge. Enderton, Herbert (1977). Elements of Set Theory. Academic Press. Foreman, Matt (2003). "Has the Continuum Hypothesis been Settled?" [1] (PDF). Retrieved February 25, 2006. Freiling, Chris (1986). "Axioms of Symmetry: Throwing Darts at the Real Number Line". Journal of Symbolic Logic (Association for Symbolic Logic) 51 (1): 190200. doi:10.2307/2273955. JSTOR2273955. Gdel, K. (1940). The Consistency of the Continuum-Hypothesis. Princeton University Press. Kunen, Kenneth (1980). Set Theory: An Introduction to Independence Proofs. Amsterdam: North-Holland. ISBN978-0-444-85401-8. Gdel, K.: What is Cantor's Continuum Problem?, reprinted in Benacerraf and Putnam's collection Philosophy of Mathematics, 2nd ed., Cambridge University Press, 1983. An outline of Gdel's arguments against CH. Maddy, Penelope (June 1988). "Believing the Axioms, I". Journal of Symbolic Logic (Association for Symbolic Logic) 53 (2): 481511. doi:10.2307/2274520. JSTOR2274520. Martin, D. (1976). "Hilbert's first problem: the continuum hypothesis," in Mathematical Developments Arising from Hilbert's Problems, Proceedings of Symposia in Pure Mathematics XXVIII, F. Browder, editor. American Mathematical Society, 1976, pp.8192. ISBN 0-8218-1428-1 McGough, Nancy. "The Continuum Hypothesis" [2]. Merimovich, Carmi (2007). "A power function with a fixed finite gap everywhere". Journal of Symbolic Logic 72 (2): 361417. doi:10.2178/jsl/1185803615. Woodin, W. Hugh (2001a). "The Continuum Hypothesis, Part I" [3] (PDF). Notices of the AMS 48 (6): 567576. Woodin, W. Hugh (2001b). "The Continuum Hypothesis, Part II" [4] (PDF). Notices of the AMS 48 (7): 681690. Primary literature in German Cantor, Georg (1874), "ber eine Eigenschaft des Ingebriffes aller reelen algebraischen Zahlen" [5], Journal fr die Reine und Angewandte Mathematik 77: 258262. This article incorporates material from Generalized continuum hypothesis on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.
References
[1] [2] [3] [4] [5] http:/ / www. math. helsinki. fi/ logic/ LC2003/ presentations/ foreman. pdf http:/ / www. ii. com/ math/ ch/ http:/ / www. ams. org/ notices/ 200106/ fea-woodin. pdf http:/ / www. ams. org/ notices/ 200107/ fea-woodin. pdf http:/ / bolyai. cs. elte. hu/ ~badam/ matbsc/ 11o/ cantor1874. pdf
Martin's axiom
295
Martin's axiom
In the mathematical field of set theory, Martin's axiom, introduced by Donald A. Martin and Robert M. Solovay(1970), is a statement which is independent of the usual axioms of ZFC set theory. It is implied by the continuum hypothesis, so certainly consistent with ZFC, but is also known to be consistent with ZF+CH. Indeed, it is only really of interest when the continuum hypothesis fails (otherwise it adds nothing to ZFC). It can informally be considered to say that all cardinals less than the cardinality of the continuum, , behave roughly like . The intuition behind this can be understood by studying the proof of the Rasiowa-Sikorski lemma. More formally it is a principle that is used to control certain forcing arguments.
MA(k) holds for every k less than the continuum. (It is a theorem of ZFC that MA( ) fails.) Note that, in this case (for application of ccc), an antichain is a subset of such that any two distinct members of are incompatible (two elements are said to be compatible if there exists a common element below both of them in the partial order). This differs from, for example, the notion of antichain in the context of trees. MA( MA( ) is simply true. This is known as the Rasiowa-Sikorski lemma. ) is false: [0, 1] is a compact Hausdorff space, which is separable and so ccc. It has no isolated points, so many points.
Consequences of MA(k)
Martin's axiom has a number of other interesting combinatorial, analytic and topological consequences: The union of k or fewer null sets in an atomless -finite Borel measure on a Polish space is null. In particular, the union of k or fewer subsets of of Lebesgue measure 0 also has Lebesgue measure 0. A compact Hausdorff space X with subsequence. No non-principal ultrafilter on Equivalently for any MA( is sequentially compact, i.e., every sequence has a convergent
A product of ccc topological spaces is ccc (this in turn implies there are no Suslin lines). MA together with the negation of the continuum hypothesis implies:
Martin's axiom There exists a Whitehead group that is not free; Shelah used this to show that the Whitehead problem is independent of ZFC.
296
References
Fremlin, David H. (1984). Consequences of Martin's axiom. Cambridge tracts in mathematics, no. 84. Cambridge: Cambridge University Press. ISBN0521250919. Jech, Thomas, 2003. Set Theory: The Third Millennium Edition, Revised and Expanded. Springer. ISBN 3-540-44085-2. Kunen, Kenneth, 1980. Set Theory: An Introduction to Independence Proofs. Elsevier. ISBN 0-444-86839-9. Martin, D. A.; Solovay, R. M. (1970), "Internal Cohen extensions.", Ann. Math. Logic 2 (2): 143178, doi:10.1016/0003-4843(70)90009-4, MR0270904
Diamond principle
In mathematics, and particularly in axiomatic set theory, the diamond principle is a combinatorial principle introduced by Bjrn Jensen (1972) that holds in the constructible universe and that implies the continuum hypothesis. Jensen extracted the diamond principle from his proof that V=L implies the existence of a Suslin tree.
Definition
The diamond principle says that there exists a -sequence, in other words sets A for <1 such that for any subset A of 1 the set of with A = A is stationary in 1. More generally, for a given cardinal number each for every The principle 1 is the same as . is stationary in and a stationary set , the statement S (sometimes written such that (S) or (S)) is the statement that there is a sequence
Diamond principle
297
References
Akemann, Charles; Weaver, Nik (2004), "Consistency of a counterexample to Naimark's problem", Proceedings of the National Academy of Sciences of the United States of America 101 (20): 75227525, arXiv:math.OA/0312135, doi:10.1073/pnas.0401489101, MR2057719 Jensen, R. Bjrn (1972), "The fine structure of the constructible hierarchy", Annals of Mathematical lLogic 4: 229308, doi:10.1016/0003-4843(72)90001-0, MR0309729 Assaf Rinot, Jensen's diamond principle and its relatives, online [1] S. Shelah: Whitehead groups may not be free, even assuming CH, II, Israel J. Math., 35(1980), 257285.
References
[1] http:/ / papers. assafrinot. com/ ?num=s01
Clubsuit
In mathematics, and particularly in axiomatic set theory, S (clubsuit) is a family of combinatorial principles that are weaker version of the corresponding S; it was introduced in 1975 by A. Ostaszewski.
Definition
For a given cardinal number such that every A is a cofinal subset of for every unbounded subset is usually written as just . , there is a so that and a stationary set , is the statement that there is a sequence
and
It is clear that , and A. J. Ostaszewski showed in 1975 that + CH ; however, Saharon Shelah gave a proof in 1980 that there exists a model of in which CH does not hold, so and are not equivalent (since CH).
References
A. J. Ostaszewski, On countably compact perfectly normal spaces, Journal of London Mathematical Society, 1975 (2) 14, pp. 505-516. S. Shelah, Whitehead groups may not be free, even assuming CH, II, Israel Journal of Mathematics, 1980 (35) pp. 257-285.
Axiom of constructibility
298
Axiom of constructibility
The axiom of constructibility is a possible axiom for set theory in mathematics that asserts that every set is constructible. The axiom is usually written as V = L, where V and L denote the von Neumann universe and the constructible universe, respectively.
Implications
The axiom of constructibility implies the axiom of choice over ZermeloFraenkel set theory. It also settles many natural mathematical questions independent of ZermeloFraenkel set theory with the axiom of choice (ZFC). For example, the axiom of constructibility implies the generalized continuum hypothesis, the negation of Suslin's hypothesis, and the existence of an analytical (in fact, ) non-measurable set of real numbers, all of which are independent of ZFC. The axiom of constructibility implies the non-existence of those large cardinals with consistency strength greater or equal to 0#, which includes some "relatively small" large cardinals. Thus, no cardinal can be 1-Erds in L. While L does contain the initial ordinals of those large cardinals (when they exist in a supermodel of L), and they are still initial ordinals in L, it excludes the auxiliary structures (e.g. measures) which endow those cardinals with their large cardinal properties. Although the axiom of constructibility does resolve many set-theoretic questions, it is not typically accepted as an axiom for set theory in the same way as the ZFC axioms. Among set theorists of a realist bent, who believe that the axiom of constructibility is either true or false, most believe that it is false. This is in part because it seems unnecessarily "restrictive", as it allows only certain subsets of a given set, with no clear reason to believe that these are all of them. In part it is because the axiom is contradicted by sufficiently strong large cardinal axioms. This point of view is especially associated with the Cabal, or the "California school" as Saharon Shelah would have it.
References
Devlin, Keith (1984). Constructibility. Springer-Verlag. ISBN3-540-13258-9.
External links
How many real numbers are there? [1], Keith Devlin, Mathematical Association of America, June 2001
References
[1] http:/ / www. maa. org/ devlin/ devlin_6_01. html
299
Statement
A forcing or partially ordered set P is proper if for all regular uncountable cardinals stationary subsets of . , forcing with P preserves
The proper forcing axiom asserts that if P is proper and D is a dense subset of P for each <1, then there is a filter G P such that DG is nonempty for all <1. The class of proper forcings, to which PFA can be applied, is rather large. For example, standard arguments show that if P is ccc or -closed, then P is proper. If P is a countable support iteration of proper forcings, then P is proper. In general, proper forcings preserve .
Consequences
PFA directly implies its version for ccc forcings, Martin's axiom. In cardinal arithmetic, PFA implies PFA implies any two every automorphism of . -dense subsets of R are isomorphic, any two Aronszajn trees are club-isomorphic, and /fin is trivial. PFA implies that the Singular Cardinals Hypothesis holds. An
especially notable consequence proved by John R. Steel is that the axiom of determinacy holds in L(R), the smallest inner model containing the real numbers. Another consequence is the failure of square principles and hence existence of inner models with many Woodin cardinals.
Consistency strength
If there is a supercompact cardinal, then there is a model of set theory in which PFA holds. The proof uses the fact that proper forcings are preserved under countable support iteration, and the fact that if is supercompact, then there exists a Laver function for . It is not yet known how much large cardinal strength comes from PFA.
References
Jech, Thomas (2002). Set theory, third millennium edition (revised and expanded). Springer. ISBN3-540-44085-2. Steel, John R. (2005). "PFA implies AD^L(R)". Journal of Symbolic Logic 70 (4): 12551296. doi:10.2178/jsl/1129642125.
300
301
302
303
304
License
305
License
Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/