0% found this document useful (0 votes)
67 views250 pages

Gecseg Tree Automata

This document provides an introduction to tree automata and recognizable forests. It discusses topics like trees and forests, tree recognizers, regular tree grammars, operations on forests, regular expressions, Kleene's theorem, minimal tree recognizers, algebraic characterizations of recognizability, a Medvedev-type characterization, local forests, and some basic decision problems.

Uploaded by

jzam1919
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views250 pages

Gecseg Tree Automata

This document provides an introduction to tree automata and recognizable forests. It discusses topics like trees and forests, tree recognizers, regular tree grammars, operations on forests, regular expressions, Kleene's theorem, minimal tree recognizers, algebraic characterizations of recognizability, a Medvedev-type characterization, local forests, and some basic decision problems.

Uploaded by

jzam1919
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 250

TREE AUTOMATA

FERENC GÉCSEG MAGNUS STEINBY


Bolyai Institute Department of Mathematics
József Attila University University of Turku
Szeged, Hungary Turku, Finland
arXiv:1509.06233v1 [cs.FL] 21 Sep 2015
PREFACES

Preface to the Second Edition


When the present book was written in the early 1980s, the theory of tree automata, tree
languages and tree transformations was young but already quite extensive. Our aim
was to give a systematic and mathematically sound exposition of some central parts of
this subject. The presentation uses universal algebra in the spirit of J. R. Büchi and
J. B. Wright from whose ideas of automata as algebras tree automata once emerged. That
the algebraic formalism encourages and supports precise definitions and rigorous proofs
may explain why the book has remained a general reference for many mathematically
minded workers in the field ever since its publication in 1984. Unfortunately, it has long
been out of print and hard to obtain.
Soon after the regrettable death of Ferenc Gécseg in October 2014, Zoltán Fülöp
(Szeged) and Heiko Vogler (Dresden) proposed a reissue of this book. Akadémiai Kiadó,
the original publisher, did not find the project feasible but gave us free hands to proceed
on our own. Professor Gécseg’s family also willingly endorsed the idea. Since the book
did not exist in any electronic form, the whole text had to be retyped in Latex. For this
exacting task Fülöp and Vogler quickly assembled a highly qualified team that, besides
themselves, included Johanna Björklund (Umeå), Frank Drewes (Umeå), Zsolt Gazdag
(Budapest), Eija Jurvanen (Turku), Andreas Maletti (Stuttgart), George Rahonis (Thes-
saloniki), Kai Salomaa (Kingston, Ontario), and Sándor Vágvölgyi (Szeged). Professors
Fülöp and Vogler also undertook the overall management of the work. The generous
contributions of all these individuals are acknowledged with many thanks.
From the very beginning it was decided that this new edition should be true to the
original one. In particular, the terminology was preserved even in cases in which some
alternative terms have become prevailing. However, a few mistakes were corrected and
a couple of obscure passages were clarified.
Of course, the book was never claimed to offer a complete presentation of its subject
matter. In fact, some important topics were totally left out. It was hoped that the ex-
tensive bibliography, fairly complete up to around 1982, and the notes and references at
the end of each chapter would, at least partly, make up for the shortcomings. Now, over
thirty years later, the incompleteness is naturally even more obvious. Much progress has
been made in already established areas and many new topics have emerged. Some of the
new work is strongly motivated by applications, old or new. No book of this size could do
justice to all these developments. Instead, we have to trust that the matters presented
here still belong to the core of the theory and are worth studying by anyone who wants
to work in this field. Moreover, to account for more recent contributions and lines of re-
search, an appendix has been added to the book. In it several topics are briefly surveyed

i
and some relevant references are given to help an interested reader get started on them.
I thank Heiko Vogler and Zoltán Fülöp for some important additions to the bibliography.

Turku Magnus Steinby


August 2015

Preface to the Original Edition


The purpose of this book is to give a mathematically rigorous presentation of the theory
of tree automata, recognizable forests, and tree transformations. Apart from its intrinsic
interest this theory offers some new perspectives to various parts of mathematical lin-
guistics. It has also been applied to some decision problems of logic, and it provides tools
for syntactic pattern recognition. We have not even tried to discuss all aspects of the
subject or any of the applications, but enough central material has been included to give
the reader a firm basis for further studies. Being relatively new and very manyfaceted,
the field still lacks a uniform widely accepted formalism. We have chosen the language of
universal algebra as our vehicle of presentation. However, we have not assumed that the
reader is familiar with universal algebra; the preparatory sections in Chapter 1 should
make the book self-contained in this respect. On the other hand, it is natural to assume
that anyone interested in such a book has some general mathematical training and some
knowledge of finite automata and formal languages.
The book consists of four chapters, a bibliography and an index. The first chapter
contains an exposition of the necessary universal algebra and lattice theory, as well as a
quick review of finite automata and formal languages. We also recommend some books
on these subjects. In Chapter 2 trees, forests, tree recognizers, tree grammars, and some
operations on forests are introduced. Several characterizations and closure properties
of recognizable forests are presented. Chapter 3 is devoted to the connections between
recognizable forests and context-free languages. Chapter 4 deals with tree transducers
and tree transformations. Chapters 2–4 contain some exercises. Each of these chapters
is concluded with some historical and bibliographical comments. We also point out some
topics not discussed in the book. We have tried to make the Bibliography as complete as
possible. Of course, it has not always been easy to decide whether a given item should
be included or not.
We want to thank our colleagues and the staffs at our institutions for the good working
atmosphere in which this book was written. Dr. András Ádám and Professor István Peák
gave the text a careful scrutiny. We gratefully acknowledge their many remarks. We are
also indebted to Dr. Zoltán Ésik for his very helpful comments on Chapter 4. We wish to
express our warmest thanks to Mrs. Piroska Folberth for performing very competently
the difficult task of typing the manuscript. Also, we want to thank our wives and
daughters for their support and for putting so gracefully up with the inconveniences
inevitably caused by our undertaking.
The writing of the book has involved several trips between Turku and Szeged. We
gratefully acknowledge the financial support provided by the Academy of Finland, the

ii
Hungarian Academy of Sciences, the János Bolyai Mathematical Society, the University
of Szeged, and the University of Turku. Our work was also furthered by a possibility for
the first-named author to spend a term at the Tampere University of Technology. For
this thanks are due Professor Timo Lepistö.

iii
Contents

Prefaces i

Notes to the reader 3

1 PRELIMINARIES 5
1.1 Sets, relations and mappings . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Universal algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Terms, polynomial functions and free algebras . . . . . . . . . . . . . . . . 16
1.4 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5 Finite recognizers and regular languages . . . . . . . . . . . . . . . . . . . 27
1.6 Grammars and context-free languages . . . . . . . . . . . . . . . . . . . . 35
1.7 Sequential machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS 47


2.1 Trees and forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.2 Tree recognizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.3 Regular tree grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.4 Operations on forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.5 Regular expressions. Kleene’s theorem . . . . . . . . . . . . . . . . . . . . 75
2.6 Minimal tree recognizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.7 Algebraic characterizations of recognizability . . . . . . . . . . . . . . . . 85
2.8 A Medvedev-type characterization . . . . . . . . . . . . . . . . . . . . . . 93
2.9 Local forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.10 Some basic decision problems . . . . . . . . . . . . . . . . . . . . . . . . . 99
2.11 Deterministic R-recognizers . . . . . . . . . . . . . . . . . . . . . . . . . . 102
2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
2.13 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

3 CONTEXT-FREE LANGUAGES AND TREE RECOGNIZERS 117


3.1 The yield function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.2 Context-free languages and recognizable forests . . . . . . . . . . . . . . . 119
3.3 Further results and applications . . . . . . . . . . . . . . . . . . . . . . . . 122
3.4 Another way to recognize CF languages . . . . . . . . . . . . . . . . . . . 125
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
3.6 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

1
Contents

4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS 131


4.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.2 Some classes of tree transformations . . . . . . . . . . . . . . . . . . . . . 142
4.3 Compositions and decompositions of tree transformations . . . . . . . . . 147
4.4 Tree transducers with regular look-ahead . . . . . . . . . . . . . . . . . . 154
4.5 Generalized syntax directed translators . . . . . . . . . . . . . . . . . . . . 162
4.6 Surface forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
4.7 Auxiliary concepts and results . . . . . . . . . . . . . . . . . . . . . . . . . 173
4.8 The hierarchies of tree transformations, surface forests and
transformational languages . . . . . . . . . . . . . . . . . . . . . . . . . . 188
4.9 The equivalence of tree transducers . . . . . . . . . . . . . . . . . . . . . . 193
4.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
4.11 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

Bibliography 207

Index 225

5 APPENDIX 233

2
NOTES TO THE READER
Within each section, there is one counter which is incremented by each of the environ-
ments definition, lemma, theorem, corollary, and example. The end of a proof or an
example is indicated by the mark ✷. It appears immediately after a theorem, lemma or
corollary if this is not followed by a proof. The references to the literature are by the
author(s) and the number with which the publication occurs in the Bibliography. In a
few cases we refer to a book mentioned at the end of Chapter 1.

3
1 PRELIMINARIES
In this chapter we shall review some basic concepts and results from the theories of
automata, formal languages, and universal algebras. It is reasonable to assume that
a potential reader of this book already knows something about automata and formal
languages. On the other hand, we do not presuppose any knowledge of universal algebra.
These two assumptions suggested the styles and extents of the following seven sections.
Section 1.1 (Sets, relations and mappings) may be skimmed through for terminology
and notation.
Sections 1.2 and 1.3 present the required universal algebraic concepts and results.
These are not many, but they should be mastered well as the very basic concepts of the
theory of tree automata are defined in terms of universal algebra. We have tried to make
the book self-contained in this respect, but a reader who wants to pursue further the
algebraic aspects of the theory should certainly consult one of the references on universal
algebra.
The lattice theory presented in Section 1.4 is less important here, and the reading of
this section may be postponed until needed.
Sections 1.5, 1.6 and 1.7 survey some of the most essential facts about finite recognizers,
regular languages, context-free grammars, and (generalized) sequential machines. A
reader less familiar with these matters would do wisely to look up these subjects in some
of the references given at the end of the chapter.

1.1 SETS, RELATIONS AND MAPPINGS


The set theory needed here is very elementary and most of our set theoretic notation is
well-known. However, a few conventions should be pointed out:

(i) A ⊆ B means that the set A is a subset of the set B. Proper inclusion is denoted
by A ⊂ B.

(ii) ∅ denotes the empty set.

(iii) |A| denotes the cardinality of the set A.

(iv) The power set of a set A, i.e., the set of all subsets of A, is denoted by pA.

(v) TheSunion of a family (Ai | T


i ∈ I) of subsets (indexed by I) of some set is written
as (Ai | i ∈ I). Similarly, (Ai | i ∈ I) is the intersection.

(vi) The set {x ∈ A | P1 (x), . . . , Pk (x)} of all elements x in A with the properties P1 ,
. . . , Pk may also be written as {x | P1 (x), . . . , Pk (x)} when A is understood from

5
1 PRELIMINARIES

the context. We shall use this notation in the following more general form, too.
Suppose f (x1 , . . . , xm ) is an object defined in some way in terms of the objects x1 ,
. . . , xm . Then
{f (x1 , . . . , xm ) | P (x1 , . . . , xm )}
is the set of all such objects constructed from objects x1 , . . . , xm satisfying the
condition P (x1 , . . . , xm ). Furthermore, we use

{f1 (x1 , . . . , xm ), . . . , fk (x1 , . . . , xm ) | P (x1 , . . . , xm )}

as a short form for the union

{f1 (x1 , . . . , xm ) | P (x1 , . . . , xm )} ∪ · · · ∪ {fk (x1 , . . . , xm ) | P (x1 , . . . , xm )}.

(vii) If there is no danger of confusion, we may write simply a for the one-element set
{a}. Of course, we should not write ∅ for {∅}.

Sometimes we employ some notation from logic as abbreviations:

(i) “(∀x ∈ A) P (x)” states that P (x) holds for all x ∈ A.

(ii) “(∃x ∈ A) P (x)” states that there exists an x in A such that P (x) holds.

(iii) “P =⇒ Q” means that Q holds if P holds.

(iv) “P ⇐⇒ Q” states that the conditions P and Q are equivalent, i.e., both of them
hold or then neither one holds.

(v) “P ∧ Q” is the statement that both P and Q hold. Similarly, “P ∨ Q” states that
at least one of P and Q holds.

The numbers dealt with here are always integers and mostly even non-negative integers.
When we write “. . . for all n ≥ 1” we mean, in fact, “. . . for all integers n ≥ 1”. The
set of all integers is denoted by Z, the set of the natural numbers 1, 2, . . . by N, and
the set of all non-negative integers by N0 .
Let A and B be sets and ̺ ⊆ A × B a (binary) relation from A to B. The fact that
(a, b) ∈ ̺ (a ∈ A, b ∈ B) is also expressed by writing a̺b or a ≡ b (̺). The opposite case
may be expressed by a6 ̺ b or by a 6≡ b (̺). For any a ∈ A, we put

a̺ = {b ∈ B | a̺b}.

This notation is extended to subsets of A:


[
A1 ̺ = (a̺ | a ∈ A1 ) for A1 ⊆ A.

The converse of ̺ is the relation

̺−1 = {(b, a) | (a, b) ∈ ̺} ⊆ B × A.

6
1.1 Sets, relations and mappings

Obviously,
b̺−1 = {a ∈ A | a̺b}
and
B1 ̺−1 = {a ∈ A | (∃b ∈ B1 )a̺b}
for all b ∈ B and B1 ⊆ B. The domain of ̺ is the subset dom(̺) = B̺−1 of A, and its
range is the subset range(̺) = A̺ of B.
The product or composition of two relations ̺ ⊆ A × B and τ ⊆ B × C is the relation

̺ ◦ τ = {(a, c) | (∃b ∈ B)a̺bτ c} ⊆ A × C.

In this definition we used the short form a̺bτ c to express the fact that a̺b and bτ c.
Often we write ̺τ for ̺ ◦ τ . The product of relations is associative. We note also the
equality (̺ ◦ τ )−1 = τ −1 ◦ ̺−1 .
Consider now (binary) relations on a set A, i.e. subsets of A × A. These include the
diagonal relation δA = {(a, a) | a ∈ A} and the total relation ιA = A × A. For any
relation ̺ on A we define the powers ̺n (n ≥ 0) with respect to the product of relations:

1◦ ̺0 = δA and
◦ n+1 n
2 ̺ =̺ ◦̺ for n ≥ 0.

The relation ̺ ⊆ A × A is called

(a) reflexive if δA ⊆ ̺,

(b) symmetric if ̺−1 ⊆ ̺,

(c) antisymmetric if ̺ ∩ ̺−1 ⊆ δA and

(d) transitive if ̺2 ⊆ ̺.

The intersection of any reflexive relations (on a given A) is reflexive, and the intersec-
tion of transitive relations is transitive. Thus there exists for every ̺ ⊆ A × A a unique
minimal reflexive, transitive relation ̺∗ containing ̺. It is called the reflexive, transitive
closure of ̺. One verifies easily that

̺∗ = δA ∪ ̺ ∪ ̺2 ∪ ̺3 ∪ . . . ,

i.e., for any a, b ∈ A we have a̺∗ b iff

a = a1 ̺ a2 ̺ a3 . . . an−1 ̺ an = b

for some n ≥ 1 and a1 , . . . , an ∈ A.


A relation on A is called an equivalence relation on A, if it is reflexive, symmetric
and transitive. The set of all equivalence relations on A is denoted by E(A). Clearly,
δA ∈ E(A) and ιA ∈ E(A). Let ̺ be an equivalence relation on A. The ̺-class (or
the equivalence class modulo ̺) of an element a ∈ A is the set a̺. Obviously, a̺b iff

7
1 PRELIMINARIES

a̺ = b̺. We shall also write a/̺ for a̺ and extend this notation to subsets A1 ⊆ A
and n-tuples a = (a1 , . . . , an ) of elements of A (n ≥ 1): A1 /̺ = {a/̺ | a ∈ A1 } and
a/̺ = (a1 /̺, . . . , an /̺). The quotient set of A modulo ̺ is A/̺. Obviously, A/̺ is
a partition on A, that is, every element of A belongs to exactly one ̺-class. On the
other hand, every partition on A can be obtained this way as the quotient set from a
unique equivalence relation and there is a natural one-to-one correspondence between
the partitions on A and E(A). The cardinality of A/̺ is called the index of ̺ ∈ E(A).
If |A/̺| is finite, we say that ̺ is of finite index. We say that ̺ ∈ E(A) saturates the
subset H ⊆ A if H̺ = H, i.e., if H is the union of some ̺-classes.
A mapping or a function from a set A to a set B is a triple (A, B, ϕ), where ϕ ⊆ A × B
is a relation such that for every a ∈ A there exists exactly one b ∈ B satisfying aϕb. As
usual we write ϕ : A → B and say that ϕ is a mapping from A to B. If aϕb (a ∈ A,
b ∈ B), b is called the image of a and a an inverse image of b. This is expressed by
writing b = aϕ, b = ϕ(a) or ϕ : a 7→ b. For a subset A1 of A we also use the two notations
A1 ϕ and ϕ(A1 ) for the set {aϕ | a ∈ A1 }. The converse ϕ−1 of ϕ is always defined as
a relation (⊆ B × A), but it is usually not a mapping from B to A. Again, ϕ−1 (B1 )
will sometimes be used instead of B1 ϕ−1 when B1 ⊆ B. Note that dom(ϕ) = A and
range(ϕ) ⊆ B. The set of all mappings from A to B is denoted by B A .
The composition or product of two mappings ϕ : A → B and ψ : B → C is the mapping

ϕψ : A → C

where ϕψ is the product of ϕ and ψ as relations. Clearly, aϕψ = (aϕ)ψ for all a ∈ A.
The restriction of a mapping ϕ : A → B to a subset C of A is the mapping

ϕ|C : C → B

where ϕ|C = ϕ ∩ (C × B). If ψ : C → B is obtained from ϕ : A → B as the restriction


of ϕ to C, i.e., C ⊆ A and ψ = ϕ|C, then we say also that ϕ is an extension of ψ to A.
The kernel ϕϕ−1 of a mapping ϕ : A → B is an equivalence relation on A and
a1 ≡ a2 (ϕϕ−1 ) iff a1 ϕ = a2 ϕ (a1 , a2 ∈ A). On the other hand, one can associate
with every θ ∈ E(A) a mapping

θ ♮ : A → A/θ, a 7→ aθ, (a ∈ A)

such that the kernel of θ ♮ is θ. This θ ♮ is called the natural mapping associated with θ.
A mapping ϕ : A → B is called

(i) injective (or an injection), if ϕϕ−1 = δA ,

(ii) surjective (or a surjection), if range(ϕ) = B, and

(iii) bijective (or an bijection), if it is injective and surjective.

If ϕ : A → B is surjective, one says also that ϕ is a mapping of A onto B. It is obvious


that the natural mapping θ ♮ is always surjective (θ ∈ E(A)). The diagonal relation of a
set A defines the identity mapping A → A, a 7→ a (a ∈ A). It is denoted by 1A .

8
1.1 Sets, relations and mappings

We shall also meet partial mappings, that is, mappings for which the image of some
elements may be undefined. A partial mapping from A to B is defined by a relation
ϕ ⊆ A × B such that |aϕ| ≤ 1 for all a ∈ A. Again, we write ϕ : A → B. If aϕ = ∅, then
we say that ϕ is undefined for a (a ∈ A). The notations and terminology introduced
above for mappings apply to partial mappings, too, although dom(ϕ) may be a proper
subset of A when ϕ : A → B is a partial mapping.
It is convenient to think of the elements of a cartesian product A1 ×· · ·×An as n-tuples
(a1 , . . . , an ) with a1 ∈ A1 , . . . , an ∈ An . We adopt the definition of an ordinal number n
as the set of all ordinals smaller that n: 0 = ∅, 1 = {0}, 2 = {0, 1} etc. and, in general,
n = {0, 1, . . . , n − 1}. Then A1 × · · · × An can also be defined as the set of all mappings
ϕ : n → A1 ∪ · · · ∪ An
such that iϕ ∈ Ai+1 for i = 0, 1, . . . , n − 1. Of course, we may identify such a ϕ with
the n-tuple (0ϕ, 1ϕ, . . . , (n − 1)ϕ). Now the cartesian power An = A × · · · × A (n times)
is the set of all mappings ϕ : n → A. In particular, A0 = {∅} since ∅ is the only mapping
from ∅ to A. Note that the notation An is consistent with our earlier notation B A for
the set of all mappings from A to B.
We shall also need countably infinite sequences of elements. Let ω = {0, 1, 2, . . . } be
the smallest infinite ordinal and A any set. The elements of Aω are called ω-sequences.
Thus an ω-sequence of elements of A is a mapping
ϕ: ω → A
which we may also write as
(0ϕ, 1ϕ, . . . , nϕ, . . . )n<ω .
We conclude the section by considering operations. These are special mappings and
are among the most fundamental concepts of algebra. Let m ≥ 0. An m-ary operation
on a set A is a mapping from Am to A. If ϕ : Am → A is an m-ary operation on A, then
ϕ assigns to every m-tuple (a1 , . . . , am ) of elements of A a unique element of A which
we write as ϕ(a1 , . . . , am ). The number m is called the arity or the rank of ϕ. Most
operations encountered in the usual algebraic systems (groups, rings, lattices etc.) have
rank 0, 1 or 2. A few comments on these special cases:
(i) A 0-ary operation ϕ : {∅} → A is completely determined by its only image ϕ(∅),
and often ϕ is given simply by naming this element. Note that here ∅ may also be
seen as the empty sequence of elements, and often one writes ϕ( ), or just ϕ, for
ϕ(∅).
(ii) When m = 1, we have a mapping from A to itself. Such operations are called
unary.
(iii) An operation of rank 2 is called a binary operation. For example, the addition and
the multiplication in a ring are binary operations. In most such concrete examples
one uses the infix notation for binary operations. Thus it is customary to write the
ring operations in the form a + b and a · b instead of +(a, b) and ·(a, b), respectively.

9
1 PRELIMINARIES

A partial m-ary operation on a set A is a partial mapping from Am to A. For any


partial m-ary operation ϕ : Am → A and subset B of A we have a partial mapping

ϕ|B : B m → B,

where ϕ|B = ϕ ∩ (B m × B). If ϕ is an operation and B is closed with respect to ϕ,


i.e., ϕ(a1 , . . . , am ) ∈ B whenever a1 , . . . , am ∈ B, then ϕ|B is an m-ary operation on B
called the restriction of ϕ to B. Often the same symbol is used to denote an operation
and its restrictions.
Suppose we are given a set A, k m-ary operations ϕ1 , . . . ,ϕk on A and a k-ary operation
ψ on A (m, k ≥ 0). The composition of ϕ1 , . . . , ϕk with ψ is the m-ary operation
ψ(ϕ1 , . . . , ϕk ) defined so that

ψ(ϕ1 , . . . , ϕk )(a1 , . . . , am ) = ψ(ϕ1 (a1 , . . . , am ), . . . , ϕk (a1 , . . . , am ))

for all a1 , . . . , am ∈ A. Note that the possibilities k = 0 or m = 0 are included. If k = 0,


then the composition is an m-ary operation with the constant image ψ(∅). If m = 0,
then the composition is a 0-ary operation with the single value ψ(ϕ1 (∅), . . . , ϕk (∅)).
Let ϕ be an m-ary operation on a set A and A1 , . . . , Am any subsets of A. Then we
write
ϕ(A1 , . . . , Am ) = {ϕ(a1 , . . . , am ) | a1 ∈ A1 , . . . , am ∈ Am }.
Thus ϕ is extended to an m-ary operation on the power set pA. In general, there is no
need to introduce a new notation for this extension.

1.2 UNIVERSAL ALGEBRAS


In this and the next section some concepts and results from universal algebra are sur-
veyed. Universal algebra is an extensive field of mathematics, but we need really just
certain basic parts of it. On the other hand, a good grasp of the material of these sections
is essential to an understanding of the rest of the book.
Generally speaking, an algebra (or a universal algebra) is a set together with a set of
operations on this set. There may be a finite or an infinite number of operations, but we
insist that they all are finitary, i.e., the ranks are finite as in the definition of operations
given in the previous section. As a first example we consider the algebra of subsets of
a given set U . In the power set pU we have several naturally defined operations. For
example, there is a binary operation ∪ that forms the union A∪B of any two A, B ∈ pU .
Similarly, we have the binary operation ∩ that forms the intersection of two subsets of U .
A unary operation is obtained if we map every A ∈ pU to its complement Ac = U − A.
Furthermore, we introduce two 0-ary operations, one that has ∅ and one that has U as
its image. Of course, an infinite number of operations could be defined on pU , but if we
restrict ourselves to those defined above, we get the algebra

(pU, ∪, ∩, c , ∅, U )

10
1.2 Universal algebras

with two binary, one unary and two 0-ary operations. Note that we get such an algebra
for each set U . In fact, all of these algebras can be viewed as special instances of a
general class of algebras known as Boolean algebras.
The example brings forth an important point. In algebra, and this will be the case
here, too, one is generally not interested just in individual algebras, but rather in whole
classes of algebras. Algebras in such a class are all “similar” in the sense that there
is a natural correspondence between the operations of any two algebras of the class.
Such a correspondence of operations is needed when one defines any concept, such as
homomorphisms or direct products, involving more than one algebra. For example, the
multiplications of any two groups correspond to each other, and a homomorphism of
groups should preserve the multiplication. We shall now introduce a convenient vehicle
to define such a class of similar algebras.

Definition 1.2.1 An operator domain is a set Σ together with a mapping

r : Σ → N0

that assigns to every σ ∈ Σ an arity, or rank, r(σ). For any m ≥ 0,

Σm = {σ ∈ Σ | r(σ) = m}

is the set of the m-ary operators (or operational symbols).

From now on Σ is an operator domain. The mapping r is usually not mentioned, but
we denote by r(Σ) the set of all m ≥ 0 such that Σm 6= ∅. One can write Σ as the
disjoint union Σ0 ∪ Σ1 ∪ Σ2 ∪ . . . from which the empty sets will be omitted.

Definition 1.2.2 A Σ-algebra A is a pair consisting of a nonempty set A (of elements


of A) and a mapping that assigns to every operator σ ∈ Σ an m-ary operation

σ A : Am → A,

where m is the arity of σ. The operation σ A is called the realization of σ in A. The


mapping σ 7→ σ A will not be mentioned explicitly, but we write A = (A, Σ). The
Σ-algebra A is finite if A is finite, and it is of finite type if Σ is finite. When Σ is not
specified, or not emphasized, we speak simply about “algebras”. An algebra with just
one element is called trivial.
In general, A = (A, Σ), B = (B, Σ) and C = (C, Σ), possibly equipped with subscripts,
will be Σ-algebras. The realizations of an operator σ ∈ Σ in these algebras are denoted
by σ A , σ B and σ C , respectively.

In the previous example of subset algebras we would have Σ = Σ0 ∪ Σ1 ∪ Σ2 with


(for example) Σ0 = {0, 1}, Σ1 = {¬} and Σ2 = {∧, ∨}. The algebra of the subsets of a
set U is then the Σ-algebra A, where A = pU and the operators are realized as follows:
0A = ∅, 1A = U , ¬A = c (complement in U ), ∧A = ∩ (intersection) and ∨A = ∪ (union).

11
1 PRELIMINARIES

Note that the possibility m = 0 is not excluded when we consider generally an m-ary
operation. For σ ∈ Σ0 one often writes σ A instead of σ A ( ) or σ A (∅) (this involves the
harmless confusion of a 0-ary operation and its value). When Σ = {σ1 , . . . , σk } is finite,
one usually writes A = (A, σ1 , . . . , σk ) instead of A = (A, Σ).
We introduce now several concepts related to algebras.

Definition 1.2.3 The Σ-algebra B is a subalgebra of the Σ-algebra A if B ⊆ A and


σ B = σ A |B for all σ ∈ Σ.

If B is a subalgebra of A, then B is a closed subset of A, i.e., σ A (b1 , . . . , bm ) ∈ B for all


σ ∈ Σm (m ≥ 0) and b1 , . . . , bm ∈ B. For every nonempty closed subset B of A, there is
exactly one way to realize the operators on B in such a way that we get a subalgebra B
of A: obviously every σ B should be the restriction σ A |B of the corresponding operation
of A to B. Hence, a subalgebra is completely determined by its set of elements and one
may call this subset a subalgebra. If σ is a 0-ary operator, then every subalgebra of A
contains the element σ A . If Σ0 is empty, then ∅ is a closed subset, but we do not count
it among the subalgebras.
It is easy to see that the intersection of any family of closed subsets of a given algebra A
is again closed. Thus we have for any H ⊆ A a unique minimal closed subset containing
H: \
[H] = (B | H ⊆ B ⊆ A, B closed).
If H 6= ∅ or Σ0 6= ∅, then [H] is also nonempty and thus a subalgebra. It is called
the subalgebra generated by H. If Σ0 = ∅, then [∅] = ∅. A generating set of A is a
subset H ⊆ A such that [H] = A and A is said to be finitely generated if it has a finite
generating set. It is clear that every finite algebra is finitely generated.

Definition 1.2.4 A homomorphism from a Σ-algebra A to a Σ-algebra B is a mapping


ϕ : A → B such that for all m ≥ 0, σ ∈ Σm and a1 , . . . , am ∈ A,

σ A (a1 , . . . , am )ϕ = σ B (a1 ϕ, . . . , am ϕ).

We write then ϕ : A → B. This homomorphism is called


(a) an epimorphism, if ϕ is surjective,

(b) a monomorphism, if ϕ is injective, and

(c) an isomorphism, if ϕ is bijective.

If there exists an epimorphism from A to B, then B is said to be an epimorphic image


of A. A monomorphism is also called an embedding. If there is an isomorphism from A
to B, then A and B are isomorphic and we write A ∼
= B. Homomorphisms are often also
called morphisms.
If A ∼= B, then A and B are the same algebra from the abstract point of view. An
easy computation shows that the composition ϕψ of two homomorphisms ϕ : A → B
and ψ : B → C is a homomorphism from A to C.

12
1.2 Universal algebras

A homomorphism is a mapping that is compatible with the operations of the algebras.


For example, let A = (Z, +) be the algebra of the integers with the usual addition as the
only operation, n ≥ 1 and B = (Zn , +) the algebra where Zn = {0, 1, . . . , n − 1} and the
sum is formed modulo n. Then the mapping ϕ : Z → Zn that maps every a ∈ Z to its
remainder rn (a) modulo n (0 ≤ rn (a) < n) is an epimorphism from A to B. Of course,
the homomorphisms defined in group theory, lattice theory etc. provide further general
examples.
The proof of the following lemma is straightforward and thus it is omitted.

Lemma 1.2.5 Let ϕ : A → B be a homomorphism. If C is a subalgebra of A, then Cϕ


is a subalgebra of B. If D is a subalgebra of B and Dϕ−1 is nonempty, then Dϕ−1 is a
subalgebra of A. ✷

The following lemma contains an important observation.

Lemma 1.2.6 Let ϕ : A → B and ψ : A → B be two homomorphisms and H a generat-


ing set of A. If ϕ|H = ψ|H, then ϕ = ψ. In other words, a homomorphism is completely
determined by its restriction to a generating set.

Proof. Let C = {a ∈ A | aϕ = aψ}. Then H ⊆ C by the assumption. If m ≥ 0, σ ∈ Σm


and a1 , . . . , am ∈ C, then σ A (a1 , . . . , am ) ∈ C:

σ A (a1 , . . . , am )ϕ = σ B (a1 ϕ, . . . , am ϕ) = σ B (a1 ψ, . . . , am ψ) = σ A (a1 , . . . , am )ψ.

Hence C is closed and we get C = A. This implies ϕ = ψ. ✷

We define now two concepts closely related to homomorphisms, namely congruences


and quotient algebras.

Definition 1.2.7 A congruence (relation) of A is an equivalence relation on A which is


invariant with respect to all operations σ A (σ ∈ Σ). A relation ̺ ⊆ A × A is said to be
invariant with respect to an m-ary operation f : Am → A if

f (a1 , . . . , am ) ≡ f (b1 , . . . , bm ) (̺)

for all elements a1 , . . . , am , b1 , . . . , bm ∈ A such that

a1 ≡ b1 , . . . , am ≡ bm (̺).

The set of all congruences of an algebra A is denoted by C(A).

Every algebra A has at least the trivial congruences δA and ιA . For ̺ ∈ C(A), the
̺-class a̺ of an element a ∈ A is also called a congruence class (modulo ̺). The partition
A/̺ of A defined by the congruence classes is compatible in the sense that for all m ≥ 0,
σ ∈ Σm and a1 ̺, . . . , am ̺ ∈ A/̺ there is a class a̺ such that

σ A (a1 ̺, . . . , am ̺) ⊆ a̺.

13
1 PRELIMINARIES

Obviously, we can choose a = σ A (a1 , . . . , am ). It is also easy to see that an equivalence


relation ̺ ∈ E(A) is a congruence of A only in case A/̺ is a compatible partition. In
fact, in automata theory it is usual to deal with compatible partitions (also called SP
partitions) rather than with congruences, but both concepts convey the same idea.
The fact that A/̺ is a compatible partition for any ̺ ∈ C(A) also justifies the following
definition; the operations are well-defined.

Definition 1.2.8 The quotient algebra A/̺ = (A/̺, Σ) of a Σ-algebra A by a congru-


ence ̺ ∈ C(A) is defined as follows. For any m ≥ 0, σ ∈ Σm and a1 , . . . , am ∈ A we
put
σ A/̺ (a1 ̺, . . . , am ̺) = σ A (a1 , . . . , am )̺.

The definition of σ A/̺ may be explained as follows. To compute σ A/̺ (a1 ̺, . . . , am ̺)


one takes a representative from each of the ̺-classes, say a1 , . . . , am , computes σ A for
the representatives and forms then the ̺-class of the resulting element.
Homomorphisms, congruences and quotient algebras are closely related to each other
as the following three theorems show.

Theorem 1.2.9 For any ̺ ∈ C(A), the natural mapping ̺♮ : a 7→ a̺ is an epimorphism


A → A/̺ (the natural homomorphism).

Proof. We know that ̺♮ is a surjection from A to A/̺ so it suffices to verify that it is


a homomorphism: for all m ≥ 0, σ ∈ Σm and a1 , . . . , am ∈ A,

σ A (a1 , . . . , am )̺♮ = σ A (a1 , . . . , am )̺ = σ A/̺ (a1 ̺, . . . , am ̺) =


σ A/̺ (a1 ̺♮ , . . . , am ̺♮ ). ✷

Theorem 1.2.10 The kernel ϕϕ−1 of any homomorphism ϕ : A → B is a congruence


of A.

Proof. Consider any m ≥ 0, σ ∈ Σm and elements a1 , . . . , am , a′1 , . . . , a′m ∈ A such


that
a1 ≡ a′1 , . . . , am ≡ a′m (ϕϕ−1 ).
Then a1 ϕ = a′1 ϕ, . . . , am ϕ = a′m ϕ, which implies σ A (a1 , . . . , am )ϕ = σ B (a1 ϕ, . . . , am ϕ) =
σ B (a′1 ϕ, . . . , a′m ϕ) = σ A (a′1 , . . . , a′m )ϕ. This means that σ A (a1 , . . . , am ) ≡ σ A (a′1 , . . . , a′m )
(ϕϕ−1 ) as required. ✷

Theorem 1.2.11 Every epimorphic image of an algebra A is isomorphic to some quo-


tient algebra of A.

Proof. Let ϕ : A → B be an epimorphism and θ = ϕϕ−1 its kernel. We claim that


B∼
= A/θ. The required isomorphism A/θ → B is shown to be given by

ψ : aθ 7→ aϕ (a ∈ A).

14
1.2 Universal algebras

For any a1 , a2 ∈ A,
a1 θψ = a2 θψ iff a1 ϕ = a2 ϕ
iff a1 ≡ a2 (θ).
This shows that ψ is well-defined (i.e., aθψ is independent of the choice of the represen-
tative a ∈ A of the θ-class aθ) and injective. Since ϕ is surjective, it is clear that ψ is
surjective, too. It remains to be shown that ψ is a homomorphism. Let m ≥ 0, σ ∈ Σm
and a1 , . . . , am ∈ A. Then
σ A/θ (a1 θ, . . . , am θ)ψ = σ A (a1 , . . . , am )θψ
= σ A (a1 , . . . , am )ϕ
= σ B (a1 ϕ, . . . , am ϕ)
= σ B (a1 θψ, . . . , am θψ). ✷

Taken together, Theorems 1.2.9 and 1.2.11 say that the epimorphic images of an algebra
are exactly its quotient algebras (when one does not distinguish between isomorphic
algebras).
Next, direct products of algebras are introduced. We may restrict ourselves to the case
of a finite number of factors.
Definition 1.2.12 The direct product of two Σ-algebras A and B is the Σ-algebra
A × B = (A × B, Σ),
where the operations are defined so that
σ A×B ((a1 , b1 ), . . . , (am , bm )) = (σ A (a1 , . . . , am ), σ B (b1 , . . . , bm ))
for all m ≥ 0, σ ∈ Σm and (a1 , b1 ), . . . , (am , bm ) ∈ A × B. The kth (k ≥ 0) direct power
Ak of the Σ-algebra A is defined inductively:
(i) A0 = ({∅}, Σ) is the trivial Σ-algebra.
(ii) Ak+1 = Ak × A for all k ≥ 0.
It is easy to see that direct products are associative in the sense that (A × B) × C ∼ =
A × (B × C) for all A, B and C. Both of these products can be written simply as
A × B × C and their elements may be identified with the triples (a, b, c) with a ∈ A,
b ∈ B and c ∈ C. More generally, one can define the direct product A1 × · · · × Ak of k
(k ≥ 0) Σ-algebras as an algebra with A1 × · · · × Ak as its set of elements and operations
performed componentwise. It is easy to see that the projections
πi : A1 × · · · × Ak → Ai , (a1 , . . . , ak ) 7→ ai
(i = 1, . . . , k) are epimorphisms from A1 × · · · × Ak to the respective factor algebras Ai .
Hence, every factor in a direct product is an epimorphic image of the direct product.
We shall also need the following, perhaps, less usual, way to construct a new algebra
from a given one.

15
1 PRELIMINARIES

Definition 1.2.13 The subset algebra (or power algebra) pA = (pA, Σ) of a Σ-algebra
A is defined as follows. If m ≥ 0, σ ∈ Σm and H1 , . . . , Hm ∈ pA, then put

σ pA (H1 , . . . , Hm ) = σ A (H1 , . . . , Hm ).

Note that the singleton sets {a} (a ∈ A) form in pA a subalgebra isomorphic to A. If


Σ0 = ∅, pA has the trivial subalgebra {∅}.
We conclude this section with a simple example illustrating these constructions.

Example 1.2.14 Suppose Σ consists of one binary operator σ and a nullary operator
γ. Let A = ({a, b}, Σ) be a Σ-algebra such that γ A = a and σ A (a, a) = σ A (a, b) =
σ A (b, a) = a, σ A (b, b) = b. Consider first the direct power A2 = A × A. If we write aa
for (a, a) etc., then γ A×A = aa and σ A×A is given by the following multiplication table:

σ A×A aa ab ba bb
aa aa aa aa aa
ab aa ab aa ab
ba aa aa ba ba
bb aa ab ba bb

Let us now construct the subset algebra. The value of the 0-ary operation is γ pA = {a}
and the operation σ pA is given by the table below.

σ pA ∅ {a} {b} {a, b}


∅ ∅ ∅ ∅ ∅
{a} ∅ {a} {a} {a}
{b} ∅ {a} {b} {a, b}
{a, b} ∅ {a} {a, b} {a, b}

1.3 TERMS, POLYNOMIAL FUNCTIONS AND FREE


ALGEBRAS
The concepts “term” and “polynomial function” are all-important in our modelling of
the theory of tree automata. Let us consider an introductory example. An expression
like (x + y)(y + z), such expressions are called terms, represents in a natural manner a
function of the three variables x, y, and z. Two things should be pointed out here. First
of all, the term defines such a function in any algebra with operations denoted by the
operators appearing in the term. In our case it could define, for example, a mapping
Z3 → Z or a mapping R3 → R depending on whether the addition and multiplication
are interpreted as those of integers or those of real numbers. Generally speaking, the
terms are determined by the operator domain, but they define operations in all algebras
with that operator domain. Secondly, we note that the term not only defines a function,
but it also describes a way to compute its values from the values of the variables once

16
1.3 Terms, polynomial functions and free algebras

the operations of the algebra in question are known. In fact, algebras can be viewed as
devices that evaluate terms. When we interpret (in Chapter 2) terms as trees, the step
from algebras to tree automata is not long.
From now on, X will be a set disjoint from the operator domain Σ. The elements of
X are called variables. Other symbols used for sets of variables are Y and Z.

Definition 1.3.1 The set FΣ (X) of Σ-terms in X, or ΣX-terms for short, is defined as
follows:

(i) X ⊆ FΣ (X),

(ii) σ(t1 , . . . , tm ) ∈ FΣ (X) whenever m ≥ 0, σ ∈ Σm and t1 , . . . , tm ∈ FΣ (X), and

(iii) every ΣX-term can be obtained by applying the rules (i) and (ii) a finite number
of times.

If σ is a 0-ary operator, then we get by rule (ii) the ΣX-term σ( ). It is convenient


to write just σ for such a term. Then the definition of FΣ (X) may be reformulated as
follows.

Definition 1.3.1’ The set FΣ (X) of ΣX-terms is defined as follows:

(i) X ∪ Σ0 ⊆ FΣ (X),

(ii) σ(t1 , . . . , tm ) ∈ FΣ (X) whenever m > 0, σ ∈ Σm and t1 , . . . , tm ∈ FΣ (X), and

(iii) every ΣX-term can be obtained by applying the rules (i) and (ii) a finite number
of times.

When Σ and X are unspecified or unemphasized, we shall speak simply about terms.
The inductive definition of FΣ (X) suggests a useful method to deal with terms. It could
be called term induction. If we want to define a property or quantity c(t) for every
ΣX-term t, it suffices

(i) to define c(t) for all t ∈ X, and then

(ii) to give a rule how to determine c(σ(t1 , . . . , tm )) in terms of σ (∈ Σm ) and c(t1 ),


. . . , c(tm ) (m ≥ 0).

Sometimes the variation suggested by Definition 1.3.1’ is more convenient: in (i) one
defines c(t) for t ∈ Σ0 , too, but in (ii) one can then restrict oneself to values m > 0.
Proofs by term induction can be modelled according to the same pattern.
Note that FΣ (X) is empty iff Σ0 = X = ∅. Since we do not want to consider this
uninteresting case separately every time, we shall tacitly assume that always Σ0 ∪X 6= ∅.

Example 1.3.2 Let Σ = Σ0 ∪ Σ1 ∪ Σ2 , where Σ0 = {µ}, Σ1 = {τ } and Σ2 = {σ}.


If X = {x, y, z}, then x, z, µ, τ (z), τ (µ), σ(z, τ (µ)) and t = σ(x, σ(z, τ (µ))) are some
examples of ΣX-terms. ✷

17
1 PRELIMINARIES

A ΣX-term t is evaluated in a given Σ-algebra as follows. First we assign a value


xα ∈ A to every variable x ∈ X. Then the operations of A are applied to these elements
as indicated by the form of t. For example, given a mapping α : X → A, the t of the
previous example would yield the element

σ A (xα, σ A (zα, τ A (µA ))).

Of course, the result depends on the choice of α, too. This evaluation process can be
formalized as follows.

Definition 1.3.3 With every Σ-algebra A and ΣX-term t we associate a mapping

tA : AX → A

as follows: for any α : X → A

(i) xA (α) = xα (x ∈ X) and

(ii) tA (α) = σ A (tA A


1 (α), . . . , tm (α)) when t = σ(t1 , . . . , tm ) (m ≥ 0, σ ∈ Σm , t1 , . . . ,
tm ∈ FΣ (X)). The mappings tA are called the polynomial functions of A in vari-
ables X and their set is denoted by PX (A).

It may seem strange that the polynomial functions tA ∈ PX (A) are evaluated on
mappings from X to A, but this is, in fact, just a modification of the usual way to
express polynomial functions. When one writes the value of a polynomial function in
the form p(a1 , . . . , an ), a given order of the variables is assumed, say X = {x1 , . . . , xn },
and the n-tuple (a1 , . . . , an ) is just a convenient way to give the mapping α : X → A
such that xi α = ai (i = 1, . . . , n).
In a sense, the polynomial functions of an algebra A are the operations one can derive
by composition from the basic operations σ A (σ ∈ Σ) of A, and they share many
properties with these. This is exemplified by the following four lemmas.

Lemma 1.3.4 If B is a subalgebra of the Σ-algebra A and α : X → A a mapping such


that Xα ⊆ B, then tA (α) ∈ B for all t ∈ FΣ (X). ✷

The lemma states, in other words, that subalgebras are closed with respect to polyno-
mial functions. The proof is a simple exercise in term induction quite similar to that of
the next lemma which expresses formally the fact that congruences are invariant with
respect to polynomial functions.

Lemma 1.3.5 Let θ be a congruence of the Σ-algebra A and α : X → A, β : X → A


two mappings such that
xα ≡ xβ (θ) for all x ∈ X.
Then tA (α) ≡ tA (β) (θ) for all t ∈ FΣ (X).

18
1.3 Terms, polynomial functions and free algebras

Proof. We proceed by term induction on t. If t = x ∈ X, then

tA (α) = xα ≡ xβ = tA (β) (θ).

Let t = σ(t1 , . . . , tm ) and suppose

tA A
i (α) ≡ ti (β) (θ) for all i = 1, . . . , m.

Then also

tA (α) = σ A (tA A A A A A
1 (α), . . . , tm (α)) ≡ σ (t1 (β), . . . , tm (β)) = t (β) (θ)

as θ is a congruence. Here the possibility m = 0 can be allowed as a trivial special case.


Lemma 1.3.6 Let ϕ : A → B be a homomorphism of Σ-algebras. Then

tA (α)ϕ = tB (αϕ)

for each mapping α : X → A and each ΣX-term t. ✷

Lemma 1.3.7 Let A and B be Σ-algebras, and α : X → A and β : X → B any map-


pings. If we define a mapping γ : X → A × B by putting

xγ = (xα, xβ) for all x ∈ X,

then
tA×B (γ) = (tA (α), tB (β)) for all t ∈ FΣ (X). ✷

Lemmas 1.3.6 and 1.3.7 can easily be verified by term induction.


The subalgebra generated by a subset can also be described in terms of polynomial
functions.

Lemma 1.3.8 For any subset X of a Σ-algebra A we have [X] = {tA (αX ) | t ∈ FΣ (X)},
where αX = 1A |X, i.e., αX is the mapping from X to A such that xαX = x for all x ∈ X.

Proof. Denote {tA (αX ) | t ∈ FΣ (X)} by C. For every x ∈ X, x = xαX = xA (αX ) ∈ C.


Hence X ⊆ C. Also, C is closed under the operations of A:

σ A (tA A A
1 (αX ), . . . , tm (αX )) = σ(t1 , . . . , tm ) (αX ) ∈ C

for all m ≥ 0, σ ∈ Σm and t1 , . . . , tm ∈ FΣ (X). Lemma 1.3.4 implies that C ⊆ B for


every subalgebra B which contains X. Hence C = [X]. Note that the result is true even
if Σ0 = X = ∅. In this case [X] = ∅. ✷

We shall now turn to the Σ-algebra formed by the ΣX-terms.

19
1 PRELIMINARIES

Definition 1.3.9 The Σ-algebra FΣ (X) = (FΣ (X), Σ) defined so that

σ FΣ (X) (t1 , . . . , tm ) = σ(t1 , . . . , tm )

for all m ≥ 0, σ ∈ Σm and t1 , . . . , tm ∈ FΣ (X), is called the ΣX-term algebra or the


free Σ-algebra generated by X.

We shall first account for the name “free algebra”.

Definition 1.3.10 Let K be a class of Σ-algebras. A Σ-algebra F = (F, Σ) is said to


be freely generated over K by a subset X ⊆ F , if the following conditions are satisfied:

(i) F ∈ K.

(ii) X generates F.

(iii) Every mapping α : X → A of X into any algebra A in K has an extension to a


homomorphism αb : F → A.

If these conditions are satisfied for some subset X of F , then F is called a free algebra
over K (with |X| generators), and X is called a free generating set.

A well-known example is provided by the free semigroup X + generated by a set (al-


phabet) X. The elements of X + are all the finite nonempty strings of elements of X.
The product of two such strings u and v is simply their concatenation uv. The associa-
tivity of this product is obvious and thus X + is a semigroup. As every string u ∈ X +
is obtained by concatenating individual elements of X, it is clear that X generates X + .
To prove that X + is freely generated by X over the class of all semigroups we consider
any semigroup S and mapping α : X → S. The required (unique) homomorphism

b : X+ → S
α

is obtained by putting

α = (x1 α) · (x2 α) · . . . · (xk α)


(x1 x2 . . . xk )b

for all x1 x2 . . . xk ∈ X + (products to the right are formed in S).


Free semigroups are considered later again, but we return now to our term algebras.

Theorem 1.3.11 The ΣX-term algebra FΣ (X) is freely generated by X over the class
of all Σ-algebras.

Proof. That X generates FΣ (X) is quite obvious when we compare the definitions of
FΣ (X) and FΣ (X), but it follows also from the useful observation that

tFΣ (X) (αX ) = t for all t ∈ FΣ (X) (*)

20
1.3 Terms, polynomial functions and free algebras

(where αX = 1FΣ (X) |X). The proof of (*) goes again by term induction. Let A be any
Σ-algebra and α : X → A any mapping. We claim that the mapping

b : FΣ (X) → A,
α t 7→ tA (α) (t ∈ FΣ (X))

is the required homomorphism. For every x ∈ X, xb α = xA (α) = xα. Hence, α


b|X = α.
It remains to be verified that α
b is a homomorphism. Indeed,

σ FΣ (X) (t1 , . . . , tm )b
α = σ(t1 , . . . , tm )A (α)
= σ A (tA A
1 (α), . . . , tm (α))
= σ A (t1 α
b , . . . , tm α
b)

for all m ≥ 0, σ ∈ Σm and t1 , . . . , tm ∈ FΣ (X). ✷

We add a few general comments on free algebras. First of all, one should note that
the homomorphic extension α b : F → A of a mapping α : X → A (A ∈ K) is unique.
This follows from Lemma 1.2.6. Free algebras over a given class do not always exist,
but when they do, they are determined up to isomorphism by the cardinality of the free
generating set. This is stated formally in the following lemma.

Lemma 1.3.12 Any two algebras freely generated over the same class of algebras by
sets of the same cardinality are isomorphic.

Proof. Suppose A and B both are free over the same class K and that they have free
generating sets X and Y , respectively, such that |X| = |Y |. Then there is a bijection
α : X → Y . The converse of it, β = α−1 , defines a bijection from Y to X. Now there
exist morphisms
b : A → B and βb : B → A
α
b = β. But then
b|X = α and β|Y
such that α

bβb : A → A and βbα


α b: B → B

are homomorphisms such that α b = 1X and βbα


bβ|X b|Y = 1Y . This means by Lemma 1.2.6
that α b b
bβ = 1A and β αb = 1B . Hence, α b
b and β are isomorphisms inverse to each other.
This implies A ∼
= B. ✷

Lemma 1.3.12 allows us to speak about the algebra freely generated over a class K by
a set X.
b used above for the rest of the book: for anyA and α : X → A,
We shall fix the notation α
b : FΣ (X) → A is the homomorphism such that α
α b|X = α. To evaluate a ΣX-term t in a
Σ-algebra A for a given assignment α : X → A of values to the variables amounts to the
computation of tb α. Indeed, we showed in the proof of Theorem 1.3.11 that tA (α) = tb α
for all A, α and t.
The polynomial functions in variables X of an algebra A are the mappings one can get
from the “projections” xA (x ∈ X) by iterated compositions with the basic operations

21
1 PRELIMINARIES

σ A (σ ∈ Σ). If the generating set of functions is enlarged by the set of all constant
mappings (c ∈ A)
γc : AX → A, α 7→ c (α ∈ AX ),
then we get, in general, a larger class of functions. These are called algebraic functions.
We shall need just the unary (i.e., 1-place) algebraic functions and these only are defined
below. In this special case X is a singleton {x} and we may identify any mapping
α : X → A with the element xα ∈ A. Then the unary algebraic functions can be defined
simply as certain mappings from A to A.

Definition 1.3.13 The set of unary algebraic functions Alg1 (A) of a Σ-algebra A is
defined as follows:
(i) 1A ∈ Alg1 (A).
(ii) For every c ∈ A, Alg1 (A) contains the constant mapping γc : A → A, a 7→ c
(a ∈ A).
(iii) The composition σ A (f1 , . . . , fm ) is in Alg1 (A) whenever m ≥ 0, σ ∈ Σm and f1 ,
. . . , fm ∈ Alg1 (A).
(iv) All members of Alg1 (A) are obtained by the rules (i)–(iii).

The constant mapping γc (c ∈ A) is usually denoted simply by c. It is intuitively


clear from Definition 1.3.13 that every f ∈ Alg1 (A) can be represented by an expression
similar to the terms that gave the polynomial functions. Let X = A ∪ {x} (x ∈ / A).
Following the inductive form of Definition 1.3.13 we associate with every f ∈ Alg1 (A) a
ΣX-term tf as follows:
(i) t1A = x.
(ii) tc = c for all c (= γc ) (c ∈ A).
(iii) If f = σ A (f1 , . . . , fm ), then tf = σ(tf1 , . . . , tfm ).
It is now an easy task to verify that the following lemma holds.

Lemma 1.3.14 For every f ∈ Alg1 (A) there exists a term tf ∈ FΣ (A ∪ x) such that,
for all a ∈ A,
f (a) = tA
f (αa )

when αa is the mapping such that αa |A = 1A and xαa = a. ✷

The assignment αa depends on a ∈ A only. We may think of tf as a ΣX-term for a


suitable X, in which all variables, save x, have been assigned constant values from A.
In other words, the unary algebraic functions are obtained from polynomial functions
by fixing the values of some variables. It is now obvious, in view of Lemma 1.3.5, that
congruences of A are invariant with respect to unary algebraic functions. The converse
of this observation holds also. In fact, it can be stated in a stronger form in terms of the
special unary algebraic functions introduced in the following definition.

22
1.4 Lattices

Definition 1.3.15 A mapping f : A → A is called an elementary translation of the Σ-


algebra A, if there exist an m > 0, a σ ∈ Σm , a j (1 ≤ j ≤ m) and elements c1 , . . . ,
cj−1 , cj+1 , . . . , cm ∈ A such that

f (a) = σ A (c1 , . . . , cj−1 , a, cj+1 , . . . , cm ) for all a ∈ A.

The set of all elementary translations of A is denoted by ET(A).

It is obvious that ET(A) ⊆ Alg1 (A).

Lemma 1.3.16 An equivalence relation θ ∈ E(A) is a congruence of A iff θ is invariant


with respect to all elementary translations of A.

Proof. Suppose a ≡ b (θ) implies f (a) ≡ f (b) (θ) for all a, b ∈ A and f ∈ ET(A).
Consider any m > 0, σ ∈ Σm and elements a1 , . . . , am , b1 , . . . , bm ∈ A such that
a1 ≡ b1 , . . . , am ≡ bm (θ). Define the following m elementary translations:

fj (ξ) = σ A (b1 , . . . , bj−1 , ξ, aj+1 , . . . , am ) (j = 1, . . . , m).

Then

σ A (a1 , a2 , . . . , am ) = f1 (a1 ) ≡ f1 (b1 ) (θ)


= f2 (a2 ) ≡ f2 (b2 ) (θ)
..
.
= fm (am ) ≡ fm (bm ) (θ)
= σ A (b1 , b2 , . . . , bm ).

Hence σ A (a1 , . . . , am )θσ A (b1 , . . . , bm ) and we have verified that θ ∈ C(A). The converse
is obvious. ✷

1.4 LATTICES
We shall need a few facts from lattice theory, and these are quickly surveyed here.

Definition 1.4.1 Let A be a set. A relation ̺ ⊆ A × A is called a partial ordering of


A, if

(1) δA ⊆ ̺ (̺ is reflexive),

(2) ̺ ∩ ̺−1 ⊆ δA (̺ is antisymmetric), and

(3) ̺̺ ⊆ ̺ (̺ is transitive).

If ̺ is a partial ordering of A, then (A, ̺) is called a poset.

23
1 PRELIMINARIES

The usual symbol for a partial ordering is ≤. Often a set A is called a poset when a
certain partial ordering of A is understood.
An example of a poset is (pS, ⊆), where S is a set and ⊆ the usual subset relation
in the power set pS. Another simple example is (N, ≤) where ≤ is the “less than or
equal” -relation of natural numbers. This ≤ is a total ordering, which means that any
two elements of the poset are comparable, i.e., either a ≤ b or b ≤ a holds for any two
elements a and b. A poset (A, ≤) in which ≤ is a total ordering is called a chain.
Let (A, ≤) be a poset and a, b ∈ A. We may write a ≥ b when b ≤ a, a < b when
a ≤ b and a 6= b, and a > b when a ≥ b and a 6= b. Clearly ≥ is a partial ordering and
the poset (A, ≥) is said to be dual to (A, ≤). Each one of the relations ≥, < and >
determines ≤ completely.
An element a ∈ A is an upper bound of a subset H ⊆ A if b ≤ a for all b ∈ H. An
upper bound a of H ⊆ A is the least upper bound, or the supremum, of H, if a ≤ c
for all upper bounds c of H. Lower bounds and greatest lower bounds (infimums) are
defined similarly. The least upper
W bound
V and the greatest lower bound of a subset H
are denoted, respectively,W by H and H. In case of an indexed family (ai | i ∈ I) of
V
elements the notations (ai | i ∈ I) and (ai | i ∈ I) may be used.
An element c ∈ A is a zero element of the poset A if c ≤ a for every a ∈ A. If a
poset has a zero element, it is unique and usually it is denoted by 0. Similarly,
V the unit
element 1, is defined by the condition V that a ≤ 1 for all a ∈WA. Clearly, A exists iff the
poset has a zero element 0, and then A = 0. Similarly, A exists, and then equals 1,
iff A has a unit element 1.
W V
Definition 1.4.2 A posetW(A, ≤) isVa lattice, if {a, b} and {a, b} exist for all a, b ∈ A.
It is a complete lattice, if H and H exist for all subsets H of A.
W V
In a lattice one usually writes a ∨ b and a ∧ b for {a, b} and {a, b}, respectively. The
element a ∨Wb is also called
V the join of a and b, and a ∧ b is the meet of a and b. It is easy
to
W see that H and H exist for every finite, nonempty subset H W of a lattice. However,
V
∅ exists only in case the lattice has a zero element
V 0. Then ∅ = 0. Similarly, ∅
exists iff the lattice has a unit element 1; then ∅ = 1.
The following lemma follows directly from the definitions of the join and the meet.
Lemma 1.4.3 If (A, ≤) is a lattice then ∧ and ∨ satisfy the following identities:
(L1) x ∧ x = x, x ∨ x = x (idempotence).
(L2) x ∧ y = y ∧ x, x ∨ y = y ∨ x (commutativity).
(L3) x ∧ (y ∧ z) = (x ∧ y) ∧ z, x ∨ (y ∨ z) = (x ∨ y) ∨ z (associativity).
(L4) x ∧ (x ∨ y) = x, x ∨ (x ∧ y) = x (absorption). ✷
The identities (L1)–(L4) are characteristic of lattices in the following sense. If (A, ∧, ∨)
is an algebra with two binary operations that satisfy these identities, then (A, ≤) is a
lattice when ≤ is defined so that
a≤b iff a∧b=a (a, b ∈ A).

24
1.4 Lattices

W V
In this lattice {a, b} = a ∨ b and {a, b} = a ∧ b for all a, b ∈ A. In lattice theory
lattices are usually defined and considered in parallel both as posets and as algebras.
The two aspects of the theory complement each other.
The following lemma is often useful when one wants to show that a certain poset is a
complete lattice.
V
Lemma 1.4.4 A poset (A, ≤) is a complete lattice, if H exists for each subset H ⊆ A.

V
Note that the existence of ∅ = 1 should also be ascertained when Lemma 1.4.4 is
used. We shall now apply the lemmaT to an important example. Let A be a set. It is
easy to see that the intersection (εi | i ∈ I) of any equivalence relations εi (i ∈ I) of A
is again in E(A). This means that
^ \
(εi | i ∈ I) = (εi | i ∈ I)
V
always exists in the poset (E(A), ⊆). (In particular, ∅ = ιA .) Hence, we get

Lemma 1.4.5 For each set A, (E(A), ⊆) is a complete lattice. ✷

In general,Wthe union of equivalence relations is not an equivalence relation. For any


S ⊆ E(A), H is the intersection of all equivalence relations which contain the union
H
H. A more useful description of the supremum is given in the following lemma.
W
Lemma 1.4.6 Let H ⊆ E(A) and a, b ∈ A. Then a ≡ b ( H) iff there exist an n ≥ 0,
ε1 , . . . , εn ∈ H and a1 , . . . , an−1 ∈ A such that

a ε1 a1 ε2 a2 . . . an−1 εn b. ✷

The lemma may be used to prove the following important fact.

Theorem 1.4.7 For any Walgebra A = (A,VΣ), C(A) forms a complete sublattice of
(E(A), ⊆), that is to say, H ∈ C(A) and H ∈ C(A) whenever H ⊆ C(A). ✷

The direct product (L1 × · · · × Ln , ≤) of posets (L1 , ≤), . . . , (Ln , ≤) is a poset when
we define ≤ in L1 × · · · × Ln so that

(a1 , . . . , an ) ≤ (b1 , . . . , bn ) iff ai ≤ bi for all i = 1, . . . , n.

If the (Li , ≤)’s are lattices, then the direct product is also a lattice in which

(a1 , . . . , an ) ∨ (b1 , . . . , bn ) = (a1 ∨ b1 , . . . , an ∨ bn )

and

(a1 , . . . , an ) ∧ (b1 , . . . , bn ) = (a1 ∧ b1 , . . . , an ∧ bn ).

An ideal of a lattice (A, ≤) is a nonempty subset I of A such that, for all a, b ∈ A,

25
1 PRELIMINARIES

(1) a, b ∈ I implies a ∨ b ∈ I, and


(2) a ≤ b ∈ I implies a ∈ I.
A dual ideal of a lattice (A, ≤) is a nonempty subset D of A such that, for all a, b ∈ A,
(1’) a, b ∈ D implies a ∧ b ∈ D, and
(2’) a ≥ b ∈ D implies a ∈ D.
General examples are provided by the
(i) principal ideal (a] = {x ∈ A | x ≤ a} generated by an element a ∈ A, and by the
(ii) principal dual ideal [a) = {x ∈ A | x ≥ a} generated by an element a ∈ A.
Let A and B be posets. A mapping ϕ : A → B is said to be isotone, if
(∀a1 , a2 ∈ A) a1 ≤ a2 =⇒ a1 ϕ ≤ a2 ϕ.
Suppose now that A and B are complete lattices. The mapping ϕ is ω-continuous, if
_ _
(ai | i ≥ 0)ϕ = (ai ϕ | i ≥ 0)
for every ascending ω-sequence
a0 ≤ a1 ≤ a2 ≤ . . .
of elements ai ∈ A (0 ≤ i < ω). An ω-continuous mapping is always isotone, but the
converse is false.
Let A be a poset and ϕ : A → A a mapping. An element a ∈ A is a fixed-point of
ϕ, if aϕ = a. It is the least fixed-point of ϕ, if all other fixed-points of ϕ are above
it. Of course, there can be at most one least fixed-point. A well-known theorem by
A. Tarski states that every isotone mapping in a complete lattice has a fixed-point. For
ω-continuous mappings the following stronger result holds.
Theorem 1.4.8 Let (A, ≤) be a complete lattice and ϕ : A → A an ω-continuous map-
ping. Then _
[ϕ] = (0ϕi | i ≥ 0)
is the least fixed-point of ϕ.
Proof. Since ϕ is isotone, 0 ≤ 0ϕ implies
0 ≤ 0ϕ ≤ 0ϕ2 ≤ 0ϕ3 ≤ . . . .
By ω-continuity, we get now
_ _
[ϕ]ϕ = (0ϕi+1 | i ≥ 0) = (0ϕi | i ≥ 0) = [ϕ].
For any fixed-point a of ϕ, 0 ≤ a implies
0ϕ ≤ aϕ = a,
and in general by induction on i ≥ 0, 0ϕi ≤ a. Hence [ϕ] ≤ a, and [ϕ] is the least
fixed-point of ϕ. ✷

26
1.5 Finite recognizers and regular languages

1.5 FINITE RECOGNIZERS AND REGULAR LANGUAGES


In this section several basic concepts and facts from the theory of finite automata are
reviewed. For many readers there is probably nothing really new. The presentation
is quite telegraphic and proofs are sketched at most. Much of the material will be
generalized to tree automata in Chapter 2, and the present section is intended mainly
as an outline of the proper background scenery.
An alphabet is a finite nonempty set of symbols which are called letters. We shall
usually use the letters X, Y and Z to indicate alphabets. A finite string of letters from
an alphabet X is called an X-word or a word over X. Consider an arbitrary X-word

w = x1 x2 . . . xn (n ≥ 0, x1 , . . . , xn ∈ X).

Here xi = xj is possible even for i 6= j. If n = 0, then w is the empty word which is


denoted by e. The length of w is n and we write it |w|. Obviously, |w| = 0 iff w = e. The
set of all X-words is denoted by X ∗ , and the set of all nonempty X-words is denoted
by X + . The letters of an alphabet are viewed as indivisible symbols. This means, in
particular, that for any m ≥ 0, n ≥ 0 and x1 , . . . , xm , y1 , . . . , yn ∈ X,

x1 x2 . . . xm = y 1 y 2 . . . y n

holds just in case m = n and xi = yi for all i = 1, ..., m. Letters are considered words of
length 1. Hence, we may write X ⊂ X + ⊂ X ∗ and X ∗ = X + ∪ e.
In Section 3 we noted that X + is the free semigroup generated by X, when the product
of two words is defined to be their catenation. Similarly, X ∗ is the free monoid generated
by X. The identity element is the empty word: ew = we = w for each w ∈ X ∗ .
A language over X, or an X-language, is simply a subset of X ∗ . An X-language is
e-free if it does not include the empty word. Of course, formal language theory concerns
itself with such languages only that can be specified in some effective manner.
A family of languages L is defined by indicating for each alphabet the set L(X) of
X-languages belonging to the family. For example, L(X) could consist of all languages
recognized by automata of a given type with input alphabet X. If L ∈ L(X), one may
write just L ∈ L. Two families of languages K and L are equal, which we write K = L,
if K(X) = L(X) for every alphabet X. Similarly, the inclusion K ⊆ L means that
K(X) ⊆ L(X) for every X.
One way to specify a language L ⊆ X ∗ is to give an automaton that can examine any
given X-word and then tell whether the word is in L or not. Such automata are called
recognizers. The most basic type of recognizers is the following:

Definition 1.5.1 An X-recognizer (also called a Rabin-Scott recognizer ) A consists of

(1) a finite (nonvoid) set A of states,

(2) the input alphabet X,

(3) a next-state function δ : A × X → A,

27
1 PRELIMINARIES

(4) an initial state a0 ∈ A, and

(5) a set A′ ⊆ A of final states.


We write A = (A, X, δ, a0 , A′ ).

If the X-recognizer A of Definition 1.5.1 is in state a (∈ A) and receives the input


x (∈ X), it enters state δ(a, x) and remains in this state until it reads the next input
letter. The next-state function is extended to a function

δ̂ : A × X ∗ → A

as follows:
1◦ δ̂(a, e) = a for each a ∈ A, and

2◦ δ̂(a, wx) = δ(δ̂(a, w), x) for all a ∈ A, w ∈ X ∗ and x ∈ X.


We will omit the cap from δ̂. For any a ∈ A and w ∈ X ∗ , δ(a, w) is the state of A when
it has read the whole input word w, from left to right, and the state in the beginning
was a. As a language recognizer A operates as follows. The word w to be tested for
membership is entered to A so that the state of A initially is a0 . Now w is accepted
by A if δ(a0 , w) is a final state. Otherwise w is said to be rejected by A. The language
recognized by A consists of all X-words accepted by A, i.e., it is the X-language

L(A) = {w ∈ X ∗ | δ(a0 , w) ∈ A′ }.

An X-language L is called recognizable, if there exists an X-recognizer A such that


L = L(A). The family of recognizable languages is denoted by Rec, and Rec X denotes
the set of all recognizable X-languages.
In the definition of X-recognizers the finiteness of the state set is essential. Otherwise,
every X-language would be recognizable.
We shall now prepare for the first of the many characterizations of recognizable lan-
guages.
The product of two X-languages U and V is the X-language

U V = {uv | u ∈ U, v ∈ V }.

The product is associative:

U (V W ) = (U V )W for all U, V, W ⊆ X ∗ .

Furthermore,
U ∅ = ∅U = ∅ and U {e} = {e}U = U
for every X-language U .
The powers U n (n ≥ 0) of an X-language U are defined inductively:

1◦ U 0 = {e} and
2◦ U n = U n−1 U for n > 0.

28
1.5 Finite recognizers and regular languages

By means of the powers we may define the iteration of U


[
U ∗ = (U n | n ≥ 0).

Excluding U 0 , we get the language


[
U+ = (U n | n ≥ 1).

Clearly, U ∗ = U + ∪ {e}, and U + = U ∗ iff e ∈ U . A word w ∈ X ∗ belongs to U ∗ iff it


can be expressed in the form w = u1 u2 . . . un , where n ≥ 0 and u1 , . . . un ∈ U .
Note that X n is the set of all X-words of length n (n ≥ 0) and the set X ∗ of all
X-words really is the iteration of X (when X is viewed as the set of X-words of length
1).
Union, product and iteration are called the regular language operations.

Definition 1.5.2 The set Reg X of regular X-languages is the smallest set R such that
1◦ ∅ ∈ R and {x} ∈ R for each x ∈ X, and
2◦ U, V ∈ R implies U ∪ V, U V, U ∗ ∈ R.

Regular languages are also called rational languages. All finite languages are regular.
Hence Reg X is the smallest set of X-languages containing the finite X-languages which
is closed under the three regular operations.
The form of Definition 1.5.2 implies that every regular X-language can be represented
by a regular expression which shows how the language is obtained from ∅ and the lan-
guages {x} by forming unions, products and iterations.

Example 1.5.3 Let X = {x, y}. Some members of Reg X are ∅, {x}, {y}, {xy} =
{x}{y}, {xy, yy} = {x}{y} ∪ {y}{y} = ({x} ∪ {y}){y} and

U = {xi y j | i ≥ 1, j ≥ 0} ∪ {yx2k | k ≥ 0}.

A possible regular expression for the language U would be η = (x(x)∗ (y)∗ ) + (y(xx)∗ )
(usually ‘+’ is used for union). If we agree on the usual hierarchy of regular operations
(first iterations, then products, and unions last), then some parentheses can be omitted
and η becomes xx∗ y ∗ +y(xx)∗ . The language U is recognized by the X-recognizer defined
by the state graph of Fig 1.1 (the initial state is a0 and the final states are a, b and c).


The following theorem is one of the cornerstones of finite automaton theory.

Theorem 1.5.4 (S. C. Kleene 1956) Rec = Reg. ✷

The theorem is effective in the following sense. There are algorithms to construct a
recognizer for any regular language given by a regular expression. Conversely, a regular
expression representing L(A) can be found for any given recognizer A.
Kleene’s theorem implies also that the family Rec is closed under the regular opera-
tions. We shall present some more closure properties of the family Rec.

29
1 PRELIMINARIES

x y
a
y
b
x x xy
a0 q
y y
y
c x d
x

Figure 1.1

Theorem 1.5.5 Let X and Y be arbitrary alphabets.


(a) If U, V ∈ Rec X, then U ∩ V, U − V ∈ Rec X.
(b) If U is a recognizable X-language, then so is its mirror image (or reversal)
mi(U ) = {xn . . . x2 x1 | n ≥ 0, x1 x2 . . . xn ∈ U (xi ∈ X)}.

(c) If U and V are recognizable X-languages, then so are the quotient languages
U −1 V = {w ∈ X ∗ | uw = v for some u ∈ U, v ∈ V }
and
U V −1 = {w ∈ X ∗ | wv = u for some u ∈ U, v ∈ V }.

(d) Let ϕ : X ∗ → Y ∗ be a homomorphism (of monoids). If U ∈ Rec X, then U ϕ ∈


Rec Y . If V ∈ Rec Y , then V ϕ−1 ∈ Rec X.
(e) If U ∈ Rec X and ϕ : pX ∗ → pY ∗ is such a substitution mapping that xϕ ∈ Rec Y
for all x ∈ X, then U ϕ ∈ Rec Y . ✷

Recall that a mapping ϕ : pX ∗ → pY ∗ is a substitution, if


1◦ {e}ϕ = {e},
2◦ {wx}ϕS= (wϕ)(xϕ) for all w ∈ X ∗ , x ∈ X, and
3◦ U ϕ = (uϕ | u ∈ U ) for all U ⊆ X ∗ .
Obviously, the substitution is completely defined when the languages xϕ (x ∈ X) are
given. Extended to mappings of languages, homomorphisms ϕ : X ∗ → Y ∗ are special
substitutions for which every xϕ (x ∈ X) consists of exactly one word.
Often it is convenient to allow a recognizer to be nondeterministic. In a nondeterministic
X-recognizer A = (A, X, δ, A0 , A′ ) the next-state function is a mapping
δ : A × X → pA.
Also, the recognizer has a set A0 ⊆ A of initial states. If A receives in state a the input
letter x, then it may enter any one of the states in δ(a, x). The operation of A may be
started in any initial state a0 ∈ A0 . A word w = x1 x2 . . . xn (n ≥ 0, x1 , . . . , xn ∈ X) is
accepted by A if there is such a choice of states a0 , a1 , . . . , an that

30
1.5 Finite recognizers and regular languages

(i) a0 ∈ A0 ,

(ii) ai ∈ δ(ai−1 , xi ) for all i = 1, . . . , n, and

(iii) an ∈ A′ .

The mapping δ extends to a mapping

δ̂ : pA × X ∗ → pA

as follows:

1◦ δ̂(H, e) = H for all H ⊆ A, and


S
2◦ δ̂(H, wx) = (δ(a, x) | a ∈ δ̂(H, w)) for all H ⊆ A, w ∈ X ∗ and x ∈ X.

Obviously, δ̂(H, w) is the set of states A may reach under the input word w from at
least one state in H. The language recognized by A can now be defined formally as

L(A) = {w ∈ X ∗ | δ̂(A0 , w) ∩ A′ 6= ∅}.

Every X-recognizer may be interpreted as a nondeterministic X-recognizer A, where


A0 and the sets δ(a, x) all are singletons. On the other hand, every nondeterministic
X-recognizer A may be turned into the equivalent X-recognizer

B = (pA, X, δ̂, A0 , A′′ ),

where A′′ = {H ∈ pA | H ∩A′ 6= ∅}; this is the well-known “subset construction”. Hence,
a language can be recognized by a nondeterministic recognizer iff it is recognizable in
our original sense of the word.
Now we recall some algebraic characterizations of Rec.
An equivalence relation ̺ on a semigroup S is a right congruence, if a̺b implies ac̺bc
for all a, b, c ∈ S. Every X-recognizer A = (A, X, δ, a0 , A′ ) defines a right congruence
̺A of the free monoid X ∗ as follows:

u ≡ v (̺A ) iff δ(a0 , u) = δ(a0 , v) (u, v ∈ X ∗ ).

The index of ̺A is at most |A| and


[
L(A) = (u̺A | u ∈ X ∗ , δ(a0 , u) ∈ A′ ).

This shows that every recognizable X-language is saturated by a right congruence of X ∗


of finite index.
Suppose now that the X-language L is saturated by a right congruence ̺ of X ∗ of
finite index. The X-recognizer

A = (X ∗ /̺, X, δ, e̺, L/̺),

31
1 PRELIMINARIES

where δ is defined by the condition

δ(u̺, x) = (ux)̺ (u ∈ X ∗ , x ∈ X),

is then well-defined and


δ(e̺, u) = u̺
for each u ∈ X ∗ . This implies L(A) = L ∈ Rec X. Among all right congruences of
X ∗ saturating a given X-language there is a greatest one which is called the Nerode
congruence of L. We denote it by ̺L and it can be defined by the condition that

u ≡ v (̺L ) iff (∀w ∈ X ∗ ) (uw ∈ L ⇔ vw ∈ L)

for all u, v ∈ X ∗ . From these observations it is easy to construct a proof for the following
theorem.

Theorem 1.5.6 (A. Nerode 1957). For any X-language L the following three conditions
are equivalent:

(1) L ∈ Rec X.

(2) L is saturated by a right congruence of X ∗ of finite index.

(3) The Nerode congruence ̺L is of finite index. ✷

There is a similar characterization which uses congruences of X ∗ . Every X-recognizer


A defines a congruence θA of X ∗ of finite index which saturates L(A):

u ≡ v (θA ) iff (∀a ∈ A) δ(a, u) = δ(a, v).

If L ⊆ X ∗ is saturated by a congruence, then a recognizer for L can be constructed


as above in the case of right congruences. The greatest congruence θL saturating L is
called the syntactic congruence of L. It may be defined by the condition that

u ≡ v (θL ) iff (∀w, w′ ∈ X ∗ ) (wuw′ ∈ L ⇔ wvw′ ∈ L)

for all u, v ∈ X ∗ .

Theorem 1.5.7 (J. R. Myhill 1957). For every X-language L the following three con-
ditions are equivalent:

(1) L ∈ Rec X.

(2) L is saturated by a congruence of X ∗ of finite index.

(3) The syntactic congruence θL is of finite index. ✷

32
1.5 Finite recognizers and regular languages

−1
Let θ be a congruence of X ∗ saturating an X-language L. Then L = (Lθ ♮ )θ ♮ , where

θ ♮ : X ∗ → X ∗ /θ

is the canonical homomorphism, and X ∗ /θ is finite iff θ is of finite index. This applies,
in particular, to the syntactic congruence θL . The monoid X ∗ /θL is called the syntactic
monoid of L. On the other hand, if we have a finite monoid M , a homomorphism

ϕ : X∗ → M

and a subset H ⊆ M for which L = Hϕ−1 , then ϕϕ−1 is a congruence of X ∗ of finite


index saturating L. It is now clear that Myhill’s theorem can be reformulated as follows.

Theorem 1.5.8 For any X-language L the following three conditions are equivalent:
(1) L ∈ Rec X.

(2) There exist a finite monoid M , a homomorphism ϕ : X ∗ → M and a subset H ⊆ M


such that L = Hϕ−1 .

(3) The syntactic monoid of L is finite. ✷

An X-language L is called local, if there exist sets H, K ⊆ X and I ⊆ X 2 such that

L − {e} = (HX ∗ ∩ X ∗ K) − X ∗ IX ∗ .

The membership of a nonempty word w in such an L can be tested by checking that


the first letter of w is in H, the last letter of w is in K, and that no two consecutive
letters of w form a pair belonging to I. Note that a local language may, according to
our definition, contain the empty word.
A homomorphism ϕ : X ∗ → Y ∗ is called length-preserving if |wϕ| = |w| for all w ∈ X ∗ .
Obviously ϕ is length-preserving iff Xϕ ⊆ Y .
In terms of these concepts one more characterization of Rec can be given.

Theorem 1.5.9 An X-language L is recognizable iff L = U ϕ for some alphabet Y , local


Y -language U and length-preserving morphism ϕ : Y ∗ → X ∗ . ✷

An X-recognizer A is said to be minimal, if no X-recognizer with fewer states recog-


nizes L(A). It is obvious that every regular language has a minimal recognizer. To say
more than that, we need a few concepts.
Let A = (A, X, δ, a0 , A′ ) be an X-recognizer. It is said to be connected, if there exists
for every a ∈ A a word w ∈ X ∗ such that a = δ(a0 , w). Two states a and b of A are said
to be equivalent, and we write a ∼ b, if

(∀w ∈ X ∗ ) (δ(a, w) ∈ A′ ⇐⇒ δ(b, w) ∈ A′ ).

The recognizer A is reduced, if a ∼ b implies a = b.


A relation θ ∈ E(A) is a congruence of A, if

33
1 PRELIMINARIES

(1) aθb implies δ(a, x)θδ(b, x) for all a, b ∈ A and x ∈ X, and

(2) θ saturates A′ .
Let C(A) be the set of all congruences of A. It is not hard to prove that ∼ is a congruence
of A. In fact, it is the greatest congruence of A.
If θ ∈ C(A), then one can define a quotient recognizer

A/θ = (A/θ, X, δ′ , a0 θ, A′ /θ)

by putting
δ′ (aθ, x) = δ(a, x)θ for all a ∈ A and x ∈ X.
The congruence property (1) guarantees that δ′ is well-defined. An easy induction on
|w| shows that
δ′ (aθ, w) = δ(a, w)θ for all a ∈ A and w ∈ X ∗ .
This implies L(A/θ) = L(A). In particular, L(A/∼) = L(A). It is now obvious that a
minimal recognizer should be reduced and, of course, connected.
Let A = (A, X, δ, a0 , A′ ) and B = (B, X, η, b0 , B ′ ) be two X-recognizers. A homomor-
phism ϕ : A → B is a mapping ϕ : A → B such that
(1) δ(a, x)ϕ = η(aϕ, x) for all a ∈ A and x ∈ X,

(2) a0 ϕ = b0 , and

(3) B ′ ϕ−1 = A′ .
Epimorphisms and isomorphisms of X-recognizers are, respectively, surjective and bi-
jective homomorphisms.
Homomorphisms, congruences and quotients of X-recognizers are related to each other
the same way as the corresponding concepts in algebra. Hence, for any θ ∈ C(A), the
natural mapping θ ♮ is an epimorphism A → A/θ. If ϕ : A → B is an epimorphism, then
ϕϕ−1 is a congruence of A and A/ϕϕ−1 is isomorphic to B. Moreover,

δ(a, w)ϕ = η(aϕ, w) for all a ∈ A, w ∈ X ∗ .

This implies L(A) = L(B).


The X-recognizer B is a subrecognizer of A if B ⊆ A, b0 = a0 , B ′ = A′ ∩ B and
η = δ|B × X. The subset B determines such a subrecognizer completely. The connected
part
Ac = {δ(a0 , w) | w ∈ X ∗ }
of an X-recognizer is the state set of a subrecognizer

Ac = (Ac , X, δc , a0 , A′ ∩ Ac )

where δc = δ|Ac × X.
The following theorem summarizes the main facts concerning minimal and reduced
recognizers.

34
1.6 Grammars and context-free languages

Theorem 1.5.10 (a) The minimal recognizer of a regular language is unique up to


isomorphism, i.e., if two recognizers are minimal and equivalent to each other, then
they are isomorphic.

(b) A recognizer is minimal iff it is connected and reduced.

(c) For any recognizer A, the quotient A/∼ is reduced and its connected part (A/∼)c
is minimal. The recognizer Ac /∼ is isomorphic to (A/∼)c .

(d) If A is minimal, B is connected and L(A) = L(B), then there exists a unique
epimorphism ϕ : B → A. ✷

Theorem 1.5.10 implies that one can find a minimal recognizer for a regular language
L by starting with any recognizer A of L; first one finds the connected part Ac and
then one has to determine the equivalent pairs of states in Ac . For both tasks there are
simple algorithms. The order may also be reversed; first form A/∼ and then find the
connected part of this reduced recognizer.
The decidability of the emptiness, finiteness and equality questions for regular lan-
guages follows from the following simple observation.

Lemma 1.5.11 Let A be an X-recognizer with n states.

(a) If L(A) contains a word w of length ≥ n, then one may write w = uvz so that
0 < |v| ≤ n and uv k z ∈ L(A) for all k ≥ 0.

(b) L(A) is nonempty iff it contains a word of length < n.

(c) L(A) is infinite iff it contains a word w such that n ≤ |w| < 2n. ✷

Statement (a) is often referred to as the “pumping lemma” for finite recognizers.
To test whether L(A) is nonempty it suffices to try all input words of length < |A|.
Similarly, the finiteness of L(A) can be checked by applying all input words w such
that |A| ≤ |w| < 2|A|. From any two X-recognizers A and B one can construct a
recognizer for (L(A) − L(B)) ∪ (L(B) − L(A)). But this language is empty exactly in
case L(A) = L(B). Hence, the equivalence of A and B can also be decided.

1.6 GRAMMARS AND CONTEXT-FREE LANGUAGES


We shall now consider the most important tools of formal language theory, Chomsky’s
grammars. A grammar is a device to define a language by showing how to generate the
strings of the language. The concept is very flexible, and by imposing various restrictions
on grammars several interesting families of languages can be obtained. A good example
is provided by the celebrated Chomsky hierarchy consisting of four families of languages.
At the bottom of the hierarchy we find, once more, the recognizable languages. However,
most of this section will be devoted to context-free languages. These form the second
step in the hierarchy.

35
1 PRELIMINARIES

Definition 1.6.1 A grammar is a 4-tuple (N, X, P, a0 ), where

(1) N is a finite nonempty set of nonterminal symbols,

(2) X is the terminal alphabet,

(3) P is the finite set of productions, and

(4) a0 ∈ N is the initial symbol.

It is required that N ∩ X = ∅. Every production is of the form β → γ, where β, γ ∈


(N ∪ X)∗ and β contains at least one nonterminal symbol.
Let G = (N, X, P, a0 ) be a grammar. For u, v ∈ (N ∪ X)∗ we write u ⇒G v (or
just u ⇒ v, when G is understood) if there exist u′ , u′′ ∈ (N ∪ X)∗ and a production
β → γ ∈ P so that u = u′ βu′′ and v = u′ γu′′ . If u ⇒G v, then u is said to generate v
directly in G. If there exists a derivation

u0 ⇒G u1 ⇒G u2 ⇒G . . . ⇒G un (n ≥ 0)

such that u0 = u and un = v, then we write u ⇒∗G v (or just u ⇒∗ v). The language
generated by G is the X-language

L(G) = {w ∈ X ∗ | a0 ⇒∗G w}.

Two grammars are equivalent, if they generate the same language.


The grammars of Definition 1.6.1 are very general and every recursively enumerable
language can be generated by such a grammar.

Definition 1.6.2 A grammar (N, X, P, a0 ) is called right linear , if each production is


of the form
a → xb, a → x or a → e,
where a, b ∈ N and x ∈ X. A language is right linear , or of type 3 (in the Chomsky
hierarchy), if it can be generated by a right linear grammar.

A right linear grammar G = (N, X, P, a0 ) can be converted into a nondeterministic


X-recognizer
A = (N ∪ {c}, X, δ, {a0 }, A′ ) (c 6∈ N )
which recognizes L(G) as follows. For any a, b ∈ N and x ∈ X, put

(i) b ∈ δ(a, x) iff a → xb ∈ P ,

(ii) c ∈ δ(a, x) iff a → x ∈ P , and

(iii) δ(c, x) = ∅.

36
1.6 Grammars and context-free languages

Finally, let A′ = {c} ∪ {a ∈ N | a → e ∈ P }. Conversely, every X-recognizer A =


(A, X, δ, a0 , A′ ) can be replaced by the right linear grammar G = (A, X, P, a0 ), where

P = {a → xb | δ(a, x) = b} ∪ {a → e | a ∈ A′ }.

These observations lead to one more characterization of Rec:

Theorem 1.6.3 The type 3 languages are exactly the regular languages. ✷

Now we proceed to the main topic of this section.

Definition 1.6.4 A grammar (N, X, P, a0 ) is context-free (CF, for short) if each pro-
duction is of the form
a→γ

where a ∈ N and γ ∈ (N ∪ X)∗ . A language is context-free (CF) if it is generated by


a CF grammar. The family of all CF languages is denoted by CF and the set of CF
X-languages by CF(X).

The CF languages are the type 2 languages in Chomsky’s hierarchy. Every right linear
grammar is CF. Hence Rec ⊆ CF. If |X| = 1, then Rec X = CF(X), but in all other
cases the inclusion is proper.

Example 1.6.5 Suppose X contains two distinct letters x and y. Every derivation in
the CF grammar
G = ({a}, X, {a → xay, a → xy}, a)

is of the form

a ⇒ xay ⇒ xxayy ⇒ . . . ⇒ xn−1 ay n−1 ⇒ xn y n (n ≥ 1).

Hence, L(G) is the nonregular language {xn y n | n ≥ 1}. ✷

The main fact to connect CF languages with tree automata is that context-free deriva-
tions can be represented by derivation trees. A derivation tree is a description of the
syntax of a word of the CF language. (Here it would be more natural to speak about
“sentences” of a language.) Derivation trees have proved very useful tools in the theory
of CF languages. Later we shall define “trees” in a way suitable for our purposes, but
here there is no need to define the concept too formally.
Let G = (N, X, P, a0 ) be a CF grammar. The derivation tree representing a derivation
of a word u ∈ (X ∪ N )∗ from a symbol a ∈ (X ∪ N ) in G is defined by induction on the
number k of steps in the derivation:

1◦ If k = 0, then u = a and the derivation tree consists of a single node labelled by a.

37
1 PRELIMINARIES

2◦ Consider a derivation
a ⇒ u1 ⇒ u2 ⇒ . . . ⇒ uk−1 ⇒ u (*)
where k ≥ 1. Suppose u1 = d1 . . . dm , where m ≥ 0 and d1 , . . . , dm ∈ N ∪ X.
At this point the context-freeness of G becomes essential. Every application of a
production in (*) rewrites exactly one di or a nonterminal derived from exactly one
di . This means that (*) may be decomposed into a number of “subderivations”
di ⇒ . . . ⇒ vi (i = 1, . . . , m)
each of which yields a segment vi of u and u = v1 v2 . . . vm . If the derivation trees
of the subderivations are t1 , . . . , tm , respectively, then the derivation tree of (*) is
that shown in Fig. 1.2.
The possibility m = 0 was not excluded. Then k = 1, u = e and the derivation
tree reduces to a single node labelled by a.

The word xxxyyy has the derivation


a ⇒ xay ⇒ xxayy ⇒ xxxyyy
in the grammar of Example 1.6.5. The corresponding derivation tree is shown in Fig. 1.3.
Consider any derivation
a0 ⇒ . . . ⇒ w
of a terminal word w ∈ L(G) from the initial symbol. The corresponding derivation

t1 tm

d1 ... dm

Figure 1.2

x y

x y
a
x y
a

Figure 1.3

tree is also called a derivation tree of w, and w can be read from the “leaves” of the tree.
The grammar G of Example 1.6.5 has the rather special property that every word in
L(G) has just one derivation in G.

38
1.6 Grammars and context-free languages

Example 1.6.6 Consider the CF grammar

G = ({a0 , a, b}, {x, y}, P, a0 )

where P consists of the productions

a0 → ab, a → xay, a → xy, b → ybx and b → yx.

Obviously, L(G) = {xm y m+n xn | m, n ≥ 1}. The word xyyx ∈ L(G) has the two
derivations
a0 ⇒ ab ⇒ xyb ⇒ xyyx
and
a0 ⇒ ab ⇒ ayx ⇒ xyyx
both of which are represented
 by the derivation tree shown in Fig. 1.4. In general, the
word xm y m+n xn has m+n
n different derivations all of which are represented by the same
derivation tree. ✷

x y y x

a b

a0

Figure 1.4

In Example 1.6.6 the different derivations of the same word do not represent different
syntactic descriptions of the word. In fact, they can all be obtained from each other by
changing the order in which the individual steps are carried out. If we agree on some
fixed order in which the subderivations are to be carried out, then there would be just
one derivation for each derivation tree of a word in the language.

Definition 1.6.7 A derivation

u0 ⇒ u1 ⇒ u2 ⇒ . . . ⇒ uk

in a CF grammar G = (N, X, P, a0 ) is called a leftmost derivation, if we can write, for


every i = 0, . . . , k − 1,
ui = wi au′i and ui+1 = wi γu′i
so that wi ∈ X ∗ , a ∈ N and a → γ ∈ P . The grammar G is ambiguous if some word w in
L(G) has two different leftmost derivations from a0 . Otherwise G is unambiguous. A CF
language generated by at least one unambiguous CF grammar is said to be unambiguous.
If all CF grammars generating a given CF language are ambiguous, then the language
is said to be inherently ambiguous.

39
1 PRELIMINARIES

A CF grammar G is unambiguous if every word w ∈ L(G) has exactly one derivation


tree. It is ambiguous, if at least one word w ∈ L(G) has more than one derivation tree.
The grammars of Examples 1.6.5 and 1.6.6 are unambiguous. Every regular language is
unambiguous. Of course, a language generated by an ambiguous CF grammar may be
unambiguous. The language

{xi y j z k | i = j or j = k (i, j, k ≥ 1)}

is a well-known example of an inherently ambiguous language.


There are many simplifying additional conditions that a CF grammar may always be
assumed to satisfy. Some of these are listed below.

Definition 1.6.8 Let G = (N, X, P, a0 ) be a CF grammar.

(a) G is reduced if either P = ∅ and N = {a0 }, or then for every a ∈ N ,

a0 ⇒∗ uav ⇒∗ w

for some u, v ∈ (N ∪ X)∗ and w ∈ X ∗ .

(b) G is in Chomsky normal form if each production is of the form


(i) a → bc (a ∈ N, b, c ∈ N − a0 ),
(ii) a → x (a ∈ N, x ∈ X), or
(iii) a0 → e.

(c) G is in Greibach normal form if each production is of the form


(i) a → xa1 . . . am (m ≥ 0, a ∈ N, a1 , . . . , am ∈ N − a0 , x ∈ X), or
(ii) a0 → e.

If m ≤ k for all productions of type (i), then G is said to be in Greibach k-form (k ≥ 0).

Proofs for the following facts can be found in the references given at the end of the
section.

Theorem 1.6.9 (a) Every CF grammar (N, X, P, a0 ) can be converted into an equiv-
alent reduced CF grammar (N ′ , X, P ′ , a0 ), where N ′ ⊆ N and P ′ ⊆ P .

(b) Every CF grammar can be converted into an equivalent CF grammar in any one
of the following normal forms: Chomsky normal form, Greibach normal form, and
Greibach 2-form. In all cases the grammar can be made reduced. ✷

We recall now some of the closure properties of the family CF.

Theorem 1.6.10 If the languages U and V are CF, then so are U ∪ V , U V and U ∗ . ✷

40
1.6 Grammars and context-free languages

The languages U = {xm y n z n | m, n ≥ 1} and V = {xn y n z m | m, n ≥ 1} are CF, but


U ∩ V = {xn y n z n | n ≥ 1} is not. This observation implies also that the difference U − V
of two CF languages U and V may be noncontext-free. However, the following theorem
holds.

Theorem 1.6.11 If U is a CF language and V is a regular language, then U ∩ V and


U − V are CF languages. ✷

The following theorem implies, as a special case, that CF is closed under morphisms.

Lemma 1.6.12 Let ϕ : pX ∗ → pY ∗ be a substitution mapping such that xϕ ∈ CF(Y )


for all x ∈ X. If U ∈ CF(X), then U ϕ ∈ CF(Y ). ✷

The following useful lemma is obtained most naturally by considering derivation trees.

Lemma 1.6.13 (Bar-Hillel’s pumping lemma). For each CF grammar G one can find
two natural numbers p and q such that the following holds for every word w ∈ L(G): if
|w| > p, then we may write w = u1 v1 w′ v2 u2 so that

(i) |v1 w′ v2 | ≤ q,

(ii) v1 v2 6= e, and

(iii) u1 v1i w′ v2i u2 ∈ L(G) for every i ≥ 0. ✷

Next we recall some decidability properties of CF languages. A CF language is always


assumed to be given by a CF grammar generating it.

Theorem 1.6.14 There are algorithms for deciding the following questions:

(1) Is a given word in a given CF language?

(2) Is a given CF language empty?

(3) Is a given CF language finite? ✷

The decidability of the finiteness problem follows from Bar-Hillel’s lemma. The other
two statements can be justified quite directly.

Theorem 1.6.15 The following questions are undecidable:

(a) Are two given CF languages equal?

(b) Is the intersection of two given CF languages empty? | finite? | regular? | context-
free?

(c) Is the complement X ∗ − U of a CF X-language U empty? | finite? | regular? |


context-free?

41
1 PRELIMINARIES

(d) Is a given CF grammar ambiguous?


(e) Is a given CF language inherently ambiguous? ✷
In the previous section we noted that every regular language has a minimal recognizer.
One might want to find a CF grammar equivalent to a given one with the smallest
possible number of nonterminals (nonterminal minimization problem) or with a mini-
mum number of productions (production minimization problem). However, the following
theorem holds.
Theorem 1.6.16 Both the nonterminal minimization problem and the production min-
imization problem are unsolvable. ✷
Let n be a fixed natural number. The sum of two n-tuples of nonnegative integers
a = (a1 , . . . , an ) and b = (b1 , . . . , bn )
is formed componentwise:
a + b = (a1 + b1 , . . . , an + bn ).
Similarly, we put
ka = (ka1 , . . . , kan )
for all k ∈ N0 and a ∈ Nn0 .
A subset K of Nn0 is called linear , if there exist an m ≥ 0 and n-tuples a1 , . . . , am ,
b ∈ Nn0 such that
K = {k1 a1 + . . . + km am + b | k1 , . . . , km ∈ N0 }.
A subset of Nn0 is semilinear if it is the union of finitely many linear sets.
Let X be an alphabet with n letters (n ≥ 1). It is convenient to think that the letters
of X are listed in some fixed order, x1 , . . . , xn . The Parikh vector of a word w ∈ X ∗ is
the n-tuple
Par(w) = (a1 , . . . , an )
where ai is the number of occurrences of xi in w (i = 1, . . . , n). The resulting Parikh
mapping
Par : X ∗ → Nn0
satisfies the conditions
(i) Par(e) = (0, . . . , 0)
and
(ii) Par(uv) = Par(u) + Par(v) (u, v ∈ X ∗ ).
The mapping Par is extended to X-languages in the natural way:
Par(L) = {Par(w) | w ∈ L}
for all L ⊆ X ∗ .
Theorem 1.6.17 For every CF language L, the Parikh set Par(L) is semilinear. ✷

42
1.7 Sequential machines

1.7 SEQUENTIAL MACHINES


Automata that produce outputs in response to inputs are generally called sequential
machines. The basic example of these is provided by the Mealy-machine which arose
as an abstract model of digital circuits with memory. A Mealy-machine is a system
A = (X, A, Y, a0 , δ, λ), where
(1) X is the input alphabet,

(2) A is a finite, nonempty set of states,

(3) Y is the output alphabet,

(4) a0 ∈ A is the initial state,

(5) δ : A × X → A is the next-state function, and

(6) λ : A × X → Y is the output function.


In many applications there is no fixed initial state, and a0 is then omitted from the
definition. The operation of A can be described as follows. If A is in state a (∈A) and
receives an input x (∈ X), then it enters state δ(a, x) and emits the letter λ(a, x). In
order to describe the behaviour of A under an arbitrary input word w ∈ X ∗ we extend
δ and λ to mappings
δ̂ : A × X ∗ → A, λ̂ : A × X ∗ → Y ∗
as follows:
1◦ δ̂(a, e) = a and λ̂(a, e) = e for every a ∈ A.

2◦ δ̂(a, wx) = δ(δ̂(a, w), x) and λ̂(a, wx) = λ̂(a, w)λ(δ̂(a, w), x) for all a ∈ A, w ∈ X ∗ ,
x ∈ X.
If A receives in state a the input word w, it emits the word λ̂(a, w) (∈ Y ∗ ) and ends
up in state δ̂(a, w). The translation induced by A is defined as the relation

τA = {(w, λ̂(a0 , w)) | w ∈ X ∗ } (⊆ X ∗ × Y ∗ ).

Two Mealy-machines are said to be equivalent if they define the same translation.
In the case of a Mealy-machine A every input word w has exactly one translation
λ̂(a0 , w) and this has the same length as w. Mealy-machines enjoy a number of desirable
properties and they have a well-developed theory. For example, the following facts are
known:

(a) The translations induced by Mealy-machines have a very simple characterization.

(b) The equivalence problem of Mealy-machines is decidable.

(c) For any Mealy-machine one can find an equivalent minimal Mealy-machine and this
is unique up to isomorphism.

43
1 PRELIMINARIES

(d) Let A be the Mealy-machine defined above. If L ∈ Rec X, then LτA ∈ Rec Y . If
−1
L ∈ Rec Y , then LτA ∈ Rec X.
There are several ways to generalize Mealy-machines. First of all, both the next-state
and the output behaviour may be nondeterministic. Another generalization allows the
sequential machine to emit a word in response to each input letter. Moreover, one may
add a set of final states. Then a translation of a word is accepted just in case it leaves
the machine in a final state. We shall now define a generalized sequential machine which
includes all these features. It is now convenient to use a set of productions which will
account both for the next-state behaviour and for the outputs. We arrive at the following
concept.
Definition 1.7.1 A (nondeterministic) generalized sequential machine (gsm) is a sys-
tem A = (X, A, Y, a0 , P, A′ ) where
(1) X is the input alphabet,
(2) A is a finite, nonempty set of states,
(3) Y is the output alphabet,
(4) a0 (∈A) is the initial state,
(5) P is a set of productions of the form ax → wb with a, b ∈ A, x ∈ X and w ∈ Y ∗ ,
and
(6) A′ ⊆ A is the set of final states.
It is assumed that A ∩ (X ∪ Y ) = ∅. The gsm A is said to be deterministic if there exists
for each pair (a, x) ∈ A × X exactly one production of the form ax → wb.
Let A be the above gsm. A production ax → wb is interpreted as follows. If A
is in state a and receives the input x, A may enter state b and simultaneously emit
the word w. We shall now define the translation performed by A. For any two words
p, q ∈ (A ∪ X ∪ Y )∗ , we write p ⇒A q if there exist a production ax → wb in P and
words p′ and p′′ such that p = p′ axp′′ and q = p′ wbp′′ . The reflexive, transitive closure
of ⇒A is denoted by ⇒∗A . Thus p ⇒∗A q (p, q ∈ (A ∪ X ∪ Y )∗ ) holds iff there exists a
derivation of the form
p = p 0 ⇒A p 1 ⇒A . . . ⇒A p k = q (k ≥ 0).
Now, the translation induced by A is defined as the relation
τA = {(u, v) | u ∈ X ∗ , v ∈ Y ∗ , a0 u ⇒∗A vb for some b ∈ A′ }.
If (u, v) ∈ τA , then v is a translation of u. If A is deterministic, then each X-word w has
at most one translation. Two gsm’s are equivalent if they induce the same translation.
The tree transducers, which form the subject matter of Chapter 4, may be viewed as
further generalizations of gsm’s in which trees replace words as inputs and as outputs.
The following two theorems may be compared with some of the results to be presented
in Chapter 4.

44
1.8 References

Theorem 1.7.2 Let A = (X, A, Y, a0 , P, A′ ) be a gsm. If L ∈ Rec X, then LτA ∈


−1
Rec Y . If L ∈ Rec Y , then LτA ∈ Rec X. ✷

Theorem 1.7.3 The equivalence problem of deterministic gsm’s is decidable, but the
equivalence problem of nondeterministic gsm’s is undecidable. ✷

The next-state behaviour of a gsm is identical to that of a nondeterministic Rabin-Scott


recognizer. Thus the following fact, which will be needed in Chapter 4, is obvious.

Lemma 1.7.4 Let A be a gsm as defined above. For any two states a, b ∈ A, the
language
L(a, b) = {u ∈ X ∗ | au ⇒∗A bv for some v ∈ Y ∗ }
is regular. ✷

1.8 REFERENCES
Extensive treatments of universal algebra can be found in the following two standard
references:
• P. M. Cohn, Universal algebra, D. Reidel, Dordrecht (2. ed. 1981).
• G. Grätzer, Universal algebra, Springer-Verlag, New York (2. ed. 1979).

The following more concise texts may also be recommended:


• H. Lugowski, Grundzüge der universellen Algebra, Teubner, Leipzig (1976).
• H. Werner, Einführung in die allgemeine Algebra, Bibliographisches Institut,
Mannheim (1978).

A good introduction to lattice theory (available in German and in French, too):


• G. Szász, Introduction to lattice theory, Academic Press, New York (1963).

Two general texts on finite automata and regular expressions:


• F. Gécseg and I. Peák, Algebraic theory of automata, Akadémiai Kiadó, Bu-
dapest (1972).
• A. Salomaa, Theory of automata, Pergamon Press, Oxford (1969).

An extensive algebraic treatment of the theory of finite automata can be found in the
following two volumes:
• S. Eilenberg, Automata, languages, and machines, Academic Press, New York
(Vol. A 1974, Vol. B 1976).

The general area of formal language theory is covered, for example, by the following
books:
• A. V. Aho and J. D. Ullman, The theory of parsing, translation, and compiling,
Prentice-Hall, Englewood Cliffs, N. J. (1972).

45
1 PRELIMINARIES

• M. A. Harrison, Introduction to formal language theory, Addison-Wesley, Read-


ing, Mass. (1978).
• J. E. Hopcroft and J. D. Ullmann, Formal languages and their relation to
automata, Addison-Wesley, Reading. Mass. (1969).
• A. Salomaa, Formal languages, Academic Press, New York (1973).

A highly recommendable classic on context-free languages is:


• S. Ginsburg, The mathematical theory of context-free languages, McGraw-Hill,
New York (1966).

46
2 TREE RECOGNIZERS AND
RECOGNIZABLE FORESTS
This chapter is devoted to finite-state tree recognizers and the family of forests recog-
nizable by them. Here trees are defined as terms over a finite operator domain, and a
forest (or tree language) is just a set of trees. As in the case of formal languages, there
are two particularly natural ways to effectively define a forest; a forest can be recognized
by an automaton, or it can be generated by a grammar. In Section 2.2 we introduce
the tree recognizers which correspond to Rabin–Scott recognizers. It does not make any
difference whether Rabin–Scott recognizers are defined to read words from left to right or
from right to left, but here we should consider both recognizers that read trees from the
leaves down towards the root (frontier-to-root recognizers) and recognizers which work
in the opposite direction (root-to-frontier tree recognizers). In both cases the recognizer
may be either deterministic or nondeterministic. This gives us four types of finite-state
tree recognizers. Three of these define the same family of forests, the family Rec of rec-
ognizable forests. Deterministic root-to-frontier recognizers are essentially weaker and
they define a proper subfamily of Rec. In Section 2.3 we define regular tree grammars.
After having shown that these can be reduced to a very simple normal form, we prove
that regular tree grammars generate exactly the recognizable forests. Often it will be
convenient to use regular tree grammars in the study of recognizable forests. In Section
2.4 several operations on forests are considered. Many of these arise as a generalization
of some basic language operation. Usually Rec can be shown to be closed under such
operations. However, one should note that there are often many ways to generalize from
languages to forests, and a right choice among the alternatives is essential if one wants to
generalize the corresponding results, too. For example, there is a natural generalization
of the product of languages with respect to which Rec is not even closed. A related point
is demonstrated by the case of tree homomorphisms. Here the greater generality of trees
compared with words admits of some entirely new phenomena, such as the copying of
subtrees.
In Section 2.5 regular expressions to denote forests are defined, and the appropriate
generalized Kleene theorem can then be proved. Section 2.6 contains the minimization
theory of deterministic frontier-to-root tree recognizers. In Sections 2.7 to 2.9 the family
Rec is characterized in some further ways. Recognizable forests are described by means
of congruences of the term algebra, as solutions of fixed-point equations, and in terms of
local forests. Moreover, a Medvedev-type characterization in terms of certain elementary
forests and elementary operations is given. In Section 2.10 we show that the emptiness,
the finiteness, and the equivalence problems of recognizable forests are decidable. Section
2.11 is devoted to deterministic root-to-frontier recognizers. The forests recognizable by

47
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

them are characterized by means of a certain closure property. Furthermore, we show


that these recognizers have canonical minimal forms.
In this chapter we try to cover the central parts of what could be called “the gener-
alized theory of finite automata”, but many topics had to be excluded. Some of these
are mentioned in the Notes and references. There we shall also indicate a few other
developments not directly related to this chapter as well as some applications of the
theory of tree automata.

2.1 TREES AND FORESTS


The “trees” which appear in tree automata theory may be visualized as tree-like directed
labelled graphs. Such a tree has exactly one node, the root, to which no edge enters.
From the root there is exactly one path to every node. Moreover, it is essential that
the edges leaving a given node have a specified left-to-right order. This concept has
been formalized in several ways, but the variations in the definition are of little or no
consequence. We shall choose a definition that suits well an algebraic treatment of the
theory.
For the labelling of the nodes of a tree we need two alphabets of different kind, a ranked
alphabet and a frontier alphabet. As a rule, these two are assumed to be disjoint. A
ranked alphabet is a finite nonempty operator domain (cf. Sect. 1.2). From now on Σ
always represents a ranked alphabet. Other symbols to be used for ranked alphabets
include Ω and Γ. The inclusion Σ ⊆ Ω means that Σm ⊆ Ωm for all m ≥ 0. If Σm ∩Ωn = ∅
whenever m 6= n, then Σ ∪ Ω may be defined:
(Σ ∪ Ω)m = Σm ∪ Ωm for all m ≥ 0.
A frontier alphabet is simply an alphabet in the usual sense, but sometimes we should
let it be empty. In fact, in most cases there is no need to exclude this possibility. Our
usual symbols for frontier alphabets are X, Y and Z.
For any Σ and X, a ΣX-tree is simply a ΣX-term. Thus the set of ΣX-trees is FΣ (X).
In many cases Σ or X, or both, are either understood or unspecified. In such cases we
often speak about Σ-trees, X-trees or just trees. A similar situation will arise whenever a
concept involves a ranked alphabet and a frontier alphabet. We shall not lengthen such
definitions by listing the modified names, but they will be used without explanation
whenever convenient.
The letters p, q, r, s and t are reserved for trees.
Although trees are defined as strings, they can be visualized as, and are in fact intended
as representations of, such tree structures as described above.
Example 2.1.1 Let Σ = Σ0 ∪ Σ1 ∪ Σ2 be a ranked alphabet, where Σ0 = {γ}, Σ1 = {ω}
and Σ2 = {σ}. As the frontier alphabet we take X = {x, y}. Then t = ω (σ(y, σ(γ, x)))
is the ΣX-tree shown in Fig. 2.1. ✷
Any other way of writing ΣX-terms would suit our purpose equally well. For example,
in Polish notation the tree t of Example 2.1.1 would be written as ωσyσγx, but it would
still be treated in tree automaton theory as the “tree” shown in Fig. 2.1.

48
2.1 Trees and forests

γ x

y σ

Figure 2.1.

Term induction will now be called tree induction. Below some important concepts are
defined by tree induction.

Definition 2.1.2 The height hg(t), the root root(t) and the set of subtrees sub(t) of a
ΣX-tree t are defined as follows:
1◦ If t ∈ X ∪ Σ0 , then hg(t) = 0, root(t) = t and sub(t) = {t}.

2◦ If t = σ(t1 , . . . , tm ) (m > 0), then


hg(t) = max(hg(ti ) | i = 1, . . . , m) + 1,
root(t) = S σ, and
sub(t) = (sub(ti ) | 1 ≤ i ≤ m) ∪ t.

For the tree of Example 2.1.1 we get hg(t) = 3, root(t) = ω and sub(t) = {t, σ (y, σ(γ, x)) ,
y, σ(γ, x), γ, x}.
Subtrees of height 0 are referred to as the leaves of the tree. A leaf is labelled by a
letter from the frontier alphabet or by a nullary operator. The length |t| of a tree t is
simply its length as a word. The leaves of tree t of our example are y, γ and x. Its length
is 15 (when parentheses and commas are counted, too). Of course, one can define and
prove things about trees by induction on the length; but in practice this mostly reduces
to tree induction. Induction on the height hg(t) is equivalent to tree induction.
We shall use the term frontier in a rather informal way to designate the part of a tree
consisting of the leaves. The frontier of the tree of Example 2.1.1 consists of the nodes
labelled by y, γ and x. The same letter or nullary operator could appear several times as
a leaf in the frontier. The visual picture of a tree also suggests the notions of a branch
and that of a path. In our t there are two main branches leaving the lower σ. They
correspond to the subtrees y and σ(γ, x). There are three paths from the root to the
frontier. They spell out the words ωσy, ωσσγ and ωσσx, respectively. These terms are
used in a descriptive manner to aid the intuition and no precise definitions are needed.

Note. In the literature the root is often called the “top” of the tree, while its frontier
is referred to as the “bottom”. Then “top-down” indicates the direction from the root
towards the frontier, and “bottom-up” means the opposite direction. This terminology
is connected with the common practice of drawing trees upside-down.

49
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

The same tree may occur several times as a subtree of a given tree and one should
distinguish between a subtree and an occurrence of a subtree. It is possible to assign
coordinates to the nodes of a tree and then indicate a certain occurrence of a subtree by
the coordinates of its root. However, the following simple device to specify an occurrence
of a subtree will suffice. For any occurrence of a subtree s of a tree t, there is a unique
way to write t = usv. Here u and v are just words and the occurrence of s is uniquely
determined by u.
We shall now consider some ways to construct new trees from given ones. The very
definition of FΣ (X) suggests such a construction. If m ≥ 0, σ ∈ Σm and t1 , . . . , tm ∈
FΣ (X), then σ(t1 , . . . , tm ) is a new ΣX-tree which could be called the σ-catenation of
t1 , . . . , tm . It is obtained by connecting the roots of the trees t1 , . . . , tm to a new root
labelled by σ. The construction is illustrated by Fig. 2.2.
γ x x x x z
t1 tm

d1 ... dm x σ σ

σ
σ
Figure 2.2. Figure 2.3.

Note that the σ-catenation is the σ-operation of the ΣX-term algebra FΣ (X):

σ(t1 , . . . , tm ) = σ FΣ (X) (t1 , . . . , tm ).

Let t be a ΣX-tree and suppose we are given a tree sx for every x ∈ X. The tree
denoted by
t(x ← sx | x ∈ X), or just t(x ← sx ),
is obtained by substituting in t, simultaneously for every x ∈ X, sx for each occurrence
of x. The formal definition by tree induction reads as follows:

1◦ If t = z ∈ X, then t(x ← sx ) = sz .

2◦ If t = σ ∈ Σ0 , then t(x ← sx ) = σ.

3◦ If t = σ(t1 , . . . , tm ), then

t(x ← sx ) = σ (t1 (x ← sx ), . . . , tm (x ← sx )).

If the trees sx are ΣX-trees, then t(x ← sx ) is also a ΣX-tree. However, the construc-
tion works also in the more general case where the trees sx are ΩY -trees for some Ω and
Y such that Σm ∩ Ωn = ∅ whenever m 6= n. Then t(x ← sx ) ∈ FΣ∪Ω (Y ).
Suppose X = {x1 , . . . , xn }. One may then write t(x ← sx ) in the more explicit form

t (x1 ← sx1 , . . . , xn ← sxn ) .

50
2.1 Trees and forests

If the order x1 , . . . , xn is understood, we may write simply t (sx1 , . . . , sxn ).


A letter x may be left unrewritten by choosing sx = x. The notation t(x1 ←
s1 , . . . , xn ← sn ) is used more generally to indicate a substitution where the letters
xi are rewritten as the corresponding si (i = 1, . . . , n), but all the other letters of X are
left unchanged in the tree t.

Example 2.1.3 Suppose γ ∈ Σ0 , σ ∈ Σ3 and x, y, z ∈ X. If t = σ (y, σ(γ, x, y), z), then

t (y ← x, z ← σ(x, x, z)) = σ (x, σ(γ, x, x), σ(x, x, z)) .

The tree is shown in Fig. 2.3. ✷

Often a certain occurrence of a subtree s of a tree t should be replaced by a tree r. If


the presentation t = usv indicates the particular occurrence of s, then the result is urv.
It is easy to show that urv is also a ΣX-tree whenever t, r ∈ FΣ (X). The operation may
also be described as follows. Let ξ be a new letter. There is a unique tree t′ ∈ FΣ (X ∪ ξ)
with exactly one occurrence of ξ such that t = t′ (ξ ← s). Then urv = t′ (ξ ← r). Other
ways to operate on trees will be encountered later on.
Trees define polynomial functions in algebras. These will be very important, and we
shall now see how the basic tree operations are reflected in them. Let A = (A, Σ)
be a Σ-algebra. If t ∈ FΣ (X) is obtained by σ-catenation from the trees t1 , . . . , tm
(m ≥ 0, σ ∈ Σm ), then
tA = σ A (tA A
1 , . . . , tm )

is simply the composition of tA A A


1 , . . . , tm with σ . Now we consider the substitution
operation. Let X = {x1 , . . . , xn } and t, s1 , . . . , sn ∈ FΣ (X). The polynomial function

t(s1 , . . . , sn )A : AX → A

is computed as follows. For any α : X → A,

t(s1 , . . . , sn )A (α) = tA (β),

where β : X → A is defined so that xi β = sAi (α) for all i = 1, . . . , n.


Finally, consider the replacing of an occurrence of a subtree s of a ΣX-tree t by a
ΣX-tree r. Write t = t′ (ξ ← s) as explained above. For any α : X → A, we get then

t′ (ξ ← r)A (α) = t′A (α′ )

where α′ : X ∪ ξ → A is defined so that α′ |X = α and ξα = r A (α).


A ΣX-forest is simply a subset of FΣ (X). Many authors call forests tree languages. In
general, we use the letters R, S and T for forests.
If Σ ⊆ Ω and X ⊆ Y , then all ΣX-trees are ΩY -trees, too. Thus every ΣX-forest
may be viewed as an ΩY -forest. In most cases this can safely be done. For example, a
ΣX-forest is recognizable (in the sense defined in the next section) as a ΣX-forest iff it
is recognizable as an ΩY -forest.

51
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

Of course, those forests only are of interest that can be defined in some natural way.
This chapter is devoted to a family of such forests, the forests recognizable by finite tree
automata. In the theory of these forests many concepts and results familiar from the
theory of recognizable languages can be perceived. The generalization from words and
languages to trees and forests will be considered in the next section.

2.2 TREE RECOGNIZERS


In this section we introduce tree recognizers, that is, tree automata which define forests.
There are four basic types of these recognizers. A tree recognizer may be defined in
such a way that it reads its input trees from the frontier towards the root. Then it is
called a frontier-to-root recognizer, or an F -recognizer for short. A tree recognizer which
reads the trees starting at the root proceeding then towards the frontier is called a root-
to-frontier recognizer, or simply an R-recognizer. In both cases the recognizer may be
either deterministic or nondeterministic. As a rule, all tree recognizers considered here
are finite, i.e., they have a finite number of states.
Our first task will be to compare the families of forests recognizable by these four
types of tree recognizers. It turns out that we get just two families. Deterministic F -
recognizers, nondeterministic F -recognizers and nondeterministic R-recognizers all have
the same recognition power. The forests recognized by them are termed recognizable.
Deterministic R-recognizers are considerably weaker and they yield a rather special
subfamily of the recognizable forests.
As stated in the previous section, Σ is always a ranked alphabet and X is a frontier
alphabet.

Definition 2.2.1 A frontier-to-root ΣX-recognizer or an (F )ΣX-recognizer, for short,


A consists of

(1) a finite Σ-algebra A = (A, Σ),

(2) an initial assignment α : X → A and

(3) a set A′ ⊆ A of final states.

We write A = (A, α, A′ ) or A = (A, Σ, X, α, A′ ). The forest recognized by A is the


ΣX-forest
T (A) = {t ∈ FΣ (X) | tA (α) ∈ A′ }.
A ΣX-forest T is said to be recognizable, if there exists a ΣX-recognizer A such that
T = T (A). The family of recognizable forests is denoted by Rec, and Rec(Σ, X) denotes
the set of all recognizable ΣX-forests.
The recognizers defined above are finite and deterministic although this has not been
emphasized in the name. They are our “basic” type of tree recognizer and we shall usually
omit the label “F ” which distinguishes them from root-to-frontier tree recognizers. The
elements of the underlying algebra A are called the states of A and A is its state set.

52
2.2 Tree recognizers

If not otherwise specified, A will be the ΣX-recognizer (A, α, A′ ). Also B and C will
usually be the ΣX-recognizers (B, β, B ′ ) and (C, γ, C ′ ), respectively. Here B = (B, Σ)
and C = (C, Σ) are Σ-algebras, β : X → B and γ : X → C are the initial assignments,
and B ′ ⊆ B and C ′ ⊆ C.
In algebraic terms the operation of the ΣX-recognizer A can be explained as follows.
Given an input tree t ∈ FΣ (X) the polynomial function tA is evaluated on the initial
assignment α. The tree is accepted exactly in case the result tA (α) is a final state. If
α̂ : FΣ (X) → A
is the extension of α to a homomorphism, then
tA (α) = tα̂ for every t ∈ FΣ (X),
and we may write
T (A) = {t ∈ FΣ (X) | tα̂ ∈ A′ } = A′ α̂−1 .
A more pictorial description of the operation of A in automata theoretic terms is also
possible. Given an input tree t, A starts reading it from the leaves in states that depend
on the labels of the leaves. If a certain leaf is labelled by a frontier letter x, then A is
in state xα at that leaf. If the label is a nullary operator σ, then A starts from that
leaf in state σ A . Now A moves down all the branches towards the root step by step as
follows. If a given node v is labelled by the m-ary operator σ (m > 0), then A enters v
in state σ A (a1 , . . . , am ), where a1 , . . . , am are the states of A at the nodes immediately
above v, listed in order from left to right. The tree is accepted if A enters the root in a
final state.
Example 2.2.2 Let Σ = Σ1 ∪ Σ2 , Σ1 = {∼}, Σ2 = {∧, ∨} and X = {x, y}. Define the
operations of the Σ-algebra A = ({0, 1}, Σ) by the tables below:

a ∼A (a) a b ∧A (a, b) ∨A (a, b)


0 1 0 0 0 0
1 0 0 1 0 1
1 0 0 1
1 1 1 1
Define an initial assignment so that xα = 1 and yα = 0. To complete the definition of
our ΣX-recognizer A we choose {1} as the set of final states. The computation of A on
the tree
t = ∧ (∼ (∧(y, x)), ∨(∼ (y), x))
is shown in Fig. 2.4. The states of A at the nodes are shown in parentheses. The tree
is accepted since the state at the root is 1. Let ∼, ∧ and ∨ have their usual meanings as
symbols for the logical connectives “not”, “and” and “or”. Then ΣX-trees are expres-
sions of propositional logic in the two propositional variables x and y. If 0 and 1 are
interpreted as the truth values “false” and “true”, respectively, then A computes the
truth values of propositions, when the truth values of the variables are given. The forest
recognized by A consists of the propositions (in variables x and y) that are true when x
is true and y is false. ✷

53
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

y (0) x (1) y (0)

∧ (0) ∼ (1) x (1)

∼ (1) ∨ (1)

∧ (1)

Figure 2.4.

Example 2.2.3 Let Σ = Σ2 = {+, ·} and X = {x1 , . . . , xn } for some n ≥ 1. The ΣX-
trees may now be interpreted as arithmetic expressions in variables x1 , . . . , xn . Using
the customary infix notation one could write, for example x1 + x1 · x2 rather than
+(x1 , ·(x1 , x2 )). Let m > 0 and define the Σ-algebra A = ({0, 1, . . . , m − 1}, Σ) so that

a +A b = a + b (mod m)

and
a ·A b = a · b (mod m)
for all a, b = 0, 1, . . . , m − 1. If t is a ΣX-tree and α : X → A is any mapping, then tA (α)
is the value of the expression t (mod m) when the variables are assigned values according
to α. Thus any ΣX-recognizer A = (A, α, A′ ) based on the algebra A recognizes a set of
arithmetic expressions which get a value (mod m) in A′ when each variable xi is given
a certain value xi α (i = 1, . . . , n). ✷

The examples suggest some useful general observations on tree recognizers. A tree
recognizer is a device that evaluates an expression (a tree) for given values of the variables
(given by the initial assignment) and decides then on the basis of this value whether the
expression belongs to given set or not. Since the state set is finite such an evaluation
is always “modulo something”. For example, we could not construct a tree recognizer
which would find out whether the value of an arithmetic expression is a prime or not.
Similarly, there is no tree recognizer that recognizes the set of all trees in which two given
operators appear the same number of times. The following example discusses another
manifestation of the same phenomenon.

Example 2.2.4 Let Σ = Σ2 = {σ} and let X be an arbitrary nonempty frontier alpha-
bet. Then the forest
T = {σ(t, t) | t ∈ FΣ (X)}
is not recognizable. For suppose T = T (A) for some ΣX-recognizer A. Since A is finite,
there must exist two different ΣX-trees s and t such that sα̂ = tα̂. But then we would
have that
σ(s, t)α̂ = σ A (sα̂, tα̂) = σ A (sα̂, sα̂) = σ(s, s)α̂ ∈ A′ ,
which implies the contradiction σ(s, t) ∈ T . ✷

54
2.2 Tree recognizers

Let us now look how tree recognizers arise as generalizations of the Rabin–Scott recog-
nizers through a universal algebraic interpretation. First, let A = (A, I, δ, a0 , A′ ) be an
I-recognizer as defined in Sect. 1.5 (to avoid confusion we use I as the input alphabet).
Define a ranked alphabet Σ such that Σ1 = I and Σm = ∅ for all m 6= 1. The next-state
mapping of A is completely determined by the Σ-algebra A = (A, Σ) which is defined
so that
σ A (a) = δ(a, σ) for all a ∈ A and σ ∈ I.
If we put X = {x}, then I-words and ΣX-trees can be identified as follows. The empty
word e corresponds to the tree x, and a nonempty word σ1 . . . σk (k ≥ 1, σi ∈ I) may
be interpreted as the tree σk (. . . σ1 (x) . . .) (the reverse Polish notation for trees would
make the identification even more natural). Define α : X → A so that xα = a0 . Then

δ(a0 , t) = tA (α) for all t ∈ I ∗ (= FΣ (X)!).

This implies that the forest recognized by the ΣX-recognizer (A, α, A′ ) is, interpreted
as an I-language, the language recognized by A. Hence a Rabin–Scott recognizer may
be viewed as a tree recognizer over a unary ranked alphabet and a one-element frontier
alphabet. The general ΣX-recognizers result when one does not require Σ to be unary
and allows also an arbitrary frontier alphabet X.
The nondeterministic frontier-to-root tree recognizers that we soon shall define may
be viewed as generalized F -tree recognizers in which nondeterminism is allowed both in
the assignment of states to the leaves and in the next-state behaviour. First we have to
introduce nondeterministic operations and nondeterministic algebras.
An m-ary nondeterministic (ND) operation on a set A is a mapping from Am to pA
(m ≥ 0). Thus an m-ary ND operation

f : Am → pA

assigns to every m-tuple of elements from A a subset of A. A nullary ND operation

f : {∅} → pA

fixes a subset of A, and f may be identified with this subset f (∅). A nondeterministic
(ND) Σ-algebra A = (A, Σ) consists of a nonempty set A and a family {σ A | σ ∈ Σ} of
ND operations on A such that for each σ ∈ Σ, σ A is m-ary if σ ∈ Σm . The ND Σ-algebra
is finite if A is finite. A Σ-algebra may be viewed as an ND Σ-algebra when elements
a ∈ A are identified with the corresponding singletons {a}.
On the other hand, we associate with every ND Σ-algebra A = (A, Σ) an ordinary
Σ-algebra, namely the subset algebra

pA = (pA, Σ)

where [ 
σ pA (A1 , . . . , Am ) = σ A (a1 , . . . , am ) | a1 ∈ A1 , . . . , am ∈ Am

55
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

for all m ≥ 0, σ ∈ Σm and A1 , . . . , Am ⊆ A. Now any mapping

α : X → pA

may be extended to a homomorphism

α̂ : FΣ (X) → pA.

Consider a ΣX-tree t. The computation of the set tα̂ may be described in automata
theoretic terms as follows. If a leaf is labelled by a letter x, then the “automaton” A
may start at that leaf in any one of the states in xα. If a leaf is labelled by a nullary
operator, then σ A is the set of the possible starting states. Let v be any node in the
tree labelled by an m-ary symbol σ (m > 0). Let σ(t1 , . . . , tm ) be the subtree of t
which has v as its root. Then t1 α̂, . . . , tm α̂ are the respective sets of possible states of
A at the nodes immediately above v. Now A may enter v in any one of the states from
σ pA (t1 α̂, . . . , tm α̂). Clearly, tα̂ is the set of all states in which A may be at the root of t.

Definition 2.2.5 A nondeterministic frontier-to-root ΣX-recognizer, or an NDF ΣX-


recognizer for short, A consists of
(1) a finite ND Σ-algebra A = (A, Σ),

(2) an initial assignment α : X → pA and

(3) a set A′ ⊆ A of final states.


We write A = (A, α, A′ ) or A = (A, Σ, X, α, A′ ). The forest recognized by A is the
ΣX-forest
T (A) = {t ∈ FΣ (X) | tα̂ ∩ A′ 6= ∅}.
The definition of T (A) means that a tree t is accepted by A iff there is a set of choices
of initial states for the leaves and next-states for the other nodes such that A enters the
root of t in a final state. It is rather obvious that the ΣX-recognizer

pA = (pA, α, A′′ ),

where
A′′ = {A1 ∈ pA | A1 ∩ A′ 6= ∅},
recognizes the same forest as A. Indeed, for any t ∈ FΣ (X),

t ∈ T (pA) iff tpA (α) ∈ A′′ iff tα̂ ∈ A′′

iff tα̂ ∩ A′ 6= ∅ iff t ∈ T (A).


This is the natural generalization of the usual subset construction as applied to ND
Rabin–Scott recognizers, and pA is the “subset recognizer” corresponding to A. Since
every ΣX-recognizer may be viewed as an equivalent NDF ΣX-recognizer we have veri-
fied the following theorem.

56
2.2 Tree recognizers

Theorem 2.2.6 The forests recognized by nondeterministic frontier-to-root recognizers


are exactly the recognizable forests. ✷
We begin the discussion of root-to-frontier tree recognizers with the nondeterministic
version. In a nondeterministic root-to-frontier Σ-algebra (NDR Σ-algebra, for short)
A = (A, Σ), A is a nonempty set and every σ ∈ Σm with m ≥ 1 is realized as a mapping

σ A : A → p(Am ).

For σ ∈ Σ0 , σ A is a subset of A. We call A finite, if A is finite.


Definition 2.2.7 A nondeterministic root-to-frontier ΣX-recognizer A, or an NDR
ΣX-recognizer, consists of
(1) a finite NDR Σ-algebra A = (A, Σ),
(2) a set A′ ⊆ A of initial states, and

(3) a final assignment α : X → pA.


We write A = (A, A′ , α) or A = (A, Σ, X, A′ , α). The elements of A are called states.
In order to make the formal definition of the forest recognized by such an A easier to
understand, we shall first describe its intended operation. At the root of a given ΣX-tree
t, A may be in any initial state a ∈ A′ . Consider now any node v of t labelled by some
σ ∈ Σm with m ≥ 1. If a is a possible state of A at v and (a1 , . . . , am ) ∈ σ A (a), then
A may assume state a1 at the leftmost node immediately above v, state a2 at the node
immediately to the right of this node etc. For every m-tuple in σ A (a), A has such a
sequence of possible next-states for the nodes directly above v. Note that the possible
states at these nodes are connected with each other: (a1 , . . . , am ), (a′1 , . . . , a′m ) ∈ σ A (a)
does not imply, for example, (a′1 , a2 , . . . , am ) ∈ σ A (a). The tree t is accepted by A if it
is possible to choose the initial state for the root and then make the consecutive choices
of next-state vectors in such a way that A arrives at each leaf labelled by a frontier
letter x in a state belonging to xα, and at each leaf labelled by a 0-ary symbol σ in a
state belonging to σ A . It is easier to formalize this recognition process by tracing it from
the leaves back to the root. The idea is to see which states at each node can lead to
acceptance. For the leaves this is clear. If a leaf is labelled by x ∈ X, then the accepting
states for that leaf form the set xα. If a leaf is labelled by σ ∈ Σ0 , then the accepting
states are those belonging to σ A . Now one can infer the states that are accepting at the
nodes immediately below the leaves. When these have been found, we may determine
the states in which A should be at nodes one level deeper in a tree. Finally one finds
out the accepting states for the root. The tree is accepted iff at least one of these is an
initial state.
Definition 2.2.8 Let A = (A, A′ , α) be an NDR ΣX-recognizer. A mapping

α̃ : FΣ (X) → pA

is defined as follows:

57
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

1◦ If x ∈ X, then xα̃ = xα.

2◦ If σ ∈ Σ0 , then σ α̃ = σ A .

3◦ If t = σ(t1 , . . . , tm ) (m ≥ 1), then


tα̃ = {a ∈ A | σ A (a) ∩ (t1 α̃ × . . . × tm α̃) 6= ∅}.

The forest recognized by A is the ΣX-forest

T (A) = {t ∈ FΣ (X) | tα̃ ∩ A′ 6= ∅}.

Example 2.2.9 Let us consider again the arithmetic expressions, defined in Example
2.2.3. We shall construct an NDR Σ{x1 , x2 }-recognizer which accepts an expression in
variables x1 and x2 iff the value of the expression is divisible by 4 when x1 = 0 or 2 (mod
4) and x2 = 3 (mod 4). An obvious choice for a state set is A = {0, 1, 2, 3}. The set of
initial states is {0}, and the final assignment is defined by x1 α = {0, 2} and x2 α = {3}.
The next-state behaviour is determined by inferring the possible summands or factors
from the sum or product, respectively. We get

+A (0) = {(0, 0), (1, 3), (2, 2), (3, 1)}

+A (1) = {(0, 1), (1, 0), (2, 3), (3, 2)}


etc., and

·A (0) = {0} × A ∪ A × {0} ∪ {(2, 2)}

·A (1) = {(1, 1), (3, 3)}


etc..
Note that we would get an equivalent NDF-recognizer by “inverting” these operations
(0 +A 0 = 0 etc.), and making {0} the set of final states and α the initial assignment. ✷

The concluding observation of Example 2.2.9 can be generalized as follows. We say


that the NDF ΣX-recognizer A = (A, Σ, X, α, A′ ) and the NDR ΣX-recognizer B =
(B, Σ, X, B ′ , β) are associated if

(1) A = B, A′ = B ′ and α = β,

(2) (a1 , . . . , am ) ∈ σ B (a) iff a ∈ σ A (a1 , . . . , am ), for all m ≥ 1, σ ∈ Σm and


a1 , . . . , am , a ∈ A, and

(3) σ A = σ B for every σ ∈ Σ0 .

It easy to see that α̂ = β̃ if A and B are associated. Since every NDF tree recognizer
has an associated NDR tree recognizer, and conversely, we get

Theorem 2.2.10 The forests recognizable by NDR tree recognizers are exactly the rec-
ognizable forests. ✷

58
2.3 Regular tree grammars

A deterministic root-to-frontier ΣX-recognizer, or a DR ΣX-recognizer , is a NDR ΣX-


recognizer A = (A, A′ , α) such that A′ and all of the sets σ A (a) (σ ∈ Σm , m ≥ 1, a ∈ A)
and σ A with σ ∈ Σ0 contain exactly one element. Thus a DR ΣX-recognizer A has
exactly one initial state and in every situation there is exactly one choice of next-state
vector. Moreover, there is exactly one final state for each leaf labelled by a nullary
symbol. The forest recognized by A is defined the same way as in the general case.
That determinism is a real limitation in the case of root-to-frontier recognizers is shown
by the following example.

Example 2.2.11 Suppose σ ∈ Σ2 and x, y ∈ X. If a DR ΣX-recognizer accepts the


trees σ(x, y) and σ(y, x), then it must accept σ(x, x), too. Hence, the forest T =
{σ(x, y), σ(y, x)} cannot be recognized by any DR ΣX-recognizer. On the other hand,
it is obvious that T ∈ Rec(Σ, X). ✷

The inability of these recognizers to cope with situations such as that in Example
2.2.11 is due to the fact that they have to read disjoint subtrees separately without any
possibility to combine the information gathered from the individual subtrees. In an NDR
tree recognizer this handicap is compensated for by their ability to make several guesses
about the subtrees jointly before reading them separately.

2.3 REGULAR TREE GRAMMARS


So far, the recognizable forests have been characterized by means of three types of tree
recognizers. Now we shall introduce a class of tree grammars that also defines the
family of recognizable forests. These grammars are the natural counterparts to type 3
grammars.

Definition 2.3.1 A regular ΣX-grammar G consists of

(1) a finite nonempty set N of nonterminal symbols,

(2) a finite set P of productions of the form a → r, where a ∈ N and r ∈ FΣ (N ∪ X),


and

(3) an initial symbol a0 ∈ N .

It is assumed that N ∩ (Σ ∪ X) = ∅. We write G = (N, Σ, X, P, a0 ).

When Σ and X are not specified, we speak about regular tree grammars or just gram-
mars, if there is no danger of confusion.
Let G be a regular tree grammar as in the definition above. The right-hand side of a
production is a tree in which nonterminal symbols may appear at the leaves only. For
p, q ∈ FΣ (X ∪ N ), we write

p ⇒G q (or just p ⇒ q)

59
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

if there exist a ∈ N, r ∈ FΣ (X ∪ N ) and words u, v such that p = uav, q = urv and


a → r ∈ P , i.e., p ⇒G q means that q is obtained by replacing an occurrence of a
nonterminal symbol a by a tree r, where a → r is a production of the grammar. More
generally, we write
p ⇒∗G q (or just p ⇒∗ q)
if p = q or there exists a (nontrivial) derivation

p ⇒G p1 ⇒G . . . ⇒G pn−1 ⇒G q (n ≥ 1)

of q from p. Hence, ⇒∗ is the reflexive, transitive closure of ⇒, when we view it as a


relation in FΣ (X ∪ N ).

Definition 2.3.2 The forest generated by a regular ΣX-grammar G = (N, Σ, X, P, a0 )


is the ΣX-forest
T (G) = {t ∈ FΣ (X) | a0 ⇒∗G t}.
Two regular ΣX-grammars G1 and G2 are said to be equivalent if T (G1 ) = T (G2 ).

Example 2.3.3 Let Σ = Σ0 ∪ Σ2 , Σ0 = {ω}, Σ2 = {σ} and X = {x}. Define the


regular ΣX-grammar
G = ({a, b}, Σ, X, P, a),
where
P = {a → σ(x, σ(x, b)), a → σ(ω, a), b → σ(x, x)}.
The tree
t = σ (ω, σ(x, σ(x, σ(x, x))))
is in T (G) and it has the derivation

a ⇒ σ(ω, a) ⇒ σ(ω, σ(x, σ(x, b))) ⇒ t.

If the graphical representation of trees is used, this derivation can be written as in


Fig. 2.5. ✷
x x

x b x σ

x σ x σ

ω a ω σ ω σ

a ⇒ ⇒ σ ⇒ σ
σ

Figure 2.5.

60
2.3 Regular tree grammars

A regular ΣX-grammar may be viewed as a context-free grammar with a terminal


alphabet consisting of Σ, X, the parentheses and the comma. Thus, if we treat trees as
words, then the forests generated by regular tree grammars are special CF languages.
However, we are mainly interested in them as forests, and we shall prove that exactly
the recognizable forests can be generated by these grammars. To facilitate the proof
first we show that the form of the productions may be restricted considerably without
limiting the generative power of regular tree grammars.
To begin with, we note that productions of the form
a→b (a, b ∈ N )
are not needed. All such productions can be deleted if we add to P all productions a → r
(a ∈ N, r ∈ FΣ (X ∪ N ) − N ) such that a ⇒∗ b and b → r ∈ P for some b ∈ N . (It is
easy to see that a ⇒∗ b is decidable for a, b ∈ N .)
Call hg(r) the height of the production a → r. If the height of a production a → r
is > 1, then r is of the form σ(r1 , . . . , rm ), where m ≥ 1, σ ∈ Σm and hg(ri ) < hg(r)
for each i = 1, . . . , m. If we introduce new nonterminal symbols a1 , . . . , am and the
productions
a → σ(a1 , . . . , am ) (*)
and
ai → r i (i = 1, . . . , m), (**)
then the production a → r may be deleted without changing the forest generated.
Indeed, any application of a → r can be replaced by an application of (*) followed by
applications of the productions (**). On the other hand, none of the productions (**)
can be used unless (*) has first been used, and when (*) has been applied it must be
followed by applications of all productions (**) as there is no other way to rewrite the new
nonterminals ai . The total effect of these steps is the same as that of a single application
of a → r. Thus every production of height > 1 can be replaced by productions of lesser
height. The process can be repeated until there are no productions of height > 1. In
(**) there may be productions of the type a → b, but they can be eliminated. Hence
each production of height 0 may be assumed to be of the type
a → x (a ∈ N, x ∈ X) (i)
or of the form
a→σ (a ∈ N, σ ∈ Σ0 ). (ii)
A production of height 1 is of the form
a → σ(r1 , . . . , rm ) (m ≥ 1, σ ∈ Σm , a ∈ N ),
where each ri is a frontier letter, a 0-ary operator or a nonterminal symbol. If ri is a
letter from X or a 0-ary operator, then we may substitute a new nonterminal symbol
d for it and introduce the production d → ri of height 0 without changing the forest
generated. Thus we may assume that all productions of height 1 are of the form
a → σ(a1 , . . . , am ) (m ≥ 1, σ ∈ Σm , a, a1 , . . . , am ∈ N ). (iii)

61
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

We say that a regular tree grammar is in normal form if each of its productions is of
type (i), (ii) or (iii). The previous discussion amounts to the following lemma.

Lemma 2.3.4 Every regular tree grammar can be transformed into an equivalent regular
tree grammar in normal form. ✷

Example 2.3.5 None of the productions of the grammar considered in Example 2.3.3
is in normal form. The production a → σ(x, σ(x, b)) can be replaced by the following
set:
a → σ(a1 , a2 ), a1 → x, a2 → σ(a1 , b).
Notice that we could use the new nonterminal symbol a1 twice since in both functions
it should be rewritten as x. Similarly, the production a → σ(ω, a) is replaced by the two
productions
a → σ(a3 , a) and a3 → ω,
and the production b → σ(x, x) is replaced by b → σ(a1 , a1 ) (we already have a1 → x).
We have got a grammar in normal form with five nonterminal symbols a, b, a1 , a2 and
a3 , and the productions

a → σ(a1 , a2 ), a → σ(a3 , a), b → σ(a1 , a1 ),


a1 → x, a2 → σ(a1 , b) and a3 → ω.

The following minor generalization of regular tree grammars is introduced as a technical


aid. An extended regular ΣX-grammar

G = (N, Σ, X, P, A′ )

is defined otherwise exactly as a regular ΣX-grammar, but it has a set A′ ⊆ N of initial


symbols. Also ⇒∗G is defined the same way as for regular tree grammars. The forest
generated by such a G is

T (G) = {t ∈ FΣ (X) | a0 ⇒∗G t for some a0 ∈ A′ }.

It is immediately clear that every language generated by an extended regular tree gram-
mar can be generated by an ordinary regular tree grammar, too.

Theorem 2.3.6 The forests generated by regular tree grammars are exactly the recog-
nizable forests.

Proof. We associate with every NDF ΣX-recognizer A = (A, Σ, X, α, A′ ) an extended


regular ΣX-grammar
G = (A, Σ, X, P, A′ ),

62
2.3 Regular tree grammars

where

P = {a → x | x ∈ X, a ∈ xα} ∪ {a → σ | σ ∈ Σ0 , a ∈ σ A }∪
{a → σ(a1 , . . . , am ) | m ≥ 1, σ ∈ Σm , a, a1 , . . . , am ∈ A, a ∈ σ A (a1 , . . . , am )}.

The grammar G is in normal form (i.e., the productions are of type (i)–(iii)). It is clear
that every extended regular ΣX-grammar in normal form arises this way from a NDF
ΣX-recognizer. To prove the theorem it suffices now to show that T (A) = T (G) for
such an associated pair A and G. To do this we show by tree induction that

a ∈ tα̂ iff a ⇒∗G t (*)

holds for all a ∈ A and t ∈ FΣ (X).


1◦ For t = x ∈ X, a ∈ xâ iff a → x ∈ P iff a ⇒∗ x (here we needed the fact that G
has no productions of the form a → b).
2◦ The case t = σ ∈ Σ0 is similar: a ∈ σ α̂ iff a ∈ σ A iff a → σ ∈ P iff a ⇒∗ σ.
3◦ Let t = σ(t1 , . . . , tm ) (m ≥ 1) and suppose that (*) holds for t1 , . . . , tm and all
states. If a ⇒∗ t, then there is a derivation of the form

a ⇒ σ(a1 , . . . , am ) ⇒∗ σ(t1 , . . . , tm ),

where a1 , . . . , am ∈ N and

ai ⇒ ∗ t i for i = 1, . . . , m.

Then a ∈ σ A (a1 , . . . , am ) by the definition of P , and (*) implies that a1 ∈ t1 α̂, . . . , am ∈


tm α̂. Hence,
a ∈ σ pA (t1 α̂, . . . , tm α̂) = tα̂.
Conversely, a ∈ tα̂ means that

a ∈ σ A (a1 , . . . , am )

for some a1 ∈ t1 α̂, . . . , am ∈ tm α̂. But then (*) implies a1 ⇒∗ t1 , . . . , am ⇒∗ tm . Also,


P contains the production a → σ(a1 , . . . , am ) and we get the required derivation

a ⇒ σ(a1 , . . . , am ) ⇒∗ σ(t1 , . . . , tm ) = t.

This completes the proof of (*), and we have for every ΣX-tree t,

t ∈ T (A) iff tα̂ ∩ A′ 6= ∅


iff a ∈ tα̂ for some a ∈ A′
iff a ⇒∗G t for some a ∈ A′
iff t ∈ T (G).

Hence T (A) = T (G) as required. ✷

63
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

2.4 OPERATIONS ON FORESTS


In this section some more insight into the family of recognizable forests is gained by
studying its closure properties with respect to various forest operations. In the following
definitions and theorems all forests usually have the same ranked alphabet and the same
frontier alphabet. To show that this is no serious limitation, we note the following simple
fact.
Lemma 2.4.1 Let Σ and Ω be ranked alphabets such that Σ ⊆ Ω, and let X and Y be
frontier alphabets such that X ⊆ Y. Then

Rec(Σ, X) = Rec(Ω, Y ) ∩ pFΣ (X). ✷

Of course, the lemma presupposes the point of view that every ΣX-forest is also an
ΩY -forest. Now let Σ and Ω be any ranked alphabets such that Σm ∩ Ωn = ∅ whenever
m 6= n. Also, let X and Y be arbitrary frontier alphabets. The lemma implies that if
S ∈ Rec(Σ, X) and T ∈ Rec(Ω, Y ), then S and T can be regarded as recognizable forests
over a common ranked alphabet Σ ∪ Ω and a common frontier alphabet X ∪ Y .

Theorem 2.4.2 If S, T ∈ Rec(Σ, X), then S ∩ T , S ∪ T and S − T are also recognizable


ΣX-forests.

Proof. Suppose S and T are recognized by the ΣX-recognizers A and B, respectively.


Let C = A × B and define

γ:X →C by x 7→ (xα, xβ).

Then
tγ̂ = (tα̂, tβ̂) for all t ∈ FΣ (X).
This implies that we get from C and γ ΣX-recognizers for S ∩ T , S ∪ T and S − T by
choosing, respectively, as the set of final states A′ ×B ′ , A′ ×B ∪A×B ′, and A′ ×(B −B ′ ).
For example, let
C = (C, γ, A′ × B ′ ).
For any t ∈ FΣ (X),

t ∈ T (C) iff tγ̂ = (tα̂, tβ̂) ∈ A′ × B ′


iff t ∈ T (A) ∩ T (B).

That is, T (C) = S ∩ T . ✷

Note that the complement FΣ (X) − T of a recognizable ΣX-forest T is recognizable. If


T is recognized by a ΣX-recognizer A, then the complement is recognized by (A, α, A −
A′ ).
Definition 2.4.3 Let (Tx | x ∈ X) be an X-indexed family of ΣX-forests. For each
ΣX-tree t we define a forest t(x ← Tx | x ∈ X), mostly written simply t(x ← Tx ), as
follows:

64
2.4 Operations on forests

1◦ If t = z ∈ X, then t(x ← Tx ) = Tz .

2◦ If t = σ ∈ Σ0 , then t(x ← Tx ) = σ.

3◦ If t = σ(t1 , . . . , tm )(m ≥ 1), then

t(x ← Tx ) = {σ(s1 , . . . , sm ) | si ∈ ti (x ← Tx ) for i = 1, . . . , m}.

The forest product of the family (Tx | x ∈ X) with the ΣX-forest T is defined as the
ΣX-forest [
T (x ← Tx | x ∈ X) = (t(x ← Tx | x ∈ X) | t ∈ T ).

We shall usually write just T (x ← Tx ). If T consists of a single ΣX tree t, then

T (x ← Tx ) = t(x ← Tx ).

The trees t(x ← Tx ) are obtained from t by replacing every occurrence of each letter x
by a tree from the corresponding forest Tx . Different occurrences of the same letter x
may be rewritten as different trees from Tx .
If x1 , . . . , xn ∈ X, then we use the notation

T (x1 ← T1 , . . . , xn ← Tn )

for the forest product T (x ← Tx ), where



Ti for x = xi (i = 1, . . . , n),
Tx =
x for x 6∈ {x1 , . . . , xn }.

If the letters x1 , . . . , xn and their order are understood, then this notation may be further
simplified to T (T1 , . . . , Tn ).
The comments presented at the beginning of the section show that the definition of
forest products also includes the cases, where T ⊆ FΣ (X) and Tx ⊆ FΩ (Y ) (x ∈ X) for
any such alphabets that Σm ∩ Ωn = ∅ whenever m 6= n. If T is a ΣX-forest and the
forests Tx are ΩY -forests, then T (x ← Tx ) is a (Σ ∪ Ω)Y -forest.

Example 2.4.4 Let Σ = Σ0 ∪ Σ2 , Σ0 = {ω}, Σ2 = {σ}, X = {x, y} and Y = {y, z}. If


t = σ(x, σ(y, x)), Tx = {σ(y, z), z} and Ty = {σ(ω, y), σ(z, z)}, then t(x ← Tx , y ← Ty )
contains eight trees, among them the tree σ(σ(y, z), σ(σ(ω, y), z)). ✷

The following special type of forest products is important.

Definition 2.4.5 Let S and T be ΣX-forests and z ∈ X. The z-product of S and T is


the forest product
S ·z T = T (x ← Tx | x ∈ X)
where Tz = S and Tx = x for all x ∈ X, x 6= z.

65
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

The trees in S ·z T are obtained by taking a tree t from T and substituting a tree from
S for every occurrence of z in t. Different occurrences of z may be replaced by different
trees from S.

Theorem 2.4.6 If T ∈ Rec(Σ, X) and Tx ∈ Rec(Σ, X) for all x ∈ X, then T (x ←


Tx ) ∈ Rec(Σ, X). In particular, Rec(Σ, X) is closed under all x-products (x ∈ X).

Proof. Here it is convenient to use regular tree grammars. Suppose T and the forests
Tx (x ∈ X) are generated by the regular ΣX-grammars G = (N, Σ, X, P, a0 ) and Gx =
(Nx , Σ, X, Px , ax ) (x ∈ X), respectively. We may assume that the grammars are in
normal form and that their sets of nonterminal symbols are pairwise disjoint. Construct
a regular ΣX-grammar
G′ = (N ′ , Σ, X, P ′ , a0 )
S
with N ′ = N ∪ (Nx | x ∈ X) and
[
P ′ = P ′′ ∪ {a → ax | x ∈ X, a → x ∈ P } ∪ (Px | x ∈ X),

where P ′′ is P with all productions of the form a → x (a ∈ N, x ∈ X) deleted.


We claim that T (G′ ) = T (x ← Tx ). The idea is that every derivation a0 ⇒G . . . ⇒G t
of a tree t ∈ T can be imitated by the productions in P ′′ up to the point where frontier
letters x ∈ X are to be generated. Instead of generating a leaf x one transfers then by a
production a → ax to the beginning of a derivation which generates any tree tx ∈ Tx in
place of the leaf. This means that G′ can generate all of T (x ← Tx ). On the other hand,
every derivation in G′ can be brought into this form by rearranging the applications of
the productions suitably. Hence, T (G′ ) ⊆ T (x ← Tx ). For a formal proof it suffices to
show that
a ⇒∗G′ p iff (∃q ∈ FΣ (X)) a ⇒∗G q, p ∈ q(x ← Tx ) (*)
holds for all a ∈ N and p ∈ FΣ (X). We proceed by tree induction on p. The fact that
the grammars G and Gx are in normal form is used without comment.

1◦ Let p = y ∈ X. Suppose there is a q ∈ FΣ (X) such that a ⇒∗G q and y ∈ q(x ← Tx ).


This is possible only in case q = z and y ∈ Tz for some z ∈ X. Then a → z ∈ P
and hence a → az , az → y ∈ P ′ . We get the derivation

a ⇒G′ az ⇒G′ y.

On the other hand, all derivations of y from a in G′ are of this form. Hence, if
a ⇒∗G′ y, then a → az , az → y ∈ P ′ for some z ∈ X. This means that a → z ∈ P
and az → y ∈ Pz , and thus z is the required tree q.

2◦ Let p = σ ∈ Σ0 .
(2a) If there is a q such that a ⇒∗G q and σ ∈ q(x ← Tx ), then there are two
possibilities. The first one is that q = σ. Then P and P ′ both contain a → σ
and we get the required derivation a ⇒∗G′ σ in one step. The other possibility

66
2.4 Operations on forests

is that q = x ∈ X and Px contains ax → σ. Then a → ax and ax → σ are in


P ′ and we get the derivation

a ⇒G′ ax ⇒G′ σ.

(2b) Suppose a ⇒∗G′ σ. One possibility is that a → σ ∈ P ′ . Then a → σ is in P ,


too, and we may choose q = σ. The only alternative is that the derivation is
of the form a ⇒G′ ax ⇒G′ σ for some x ∈ X. Then a → x ∈ P and σ ∈ Tx ,
and we may put q = x.

3◦ Let p = σ(p1 , . . . , pm ) (m > 0).


(3a) Suppose we have a tree q such that a ⇒∗G q and p ∈ q(x ← Tx ). Again
there are two cases to consider. If q = z ∈ X, then p ∈ Tz , a → z ∈ P and
az ⇒∗Gz p. Now a → az ∈ P ′ and, since Pz ⊆ P ′ , we get

a ⇒G′ az ⇒∗G′ p.

The other possibility is that

q = σ(q1 , . . . , qm )

for some q1 , . . . , qm ∈ FΣ (X). Then

pi ∈ qi (x ← Tx ) (i = 1, . . . , m)

and the derivation a ⇒∗G q must begin with a step

a ⇒G σ(a1 , . . . , am )

such that
ai ⇒∗G qi for i = 1, . . . , m.
Our silent inductive assumption yields

ai ⇒∗G′ pi for i = 1, . . . , m.

Combining these derivations with a → σ(a1 , . . . , am ) ∈ P ′ we get a ⇒∗G′ p.


(3b) Suppose a ⇒∗G′ p. This could mean that a → z ∈ P and az ⇒∗Gz p for
some z ∈ X. Then we may choose q = z. The other possibility is that the
derivation takes the form

a ⇒G′ σ(a1 , . . . , am ) ⇒∗G′ σ(p1 , . . . , pm ).

Then there exist ΣX-trees qi such that

ai ⇒∗G qi , pi ∈ qi (x ← Tx ) (i = 1, . . . , m).

Now we may put q = σ(q1 , . . . , qm ). ✷

67
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

Next we generalize the iteration operation taking the x-products as the starting point.

Definition 2.4.7 Let T be any ΣX-forest and let x ∈ X. Put T 0,x = {x} and

T j+1,x = T j,x ·x T ∪ T j,x

for all j ≥ 0. Then the x-iteration of T is the ΣX-forest


[
T ∗x = (T j,x | j ≥ 0).

The forest T ∗x is obtained as follows. First include x. New members of T ∗x are


obtained by substituting in some t ∈ T for every occurrence of x some tree already
known to be in T ∗x . Note that T 1,x = T ∪ x and T j,x ⊆ T j+1,x for every j ≥ 0.

Theorem 2.4.8 If T ∈ Rec(Σ, X), then T ∗x ∈ Rec(Σ, X) for each x ∈ X.

Proof. Let G = (N, Σ, X, P, a0 ) be a regular tree grammar in normal form generating


the forest T . Construct an extended regular ΣX-grammar G′ = (N ′ , Σ, X, P ′ , A′ ), where

(1) N ′ = N ∪ {d} (d 6∈ N ),

(2) P ′ = P ∪ {d → x} ∪ {a → r | a → x ∈ P, a0 → r ∈ P }, and

(3) A′ = {a0 , d}.

It is not hard to see that T (G′ ) = T ∗x . ✷

The following operation may be seen as a converse to the x-product.

Definition 2.4.9 Let S and T be ΣX-forests and let x ∈ X. The x-quotient of T by S


is the forest
S −x T = {p ∈ FΣ (X) | S ·x {p} ∩ T 6= ∅}.
If S = {s} is a singleton, then we write S −x T = s−x T .

A tree p is in S −x T iff one can convert it into a tree in T by substituting for every
occurrence of x a tree from S. If Σ is unary and X = {x}, and if we identify the tree
σk (. . . σ1 (x) . . .) with the word σ1 . . . σk , then

S −x T = S −1 T = {u ∈ Σ∗ | Su ∩ T 6= ∅}

is the usual (left) quotient language.

Theorem 2.4.10 If T ∈ Rec(Σ, X) and S is any ΣX-forest, then S −x T is recognizable


for every x ∈ X. Moreover, the number of different x-quotients S −x T for any fixed
T ∈ Rec(Σ, X) is finite.

68
2.4 Operations on forests

Proof. Let A be a ΣX-recognizer for T . We define an NDF ΣX-recognizer


B = (A, β, A′ )
which is identical to A (when states a ∈ A and singleton sets {a} are identified) except
for the initial assignment which is defined so that
xβ = S α̂
and
zβ = {zα} for all z ∈ X, z 6= x.
Here S α̂ is the set of all states sα̂ in which A may be after reading a tree s from S. By
tree induction one verifies that
tβ̂ = (S ·x t)α̂
for all t ∈ FΣ (X). Hence
t ∈ T (B) iff tβ̂ ∩ A′ 6= ∅
iff (S ·x t)α̂ ∩ A′ 6= ∅
iff S ·x t ∩ T 6= ∅
iff t ∈ S −x T
for all t ∈ FΣ (X). This implies S −x T = T (B). The second statement follows from this
construction as the number of possible β’s is finite. ✷

Next we introduce the forest operation corresponding to the σ-catenation of trees which
was defined in Section 2.1.
Definition 2.4.11 Let σ ∈ Σ be an m-ary operator and let T1 , . . . , Tm be m ΣX-forests
for some m ≥ 0. The σ-product of the forests T1 , . . . , Tm is the forest
σ(T1 , . . . , Tm ) = {σ(t1 , . . . , tm ) | t1 ∈ T1 , . . . , tm ∈ Tm }.
If m = 0, then the σ-product is always {σ}. In general,
σ(T1 , . . . , Tm ) = {σ(x1 , . . . , xm )}(x1 ← T1 , . . . , xm ← Tm ).
From Theorem 2.4.6 we get the following result which could easily be proved directly,
too.
Corollary 2.4.12 If σ ∈ Σm and T1 , . . . , Tm ∈ Rec(Σ, X) (m ≥ 0), then σ(T1 , . . . , Tm ) ∈
Rec(Σ, X). ✷
We shall now consider some operations in which forests are generally transformed into
forests over another ranked alphabet. The ranked alphabets will be Σ and Ω. Moreover,
we introduce for every m ≥ 0, a new alphabet
Ξm = {ξ1 , . . . , ξm }
which is assumed to be disjoint from all other alphabets.

69
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

Definition 2.4.13 Suppose we are given a mapping

hX : X → FΩ (Y )

and for each m ≥ 0 a mapping

hm : Σm → FΩ (Y ∪ Ξm ).

The tree homomorphism determined by these mappings is the mapping

h : FΣ (X) → FΩ (Y )

defined as follows:

1◦ h(x) = hX (x) for each x ∈ X.

2◦ h(σ(t1 , . . . , tm )) = hm (σ)(ξ1 ← h(t1 ), . . . , ξm ← h(tm )) for all m ≥ 0, σ ∈ Σm and


t1 , . . . , tm ∈ FΣ (X).

The tree homomorphism h is said to be linear if no letter ξi appears more than once in
hm (σ) for any m ≥ 0 and σ ∈ Σm .

To define such an h it obviously suffices to give hX and the mappings hm for which
Σm 6= ∅.

Example 2.4.14 Let Σ = Σ2 = { | }, Ω = Ω1 ∪ Ω2 , Ω1 = {′ }, Ω2 = {∨} and X = Y =


{x, y}. Define hX and h2 by the conditions

hX (x) = x, hX (y) = y and h2 (|) = ∨(′ (ξ1 ),′ (ξ2 )).

If we interpret | as the Sheffer stroke (i.e., the 2-place NAND), ∨ as the symbol of
disjunction and ′ as the symbol of negation, then the tree homomorphism h defined
by hX and h2 transforms |-expressions in variables x and y into equivalent expressions
which use ∨ and ′ only. If the more customary way to write Boolean expressions is used,
we get, for example,

h((x|y)|(x|x)) = h(x|y)′ ∨ h(x|x)′


= (x′ ∨ y ′ )′ ∨ (x′ ∨ x′ )′ .

This tree homomorphism is linear. ✷

Tree homomorphisms are not really homomorphisms in the sense of algebra. The
concept is the result of the dual nature of words. When one generalizes from languages
to forests, words are usually treated as unary terms. On the other hand, many concepts in
language theory arise from the interpretation of words as elements of a free monoid. Here
the initial concept was that of a homomorphism from the free monoid generated by an
alphabet Σ to the free monoid generated by another alphabet Ω. Such a homomorphism
rewrites every letter in a word over Σ as a word over Ω. When Σ and Ω are now viewed

70
2.4 Operations on forests

as unary ranked alphabets, this means that every operator from Σ is rewritten as a piece
of Ω-tree to be combined with other such pieces to form the image of a given Σ-word.
The generalization of such mappings to the case of arbitrary ranked alphabets gives tree
homomorphisms.
The following example shows that tree homomorphisms do not always preserve recog-
nizability.

Example 2.4.15 Put Σ = Σ1 = {σ}, X = Y = {x} and Ω = Ω2 = {ω}. Define hX and


h1 so that
hX (x) = x and h1 (σ) = ω(ξ1 , ξ1 ).
All ΣX-trees are of the type

tk = σ(σ(. . . σ(x) . . .)) = σ k (x) (k ≥ 0).

Obviously, h(t0 ) = hX (x) = x and, for all k ≥ 0,

h(tk+1 ) = ω(h(tk ), h(tk )).

Thus h(FΣ (X)) consists of the trees

s0 = x, s1 = ω(x, x), . . . , sk+1 = ω(sk , sk ), . . . .

Suppose A = (A, Ω, Y, α, A′ ) is an ΩY -recognizer such that T (A) = h(FΣ (X)). There


must exist two integers i, j ≥ 0, i 6= j, such that si α̂ = sj α̂. But then

ω(si , sj )α̂ = ω A (si α̂, sj α̂) = ω A (si α̂, si α̂) = si+1 α̂ ∈ A′

would imply ω(si , sj ) ∈ h(FΣ (X)). Thus h(FΣ (X)) cannot be recognizable. ✷

The nonpreservation of recognizability in Example 2.4.15 is due to the ability of the


tree homomorphism to create arbitrarily large identical subtrees by copying. No tree
recognizer can check whether trees of unbounded height are identical or not. Such
copying is precluded by linearity, and the following closure theorem holds.

Theorem 2.4.16 If h : FΣ (X) → FΩ (Y ) is a linear tree homomorphism and T ∈


Rec(Σ, X), then h(T ) ∈ Rec(Ω, Y ).

Proof. Let G = (N, Σ, X, P, a0 ) be a regular tree grammar in normal form generating T .


We may assume that G has no superfluous nonterminal symbols from which no ΣX-tree
can be generated. Let Σ′ and Ω′ be the ranked alphabets which are obtained by adding
all nonterminal symbols a ∈ N to Σ and Ω, respectively, as nullary operators. We extend
h to a tree homomorphism
h′ : FΣ′ (X) → FΩ′ (Y )
by continuing h0 to a mapping

h′0 : Σ0 ∪ N → FΩ′ (Y )

71
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

so that h′0 (a) = a for all a ∈ N . Now let

G′ = (N, Ω, Y, P ′ , a0 )

be the regular ΩY -grammar, where

P ′ = {a → h′ (p) | a → p ∈ P },

i.e., G′ is obtained simply by replacing in every production a → p ∈ P the right-hand


side by the tree h′ (p). The theorem follows when we show that T (G′ ) = h(T ). This
again is obvious once we have shown that

a ⇒∗G′ t iff (∃s ∈ FΣ (X)) h(s) = t, a ⇒∗G s (*)

holds for all a ∈ N and t ∈ FΩ (Y ). We prove the two directions of (*) separately.
Suppose a ⇒∗G′ t for some a ∈ N and ΩY -tree t. We prove the existence of the required
s by induction on the length of the shortest derivation of t from a.

1◦ If t is obtained by a one-step derivation, then P ′ contains the production a → t.


Then P contains a production a → r such that h′ (r) = t. If r does not contain
any nonterminal symbols, we may put s = r. Otherwise we choose for every b ∈ N
appearing in r a tree rb ∈ FΣ (X) such that b ⇒∗G rb . Let s be the tree obtained
by substituting in r these trees for the corresponding nonterminal symbols. Then
h(s) = h′ (r) = t since h′ deletes all nonterminal symbols from r. Moreover,

a ⇒G r ⇒∗G s,

and s is the required tree.

2◦ Suppose now that the derivation consists of k steps (k > 1) and that (*) holds
whenever a shorter derivation exists. The first step must be the application of a
production a → h′ (p), where a → p ∈ P . Since G is in normal form,

p = σ(a1 , . . . , am )

for some m > 0, σ ∈ Σm and a1 , . . . , am ∈ N . The derivation of t can now be


written in the form

a ⇒G′ hm (σ)(ξ1 ← a1 , . . . , ξm ← am ) ⇒G′ . . . ⇒G′ t.

For each ξi (i = 1, . . . , m) which is present in hm (σ) we have a subderivation

ai ⇒G′ . . . ⇒G′ ti (∈ FΩ (Y ))

of length less than k. The linearity of h implies that such a ξi appears in hm (σ)
exactly once, and hence ti is unique. For every ti there is an si ∈ FΣ (X) such that
h(si ) = ti and ai ⇒∗G si . If a certain ξi does not appear in hm (σ), then we choose

72
2.4 Operations on forests

any si ∈ FΣ (X) such that ai ⇒∗G si and put ti = h(si ). With these choices we get
a tree
s = σ(s1 , . . . , sm ) ∈ FΣ (X)
such that
a ⇒G σ(a1 , . . . , am ) ⇒∗G σ(s1 , . . . , sm ) = s
and
h(s) = hm (σ)(ξ1 ← h(s1 ), . . . , ξm ← h(sm )) = t.

Now we shall prove the converse part of (*). Suppose a ⇒∗G s and h(s) = t for some
a ∈ N , s ∈ FΣ (X) and t ∈ FΩ (Y ). To show that this implies a ⇒∗G′ t we proceed by
induction on the length of the shortest derivation a ⇒G . . . ⇒G s.

1◦ If there is a derivation of length one, then it consists simply of the application of


the production a → s. But then a → t is a production of G′ and a ⇒G′ t is the
required derivation.

2◦ Suppose now that the derivation is of the form

a ⇒G σ(a1 , . . . , am ) ⇒G . . . ⇒G σ(s1 , . . . , sm ) = s,

where m > 0, σ ∈ Σm and a1 , . . . , am ∈ N . For every i = 1, . . . , m there is a


shorter derivation
ai ⇒ G . . . ⇒ G s i .
Hence, ai ⇒∗G′ h(si ) for each i = 1, . . . , m. Moreover, P ′ contains the production

a → hm (σ)(ξ1 ← a1 , . . . , ξm ← am )

corresponding to the production a → σ(a1 , . . . , am ) of G. Now the required deriva-


tion is
a ⇒G′ hm (σ)(ξ1 ← a1 , . . . , ξm ← am ) ⇒G′ . . .
⇒G′ hm (σ)(ξ1 ← h(s1 ), . . . , ξm ← h(sm ))
= h(s) = t.
This concludes the proof. ✷

Next we show that arbitrary inverse tree homomorphisms preserve recognizability. We


need the following technical lemma. Its proof is left as an exercise.
Lemma 2.4.17 Consider a Σ-algebra A and a mapping α : X → A, where X ∩ A = ∅.
Let
α : FΣ (X ∪ A) → A
be the unique homomorphism such that α|X = α and α|A = 1A . Then α|FΣ (X) = α̂
and
p(ξ1 ← p1 , . . . , ξk ← pk )α̂ = p(ξ1 ← p1 α̂, . . . , ξk ← pk α̂)α
for all k ≥ 0, p ∈ FΣ (X ∪ Ξk ) and p1 , . . . , pk ∈ FΣ (X). ✷

73
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

Theorem 2.4.18 Let h : FΣ (X) → FΩ (Y ) be a tree homomorphism. If T ∈ Rec(Ω, Y ),


then h−1 (T ) ∈ Rec(Σ, X).

Proof. Let A = (A, Ω, Y, α, A′ ) be an ΩY -recognizer for T . We construct a ΣX-


recognizer B = (A, Σ, X, β, A′ ) as follows. For any m ≥ 0, σ ∈ Σm and a1 , . . . , am ∈ A,
we put
σ B (a1 , . . . , am ) = hm (σ)(ξ1 ← a1 , . . . , ξm ← am )α,

where α : FΩ (Y ∪ A) → A is the homomorphism for which α|X = α and α|A = 1A . In


the special case m = 0, we get σ B = h0 (σ)α = h0 (σ)α̂. The initial assignment is defined
by putting
xβ = h(x)α̂ for all x ∈ X.

Now a proof by tree induction shows that

sβ̂ = h(s)α̂

for all s ∈ FΣ (X). Hence, s ∈ T (B) iff h(s) ∈ T (A). This means that h−1 (T ) = T (B)
is recognizable. ✷

As a conclusion we consider a simple, but very important special type of tree homo-
morphisms.

Definition 2.4.19 A tree homomorphism h : FΣ (X) → FΩ (Y ) is called alphabetic if


the defining mappings hX and hm (m ≥ 0) satisfy the following conditions:

(1) hX (x) ∈ Y for all x ∈ X.

(2) hm (σ) = ω(ξ1 , . . . , ξm ), where ω ∈ Ωm , for all m ≥ 0, σ ∈ Σm .

An alphabetic tree homomorphism FΣ (X) → FΩ (Y ) can be defined only in case Ωm 6=


∅ for all such m ≥ 0 that Σm 6= ∅. Alphabetic tree homomorphisms are often called
projections.
Consider the general alphabetic tree homomorphism of the definition. For any t ∈
FΣ (X), the image h(t) is obtained simply by rewriting every x in t as the letter hX (x)
and every σ ∈ Σm as the operator ω, where hm (σ) = ω(ξ1 , . . . , ξm ). Hence h preserves
completely the “shape” of the tree t. Obviously, h is linear. From Theorems 2.4.16 and
2.4.18 we get

Corollary 2.4.20 Let h : FΣ (X) → FΩ (Y ) be an alphabetic tree homomorphism.

(i) If T ∈ Rec(Σ, X), then h(T ) ∈ Rec(Ω, Y ).

(ii) If T ∈ Rec(Ω, Y ), then h−1 (T ) ∈ Rec(Σ, X). ✷

74
2.5 Regular expressions. Kleene’s theorem

2.5 REGULAR EXPRESSIONS. KLEENE’S THEOREM


Kleene’s theorem is of central importance in the theory of finite automata and it is
quite natural that it was among the first results to be generalized to the theory of
tree automata. Although the greater generality adds some technical complications, the
standard development of the theory can be followed quite close here, too, once the right
generalizations of the basic concepts have been found.
We fix again an arbitrary ranked alphabet Σ and an arbitrary frontier alphabet X. It
turns out that some additional frontier symbols are needed in the construction of regular
forests. Therefore we will operate with an extended alphabet Z which contains X as a
subset.

Definition 2.5.1 The set of regular ΣZ-expressions RE(Σ, Z) is defined as the smallest
set RE such that the following conditions are satisfied:

1◦ ∅ ∈ RE.

2◦ Σ0 ∪ Z ⊆ RE.

3◦ If ζ, η ∈ RE, then (ζ + η) ∈ RE.

4◦ If ζ, η ∈ RE and z ∈ Z, then (ζ ·z η) ∈ RE.

5◦ If ζ ∈ RE and z ∈ Z, then (ζ ∗z ) ∈ RE.

6◦ If m > 0, σ ∈ Σm , η1 , . . . , ηm ∈ RE, then σ(η1 , . . . , ηm ) ∈ RE.

Thus regular ΣZ-expressions are strings of symbols from Σ ∪ Z, of commas etc. Parts
2◦ and 6◦ of the definition imply that every ΣZ-tree is a regular ΣZ-expression. Regular
expressions are intended as representations of forests.

Definition 2.5.2 The forest |η| represented by a regular expression η ∈ RE(Σ, Z) is


defined following the inductive form of Definition 2.5.1:

1◦ |∅| = ∅ (the empty forest).

2◦ If η ∈ Σ0 ∪ Z, then |η| = {η}.

3◦ |(ζ + η)| = |ζ| ∪ |η|.

4◦ |(ζ ·z η)| = |ζ| ·z |η|.

5◦ |(ζ ∗z )| = |ζ|∗z .

6◦ |σ(η1 , . . . , ηm )| = σ(|η1 |, . . . , |ηm |).

75
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

Note that the operations in the right-hand sides of 3◦ − 6◦ are forest operations which
have been defined in Section 2.4. It is easy to see that every tree t ∈ FΣ (Z) represents,
as a regular expression, the one-element forest {t}.
With this interpretation in mind we may simplify regular expressions by omitting
parentheses that are not needed in order to specify the intended order of the opera-
tions. First of all, the outermost parentheses in (ζ + η), (ζ ·z η) and (ζ ∗z ) are obviously
superfluous if the expressions do not appear as parts of other expressions. We may
also agree that iterations precede products and that products precede unions. Then the
parentheses around ζ ∗z can always be omitted and, for example,

ζ + η ·x θ ∗y

is interpreted as a short form for

(ζ + (η ·x (θ ∗y ))).

Example 2.5.3 Let Σ = Σ0 ∪ Σ2 , Σ0 = {ω} and Σ2 = {σ} and Z = {x, y}. The forest
represented by
η = ω ·y σ(x, y)∗y
contains the trees ω, σ(x, ω), σ(x, σ(x, ω)) etc. Note that y has a purely auxiliary
function; it does not appear in any tree of the forest |η|. ✷
In the following definition we make the formal distinction between letters that may
appear in trees of the forest represented by a regular expression and those letters that
are used just to mark leaves to be rewritten when products of forests are formed.

Definition 2.5.4 Suppose a regular ΣZ-expression ζ can be written in the form

ζ = u(η ·z θ)v

where η, θ ∈ RE(Σ, Z) and z ∈ Z. Then every occurrence of z within the string ·z θ is


said to be bound. An occurrence of a letter z ∈ Z which is not bound is free. A letter
z ∈ Z is bound in ζ ∈ RE(Σ, Z), if all occurrences of z in ζ are bound, and it is free in ζ
if it has at least one free occurrence in ζ. We denote by Zζ the set of letters z ∈ Z free
in ζ.

In Example 2.5.3 Zη = {x} and y is bound by the y-product.


Lemma 2.5.5 For any η ∈ RE(Σ, Z), |η| ∈ Rec(Σ, Zη ).
Proof. We proceed by induction following the six parts in Definitions 2.5.1 and 2.5.2.
1◦ Z∅ = ∅ and |∅| = ∅ ∈ Rec(Σ, ∅).

2◦ For each z ∈ Z, Zz = {z} and |z| = {z} ∈ Rec(Σ, {z}). For σ ∈ Σ0 , Zσ = ∅, but
still |σ| = {σ} ∈ Rec(Σ, ∅).

76
2.5 Regular expressions. Kleene’s theorem

3◦ If η = ζ + θ, then Zη = Zζ ∪ Zθ and |η| = |ζ| ∪ |θ| ∈ Rec(Σ, Zη ) by Lemma 2.4.1


and Theorem 2.4.2.

4◦ If η = ζ ·z θ, then (if we omit the trivial case z 6∈ Zθ , |η| = |θ|) Zη = Zζ ∪ (Zθ − z).
There are two cases to consider. If z 6∈ Zζ , then Zη = (Zζ ∪ Zθ )− z. From Theorem
2.4.6 we know that |η| ∈ Rec(Σ, Zζ ∪ Zθ ). Thus it suffices to show that no tree
t ∈ |η| contains any occurrence of z. But this is obvious since every such t is
obtained from some s ∈ |θ| by replacing every occurrence of z by a tree from |ζ|,
and no tree in |ζ| contains z. If z ∈ Zζ , then Zη = Zζ ∪ Zθ and |η| ∈ Rec(Σ, Zη )
follows directly from Theorem 2.4.6.

5◦ If η = ζ ∗z (z ∈ Z), then Zη = Zζ ∪ z. Thus |ζ| ∈ Rec(Σ, Zη ) by Lemma 2.4.1. This


implies |ζ ∗z | ∈ Rec(Σ, Zη ) by Theorem 2.4.8.

6◦ If η = σ(η1 , . . . , ηm ), where m > 0, σ ∈ Σm and |ηi | ∈ Rec(Σ, Zηi ) (i = 1, . . . , m),


then Zη = Zη1 ∪. . .∪Zηm and every |ηi | is also a recognizable ΣZη -forest. Corollary
2.4.12 yields now |η| ∈ Rec(Σ, Zη ). ✷

The operations (finite) union, z-product and z-iteration are called the regular opera-
tions. A forest is regular if it can be constructed from finite forests by applying a finite
number of regular operations. In view of the preceding discussion regularity can also be
defined as follows:

Definition 2.5.6 A ΣX-forest T is regular if there exist an alphabet Z (X ⊆ Z) and a


regular ΣZ-expression η such that |η| = T .

Note that an unlimited number of auxiliary letters (z ∈ Z − X) is allowed in a regu-


lar expression representing a regular forest, but that in any particular case just a finite
number of them are needed. Lemma 2.5.5 implies now that all regular forests are recog-
nizable. The next lemma contains the converse statement.

Lemma 2.5.7 For any ΣX-recognizer A one can construct a regular expression η ∈
RE(Σ, X ∪ A) (we assume X ∩ A = ∅) such that |η| = T (A).

Proof. The proof is modelled after the almost standard proof for the corresponding fact
in the language case (due to R. McNaughton and H. Yamada (1960)). The notation can
be simplified by assuming that

A = {1, 2, . . . , k} for some k ≥ 1.

As in Lemma 2.4.17 let


α : FΣ (X ∪ A) → A
be the homomorphism such that α|X = α and α|A = 1A . For any i ∈ A, K ⊆ A and h,
0 ≤ h ≤ k, we denote by T (K, h, i) the set of all t ∈ FΣ (X ∪ K) such that

(1) tα = i and

77
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

(2) sα ∈ {1, . . . , h} for all s ∈ sub(t) − (X ∪ Σ0 ∪ t).


Thus t ∈ T (K, h, i) means that the leaves of t may be labelled, besides frontier letters
and nullary symbols, by states from K. Moreover, the computation of A on t results in
state i and the state of A at any node between the frontier and the root is in the set
{1, . . . , h}. Obviously, [
T (A) = (T (∅, k, i) | i ∈ A′ ).
It suffices therefore to show that all sets T (K, h, i) are regular. To do this we proceed
by induction on the number h.

1◦ When h = 0, no intermediate states between the frontier and the root are allowed.
Every tree t in T (K, 0, i) must hence be of one of the following types:
(i) t = x ∈ X and xα = i.
(ii) t = i ∈ K.
(iii) t = σ ∈ Σ0 with σ A = i.
(iv) t = σ(d1 , . . . , dm ) with m > 0, dj ∈ X ∪ Σ0 ∪ K (j = 1, . . . , m) and tα = i.
In each case a regular expression for {t} can be written. The number of such trees
t is finite and we get a regular expression for T (K, 0, i).
2◦ Suppose we already have a regular expression for each T (K, j, i) such that j ≤ h
for some h < k. We show that
T (K, h + 1, i) = (*)
T (K, h, i) ∪ T (K, h, h + 1) ·h+1 T (K ∪ h + 1, h, h + 1)∗h+1 ·h+1 T (K ∪ h + 1, h, i)
holds for all K ⊆ A and i ∈ A. This will complete the induction because the
right-hand side of (*) is obtained by regular operations from forests for which we
already have regular expressions.
Let T be the right-hand side of (*). From the construction of T it is obvious that
T ⊆ T (K, h + 1, i). If t ∈ T (K, h + 1, i), then either t ∈ T (K, h, i) or t has a proper
subtree s 6∈ X ∪ Σ0 such that sα = h + 1. In the former case we get t ∈ T directly.
In the second case we have
t ∈ {p1 , . . . , pd } ·h+1 {q11 , . . . , q1e1 } ·h+1 . . . ·h+1 {qj1 , . . . , qjej } ·h+1 {r},
for some
p1 , . . . , pd ∈ T (K, h, h + 1), q11 , . . . , q1e1 , . . . , qj1 , . . . , qjej ∈ T (K ∪ h + 1, h, h + 1)
and r ∈ T (K ∪ h + 1, h, i). This means that t belongs to the second part of T . ✷

Combining Lemma 2.5.5 and Lemma 2.5.7 we get the following generalized form of
Kleene’s theorem.

Theorem 2.5.8 A forest is recognizable iff it is regular. ✷

78
2.6 Minimal tree recognizers

2.6 MINIMAL TREE RECOGNIZERS


The number of states is a simple and natural measure of the complexity of a finite
automaton. In this section we consider minimal recognizers of forests. In the case of a
recognizable forest minimality means simply a minimal number of states, and there is
always a minimal recognizer which is unique up to isomorphism. All tree recognizers
recognizing a nonregular forest must be infinite and counting the number of states does
not make any sense. Nevertheless, the general definition of minimality is such that
the minimal recognizer of a forest remains unique even in such a case. The minimal
recognizer of a forest can be derived from any recognizer of this forest. If the forest is
recognizable, then the minimalization procedure is effective. Otherwise, the finiteness
of the recognizers is not needed in this section. Also, some of the concepts and results
presented here will be applied to infinite tree recognizers in the next section. Thus we
will temporarily drop our general assumption that all tree recognizers dealt with are
finite. In all other respects the earlier definitions and conventions remain valid.
We shall now define homomorphisms, congruences and quotients of tree recognizers.
The reader may find it helpful to review the corresponding material from Section 1.2
before going on. Algebraic functions and elementary translations (cf. Sect. 1.3) will also
be needed.

Definition 2.6.1 A homomorphism from a ΣX-recognizer A to a ΣX-recognizer B is


a mapping ϕ : A → B such that

(1) ϕ is a homomorphism from the Σ-algebra A to the Σ-algebra B,

(2) αϕ = β, and

(3) B ′ ϕ−1 = A′ .

If ϕ is a homomorphism from A to B, we write ϕ : A → B. A homomorphism of tree


recognizers is an epimorphism if it is surjective, a monomorphism if it is injective, and it is
called an isomorphism if it is bijective. If there exists an isomorphism ϕ : A → B, then
we write A ∼ = B and say that A and B are isomorphic. If there exists an epimorphism
ϕ : A → B, then B is said to be an epimorphic image of A. A monomorphism is also
called an embedding.

Part (3) of Definition 2.6.1 means that the final states, and these only, map to final
states in a homomorphism. If ϕ is an epimorphism, then (3) implies A′ ϕ = B ′ .

Lemma 2.6.2 Let A and B be two ΣX-recognizers. If there exists a homomorphism


ϕ : A → B, then T (A) = T (B).

Proof. The clauses (1) and (2) of Definition 2.6.1 imply together with Lemma 1.3.6 that

tA (α)ϕ = tB (αϕ) = tB (β)

79
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

for every t ∈ FΣ (X). Now clause (3) shows that

t ∈ T (B) iff tB (β) = tA (α)ϕ ∈ B ′


iff tA (α) ∈ A′
iff t ∈ T (A)

for every t ∈ FΣ (X), and the lemma follows. ✷

Definition 2.6.3 A congruence of a ΣX-recognizer A is a congruence ̺ of the algebra


A saturating A′ , that is, such that A′ ̺ = A′ . The set of all congruence relations of A is
denoted by C(A).

Lemma 2.6.4 C(A) is a principal ideal of the complete lattice C(A), and thus (C(A),
⊆) is a complete lattice itself, too.

Proof. It suffices to verify the following simple facts:

(i) δA ∈ C(A) (which implies C(A) 6= ∅).

(ii) θ ⊆ ̺ ∈ C(A) and θ ∈ C(A) imply θ ∈ C(A).

(iii) ∨(̺ | ̺ ∈ C(A)) ∈ C(A).

In (iii) the supremum is to be formed in C(A). It is the generating element of the


principal ideal. ✷

In Theorem 2.6.10 we shall get a more useful description of the greatest element of
C(A).

Definition 2.6.5 The quotient ΣX-recognizer of a ΣX-recognizer A with respect to a


congruence ̺ is the ΣX-recognizer

A/̺ = (A/̺, α̺ , A′ /̺),

where α̺ is defined so that xα̺ = (xα)̺ for each x ∈ X.

The usual relations between homomorphisms, congruences and quotients hold for tree
recognizers, too. Some of them are listed in the following theorem. We omit the proofs
since they can be constructed exactly as the corresponding proofs in algebra.

Theorem 2.6.6 (a) If ̺ ∈ C(A), the natural mapping

̺♮ : A → A/̺, a 7→ a̺ (a ∈ A),

is an epimorphism A → A/̺ (called the natural epimorphism).

80
2.6 Minimal tree recognizers

(b) If ϕ : A → B is a homomorphism, then the kernel ϕϕ−1 is a congruence of A and


the image
Aϕ = (Aϕ, β, A′ ϕ)
of A is isomorphic to Aϕϕ−1 . (In Aϕ Aϕ is the Σ-algebra (Aϕ, Σ) such that
σ Aϕ = σ B |Aϕ and β is to be interpreted as a mapping from X to Aϕ.)

(c) If π ⊆ ̺ for some π, ̺ ∈ C(A), then A/̺ is an epimorphic image of A/π. ✷

From Theorem 2.6.6 and Lemma 2.6.2 we get

Corollary 2.6.7 If ̺ ∈ C(A), then T (A/̺) = T (A). ✷

Thus any congruence of a tree recognizer yields an equivalent recognizer which is an


epimorphic image of the original one. If the recognizer is finite and the congruence is
nontrivial, then a real reduction in the number of states is achieved. Obviously, the
greatest congruence gives the smallest quotient recognizer. The construction of the
quotient recognizer involves a merging of states which are equivalent in the sense that
one can be substituted for another in any computation without affecting the end result.
We shall now give a precise meaning to this equivalence of states and show then that
the greatest congruence consists exactly of the pairs of equivalent states.

Definition 2.6.8 Two states a and b of a ΣX-recognizer A are said to be equivalent


and we write a ∼A b or just a ∼ b, iff

(∀f ∈ Alg1 (A)) f (a) ∈ A′ ⇐⇒ f (b) ∈ A′ .

To get a better intuitive grasp of this definition we recall the fact that for each algebraic
function f ∈ Alg1 (A) there exists a tree t ∈ FΣ (A ∪ ξ) such that for all a ∈ A,

f (a) = tα̂a ,

where αa : A ∪ ξ → A is defined by αa |A = 1A and ξαa = a (Lemma 1.3.14). This means


that A computes f (a) from the tree t when one assigns state a to all leaves labelled
by ξ. On the other hand, every tree t ∈ FΣ (A ∪ ξ) defines this way a unary algebraic
function. Such a tree may be thought of as the unprocessed part of a ΣX-tree where a
leaf labelled by a state c ∈ A corresponds to a subtree s such that sα̂ = c. Once a value
a ∈ A has been assigned to the leaves labelled by ξ the computation may be completed.
The equivalence of two states a and b means that the assignments ξ = a and ξ = b give
always the same result (mod A′ ) when such a computation is completed.

Definition 2.6.9 The ΣX-recognizer A is

(a) reduced if ∼A = δA ,

(b) connected if every state of A is reachable, i.e., there exists for every a ∈ A a tree
t ∈ FΣ (X) such that tα̂ = a, and A is

81
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

(c) minimal if it is connected and reduced.

That a recognizer is reduced means that no two distinct states are equivalent. To
be connected means that every state is possible in some computation performed by
the recognizer on some tree. By Lemma 1.3.8, a tree recognizer A is connected iff
Xα generates A. In the case of a finite recognizer minimality really means a minimal
number of states among equivalent recognizers. If a recognizer is not connected, then
the nonreachable states can be discarded without changing the forest recognized. If A is
finite and ∼A > δA , then A/∼A is a properly smaller recognizer equivalent to A. Hence,
a finite tree recognizer can be minimal with respect to the number of states only if it is
minimal in the sense of Definition 2.6.9. The converse will be established later.

Theorem 2.6.10 For any ΣX-recognizer A, ∼ is the greatest congruence of A and


A/∼ is a reduced ΣX-recognizer equivalent to A.

Proof. It is obvious that ∼ is an equivalence relation on A. Let a ∼ b (a, b ∈ A). For


any two unary algebraic functions f, g ∈ Alg1 (A), the composition

f ◦ g : ξ 7→ g(f (ξ)) (ξ ∈ A)

is a unary algebraic function. Hence

g(f (a)) ∈ A′ iff g(f (b)) ∈ A′ ,

and this implies f (a) ∼ f (b). By Lemma 1.3.16, ∼ is a congruence of A. If a ∼ b and


a ∈ A′ , then b = 1A (b) ∈ A′ . Thus A′ ∼ = A′ and ∼ is a congruence of A. Let ̺ be
any congruence of A. If a̺b and f ∈ Alg1 (A), then ̺ ∈ C(A) implies f (a)̺f (b). Now
A′ ̺ = A′ implies
f (a) ∈ A′ iff f (b) ∈ A′ .
Hence a ∼ b and we have shown that ∼ is the greatest among the congruences of A.
Corollary 2.6.7 tells us that T (A) = T (A/∼). That A/∼ is reduced follows directly
from the fact, well-known in universal algebra, that the lattice C(A/∼) is isomorphic to
the principal dual ideal [∼) generated by ∼ in C(A). Since ∼ is the greatest element of
C(A), [∼) is trivial and thus ∼A/∼ must be the diagonal relation of A/∼. A more direct
proof is possible, too. It is not hard to show that (a ∼) ∼A/∼ (b ∼) implies a ∼ b, and
hence a ∼= b ∼. ✷

The quotient recognizer A/∼A is often called the reduced form of A. It is clear
from Theorem 2.6.10 that two tree recognizers having isomorphic reduced forms are
equivalent. We show that the converse holds for connected recognizers. In other words,
equivalent minimal recognizers are shown to be isomorphic.

Theorem 2.6.11 Let A and B be two minimal tree recognizers. If A and B are equiv-
alent, then they are also isomorphic.

82
2.6 Minimal tree recognizers

Proof. Define ϕ : A → B so that

(tα̂)ϕ = tβ̂ for all t ∈ FΣ (X).

We show that ϕ gives the required isomorphism from A to B. This involves the following
seven points:

(i) ϕ associates with every a ∈ A a state of B since A is connected.

(ii) To show that ϕ is well-defined we consider the possibility that sα̂ = tα̂ for two
ΣX-trees s and t. If sβ̂ 6= tβ̂, then sβ̂ and tβ̂ are nonequivalent and there exists
an algebraic function f ∈ Alg1 (B) such that f (sβ̂) ∈ B ′ and f (tβ̂) 6∈ B ′ (or
conversely). By Lemma 1.3.14 there exists a tree p ∈ FΣ (B ∪ ξ) (ξ 6∈ B ∪ X) such
that for all b ∈ B,
f (b) = pB (βb ),
where βb : B ∪ ξ → B is defined so that βb |B = 1B and ξβb = b. Since B is
connected there exists for each b ∈ B a ΣX-tree pb such that pb β̂ = b. Let

q = p(b ← pb | b ∈ B)(∈ FΣ (X ∪ ξ)).

Consider the ΣX-trees qs = q(ξ ← s) and qt = q(ξ ← t). Now

qs β̂ = pB (βsβ̂ ) = f (sβ̂) ∈ B ′

and
qt β̂ = pB (βtβ̂ ) = f (tβ̂) 6∈ B ′ .

If we assign in q to every letter x ∈ X the value xα, we get a function g ∈ Alg1 (A)
such that for each a ∈ A,
g(a) = q A (αa )
where αa : X ∪ ξ → A is defined so that αa |X = α and ξαa = a. Applying Lemma
1.3.6 we get now
g(sα̂)ϕ = q A (αsα̂ )ϕ = qs α̂ϕ = qs β̂ ∈ B ′
and
g(tα̂)ϕ = q A (αtα̂ )ϕ = qt α̂ϕ = qt β̂ 6∈ B ′ .
This is in contradiction with our original assumption that sα̂ = tα̂. Hence qs ∈
T (B), but qt 6∈ T (B). On the other hand, sα̂ = tα̂ implies qs α̂ = qt α̂, and a
contradiction with our assumption that T (A) = T (B) results.

(iii) Reversing the roles of A and B in Part (ii) one sees that sβ̂ = tβ̂ implies sα̂ = tα̂
for all ΣX-trees s and t. This means that ϕ is injective.

(iv) ϕ is surjective since B is connected.

83
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

(v) Let m ≥ 0, σ ∈ Σm and a1 , . . . , am ∈ A. There are trees t1 , . . . , tm ∈ FΣ (X) such


that a1 = t1 α̂, . . . , am = tm α̂. Then

σ A (a1 , . . . , am )ϕ = σ A (t1 α̂, . . . , tm α̂)ϕ


= σ(t1 , . . . , tm )α̂ϕ
= σ(t1 , . . . , tm )β̂
= σ B (t1 β̂, . . . , tm β̂)
= σ B (a1 ϕ, . . . , am ϕ).

Hence ϕ is a homomorphism from A to B.

(vi) For each x ∈ X, xαϕ = xα̂ϕ = xβ̂ = xβ. Thus αϕ = β.

(vii) If tα̂ ∈ A′ (t ∈ FΣ (X)), then tα̂ϕ = tβ̂ ∈ B ′ since t ∈ T (A) = T (B). Similarly,
tα̂ϕ ∈ B ′ implies tα̂ ∈ A′ . Hence, B ′ ϕ−1 = A′ . ✷

Corollary 2.6.12 If A and B are connected ΣX-recognizers such that T (A) = T (B),
then A/∼A ∼
= B/∼B . ✷

For every ΣX-forest T there is at least the infinite ΣX-recognizer

FT = (FΣ (X), 1X , T )

where FΣ (X) = (FΣ (X), Σ) is the ΣX-term algebra. Indeed, for each t ∈ FΣ (X) we
have
tFΣ (X) (1X ) = t ∈ T (FT ) iff t ∈ T.
Obviously FT is connected. Hence, FT /∼ is a minimal recognizer for T (the relation ∼
will be examined more closely in the next section). To show it we shall verify that every
quotient recognizer of a connected tree recognizer is connected.
Let ϕ : A → B be an epimorphism of ΣX-recognizers. If A is connected, then so is
B. Indeed, let b be any state of B. There exists an a ∈ A such that aϕ = b. Since A is
connected there is a tree t ∈ FΣ (X) so that a = tA (α). Using Lemma 1.3.6 we get

tB (β) = tB (αϕ) = tA (α)ϕ = aϕ = b.

In particular, A/∼A is connected for every tree recognizer A.


We now have everything needed for the main theorem of the section.

Theorem 2.6.13 For every forest T there exists a minimal tree recognizer, and it is
unique up to isomorphism. If A is any connected recognizer of T , then the minimal
recognizer is an epimorphic image of A. In fact, A/∼A is minimal. ✷

The theorem is valid for every forest. It suggests the following two-step procedure for
finding the minimal recognizer for T once any recognizer A of T is given:

84
2.7 Algebraic characterizations of recognizability

1◦ Discard all nonreachable states from A. We get a connected recognizer B such


that T (B) = T .
2◦ Reduce B by finding ∼B and then constructing B/∼B which is the required mini-
mal recognizer.
Both of these steps become effective when T is a recognizable forest and the given
recognizer A is finite.
The reachable states of A form the subalgebra of A generated by the subset Xα. This
can be found as follows. Let H0 = Xα ∪ {σ A | σ ∈ Σ0 } and put
Hi+1 = Hi ∪ {σ A (a1 , . . . , am ) | m > 0, σ ∈ Σm , a1 , . . . , am ∈ Hi }.
Then
H0 ⊆ H1 ⊆ . . . ⊆ A
and Hi = [Xα] (i ≥ 0) if Hi+1 = Hi . Such an i must exist since A is finite.
Suppose now that we have a finite connected ΣX-recognizer B and consider step 2◦ .
First one should find Alg1 (B). It is finite and can be formed repeating the inductive step
of Definition 1.3.13 a finite number of times. Then ∼B can be determined directly, using
the definition. Although the minimal recognizer B/∼B certainly can be found this way,
the procedure would be quite tedious in most cases. A computationally simpler method
can be derived from the following lemma. The proof is left as an exercise. The crucial
aid is Lemma 1.3.16: an equivalence is a congruence iff it is invariant with respect to all
elementary translations.
Lemma 2.6.14 Define a descending sequence ∼0 ⊇∼1 ⊇ . . . of equivalences on B as
follows: (i) B/∼0 = {B ′ , B − B ′ } and (ii) for all i ≥ 0 and a, b ∈ B, a ∼i+1 b iff a ∼i b
and f (a) ∼i f (b) for all f ∈ ET(B). Then ∼i =∼B if ∼i+1 =∼i , and this holds for some
i < |B|. ✷

2.7 ALGEBRAIC CHARACTERIZATIONS OF


RECOGNIZABILITY
In this section two strictly algebraic characterizations of the recognizable forests are
presented. First some ideas from the previous section are applied to derive a generaliza-
tion of Nerode’s theorem on regular languages and right congruences of the free monoid
(cf. Theorem 1.5.6). Then we show that the recognizable forests can be obtained by
solving fixed-point equations of a certain kind. Again, there is a well-known precursor
in the theory of finite automata. In fact, in the unary case the equations considered here
reduce to Arden’s equations which give the regular languages as their solutions.
Let Σ and X be fixed and denote the ΣX-term algebra FΣ (X) by F, for short. In
the previous section we noted that each ΣX-forest T has the (infinite) ΣX-recognizer
FT = (F, 1X , T ). Consider any ΣX-recognizer A such that T (A) = T . It is easy to
verify that the extension of the initial assignment α : X → A to a homomorphism
α̂ : F → A

85
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

is also a homomorphism of ΣX-recognizers from FT to A. Indeed, 1X α̂ = α̂ and


A′ α̂−1 = T (A) = T . The kernel α̂α̂−1 is a congruence of FT with a congruence class
for each reachable state of A. If T is recognizable, A may be chosen as finite, and then
α̂α̂−1 is of finite index. Now, suppose FT has a congruence ̺ of finite index. Then
FT /̺ is a finite ΣX-recognizer such that T (FT /̺) = T (FT ) = T (by Corollary 2.6.7).
Hence T is recognizable. The congruences of FT are simply the congruences of F which
saturate T . Among these there is one of finite index iff the greatest congruence ∼FT
of FT is of finite index. The congruence ∼FT (∼T for short) is the Nerode congruence
of T . These observations may be summed up as

Theorem 2.7.1 For every ΣX-forest T the following three conditions are equivalent:

(i) T ∈ Rec(Σ, X).

(ii) The term algebra FΣ (X) has a congruence of finite index which saturates T .

(iii) The index of the Nerode congruence ∼T is finite. ✷

The recognizer FT is connected and Theorem 2.6.10 implies therefore that FT / ∼T is


the minimal recognizer of the forest T . To find ∼T for a given ΣX-forest T one could
try to apply Definition 2.6.8 to FT : for any s, t ∈ FΣ (X),
 
s ∼T t iff ∀p ∈ FΣ (X ∪ ξ) p(ξ ← s) ∈ T ⇐⇒ p(ξ ← t) ∈ T .

A part of Theorem 2.7.1 can be restated as follows.

Corollary 2.7.2 A ΣX-forest T is recognizable iff there exist a finite Σ-algebra A, a


homomorphism ϕ : FΣ (X) → A and a subset A′ ⊆ A such that T = A′ ϕ−1 . ✷

The corollary gives, in fact, just an obvious reformulation of the definition of recog-
nizability. Without going into the subject any further here, we note that in this form
recognizability may be defined for subsets of arbitrary algebras (and not just term al-
gebras): a subset T of a Σ-algebra A is said to be recognizable, if there exist a finite
Σ-algebra B, a homomorphism ϕ : A → B and a subset H ⊆ B such that Hϕ−1 = T .
If here A = FΣ (X), then we get the recognizable ΣX-forests, and if A is the free
monoid X ∗ , then we get the recognizable X-languages.
As an introduction to the theory of fixed-point equations we first look at an example
of Arden equations.

Example 2.7.3 Consider the two-state Rabin-Scott recognizer A defined by the state
graph shown in Fig. 2.6. The input alphabet is Σ = {σ, τ }.
Let L1 and L2 be the languages of all words taking A from the initial state 1 to state
1 and 2, respectively. Then the following equations hold:

L1 = L1 σ ∪ L2 σ ∪ e
(1)
L2 = L1 τ ∪ L2 τ .

86
2.7 Algebraic characterizations of recognizability

σ σ τ

1 2
τ

Figure 2.6.

If we define a mapping
Π̂ : (pΣ∗ )2 → (pΣ∗ )2
so that for all U, V ⊆ Σ∗ ,
Π̂(U, V ) = (U σ ∪ V σ ∪ e, U τ ∪ V τ ) ,
then (1) means that (L1 , L2 ) is a solution of the fixed-point equation
(v1 , v2 ) = Π̂(v1 , v2 ) . (2)
Moreover, (L1 , L2 ) is the least solution of (2) when (pΣ∗ )2 is partially ordered in the
natural way:
(U1 , V1 ) ≤ (U2 , V2 ) iff U1 ⊆ U2 and V1 ⊆ V2 .
If we view Σ as a unary ranked alphabet and identify Σ{x}-trees and Σ-words as shown
in Section 2.2 (x = e, σk (· · · σ1 (x) · · · ) = σ1 · · · σk ), then the term algebra FΣ ({x}) may
be taken to be
F = (Σ∗ , Σ) ,
where
σ F (u) = uσ (σ ∈ Σ, u ∈ Σ∗ ) .
In the corresponding subset algebra
pF = (pΣ∗ , Σ)
we have the operations
σ pF (L) = Lσ (σ ∈ Σ, L ⊆ Σ∗ ) .
The mapping Π̂ can be defined in terms of these operations, the empty word and unions:

Π̂(U, V ) = σ pF (U ) ∪ σ pF (V ) ∪ x, τ pF (U ) ∪ τ pF (V ) .
Using forest products we may write this as follows:
Π̂(U, V ) = {σ(v1 ), σ(v2 ), x}(v1 ← U, v2 ← V ) ,
 (3)
{τ (v1 ), τ (v2 )}(v1 ← U, v2 ← V ) .
Finally, we write (2) in the more readable form
v1 = σ(v1 ) + σ(v2 ) + x
(4)
v2 = τ (v1 ) + τ (v2 )
as a system of equations to be solved in the forest algebra pF which is augmented by
union as an operation. Union is denoted here by +. ✷

87
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

It is obvious that Example 2.7.3 could be repeated for any regular language and that
the language itself is always the union of those components of the minimal fixed-point
which correspond to final states. The interpretation of the equations in terms of forest
operations serves as the starting point for a generalization to equations for regular forests.
Fix again a ranked alphabet Σ and a frontier alphabet X. For any k ≥ 1, let

Fk = (pFΣ (X))k

be the set of k-tuples of ΣX-forests. We order Fk partially by componentwise inclusion:

(S1 , . . . , Sk ) ≤ (T1 , . . . , Tk ) iff S1 ⊆ T1 , . . . , Sk ⊆ Tk .

Then Fk becomes a complete lattice in which least upper bounds and greatest lower
bounds are obtained, respectively, by forming componentwise unions and intersections,
thus
_  [ [ 
(Si1 , . . . , Sik ) | i ∈ I = (Si1 | i ∈ I), . . . , (Sik | i ∈ I)

and
^  \ \ 
(Si1 , . . . , Sik ) | i ∈ I = (Si1 | i ∈ I), . . . , (Sik | i ∈ I) .

The least element is 0 = (∅, . . . , ∅). (We refer the reader to Section 1.4 for the lattice
theory needed here.)
Let Vk = {v1 , . . . , vk } be a set of variables disjoint from Σ and X. With every Σ(X ∪
Vk )-forest P we associate the mapping

P̂ : Fk → pFΣ (X)

defined so that
P̂ (T1 , . . . , Tk ) = P (v1 ← T1 , . . . , vk ← Tk )
for all (T1 , . . . , Tk ) ∈ Fk . A k-tuple Π = (P1 , . . . , Pk ) of finite Σ(X ∪ Vk )-forests is called
a (Σ, X, k)-polynomial and we associate with it the mapping

Π̂ : Fk → Fk

defined so that

Π̂(T) = P̂1 (T), . . . , P̂k (T) (T ∈ Fk ) .

Lemma 2.7.4 For any (Σ, X, k)-polynomial Π, the mapping Π̂ : Fk → Fk is ω-continuous.

Proof. Let Π = (P1 , . . . , Pk ). The mapping Π̂ is isotone as

P (v1 ← S1 , . . . , vk ← Sk ) ⊆ P (v1 ← T1 , . . . , vk ← Tk )

88
2.7 Algebraic characterizations of recognizability

obviously holds for all P ⊆ FΣ (X ∪ Vk ) and ΣX-forests S1 , . . . , Sk , T1 , . . . , Tk such that


S1 ⊆ T1 , . . . , Sk ⊆ Tk . Let
T0 ⊆ T1 ⊆ T2 ⊆ . . .
be any ascending ω-sequence of vectors

Ti = (Ti1 , . . . , Tik ) ∈ Fk (i ≥ 0)

of ΣX-forests. Now write


[ [ 
T= (Ti1 | i ≥ 0), . . . , (Tik | i ≥ 0) .

In order to prove ω-continuity we should show that


[  [ 
Π̂(T) = P̂1 (Ti ) | i ≥ 0 , . . . , P̂k (Ti ) | i ≥ 0 ,

or equivalently, that
[ 
P̂j (T) = P̂j (Ti ) | i ≥ 0 (j = 1, . . . , k) . (5)
S
Every tree t ∈ P̂j (T) is obtained from some p ∈ Pj by substituting a tree from (Tim |
i ≥ 0) for every occurrence of each variable vm and each m = 1, . . . , k. The number of
occurrences of variables in p is finite. Hence there exists an i ≥ 0 such that all trees used
in this substitution appear in a component of Ti . Then t ∈ P̂j (Ti ). This shows that the
left side of (5) is included in the right side of (5) for each j = 1, . . . , k. The converse
inclusions are obvious since Π̂ is isotone and Ti ≤ T for all i ≥ 0. ✷

Now, using Theorem 1.4.8 we get

Corollary 2.7.5 For any (Σ, X, k)-polynomial Π, the mapping Π̂ : Fk → Fk has the
least fixed-point
_
[Π̂] = (00Π̂i | i ≥ 0) . ✷

The corollary means that [Π̂] is the least solution of the fixed-point equation

(v1 , . . . , vk ) = Π̂(v1 , . . . , vk ) , (6)

where the vi ’s are “unknowns” that assume ΣX-forests as their values. The equation (6)
can also be written as a system of equations


v1 = P1

..
. (7)


v = P ,
k k

89
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

where the P ’s are usually expressed as formal sums of their elements (as we did in
Example 2.7.3).
The finiteness of the components Pi was not used in the proof of Lemma 2.7.4. How-
ever, it will be essential for obtaining the main result of this section. In fact, it will
be convenient, although not necessary, to work with an even more restricted class of
fixed-point equations, which we shall soon introduce. Example 2.7.3 provides us with a
guideline here, too.
Let us extend the height function of FΣ (X) to FΣ (X ∪ Vk ) so that

hg(vi ) = −1 (i = 1, . . . , k) .

Then the Σ(X ∪ Vk )-trees of height 0 are


(i) the frontier letters x ∈ X,

(ii) the 0-ary operators σ ∈ Σ0 , and

(iii) the trees of the form σ(vi1 , . . . , vim ), where m > 0, σ ∈ Σm and vi1 , . . . , vim ∈ Vk .

Definition 2.7.6 A (Σ, X, k)-polynomial Π = (P1 , . . . , Pk ) is regular, if every Σ(X∪Vk )-


tree of height 0 belongs to exactly one Pj , and the Pj ’s do not contain any other trees.
If Π is regular, then Π̂ and the corresponding fixed-point equation (6) are also said to
be regular. A ΣX-forest T is called equational if it can be expressed as the union of
some components of the least solution of a regular fixed-point equation.

The fixed-point equation in Example 2.7.3 is regular. It is easy to see that the same
procedure applied to any Rabin-Scott recognizer will yield a regular fixed-point equation.
Hence, every regular language is equational when viewed as a unary forest. It is also
well-known, and easy to prove, that the components of the least solution of a system of
Arden equations are regular.

Example 2.7.7 Let Σ = Σ0 ∪ Σ2 , Σ0 = {γ}, Σ2 = {σ} and X = {x, y}. Then



Π = {x, γ, σ(v1 , v2 ), σ(v2 , v1 )}, {y, σ(v1 , v1 ), σ(v2 , v2 )}

is a regular (Σ, X, 2)-polynomial. The corresponding regular fixed-point equation can


be written as the system
(
v1 = x + γ + σ(v1 , v2 ) + σ(v2 , v1 )
v2 = y + σ(v1 , v1 ) + σ(v2 , v2 ) .

The least solution is the pair (T1 , T2 ), where


 
T1 = x, γ, σ(x, y), σ(γ, y), σ(y, x), σ(y, γ), σ x, σ(x, x) , . . .

and

T2 = y, σ(x, x), σ(γ, γ), σ(y, y), . . . . ✷

90
2.7 Algebraic characterizations of recognizability

Let [Π̂] = (T1 , . . . , Tk ) be the least fixed-point for a given (Σ, X, k)-polynomial Π. We
define a binary relation ̺(Π) in FΣ (X):

̺(Π) = {(s, t) | s, t ∈ Ti for some i = 1, . . . , k} .

Lemma 2.7.8 If Π is a regular (Σ, X, k)-polynomial, then ̺(Π) is a congruence of FΣ (X)


with at most k equivalence classes. For each congruence ̺ of FΣ (X) of index k (k ≥ 1)
there exists a regular (Σ, X, k)-polynomial Π such that ̺(Π) = ̺.

Proof. Let Π = (P1 , . . . , Pk ) be a regular (Σ, X, k)-polynomial and [Π̂] = (T1 , . . . , Tk )


the corresponding least fixed-point. From the definition of ̺(Π) it is clear that the
relation is symmetric. To prove that it is reflexive and transitive, too, we show that
every ΣX-tree t belongs to exactly one Ti . First we note that

Ti = Pi (v1 ← T1 , . . . , vk ← Tk ) (i = 1, . . . , k) (8)

as [Π̂] is a fixed-point of Π̂. We proceed now by induction on hg(t).

1o If hg(t) = 0, then t is in exactly one of the sets Pi (i = 1, . . . , k) because Π is


regular. From (8) we see that t is in the corresponding Ti and that it could belong
to some other Tj (j 6= i) only in case vi ∈ Pj . But hg(vi ) = −1 and vi does not
appear in Π.

2o Consider a tree t = σ(t1 , . . . , tm ) (m > 0) and assume that all trees of lesser height
belong to exactly one Ti . Then there exists for each j = 1, . . . , m exactly one ij
(1 ≤ ij ≤ k) such that tj ∈ Tij . Also, there is exactly one i (1 ≤ i ≤ k) such that
p = σ(vi1 , . . . , vim ) ∈ Pi . Clearly,

t ∈ p(v1 ← T1 , . . . , vk ← Tk ) ⊆ Ti .

The uniqueness of the indices ij implies that p is the only tree of height 0 in FΣ (X ∪
Vk ) from which t can be obtained by the substitutions v1 ← T1 , . . . , vk ← Tk . Hence
t belongs to Ti only.

Now we know that ̺(Π) ∈ E FΣ (X) . It is obvious that it has at most k equivalence
classes. (There may be less than k classes as some T ’s could be empty.) To prove that it is
a congruence relation we consider any m ≥ 1, σ ∈ Σm and s1 , . . . , sm , t1 , . . . , tm ∈ FΣ (X)
such that 
s1 ≡ t1 , . . . , sm ≡ tm ̺(Π) .
There are indices i1 , . . . , im such that

sj , tj ∈ Tij for j = 1, . . . , m .

Let σ(vi1 , . . . , vim ) be in Pi . Then

σ(s1 , . . . , sm ), σ(t1 , . . . , tm ) ∈ Ti

91
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

by (8). Hence 
σ FΣ (X) (s1 , . . . , sm ) ≡ σ FΣ (X) (t1 , . . . , tm ) ̺(Π)
as required. 
Now, suppose ̺ ∈ C FΣ (X) and let S1 , . . . , Sk be the equivalence classes of ̺. We
define a (Σ, X, k)-polynomial Π = (P1 , . . . , Pk ) so that
Pi = {p ∈ FΣ (X ∪ Vk ) | hg(p) = 0, p(v1 ← S1 , . . . , vk ← Sk ) ⊆ Si }
for all i = 1, . . . , k. The fact that ̺ is a congruence means that for each p of height 0
there is exactly one i (1 ≤ i ≤ k) such that
p(v1 ← S1 , . . . , vk ← Sk ) ⊆ Si .
Hence Π is regular. We claim that ̺(Π) = ̺. Let [Π̂] = (T1 , . . . , Tk ). In order to
prove the second statement of the lemma we show by induction on hg(t) that for all
i = 1, . . . , k,  
∀t ∈ FΣ (X) t ∈ Si ⇐⇒ t ∈ Ti .
1o If hg(t) = 0, then there is exactly one i such that t ∈ Pi . This means t ∈ Si .
From (8) it follows that t ∈ Ti for the same i.
2o Let t = σ(t1 , . . . , tm ) (m > 0) and suppose the claim holds for all trees of height <
hg(t). Then there are unique indices i1 , . . . , im such that
tj ∈ Sij ∩ Tij (j = 1, . . . , m) .
Also, there is a unique i such that
p = σ(vi1 , . . . , vim ) ∈ Pi .
Then
t ∈ σ(Si1 , . . . , Sim ) = p(v1 ← S1 , . . . , vk ← Sk ) ⊆ Si
by the definition of Pi . On the other hand, (8) implies t ∈ Ti . ✷

If we combine Lemma 2.7.8 and Theorem 2.7.1, we get

Theorem 2.7.9 A forest is equational iff it is recognizable. ✷

From the first part of this section it is clear that a ΣX-forest T can be recognized by a
k-state tree recognizer iff T is saturated by a congruence of FΣ (X) of index ≤ k. From
Lemma 2.7.8 we get a similar connection between the number of states and the number
of variables in a regular fixed-point equation which defines the forest.
There is also a very close connection between regular tree grammars and the fixed-point
equations considered here. For example, the equations of Example 2.7.7 can be converted
into the following set of productions in which v1 and v2 are nonterminal symbols:
v1 → x, v1 → γ, v1 → σ(v1 , v2 ), v1 → σ(v2 , v1 ),
v2 → y, v2 → σ(v1 , v1 ), v2 → σ(v2 , v2 ) .

92
2.8 A Medvedev-type characterization

The resulting regular tree grammar generates T1 if v1 is the initial symbol, and it gen-
erates T2 if v2 is the initial symbol.
On the other hand, every regular ΣX-grammar with k nonterminal symbols can be
converted into a fixed-point system with k equations. This system is not necessarily
regular, but the components of the least solution are nevertheless the regular forests
generated by the grammar from the different nonterminal symbols. For example, if
Σ and X are as in Example 2.7.7 and the productions are

a → x, a → γ, a → σ(a, b), b → σ(b, b), b → y,

then the corresponding equations would be

a = x + γ + σ(a, b) and
b = y + σ(b, b) ,

where a and b now are the unknowns. The least solution is T (Ga ), T (Gb ) , where
Ga and Gb are the grammars which we obtain by choosing a and b, respectively, as the
initial symbol.

2.8 A MEDVEDEV-TYPE CHARACTERIZATION


Our next description of the recognizable forests is a streamlined generalization of a well-
known characterization of the regular languages given by J. Medvedev in 1956. First
we define the family of representable forests. The theorem states then that the repre-
sentable forests are exactly the recognizable forests. The representable forests are defined
collectively for all ranked alphabets as the definition involves tree homomorphisms and
these may take us from one alphabet to another. Recall that r(Σ) is the finite set of
nonnegative integers m for which Σm 6= ∅.

Definition 2.8.1 For every pair (Σ, X) we define the “next-to-root function”
[ 
nroot : FΣ (X) − (Σ0 ∪ X) → (Σ ∪ X)m | m ∈ r(Σ)

so that
 
nroot σ(t1 , . . . , tm ) = root(t1 ), . . . , root(tm )
for all m > 0, σ ∈ Σm and t1 , . . . , tm ∈ FΣ (X).

Definition 2.8.2 The elementary ΣX-forests are the forests

U (d) = root−1 (d) (d ∈ Σ ∪ X) , and (i)


−1
V (d1 , . . . , dm ) = nroot (d1 , . . . , dm ) , (ii)

where m > 0, m ∈ r(Σ), and d1 , . . . , dm ∈ Σ ∪ X.

93
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

Note that the definitions of the U (d)- and V (d1 , . . . , dm )-forests presume a Σ and an X
although the notations do not show this. Clearly, U (d) is the set of all ΣX-trees with
the root labelled by d, and V (d1 , . . . , dm ) consists of all ΣX-trees of height ≥ 1 in which
the nodes immediately above the root are labelled, from left to right, by d1 , . . . , dm ,
respectively. Note also that U (d) = {d} when d ∈ Σ0 ∪ X. We need three more
definitions.

Definition 2.8.3 The restriction of a forest T is the forest

rest(T ) = {t ∈ T | sub(t) ⊆ T } .

Definition 2.8.4 The elementary operations on forests are the formation of

(i) the union of two forests,

(ii) the intersection of two forests,

(iii) an alphabetic tree homomorphic image of a forest, and

(iv) the restriction of a forest.

Definition 2.8.5 A forest is representable if it can be constructed from elementary


forests by a finite number of applications of elementary operations.

Now the theorem can be stated.

Theorem 2.8.6 A forest is representable iff it is recognizable.

Proof. To prove that the representable forests are recognizable it suffices to note that
the elementary forests are recognizable and that the elementary operations preserve
recognizability. Consider any Σ and X. If d ∈ Σ0 ∪ X, then U (d) = {d} ∈ Rec(Σ, X). If
d ∈ Σm (m > 0), then

U (d) = d(y1 , . . . , ym ) y1 ← FΣ (X), . . . , ym ← FΣ (X)

is again recognizable. Similarly,


[  
V (d1 , . . . , dm ) = σ(y1 , . . . , ym ) y1 ← U (d1 ), . . . , ym ← U (dm ) | σ ∈ Σm

is recognizable for all m ∈ r(Σ) and d1 , . . . , dm ∈ Σ ∪ X. We have already seen in


Section 2.4 that unions, intersections and alphabetic tree homomorphisms preserve rec-
ognizability. Let T be the forest recognized by a ΣX-recognizer A. We construct a
recognizer for rest(T ). First define a Σ-algebra B = (A ∪ b, Σ) (b ∈ / A) so that
(
σ A (b1 , . . . , bm ) if b1 , . . . , bm ∈ A and σ A (b1 , . . . , bm ) ∈ A′ ,
σ B (b1 , . . . , bm ) =
b in all other cases,

94
2.8 A Medvedev-type characterization

for all m ≥ 0, σ ∈ Σm and b1 , . . . , bm ∈ A ∪ b. The initial assignment β : X → A ∪ b is


defined so that for each x ∈ X,
(
xα if xα ∈ A′ ,
xβ =
b / A′ .
if xα ∈

Consider any ΣX-tree t. It is easy to show that


(
tα̂ if sub(t) ⊆ T ,
tβ̂ =
b otherwise.

Hence, B = (B, β, A′ ) recognizes rest(T ).


We shall now show that every recognizable forest is representable. Let T = T (A) for
some ΣX-recognizer A. First define a new ranked alphabet Ω such that

Ωm = Σm × (A ∪ X)m for all m≥0 .

We construct two representable ΩX-forests R and S as follows. For c ∈ A ∪ X we


introduce the notation (
c if c ∈ A ,
c=
cα if c ∈ X .
Then

R = {x ∈ X | xα ∈ A′ } ∪
[ 
U ((σ, c1 , . . . , cm )) | (σ, c1 , . . . , cm ) ∈ Ω, σ A (c1 , . . . , cm ) ∈ A′ .

The forest S is the union of all intersections



V (u1 , . . . , um ) ∩ U (σ, b1 , . . . , bm ) ,

where for each i = 1, . . . , m, either


(i) ui ∈ X and bi = ui α, or

(ii) ui = (τ, c1 , . . . , ck ) ∈ Ωk (k ≥ 0) and bi = τ A (c1 , . . . , ck ).


Note that the possibility m = 0 is included at appropriate places in the definitions of
R and S.
Define the tree homomorphism

h : FΩ (X) → FΣ (X)

so that  
hm (σ, b1 , . . . , bm ) = σ , m ≥ 0, (σ, b1 , . . . , bm ) ∈ Ωm
and hX = 1X . Clearly, h is alphabetic. We claim that

T = h(P )

95
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

for the representable forest

P = R ∩ rest(S ∪ Ω0 ∪ X) .

Let p ∈ P . If p = (σ, e) ∈ Ω0 , then p ∈ R implies σ A ∈ A′ . Hence h(p) = σ ∈ T . If


p = x ∈ X, then p ∈ R implies h(x)α̂ = xα ∈ A′ . Again h(p) = x ∈ T . Next we show
that for every p ∈ rest(S ∪ Ω0 ∪ X) of height ≥ 1

h(p)α̂ = σ A (b1 , . . . , bm ) , where (σ, b1 , . . . , bm ) = root(p) . (1)

We proceed by induction on hg(p).

1o If hg(p) = 1, then m ≥ 1 and

p = (σ, b1 , . . . , bm )(u1 , . . . , um )

for some u1 , . . . , um ∈ Ω0 ∪X. Since p ∈ S we have h(ui )α̂ = bi for all i = 1, . . . , m.


But this implies that (1) holds for p.

2o Now let
p = (σ, b1 , . . . , bm )(p1 , . . . , pm )
and assume that (1) holds for the trees p1 , . . . , pm . As p is in S and

h(p)α̂ = σ A h(p1 )α̂, . . . , h(pm )α̂ ,

it suffices to show that h(pi )α̂ = bi for every i = 1, . . . , m. We should consider


three cases.
(a) If pi is of the form (τ, c1 , . . . , ck )(r1 , . . . , rk ) (k > 0), then the induction hy-
pothesis yields
h(pi )α̂ = τ A (c1 , . . . , ck ) .
Moreover, τ A (c1 , . . . , ck ) = bi = bi since p ∈ S.
(b) If pi = (σ, e) ∈ Ω0 , then h(pi )α̂ = σ A = bi = bi .
(c) If pi = x ∈ X, then h(pi )α̂ = xα = bi .

Now we have completed the proof of (1). Consider any tree

p = (σ, b1 , . . . , bm )(p1 , . . . , pm ) ∈ P .

By using (1) and the fact that p ∈ R we get

h(p)α̂ = σ A (b1 , . . . , bm ) ∈ A′ .

This implies h(p) ∈ T and we have shown that h(P ) ⊆ T .


In order to prove the converse inclusion we show first by tree induction how to construct
for each t ∈ FΣ (X) a tree p ∈ rest(S ∪ Ω0 ∪ X) such that h(p) = t:

96
2.9 Local forests

1o If t = x ∈ X, then we may choose p = x.

2o If t = σ ∈ Σ0 , put p = (σ, e).

3o Let t = σ(t1 , . . . , tm ) (m > 0) and suppose we have trees p1 , . . . , pm ∈ rest(S ∪


Ω0 ∪ X) such that h(pi ) = ti (i = 1, . . . , m). If we put

p = (σ, b1 , . . . , bm )(p1 , . . . , pm ) ,

where bi = ti α̂ for i = 1, . . . , m, then h(p) = t and p ∈ rest(S ∪ Ω0 ∪ X) as required.


Let t ∈ T and construct a p for t as above. To prove t ∈ h(P ) it suffices to show
that p ∈ R. This can again be done by tree induction:
1o If t = x ∈ X, then xα ∈ A′ and hence p = x ∈ R.

2o If t = σ ∈ Σ0 , then σ A ∈ A′ and p = (σ, e) ∈ U (σ, e) ⊆ R.

3o Let t = σ(t1 , . . . , tm ) (m > 0). If we use (1) and its notation, we get

σ A (b1 , . . . , bm ) = h(p)α̂ = tα̂ ∈ A′ .

This shows that p ∈ R. ✷

2.9 LOCAL FORESTS


In this section a proper subfamily of the recognizable forests is introduced. We will then
also get one more characterization of the recognizable forests, not quite unrelated to
that given in the preceding section.
We need the following auxiliary concept

Definition 2.9.1 The set of forks fork(t) of a ΣX-tree t is defined as follows:


1o If t ∈ Σ0 ∪ X, then fork(t) = ∅.

2o If t = σ(t1 , . . . , tm ) (m > 0), then


 
fork(t) = fork(t1 ) ∪ · · · ∪ fork(tm ) ∪ σ root(t1 ), . . . , root(tm ) } .
S 
The set of all forks of ΣX-trees fork(t) | t ∈ FΣ (X) will be denoted by fork(Σ, X).

Example 2.9.2 Let Σ = Σ0 ∪ Σ1 ∪ Σ2 , Σ0 = {γ}, Σ1 = {τ }, Σ2 = {σ} and X = {x, y}.


For the ΣX-tree 
t = σ τ (γ), σ(x, τ (y)) ,
we have 
fork(t) = σ(τ, σ), τ (γ), σ(x, τ ), τ (y) .
Graphically these forks are represented as in Fig. 2.7 respectively. Obviously, fork(Σ, X)
is always finite and here it consists of 30 forks. ✷

97
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

τ σ γ x τ y

, , and
σ τ σ τ

Figure 2.7.

Local forests may now be defined.



Definition 2.9.3 A ΣX-forest T is local if there are sets R(⊆ Σ∪X) and F ⊆ fork(Σ, X)
such that, for each t ∈ FΣ (X),

t∈T iff root(t) ∈ R and fork(t) ⊆ F .

Then we write T = Loc(R, F ).

Hence the membership of a ΣX-tree t in the local forest Loc(R, F ) can be decided by
testing for the local properties root(t) ∈ R and fork(t) ⊆ F .
A ΣX-recognizer for Loc(R, F ) can be constructed as follows. First we define a Σ-
algebra A = (A, Σ). Let A = Σ ∪ X ∪ 0 (0 ∈ / Σ ∪ X). For every σ ∈ Σ0 , put σ A = σ.
For m > 0, σ ∈ Σm and a1 , . . . , am ∈ A let
(
σ if σ(a1 , . . . , am ) ∈ F ,
σ A (a1 , . . . , am ) =
0 otherwise.

Let α : X → A be the embedding x 7→ x (x ∈ X). It is easy to show that for all


t ∈ FΣ (X), (
root(t) if fork(t) ⊆ F ,
tα̂ =
0 otherwise.
This readily implies T (A) = Loc(R, F ) for A = (A, α, R). Hence we have

Theorem 2.9.4 Every local forest is recognizable. ✷

The converse of Theorem 2.9.4 does not hold. For example, the forest consisting of
the single tree of Example 2.9.2 is not local as there are many other trees with the same
root and the same forks. However, the following fact can be proved.

Theorem 2.9.5 For every recognizable ΣX-forest T there exist a ranked alphabet Ω, a
frontier alphabet Y , a local ΩY -forest S and an alphabetic tree homomorphism

h : FΩ (Y ) → FΣ (X)

such that T = h(S).

98
2.10 Some basic decision problems

Proof. Let G = (N, Σ, X, P, a0 ) be a regular ΣX-grammar generating T . We assume


that G is in normal form. A new ranked alphabet Ω is defined so that

Ωm = {[a → σ(a1 , . . . , am )] | a → σ(a1 , . . . , am ) ∈ P, σ ∈ Σm }

for all m ≥ 0. Also, let

Y = {[a → x] | a → x ∈ P, x ∈ X} .

The local ΩY -forest S = Loc(R, F ) is defined by the sets

R = {[a0 → p] | a0 → p ∈ P }

and

F = [a → σ(a1 , . . . , am )]([a1 → p1 ], . . . , [am → pm ]) | m > 0,

a → σ(a1 , . . . , am ), a1 → p1 , . . . , am → pm ∈ P .

Finally, define an alphabetic tree homomorphism

h : FΩ (Y ) → FΣ (X)

by the mappings
hY : Y → FΣ (X), [a → x] 7→ x,
and
hm : Ωm → FΣ (X ∪ Ξm ), [a → σ(a1 , . . . , am )] 7→ σ(ξ1 , . . . , ξm ) .
Now h(S) = T , and thereby the theorem, follows from (1) and (2):

(1) If a ⇒∗G t, for some a ∈ N and t ∈ FΣ (X), then there is a tree s ∈ FΩ (Y ) such that
h(s) = t, fork(s) ⊆ F and root(s) is of the form [a → p].

(2) If s ∈ FΩ (Y ) is such that fork(s) ⊆ F and root(s) = [a → p] for some p ∈ FΣ (N ∪X),


then a ⇒∗G h(s).

Part (1) can be proved by induction on the length of the derivation of t and (2) by
tree induction on s. ✷

Note that h(S) is always recognizable when S is a local forest and h an alphabetic tree
homomorphism (Theorem 2.9.4 and Corollary 2.4.20).

2.10 SOME BASIC DECISION PROBLEMS


In this section we shall show that some of the first questions one might ask about given
tree recognizers are algorithmically decidable. To begin with, we have the emptiness
problem: Is the forest recognized by a given tree recognizer empty? Or one may ask
whether this forest is finite or infinite. This is the finiteness problem. Finally, we have

99
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

the important equivalence problem: Do two given tree recognizers recognize the same
forest? In fact, the more general inclusion problem: “T (A) ⊆ T (B)?” is shown to
be decidable. The problems are quite easy and the proofs follow the strategy familiar
from finite automata theory with a “pumping lemma” as the key result. We have seen in
Section 2.2 that any nondeterministic frontier-to-root, or root-to-frontier, tree recognizer
can be converted into an equivalent deterministic F-recognizer. Hence we may again
restrict ourselves to our basic type of tree recognizers.
We need the following special notation. Let Σ and X be given. Introduce a new letter ξ
and let Tξ be the set of all Σ(X ∪ ξ)-trees in which ξ appears exactly once. For any
q ∈ Tξ and p ∈ FΣ (X) ∪ Tξ we denote q(ξ ← p) by p · q. Also, we define the powers q k
as follows:

1o q 0 = ξ,

2o q n+1 = q · q n (n ≥ 0).

Using these notations we may formulate the pumping lemma of tree recognizers as
follows.

Lemma 2.10.1 Let A be a k-state ΣX-recognizer. If t ∈ T (A) and hg(t) ≥ k, then


there are trees p ∈ FΣ (X) and q, r ∈ Tξ such that

(a) t = p · q · r,

(b) hg(q) ≥ 1 and

(c) p · q i · r ∈ T (A) for all i = 0, 1, 2, . . . .

Proof. Suppose t ∈ T (A) and hg(t) ≥ k(= |A|). Then we can write t = σ(t1 , . . . , tm )
(m > 0, σ ∈ Σm ). Choose some j (1 ≤ j ≤ m) such that hg(tj ) = hg(t) − 1. Then

t = tj · s 1 ,

where
s1 = σ(t1 , . . . , tj−1 , ξ, tj+1 , . . . , tm ) ∈ Tξ .
If hg(tj ) > 0, we may decompose tj the same way. Since hg(t) ≥ k the process can be
repeated k times and finally we obtain a representation

t = t′ · s k · . . . · s 2 · s 1 ,

where t′ ∈ FΣ (X) and s1 , . . . , sk ∈ Tξ . Moreover, hg(si ) ≥ 1 for every i = 1, . . . , k. Let

uk+1 = t′ , uk = t′ · sk , . . . , u1 = t ′ · s k · . . . · s 1 = t .

There must be indices h and j, k + 1 ≥ h > j ≥ 1, such that

uh α̂ = uj α̂ .

100
2.10 Some basic decision problems

Now let p = uh , q = sh−1 · . . . · sj and r = sj−1 · . . . · s1 (if j = 1, then r = ξ). Then


t = p · q · r and hg(q) ≥ 1. Also, our choice of p and q implies

pα̂ = (p · q)α̂ . (1)

We assume that A ∩ X = ∅, and extend α̂ to a homomorphism

α : FΣ (X ∪ A) → A

so that α̂|A = 1A . By Lemma 2.4.17 sα = sα̂ whenever s ∈ FΣ (X). We verify now by


induction on i that
(p · q i )α̂ = (p · q)α̂ (2)
for every i ≥ 0. From (1) we know that this is true for i = 0. Suppose (2) holds for a
given i. This assumption and (1) imply
 
(p · q i+1 )α̂ = q ξ ← (p · q i )α̂ α = q ξ ← (p · q)α̂ α

= q ξ ← pα̂ α = (p · q)α̂ .

Using (2) we get for each i ≥ 0,



(p · q i · r)α̂ = r ξ ← (p · q i )α̂ α

= r ξ ← (p · q)α̂ α
= (p · q · r)α̂ .

Hence, p · q i · r ∈ T (A) for all i ≥ 0. ✷

Theorem 2.10.2 Let A be a k-state ΣX-recognizer. Then T (A) is nonempty iff it


contains a tree of height less than k. Hence the emptiness problem of recognizable forests
is decidable.

Proof. Suppose T (A) is nonempty. Let t be a tree in T (A) of minimal length. If


hg(t) ≥ k, we apply the pumping lemma and write t = p · q · r. But then T (A) would
contain the tree p · r which is properly shorter than t as hg(q) ≥ 1. Hence hg(t) < k
must hold. The converse part is trivial. The emptiness of T (A) can always be decided
by going through the finite set of trees of height < k. ✷

Suppose two ΣX-recognizers A and B are given. Clearly, T (A) ⊆ T (B) iff T (A) −
T (B) = ∅. But T (A) − T (B) is recognized by

C = A × B, γ, A′ × (B − B ′ ) ,

where xγ = (xα, xβ) for x ∈ X. Thus the question “T (A) ⊆ T (B)?” can be answered
by deciding whether T (C) is empty or not. The equivalence problem can similarly be
reduced to the emptiness problem. Of course, its decidability follows also from the
decidability of the inclusion problem. We have justified

101
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

Theorem 2.10.3 The inclusion problem and the equivalence problem of tree recognizers
are decidable. ✷
Finally we consider the finiteness problem.
Theorem 2.10.4 It is decidable whether the forest recognized by a given tree recognizer
is finite or infinite.
Proof. Let A be a k-state ΣX-recognizer and write
T = T (A) − {t ∈ FΣ (X) | hg(t) < k} .
We claim that T (A) is finite iff T = ∅. Obviously the condition is sufficient since the
set of ΣX-trees of height < k is finite. If T 6= ∅ and t ∈ T , then hg(t) ≥ k and we may
apply the pumping lemma and write t = p · q · r so that
p · q i · r ∈ T (A) for all i ≥ 0 .
These trees are pairwise distinct since hg(q) ≥ 1. Hence T (A) is infinite. The forest T
is recognizable and one can easily construct a recognizer for it. This means that the
condition T = ∅ is effectively testable. ✷

The decidability of the finiteness problem may also be deduced from the following
corollary of the pumping lemma. The proof is an exercise.
Lemma 2.10.5 Let A be a k-state tree recognizer. Then T (A) is infinite iff it contains
a tree t such that
k ≤ hg(t) < 2k . ✷

2.11 DETERMINISTIC R-RECOGNIZERS


In Section 2.2 it was shown that NDR-recognizers recognize exactly the family Rec,
but that there are recognizable forests that cannot be recognized by any deterministic
R-recognizer. The limited recognition power of DR-recognizers is due to the fact that
they have no way of combining the information gathered from disjoint subtrees. This
implies that a DR-recognizer will accept any tree in which every path from the root to
the frontier appears in some tree accepted by the recognizer. It will turn out that this
closure property characterizes the forests recognizable by DR-recognizers. Here a “path”
contains, not only a list of the labels of the nodes traversed, but also the information
about the directions taken at the nodes. In the later part of this section we shall consider
the minimization of DR-recognizers. It will be shown that every DR-recognizer can be
reduced to a canonical minimal form which is unique up to isomorphism.
Let Σ be a fixed ranked alphabet. In order to avoid some troublesome technicalities,
we shall assume that Σ0 = ∅. We associate with Σ a unary ranked alphabet
[
Γ = Γ1 = (Γ(σ) | σ ∈ Σ),
where for all σ, τ ∈ Σ,

102
2.11 Deterministic R-recognizers

(i) Γ(σ) = {σ1 , . . . , σm } if σ ∈ Σm (m ≥ 1), and

(ii) Γ(σ) ∩ Γ(τ ) = ∅ if σ 6= τ .

The paths in Σ-trees can now be defined as Γ-trees.

Definition 2.11.1 Let X be any frontier alphabet. For each x ∈ X the set gx (t) of
x-paths of a ΣX-tree t is defined as follows:

1◦ gx (x) = {x}, and gx (y) = ∅ for all y 6= x, y ∈ X.

2◦ If t = σ(t1 , . . . , tm ) (σ ∈ Σm , m > 0), then gx (t) = σ1 (gx (t1 )) ∪ · · · ∪ σm (gx (tm )).

We extend gx to a mapping from pFΣ (X) to pFΓ (X) in the natural way. Moreover, we
put [
g(T ) = (gx (T ) | x ∈ X)
for each T ⊆ FΣ (X).

Label the edges of the graph representing a tree t ∈ FΣ (X) so that the ith edge (counted
from the left) leaving a node labelled by a symbol σ always gets the label σi . Then the
elements of gx (t) (x ∈ X) are spelled out by the paths leading from the root to a leaf
labelled by x when we interpret a word σ1i1 . . . σkik x (k ≥ 0, σ1i1 , . . . , σkik ∈ Γ) as the
ΓX-tree σ1i1 (. . . σkik (x) . . . ). Moreover, every such path gives an element of gx (t).

Lemma 2.11.2 If T ∈ Rec(Σ, X), then g(T ) ∈ Rec(Γ, X).

Proof. Let G = (N, Σ, X, P, a0 ) be a regular ΣX-grammar in normal form generating


T . The case T = ∅ being trivial, we may assume that every Ga = (N, Σ, X, P, a) (a ∈ N )
generates a nonempty forest. Let G′ = (N, Γ, X, P ′ , a0 ) be the regular ΓX-grammar,
where
P ′ = {a → σi (ai ) | a → σ(a1 , . . . , am ) ∈ P, m > 0, 1 ≤ i ≤ m} ∪
{a → x | a → x ∈ P, x ∈ X}.

We claim that T (G′ ) = g(T ). This follows when we show that, for every tree

p = σ1i1 (. . . σkik (x) . . . ) ∈ FΓ (X)

and every a ∈ N ,
p ∈ T (G′a ) iff p ∈ g(T (Ga )), (*)
where G′a = (N, Γ, X, P ′ , a).
We proceed by induction on hg(p).

1◦ If hg(p) = 0, then p = x. In this case (*) obviously holds as a → x is in P ′ iff it is


in P .

2◦ Suppose hg(p) > 0 and that (*) holds for all trees of lesser height.

103
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

If p ∈ T (G′a ), then a ⇒∗G′ σ1i1 (ai1 ) and ai1 ⇒∗G′ σ2i2 (. . . σkik (x) . . . ) for some ai1 ∈ N ,
and P contains a production a → σ1 (a1 , . . . , am ) such that i1 ≤ m. By the inductive
assumption there exists a tree ti1 ∈ T (Gai1 ) such that σ2i2 (. . . σkik (x) . . . ) ∈ gx (ti1 ).
Moreover, we may choose for every i 6= i1 , 1 ≤ i ≤ m, a tree ti ∈ T (Gai ). Then
t = σ1 (t1 , . . . , tm ) ∈ T (Ga ) and p ∈ gx (t) ⊆ g(T (Ga )).
Conversely, let p ∈ g(T (Ga )). Then p ∈ gx (t) for some t ∈ T (Ga ). Obviously, t is of
the form σ1 (t1 , . . . , tm ), where i1 ≤ m, and it has a derivation

a ⇒G σ1 (a1 , . . . , am ) ⇒∗G t.

This means that P ′ contains the production a → σ1i1 (ai1 ). Moreover, ti1 ∈ T (Gai1 ) and
σ2i2 (. . . σkik (x) . . . ) ∈ gx (ti1 ). Hence, we get a derivation

a ⇒∗G σ1i1 (ai1 ) ⇒∗G′ p,

which shows that p ∈ T (G′a ). ✷

Let g be the mapping of Definition 2.11.1 associated with a given frontier alphabet X.
Then we write τX = gg−1 . It is clear that τX is a closure operation in FΣ (X), i.e., for
all S, T ⊆ FΣ (X),

(i) S ⊆ SτX ,

(ii) S ⊆ T implies SτX ⊆ T τX , and

(iii) SτX τX = SτX .

For any T ⊆ FΣ (X), T τX is the closure of T , and T is said to be closed if T τX = T .


Now, consider an arbitrary NDR ΣX-recognizer A = (A, A′ , α). For each a ∈ A, let

T (A, a) = {t ∈ FΣ (X) | a ∈ tα̃}.

A state a ∈ A is a 0-state, if T (A, a) = ∅. We say that A is normalized if for all m > 0,


σ ∈ Σm and a ∈ A one of the following two alternatives holds:

(1) Each component of every vector in σ A (a) is a 0-state.

(2) No component of any vector of σ A (a) is a 0-state.

A normalized NDR ΣX-recognizer A has the following important property. Let p ∈


gx (s) (x ∈ X) for some ΣX-tree s such that A has a computation on s which begins
at the root in an initial state and ends at the leaf corresponding to p in a state which
belongs to xα. Then there exists a tree t in T (A) such that p ∈ gx (t). Such a t can be
built around the x-path p by completing it with trees from appropriate T (A, a)-forests.
An NDR ΣX-recognizer A becomes normalized if we omit from each set σ A (a) every
vector which contains a 0-state. This does not change T (A) because the use of a vector
containing a 0-state cannot lead to an accepting computation. Hence, we have

104
2.11 Deterministic R-recognizers

Lemma 2.11.3 For every NDR-recognizer there is an equivalent normalized NDR-recog-


nizer. ✷

We associate with each NDR ΣX-recognizer A a DR ΣX-recognizer pA = (pA, A′ , β)


defined as follows:

(i) pA = (pA, Σ) is the deterministic root-to-frontier algebra such that


[ [ 
σ pA (H) = (π1 (σ A (a)) | a ∈ H), . . . , (πm (σ A (a)) | a ∈ H)

for all H ∈ pA, m > 0 and σ ∈ Σm . Here πi (1 ≤ i ≤ m) is the ith projection.

(ii) For each x ∈ X, xβ = {H ∈ pA | H ∩ xα 6= ∅}.

Lemma 2.11.4 For every normalized NDR ΣX-recognizer A, T (pA) = T (A)τX .

Proof. In order to prove the inclusion T (pA) ⊆ T (A)τX , we consider an arbitrary tree
s ∈ T (pA) and an x-path p ∈ gx (s) (x ∈ X). We should show that p ∈ g(T (A)). Let
p = σ1i1 (. . . (σkik (x)) . . . ). By the definition of pA there are states a0 , a1 , . . . , ak ∈ A
such that

(i) a0 ∈ A′ and ak ∈ xα, and

(ii) aj ∈ πij (σjA (aj−1 )) for j = 1, . . . , k.

Since A is normalized, this implies that there is a tree t ∈ T (A) such that p ∈ gx (t).
Hence p ∈ g(T (A)). Now, let s ∈ T (A)τX and consider any x-path

p = σ1i1 (. . . σkik (x) . . . ) ∈ gx (s) (x ∈ X).

Then p ∈ gx (t) for some t ∈ T (A) and there are states a0 , a1 , . . . , ak ∈ A such that the
above conditions (i) and (ii) hold. But the definition of pA implies that the state of pA
at the leaf corresponding to p includes ak for any tree in which p is an x-path. Hence
pA arrives at the leaf of s corresponding to p in a state belonging to xα. This holds for
every leaf of s and therefore s ∈ T (pA). ✷

Corollary 2.11.5 If T ∈ Rec(Σ, X), then T τX ∈ Rec(Σ, X). ✷

Lemmas 2.11.3 and 2.11.4 also imply that every closed recognizable forest is recognized
by a DR recognizer. But it is easy to see that T (pA) = T (A) if A is deterministic. Hence
we may state the following result.

Theorem 2.11.6 A recognizable forest can be recognized by a DR recognizer iff it is


closed. ✷

105
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

The rest of this section deals with the minimization of DR-recognizers. First two
general remarks. When A = (A, a0 , α) is a DR ΣX-recognizer, then the NDR algebra
A = (A, Σ) is deterministic and we may view each σ A (σ ∈ Σm , m > 0) as a mapping

σ A : A → Am .

Hence we write σ A (a) = (a1 , . . . , am ) rather than σ A (a) = {(a1 , . . . , am )}. The second
remark concerns normalized DR recognizers. If the DR ΣX-recognizer A is normalized,
one of the following conditions holds for each pair (a, σ) ∈ A × Σ:
(1) Every component of σ A (a) is a 0-state.

(2) No component of σ A (a) is a 0-state.


Of course, Lemma 2.11.3 and the construction which led to it are valid here, too, but
we define a “standard” normalized form A∗ = (A∗ , a0 , α) of A as follows:

(i) If A has no 0-state, then put A∗ = A.

(ii) If A has a 0-state, choose one of them, say d, and define then for all m > 0,
σ ∈ Σm , and a ∈ A,

∗ (d, . . . , d) (∈ Am ) if σ A (a) contains a 0-state,
σ A (a) =
σ A (a) otherwise.

It is easy to prove that A∗ is normalized and deterministic, and that T (A∗ ) = T (A).
Normalized DR recognizers have also the following useful property.

Lemma 2.11.7 Let A and B be normalized DR ΣX-recognizers, and let a ∈ A, b ∈ B,


m > 0, σ ∈ Σm , σ A (a) = (a1 , . . . , am ) and σ B (b) = (b1 , . . . , bm ). If T (A, a) = T (B, b),
then T (A, ai ) = T (B, bi ) for all i = 1, . . . , m.

Proof. If one of the states ai (1 ≤ i ≤ m) is a 0-state, then all of them are. Moreover,
T (A, a) = T (B, b) does not contain any tree of the form σ(t1 , . . . , tm ). Hence, one of
the forests T (B, bi ) (1 ≤ i ≤ m), and therefore every one of them, is empty. Thus
T (A, ai ) = T (B, bi ) = ∅ for all i = 1, . . . , m.
Suppose now that T (A, ai ) 6= ∅ and T (B, bi ) 6= ∅ for all i = 1, . . . , m. Consider any i
(1 ≤ i ≤ m) and ti ∈ T (A, ai ). Choose any t1 ∈ T (A, a1 ), . . . , ti−1 ∈ T (A, ai−1 ), ti+1 ∈
T (A, ai+1 ), . . . , tm ∈ T (A, am ). Then σ(t1 , . . . , tm ) ∈ T (A, a) = T (B, b) implies ti ∈
T (B, bi ). By a symmetrical argument, T (B, bi ) ⊆ T (A, ai ) holds for every i = 1, . . . , m.
Hence, T (A, ai ) = T (B, bi ) for every i = 1, . . . , m, as required. ✷

We shall now define a few algebraic concepts for DR recognizers. Let A = (A, a0 , α)
and B = (B, b0 , β) be DR ΣX-recognizers.
A homomorphism from A to B is a mapping ϕ : A → B such that
(i) for all m > 0, σ ∈ Σm and a ∈ A, σ B (aϕ) = (a1 ϕ, . . . , am ϕ), where (a1 , . . . , am ) =
σ A (a),

106
2.11 Deterministic R-recognizers

(ii) a0 ϕ = b0 , and

(iii) for every x ∈ X, xβϕ−1 = xα.

If ϕ is a homomorphism from A to B, we write ϕ : A → B. If such a ϕ is surjective, it


is called an epimorphism. For an epimorphism condition (iii) implies xαϕ = xβ, too. If
there exists an epimorphism ϕ from A onto B, then B is an epimorphic image of A. If
ϕ : A → B is bijective, then A and B are isomorphic, and we write A ∼ = B.
A congruence on A is an equivalence relation ̺ on A such that

(i) for all m > 0, σ ∈ Σm and a, a′ ∈ A, a̺ = a′ ̺ implies σ A (a)/̺ = σ A (a′ )/̺ (recall
the notation from Section 1.1), and

(ii) ̺ saturates every set xα (x ∈ X).

If ̺ is a congruence on A, then the quotient recognizer determined by ̺ is the DR


ΣX-recognizer
A/̺ = (A/̺, a0 ̺, α̺ ),
where A/̺ = (A/̺, Σ) is defined by

σ A/̺ (a̺) = σ A (a)/̺ (σ ∈ Σm , m > 0, a ∈ A),

and α̺ : X → A/̺ is defined by xα̺ = xα/̺ (x ∈ X). It is easy to see that A/̺ is
well-defined.
The following theorem is easily obtained by modifying the proofs of the corresponding
facts from algebra.

Theorem 2.11.8 Let A and B be DR ΣX-recognizers.

(a) If ̺ is a congruence of A, then the natural mapping ̺♯ : A → A/̺ defines an


epimorphism of A onto A/̺.

(b) If ϕ : A → B is an epimorphism, then ̺ = ϕϕ−1 is a congruence on A, and


A/̺ ∼ = B. ✷

The following fact will be needed later.

Theorem 2.11.9 If B is an epimorphic image of A, then T (A) = T (B).

Proof. Let ϕ : A → B be an epimorphism. We verify by tree induction that

tα̃ = tβ̃ϕ−1 and tα̃ϕ = tβ̃, (*)

for every t ∈ FΣ (X).

1◦ For t = x ∈ X, (*) follows directly from the fact that ϕ is an epimorphism.

107
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

2◦ Let t = σ(t1 , . . . , tm ) and assume that (*) holds for t1 , . . . , tm . Suppose a ∈ tα̃.
If σ A (a) = (a1 , . . . , am ), this means that a1 ∈ t1 α̃, . . . , am ∈ tm α̃. Hence, a1 ϕ ∈
t1 β̃, . . . , am ϕ ∈ tm β̃. This implies
σ B (aϕ) = (a1 ϕ, . . . , am ϕ) ∈ t1 β̃ × · · · × tm β̃.
Hence, aϕ ∈ tβ̃. Suppose now that aϕ ∈ tβ̃, and let σ A (a) = (a1 , . . . , am ). Then
a1 ϕ ∈ t1 β̃, . . . , am ϕ ∈ tm β̃, which implies a1 ∈ t1 α̃, . . . , am ∈ tm α̃. Hence, a ∈ tα̃.
The equality tα̃ = tβ̃ϕ−1 implies tα̃ϕ = tβ̃ as ϕ is surjective.
Now, (*) implies that for every t ∈ FΣ (X),
t ∈ T (A) iff a0 ∈ tα̃
iff a0 ϕ(= b0 ) ∈ tα̃ϕ(= tβ̃)
iff t ∈ T (B). ✷

We call two states a and a′ of a DR ΣX-recognizer A equivalent, and we write a ∼A a′


(or just a ∼ a′ ), if T (A, a) = T (A, a′ ). Obviously, ∼A is an equivalence relation on A.
We say that A is reduced, if ∼A = δA .
Lemma 2.11.10 If A is a normalized DR ΣX-recognizer, then ∼ is a congruence on
A and A/∼ is reduced.
Proof. First we show that ∼ is a congruence relation.
(i) Consider any m > 0, σ ∈ Σm and a, a′ ∈ A such that a ∼ a′ . Let
σ A (a) = (a1 , . . . , am ) and σ A (a′ ) = (a′1 , . . . , a′m ).
But a ∼ a′ means that T (A, a) = T (A, a′ ), and Lemma 2.11.7 implies that
T (A, ai ) = T (A, a′i ) for all i = 1, . . . , m.
Hence, ai ∼ a′i for all i = 1, . . . , m.
(ii) If a ∈ xα and a ∼ a′ , for some x ∈ X and a, a′ ∈ A, then x ∈ T (A, a) = T (A, a′ )
implies a′ ∈ xα. Hence, ∼ saturates xα.
Now we know that the quotient recognizer A/∼ can be defined. It is reduced as
(a∼) ∼A/∼ (a′ ∼) implies a∼ = a′ ∼ (a, a′ ∈ A) because, by Theorem 2.11.9,
T (A, a) = T (A/∼, a∼) = T (A/∼, a′ ∼) = T (A, a′ ). ✷

Let a, a′ ∈ A. We write a ⊢ a′ if there exist an m > 0 and a σ ∈ Σm such that a′


appears in σ A (a). The reflexive, transitive closure of ⊢ is denoted by ⊢∗ . If a ⊢∗ a′ , we
say that a′ is reachable from a. The DR recognizer A is said to be connected if every
state is reachable from the initial state.
The connected component
Ac = (Ac , a0 , αc )
of a DR ΣX-recognizer A is defined as follows:

108
2.11 Deterministic R-recognizers

c
(i) Ac = (Ac , Σ), where Ac = {a ∈ A | a0 ⊢∗ a} and σ A (a) = σ A (a) for all σ ∈ Σ and
a ∈ Ac .

(ii) xαc = xα ∩ Ac for each x ∈ X.


c
Clearly, the operations σ A : Ac → (Ac )m are completely defined (m > 0, σ ∈ Σm ).
The proof of Lemma 2.11.11 is quite straightforward and we shall omit it.

Lemma 2.11.11 Let A be any DR ΣX-recognizer. Then

(a) Ac is connected and deterministic,

(b) Ac = A iff A is connected,

(c) T (Ac ) = T (A), and

(d) if A is normalized, then so is Ac . ✷

We are now ready to present the main theorem of the minimization theory of DR
recognizers.

Theorem 2.11.12 Let A and B be connected, normalized DR ΣX-recognizers. Then


T (A) = T (B) iff A/∼A ∼
= B/∼B .

Proof. If A/∼A and B/∼B are isomorphic, then

T (A) = T (A/∼A ) = T (B/∼B ) = T (B)

by Theorems 2.11.8 and 2.11.9.


Assume now that T (A) = T (B). We define a mapping

ϕ : A/∼A → B/∼B

by the condition that

(a∼A )ϕ = b∼B if T (A, a) = T (B, b) (a ∈ A, b ∈ B).

The following steps (i)–(v) show that ϕ is the required isomorphism.

(i) (a∼A )ϕ is defined for all a∼A ∈ A/∼A . Since A is connected, there exist for every
a ∈ A a k ≥ 0 and states a1 , . . . , ak ∈ A such that

a0 ⊢ a1 ⊢ a2 ⊢ · · · ⊢ ak = a.

Using Lemma 2.11.7 one shows by induction on the smallest k (corresponding to


the given a) that there is a b such that T (A, a) = T (B, b).

(ii) ϕ is well-defined. If a ∼A a′ , T (A, a) = T (B, b) and T (A, a′ ) = T (B, b′ ) for some


a, a′ ∈ A and b, b′ ∈ B, then b∼B = b′ ∼B .

109
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

(iii) ϕ is injective. Similarly as (ii).

(iv) ϕ is surjective. If we exchange the roles of A and B in (i), we see that there exists
for every b ∈ B an a ∈ A such that T (A, a) = T (B, b).

(v) ϕ is a homomorphism. That ϕ preserves the operations follows from Lemma 2.11.7.
If a∼A ∈ xα/∼A (x ∈ X) and (a∼A )ϕ = b∼B , then x ∈ T (A, a) = T (B, b) implies
b∼B ∈ xβ/∼B . Likewise, (a∼A )ϕ = b∼B ∈ xβ/∼B implies a∼A ∈ xα/∼A . Thus
xβ∼B ϕ−1 = xα∼A for every x ∈ X. ✷

A DR recognizer A is said to be minimal if no DR recognizer with fewer states recog-


nizes T (A). If A is minimal, then it is connected by Lemma 2.11.11. As T (A∗ ) = T (A)
we may also assume that A is normalized. Then T (A) = T (A/∼A ) implies that A
should be reduced, too. Conversely, if A is connected, normalized and reduced, then
it is minimal and every normalized minimal DR recognizer is isomorphic to it (Theo-
rem 2.11.12). These facts imply that the following three steps yield for any DR recognizer
A an equivalent minimal DR recognizer B. Moreover, this B is normalized.

Step 1. Form A∗ .

Step 2. Form A∗ c .

Step 3. Form ∼ for A∗c , and put B = A∗c /∼.

It is not hard to see that these steps are effectively realizable.

2.12 EXERCISES
1. Let leaf(t) denote the set of symbols which label the leaves of a given ΣX-tree t.
Define leaf(t) by tree induction.

2. (a) Define the length |t| of a ΣX-tree t (as a word) by tree induction.
(b) For the sake of simplicity, let Σ = Σ2 . Derive an upper bound for |t| in terms
of hg(t). Give also a lower bound for |t| in terms of hg(t).

3. Let Σ = Σ0 ∪ Σ2 , Σ0 = {ω}, Σ2 = {σ}, and let X = {x, y}. Construct a CF


grammar which generates the set FΣ (X) of all ΣX-trees (when these are viewed as
words). Is the set of all ΣX-trees still a CF language if we use the Polish notation
for ΣX-terms?

4. Let Σ and X be as in the previous exercise. Decide which ones of the ΣX-forests,
R, S, and T are recognizable, when these are defined as follows:
(i) t ∈ R iff the number of σ’s in t is odd.
(ii) t ∈ S iff all paths from the root to a leaf are of the same length.

110
2.12 Exercises

(iii) t ∈ T iff no leaf labelled by y appears to the left of a leaf labelled by x.

5. Let A be an NDF ΣX-recognizer and B an NDR ΣX-recognizer which are asso-


ciated in the sense of Section 2.2. Prove the equality α̂ = β̃ by tree induction.

6. Use regular tree grammars to prove directly that Rec(Σ, X) is closed under σ-
products (Corollary 2.4.12).

7. Let us change the definition of the forest product T (x ← Tx ) (cf. Definition 2.4.3)
in such a way that every occurrence of each letter x ∈ X should be rewritten as
the same tree tx ∈ Tx . Then we get the new product

T [x ← Tx | x ∈ X] = {t(x ← tx | x ∈ X) | t ∈ T, tx ∈ Tx (x ∈ X)}.

Is Rec(Σ, X) closed under this product?

8. Let T be a ΣX-forest and let x ∈ X. Describe the forests T ·x ∅ and ∅ ·x T .

9. Do the following laws hold for x-products?


(a) R ·x (S ∪ T ) = (R ·x S) ∪ (R ·x T ).
(b) (R ∪ S) ·x T = (R ·x T ) ∪ (S ·x T ).
(c) R ·x (S ·y T ) = (R ·x S) ·y T .

10. Let us change Definition 2.4.7 so that T j+1,x = T ·x T j,x ∪ T j,x for all j ≥ 0.
Does the new x-iteration coincide with the original one? If not, does it preserve
recognizability?

11. Let x 6= y (x, y ∈ X). Is it possible that (T ∗x )∗y 6= (T ∗y )∗x for some ΣX-forest T ?

12. Show that the construction of the tree recognizer for the forest S −x T given in the
proof of Theorem 2.4.10 is effective when S is recognizable (and given by a tree
recognizer).

13. Prove Lemma 2.4.17.

14. Prove Corollary 2.4.20 directly without using Theorems 2.4.16 and 2.4.18.

15. Let ϕ : FΣ (X) → FΣ (X) be a homomorphism of Σ-algebras. Prove that if T ∈


Rec(Σ, X), then (a) T ϕ ∈ Rec(Σ, X) and (b) T ϕ−1 ∈ Rec(Σ, X).

16. The set of atomic ΣX-trees is defined as

A(Σ, X) = {σ(xi1 , . . . , xim ) | m ≥ 0, σ ∈ Σm , xi1 , . . . , xim ∈ X}.

For the sake of definiteness, let X = {x1 , . . . , xn } (n ≥ 1). Prove that

(. . . (A(Σ, X)∗x1 )∗x2 . . . )∗xn = FΣ (X)

(cf. Thatcher and Wright [241]).

111
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

17. Let Σ = Σ2 = {σ} and X = {x}. Write a regular expression for the forest of all
ΣX-trees which contain an even number of σ’s.
18. Let Σ and X be as in Exercise 3. Construct a ΣX-recognizer for the forest repre-
sented by the regular expression σ(x, y) ·z σ(ω, σ(ω, z))∗z .
19. Prove Theorem 2.6.6.
20. If A is a ΣX-recognizer and T (A) = T , then α̂ is a homomorphism from FT to
A. Prove Lemma 2.6.2 using this observation.
21. Prove Lemma 2.6.14.
22. In Section 2.7 we noted that one may define recognizability for subsets of algebras.
We call T (⊆ A) a recognizable subset of the Σ-algebra A = (A, Σ), if there exists
a congruence θ of finite index which saturates T . Denote by Rec A the set of all
recognizable subsets of A. Prove the following facts:
(a) If S, T ∈ Rec A, then S ∪ T, S ∩ T, S − T ∈ Rec A.
(b) If ϕ : A → B is a homomorphism and T ∈ Rec B, then T ϕ−1 ∈ Rec A.
(Note. T ∈ Rec A does not imply T ϕ ∈ Rec B. A counterexample where A and
B are monoids can be found in Eilenberg’s book (Vol. A) mentioned among the
references of Chapter 1.)
23. Let Σ = Σ2 = {σ} and X = {x, y}, and let (U, V ) be the least fixed-point of the
system
u = x + σ(σ(u, v), y)
v = σ(y, u).
Find a regular (Σ, X, k)-polynomial Π (k ≥ 2) such that U and V can be rep-
resented as unions of some components of [Π̂]. (For a general treatment of such
questions see Mezei and Wright [182].)
24. Show that every local ΣX-forest Loc(R, F ) can be represented in terms of the ele-
mentary forests and the elementary operations intersection, union, and restriction.
Note the resulting connection between the Theorems 2.8.6 and 2.9.5.
25. Show that the decidability of the equivalence problem of tree recognizers follows
from the results of Section 2.6.
26. Prove Lemma 2.10.5.
27. Prove that it is decidable whether a recognizable forest can be recognized by a
DR-recognizer.
28. Are all local forests recognizable by DR-recognizers?
29. Present algorithms for carrying out Steps 2 and 3 of the minimization algorithm
for DR-recognizers which was outlined in Section 2.11.

112
2.13 Notes and references

2.13 NOTES AND REFERENCES


The observation (made about 1960) that finite automata may be defined as unary al-
gebras is attributed to J. R. Büchi and J. B. Wright (see Mezei and Wright [182],
Thatcher [239]). The generalization to tree automata was suggested independently
by Doner [65, 66] and by Thatcher and Wright [240, 241]. Many of the basic re-
sults presented in this chapter were obtained in various forms by several authors, and
often it would be hard to establish any priorities. Most of the important early con-
tributions can be found in Mezei and Wright [182], Eilenberg and Wright [69],
Thatcher and Wright [241], Doner [66], Thatcher [238], Pair and Quere [196],
Brainerd [39, 40], Arbib and Give’on [5], and Magidor and Moran [166].
Already in many of these papers trees were defined as terms, and this formalism is
now very common. However, most authors use no separate frontier alphabet. Also,
often operators may have more than one rank. The original reason for our use of frontier
alphabets was to keep the character of the algebras independent of the number of frontier
symbols. Another popular formalism defines a tree as a pair (D, λ) consisting of a “tree
domain” D and a labelling mapping λ. Each element d of D specifies a node of the
tree and λ(d) is the label of this node. This definition is quite convenient for discussing
concepts and operations which involve specific occurrences of subtrees. Tree domains
were introduced by S. Gorn in 1965 (for a reference, see Brainerd [40]).
Deterministic and nondeterministic frontier-to-root tree recognizers were defined, and
their equivalence was established, by Thatcher and Wright [241], Doner [66], and
Magidor and Moran [166]. Root-to-frontier tree recognizers were introduced by Ra-
bin [204], and Magidor and Moran [166]. Magidor and Moran showed the equivalence
of NDF and NDR recognizers, and they also studied DR recognizers.
Regular tree grammars and the results of Section 2.3 are due to Brainerd [40]. In
Brainerd’s grammars the form of the productions is quite general, but he shows that
they can be reduced to, what we call, regular tree grammars.
The Boolean closure properties of Rec(Σ, X) were noted in many of the early papers
mentioned above. The Kleene theorem (Theorem 2.5.8) was proved by Thatcher and
Wright [241] and by Magidor and Moran [166]. A simplified proof was given by
Arbib and Give’on [5]. Alphabetic tree homomorphisms (called there projections) and
Corollary 2.4.20 appear in Thatcher and Wright [241]. General tree homomorphisms
arose as special cases of finite-state tree transductions (see Thatcher [238, 239] and
Engelfriet [75]). Tree transductions and tree homomorphisms will be considered in
Chapter 4. Forest products (or “substitutions”) were also introduced in this context. Ito
and Ando [127] present a complete axiom system for the equality of regular expressions
(cf. also Ésik [91]).
Minimal tree recognizers and Nerode congruences are discussed in Brainerd [39],
Arbib and Give’on [5], and Magidor and Moran [166].
The theory of equational forests is from Mezei and Wright [182]. We have simpli-
fied the exposition by considering only regular fixed-point equations. Mezei and Wright
considered also equational and recognizable subsets of general algebras (cf. Exercise 22).
They proved that the equational subsets of an algebra (of finite type) are the homo-

113
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

morphic images of the recognizable subsets of term algebras. Applied to term algebras
this result gives our Theorem 2.7.9. Eilenberg and Wright [69] present these re-
sults in a category theoretic form. For various classes of subsets in general algebras we
refer also to Wagner [249], Lescanne [150], Marchand [175], Shepard [220], and
Steinby [227]. Dubinsky [67] discusses equational and recognizable subsets of nonde-
terministic algebras. Maibaum [170], and Engelfriet and Schmidt [85] extend the
subject into another direction by considering many-sorted algebras.
The material of Section 2.8 is from Costich [52]. Local forests, or similar concepts, and
results related to Theorems 2.9.4 and 2.9.5 can be found in Doner [66], Thatcher [237,
238], and Takahashi [234].
The characterization of the forests recognizable by DR recognizers is from Virágh [248],
although the basic idea is discernible already in Magidor and Moran [166] (cf. also
Thatcher [239]). The minimization theory of DR recognizers appears in Gécseg and
Steinby [104].
We should also mention an alternative approach, originating with Pair and Quere [196]
and popular among French writers, in which the basic objects are tuples of trees rather
than trees. The usual tree operations are then augmented by operations which catenate
tuples of trees or form a tree from an m-tuple by creating a new root labelled by an
m-ary operator. As an abstract framework for their study Pair and Quere introduced
“binoids”, the tuples of trees form such a binoid. Their results include the basic clo-
sure properties and a Kleene Theorem. This formalism has been developed further by
Arnold and Dauchet [21] to a theory of “magmoids” which also embodies many of the
ideas of Eilenberg and Wright [69]. Arnold [9, 10] discusses many topics relevant
to this chapter within the framework of magmoids.
We shall now discuss briefly some topics and applications of the theory not covered
by this book. The survey is by no means complete, and in many cases the choices
were dictated by personal preference. Some more remarks will be made at the end of
Chapters 3 and 4.
The category theoretic treatment of recognizable and equational subsets by Eilenberg
and Wright [69] was already mentioned. It is based on Lawvere’s “theories”. This
approach was developed further by Give’on and Arbib [111], and others. The theory
of magmoids has also evolved from the same ideas. We have avoided the use of category
theory altogether, but the bibliography contains a sample from the extensive and highly
diversified literature on the subject. The items of interest include Alagić [3, 4], Arbib
and Manes [6], Bobrow and Arbib [38], Goguen [113], Goguen et al [114, 115],
Horváth [122, 123], and Trnková and Adámek [244].
The structure theory of tree automata has received little attention although some initial
steps were taken already by Magidor and Moran [166]. Ricci [209] considered cascade
products of tree automata. Iterative realizations and general products of tree automata
are studied in Steinby [225]. Two sections of Gécseg and Steinby [105] are devoted
to the subject. It is evident that generalizations from the unary case will usually not be
easy in this area.
Transition monoids have proved very useful in finite automaton theory and some
equivalents of them for tree automata have been suggested. The “m-ary monoids” of

114
2.13 Notes and references

Give’on [110] and the “substitution algebras” of Yeh [253] are in fact special Menger
algebras. The same idea reappears in the “clone algebras” of Turner [246]. Sommer-
halder [223] develops the concept further and associates with an algebra a sequence
M1 , M2 , . . . of monoids. Here Mn consists of all n-tuples of n-ary polynomial functions of
the algebra. It would be easy to define syntactic monoids of forests along these lines, but
no such theory seems to have evolved yet. Another variant of the transition semigroup
concept has been studied by Helton [120].
We shall mention some other algebraic topics of potential interest. A ΣX-forest T is
said to be recognizable by a Σ-algebra A = (A, Σ) if one may choose α : X → A and A′ (⊆
A) in such a way that (A, Σ, A′ ) recognizes T . Families of forests recognizable by algebras
belonging to a given variety (equational class) were considered by Steinby [224] and by
Gécseg and Horváth [103]. For a further study in this direction it would probably
be advantageous to follow the example of Eilenberg’s theory of M -varieties and varieties
of recognizable languages and consider “ω-varieties” (usually called pseudovarieties) of
algebras and the families of forests corresponding to them; an ω-variety is a class of
finite algebras closed under the construction of subalgebras, homomorphic images and
finite direct products. In Steinby [226] it was shown that Eilenberg’s basic variety
theorem can be extended to ω-varieties and varieties of recognizable subsets of free
algebras (suitably defined). A specialization of this result to term algebras gives a
correspondence between ω-varieties and varieties of recognizable forests. A ΣX-forest T
is said to be rationally represented by an ΩX-recognizer A if there exists an embedding
ϕ : FΣ (X) → FΩ (X) of a certain kind such that T ϕ = T (A). A variety K of algebras is
said to be rationally complete if every recognizable forest can be rationally represented by
a recognizer based on a finite algebra belonging to K. Gécseg [101] studies the rational
completeness of varieties and the equivalence of tree recognizers with respect to rational
representation. Further results can be found in Maróti [176], and Marchand [173]
also contains some related ideas.
We shall now list a few references to some more topics. Probabilistic tree automata and
related topics have been discussed by Magidor and Moran [166, 167], Ellis [72] and
Karpiński [141, 142]. Forests of infinite trees appear in Rabin [204], Engelfriet [73],
Casteran [50] and Courcelle [54]. An alternative way to generate forests is provided
by the tree adjunct grammars studied by Joshi, Levy and Takahashi [135, 136],
Levy [155], and Levy and Joshi [157]. Also Lindenmayer systems (L-systems) for trees
have been considered; see Čulik [56], Čulik and Maibaum [57], Engelfriet [76, 79],
Karpiński [143], Steyart [230], and Szilard [231].
Although we present our subject as a part of pure automata and formal language
theory, it should be clear that it has many connections to the more applied aspects of
language specification, translation and semantics. As a conclusion we would like to point
out some less obvious areas of application.
When Doner [65, 66] and Thatcher and Wright [240, 241] introduced tree au-
tomata their goal was to prove the decidability of the weak second order theory of
multiple successors. Further applications to logic can be found in Rabin [204, 205].
In syntactic pattern recognition patterns are decomposed into simple basic elements
which are represented by letters of an alphabet. A pattern is then represented, for

115
2 TREE RECOGNIZERS AND RECOGNIZABLE FORESTS

example, as a word. However, essential information about the relations between the
basic elements may be lost if the corresponding letters are simply concatenated to form
a word. It is possible that these can be described adequately by representing the pattern
as a tree, and then tree automata theory may be used. For example, the considered class
of patterns may be generated by a tree grammar or recognized by a tree recognizer. One
specific problem prompted by syntactic pattern recognition is the inference of forests
from samples. The interested reader may consult the books by Fu [97] and Gonzalez
and Thomason [117]. Some papers from this area are Berger and Pair [33], Brayer
and Fu [42], Fu and Bhargava [98], Gonzalez, Edwards and Thomason [116], Lu
and Fu [165], Pair [194], Tai [232], and Williams [251].

116
3 CONTEXT-FREE LANGUAGES AND
TREE RECOGNIZERS
The words generated by a context-free grammar can be read from derivation trees. The
connection between forests and languages implied by this fact is the subject matter
of this chapter. In the first section we define the yield-function by means of which a
word is extracted from a tree. In Section 3.2 the basic relations between recognizable
forests and context-free grammars are established. The usual definition of derivation
trees must be modified slightly as to make them “trees” in our sense of the term, but
the difference is inessential. The forest of derivation trees of any CF grammar is shown
to be recognizable. On the other hand, we shall see that the yield of any recognizable
forest is a CF language. Hence tree recognizers may also be viewed as recognizers of CF
languages. The section is concluded by showing that every CF language is the yield of
a local forest recognizable by a deterministic R-recognizer.
The inverse image of a CF language under the yield-function is not always a recog-
nizable forest, but we show in the beginning of Section 3.3 that the inverse image of a
regular language is a recognizable forest. Also, a slightly restricted converse of this fact
is presented. Then we show that every CF language can be obtained from a recognizable
forest with a fixed and very simple ranked alphabet. Section 3.3 is concluded by some
examples which show how facts about context-free languages can be proved using the
theory of recognizable forests.
In Section 3.4 another, less well-known, way to obtain the context-free languages from
recognizable forests is presented.

3.1 THE YIELD FUNCTION


We shall now formally define the function that extracts a word from the frontier of a
tree. This will also give a function that associates a language with every forest.

Definition 3.1.1 The yield yd(t) of a ΣX-tree t is defined inductively as follows:

1◦ yd(x) = x for all x ∈ X.

2◦ If t = σ(t1 , . . . , tm ) (m ≥ 0, σ ∈ Σm ), then yd(t) = yd(t1 ) . . . yd(tm ).

The yield of a ΣX-forest T is the X-language yd(T ) = {yd(t) | t ∈ T }.

To obtain the yield of a tree σ(t1 , . . . , tm ) one concatenates the yields of the subtrees
t1 , . . . , tm . In particular, yd(σ) = e for all σ ∈ Σ0 . More generally, yd(t) = e iff

117
3 CONTEXT-FREE LANGUAGES AND TREE RECOGNIZERS

t ∈ FΣ (∅). The mapping


yd : FΣ (X) → X ∗
is not injective; in general, a word is the yield of several trees.
We use the same symbol yd for its extension to forests. Of course, yd presupposes a
Σ and an X although our notation does not show this.

Example 3.1.2 Let ω ∈ Σ0 , σ ∈ Σ3 , and x, y ∈ X. For s = σ(x, σ(y, ω, y), ω) and


t = σ(ω, x, σ(y, y, ω)) we have yd(s) = yd(t) = xyy. ✷

Whether or not a given word w ∈ X ∗ is the yield of some ΣX-tree depends on the
length of w and the arities of the operators in Σ.

Lemma 3.1.3 Let r(Σ) = {m1 , . . . , mk }. For a word w ∈ X ∗ there exists a tree t ∈
FΣ (X) such that yd(t) = w iff the length of w can be expressed in the form

|w| = h1 (m1 − 1) + · · · + hk (mk − 1) + 1

for some (integers) h1 , . . . , hk ≥ 0. ✷

The proof of the lemma is an exercise. It is easy to see that yd(FΣ (X)) = X ∗ iff
Σ0 6= ∅ and Σ − (Σ1 ∪ Σ0 ) 6= ∅. When this is the case, there exists for every X-language
L a ΣX-forest T such that yd(T ) = L. The greatest among these is the forest

yd−1 (L) = {t ∈ FΣ (X) | yd(t) ∈ L}.

In general, we know just that yd(yd−1 (L)) ⊆ L. From Lemma 3.1.3 one easily gets

Corollary 3.1.4 For a given L ⊆ X ∗ , there exists a forest T ⊆ FΣ (X) such that
yd(T ) = L iff

{|w| | w ∈ L} ⊆ {h1 (m1 − 1) + · · · + hk (mk − 1) + 1 | h1 , . . . , hk ≥ 0},

where {m1 , . . . , mk } = r(Σ). ✷

In the following lemma we list some obvious properties of yd and yd−1 .

Lemma 3.1.5 Let S and T be ΣX-forests, and K and L X-languages. Then

(a) yd(S ∪ T ) = yd(S) ∪ yd(T ),

(b) yd(S ∩ T ) ⊆ yd(S) ∩ yd(T ),

(c) yd−1 (K ∪ L) = yd−1 (K) ∪ yd−1 (L),

(d) yd−1 (K ∩ L) = yd−1 (K) ∩ yd−1 (L), and

(e) yd−1 (K − L) = yd−1 (K) − yd−1 (L). ✷

118
3.2 Context-free languages and recognizable forests

3.2 CONTEXT-FREE LANGUAGES AND RECOGNIZABLE


FORESTS
In the customary definition of derivation trees the inner nodes are labelled by nonterminal
symbols and a nonterminal may appear at nodes with different numbers of outgoing
edges. Since we allowed a symbol of a ranked alphabet to have just one rank, the
definition of derivation trees should be modified accordingly.
Let G = (N, X, P, a0 ) be a CF grammar as defined in Section 1.6. We associate with
G a ranked alphabet ΣG thus: for each m ≥ 0,

ΣG
m = {(a, m) | (∃a → η ∈ P )|η| = m}.

Definition 3.2.1 Let G and ΣG be as above. For every d ∈ N ∪ X the set D(G, d) of
derivation trees with d as the root is defined by the following conditions:
1◦ D(G, x) = {x} for each x ∈ X.

2◦ For a ∈ N , (a, 0) ∈ D(G, a) iff a → e ∈ P .

3◦ Suppose a → d1 . . . dm ∈ P , with m > 0, a ∈ N and d1 , . . . , dm ∈ N ∪ X. If


t1 ∈ D(G, d1 ), . . . , tm ∈ D(G, dm ), then (a, m)(t1 , . . . , tm ) ∈ D(G, a).

4◦ Nothing is in any D(G, d) unless this follows from a finite number of applications
of the rules 1◦ , 2◦ and 3◦ .
The derivation forest of G is the ΣG X-forest D(G) = D(G, a0 ).

Exactly as in the case of conventional derivation trees, every t in D(G, d) (d ∈ N ∪ X)


corresponds to a unique leftmost derivation in G of the word yd(t) from d. Also, every
derivation
d ⇒G u1 ⇒G · · · ⇒G uk−1 ⇒G w,
with d ∈ N ∪ X and w ∈ X ∗ , can be described by a tree t ∈ D(G, d) such that
yd(t) = w. This is easily shown by induction on the length of the derivation. Hence,
L(G) = yd(D(G)).

Theorem 3.2.2 The derivation forests of CF grammars are local and, therefore, recog-
nizable.

Proof. Let G = (N, X, P, a0 ) be a CF grammar. It is obvious that D(G) is the local


ΣG X-forest L(R, F ) (in the notation of Section 2.9), where

R = {(a0 , m) | m ≥ 0, (a0 , m) ∈ ΣG
m}

and the set F of the allowed forks is defined as follows. If m > 0 and a → d1 . . . dm ∈ P ,
then we include in F every fork (a, m)(c1 , . . . , cm ) such that for all i = 1, . . . , m,

di if di ∈ X,
ci =
(di , k) with k ≥ 0 and (di , k) ∈ ΣG k , if di ∈ N .

119
3 CONTEXT-FREE LANGUAGES AND TREE RECOGNIZERS

Nothing is in F unless this follows from the construction described above. ✷

It is also easy to see that D(G) is generated by the regular ΣG X-grammar GD =


(N, ΣG , X, PD , a0 ), where

PD = {a → (a, m)(d1 , . . . , dm ) | m ≥ 0, a → d1 . . . dm ∈ P, d1 , . . . , dm ∈ N ∪ X}.

Example 3.2.3 Consider the CF grammar

G = ({a0 , b}, {x, y}, {a0 → xa0 b, a0 → e, b → xyb, b → y}, a0 ).

In this case ΣG = ΣG G G G G
0 ∪ Σ1 ∪ Σ3 , where Σ0 = {(a0 , 0)}, Σ1 = {(b, 1)} and Σ3 =
G
G
{(a0 , 3), (b, 3)}. The productions of the grammar GD = (N, Σ , X, PD , a0 ) generating
D(G) are a0 → (a0 , 3)(x, a0 , b), a0 → (a0 , 0), b → (b, 3)(x, y, b) and b → (b, 1)(y). The
allowed roots of the local forest D(G) are (a0 , 0) and (a0 , 3), and the possible forks are
(a0 , 3)(x, (a0 , 0), (b, 1)), (a0 , 3)(x, (a0 , 0), (b, 3)), (a0 , 3)(x, (a0 , 3), (b, 1)), (a0 , 3)(x, (a0 , 3),
(b, 3)), (b, 3)(x, y, (b, 1)), (b, 3)(x, y, (b, 3)) and (b, 1)(y). ✷

Theorem 3.2.2 yields immediately

Corollary 3.2.4 Every CF language is the yield of a recognizable forest. ✷

The converse is also true:

Theorem 3.2.5 The yield of any recognizable forest is a context-free language.

Proof. Let G = (N, Σ, X, P, a0 ) be a regular ΣX-grammar generating the given recog-


nizable ΣX-forest T . To simplify matters we assume that G is in normal form. Now we
construct the CF grammar G1 = (N, X, P1 , a0 ) with

P1 = {a → yd′ (p) | a → p ∈ P }.

Here yd′ is the yield-function corresponding to the extended frontier alphabet X ∪ N .


Inductions on the lengths of the derivations show that
(1) a ⇒∗G t implies a ⇒∗G1 yd(t), for all a ∈ N , t ∈ FΣ (X), and that

(2) for all w ∈ X ∗ and a ∈ N , a ⇒∗G1 w only in case there exists a tree t ∈ FΣ (X)
such that a ⇒∗G t and yd(t) = w.
These two facts imply that yd(T ) = L(G1 ) is CF. ✷

In view of Theorem 3.2.5 any tree recognizer may be seen as a device which recognizes
a CF language by checking the possible syntaxes of given words; a word is accepted iff
it is the yield of at least one tree accepted by the tree recognizer.

Definition 3.2.6 The language recognized by a ΣX-recognizer A is the X-language


L(A) = yd(T (A)).

120
3.2 Context-free languages and recognizable forests

The previous results can now be expressed as follows.

Theorem 3.2.7 A language is recognized by a tree recognizer iff it is context-free. ✷

The equivalence expressed in Theorem 3.2.7 is effective both ways; for any CF language
given by a CF grammar we can construct a tree recognizer, and for any tree recognizer
A we can construct a CF grammar generating L(A).
By Theorem 3.2.2 every CF language is the yield of a local forest. We shall now show
that even a smaller class of forests will suffice. To this end we replace derivation trees
by trees in which the inner nodes are labelled by complete productions.
With every CF grammar G = (N, X, P, a0 ) we associate another ranked alphabet ΣP
defined as follows. For each m ≥ 0, let

ΣPm = {(a → η) | a → η is in P and |η| = m},

i.e., the m-ary symbols correspond to the productions with right-hand sides of length
m.

Definition 3.2.8 Let G and ΣP be as above. For every d ∈ N ∪ X the set P (G, d) of
production trees with d at the root is defined by the following conditions:
1◦ P (G, x) = {x} for each x ∈ X.
2◦ For a ∈ N , (a → e) ∈ P (G, a) iff a → e ∈ P .
3◦ Suppose a → d1 . . . dm ∈ P (m > 0, a ∈ N and d1 , . . . , dm ∈ N ∪ X). If p1 ∈
P (G, d1 ), . . . , pm ∈ P (G, dm ), then (a → d1 . . . dm )(p1 , . . . , pm ) ∈ P (G, a).
4◦ Nothing is in any P (G, d) unless this follows from a finite number of applications
of 1◦ , 2◦ and 3◦ .
The production forest of G is the ΣP X-forest P (G) = P (G, a0 ).

In our previous discussion of DR-recognizers we excluded nullary symbols, but since


the ranked alphabets ΣP may contain such symbols, we now extend the definition of a
DR ΣX-recognizer A = (A, a0 , A′ ) by setting σ A ∈ A and σ α̃ = {σ A } for any σ ∈ Σ0 .

Theorem 3.2.9 The production forest P (G) of any CF grammar G is local and it is
also recognizable by a deterministic R-recognizer.

Proof. Let G = (N, X, P, a0 ) be a CF grammar. The presentation of P (G) as a local for-


est is similar to that of D(G). We construct a DR ΣP X-recognizer A = (A, ΣP , X, A′ , α)
as follows. Put A = N ∪ X ∪ {d} (d ∈ / N ∪ X), A′ = {a0 }, and for each x ∈ X,
xα = {x}. Next, the underlying root-to-frontier algebra A = (A, ΣP ) is defined. If
σ = (a → e) ∈ ΣP0 , then σ A = a. Let σ = (a → c1 . . . cm ) ∈ ΣPm with m > 0. Then we
put σ A (a) = (c1 , . . . , cm ), and σ A (b) = (d, . . . , d) for all b 6= a. It is easy to show by tree
induction that for all t ∈ FΣP (X) and a ∈ N ∪ X,

a ∈ tα̃ iff t ∈ P (G, a).

121
3 CONTEXT-FREE LANGUAGES AND TREE RECOGNIZERS

This implies that A recognizes P (G). ✷

The language recognized by an R-recognizer is defined in the natural way. As it is


obvious that yd(P (G)) = L(G) for every CF grammar G, we may state

Corollary 3.2.10 Every CF language is recognized by a deterministic R-recognizer. ✷

3.3 FURTHER RESULTS AND APPLICATIONS


Every CF language L is the yield of many different forests. Such a forest is not neces-
sarily recognizable. In particular, the greatest of them (for a given Σ) yd−1 (L) may be
nonrecognizable.

Example 3.3.1 Let Σ = Σ2 = {σ} and X = {x, y}. Consider the (minimal linear) CF
language L = {xn y n | n ≥ 1}. If yd−1 (L) were recognized by a ΣX-recognizer A, then
A would accept all trees σ(si , ti ) (i ≥ 1), where (i) s1 = x, t1 = y and (ii) si+1 = σ(sk , x)
and tk+1 = σ(y, tk ) for all k ≥ 1. As A is finite, it would then also accept some tree
σ(si , tj ) with i 6= j. But this is a contradiction, because yd(σ(si , tj )) = xi y j 6∈ L. ✷

In contrast to Example 3.3.1 we have

Theorem 3.3.2 If L is a regular X-language, then yd−1 (L) ∈ Rec(Σ, X) for any ranked
alphabet Σ.

Proof. Let M be a finite monoid, ϕ : X ∗ → M a homomorphism and H a subset of M


such that L = Hϕ−1 . Let A = (M, Σ) be the Σ-algebra defined so that

σ A (a1 , . . . , am ) = a1 · a2 · . . . · am (product in M )

for all m ≥ 0, σ ∈ Σm and a1 , . . . , am ∈ M . In particular, σ A = 1 when σ ∈ Σ0 . If we


put
α = ϕ|X : X → M,
then
tα̂ = yd(t)ϕ for all t ∈ FΣ (X).
This implies that yd−1 (L) = T (A) for the ΣX-recognizer A = (A, α, H). Indeed, for all
t ∈ FΣ (X),

t ∈ T (A) iff tα̂ = yd(t)ϕ ∈ H


iff yd(t) ∈ L
iff t ∈ yd−1 (L).

The full converse of Theorem 3.3.2 is not valid, but the following result will be proven
in Exercises 6 and 7.

122
3.3 Further results and applications

Theorem 3.3.3 Let L (⊆ X ∗ ) be a language and Σ a ranked alphabet such that


yd(yd−1 (L)) = L. Then yd−1 (L) ∈ Rec(Σ, X) implies L ∈ RecX. ✷

The ranked alphabets ΣG and ΣP depend on the given CF grammar. We shall now
show that every CF language is the yield of a recognizable forest over a fixed ranked
alphabet. In fact, a very simple alphabet will suffice.

Theorem 3.3.4 Let Σ be a ranked alphabet which contains a binary operator and a
nullary operator. Then every CF language is recognized by a Σ-recognizer. For e-free
CF languages the binary symbol alone is sufficient.

Proof. Let us consider the e-free case first. Every CF language L ⊆ X + is generated
by a CF grammar G = (N, X, P, a0 ) in Chomsky normal form, where each production
is of the form a → bc or a → x (a, b, c ∈ N , x ∈ X). By Lemma 2.4.1 we may assume
that Σ = Σ2 = {σ}. Let G1 = (N, Σ, X, P1 , a0 ) be the regular ΣX-grammar, where

P1 = {a → σ(b, c) | a → bc ∈ P } ∪ {a → x | a → x ∈ P }.

Adjoin N to the frontier alphabet and let

yd′ : FΣ (X ∪ N ) → (X ∪ N )∗

be the corresponding yield-function. By induction on the length of the derivation one


can verify that for every derivation

a ⇒G u1 ⇒G . . . ⇒G uk (a ∈ N, k ≥ 1)

there is a derivation

a ⇒G1 p1 ⇒G1 . . . ⇒G1 pk (p1 , . . . , pk ∈ FΣ (X ∪ N )) (*)

such that yd′ (pi ) = ui for i = 1, . . . , k. This implies L(G) ⊆ yd(T (G1 )) as yd′ |FΣ (X) =
yd. The converse inclusion follows from the fact that for every derivation (*) we have a
derivation
a ⇒G yd′ (p1 ) ⇒G . . . ⇒G yd′ (pk ).
If L ⊆ X ∗ and e ∈ L, then we find, as above, a recognizable ΣX-forest T such that
yd(T ) = L − {e}. Now add a nullary operator ω to Σ and let T ′ = T ∪ ω. Then T ′ is
recognizable and yd(T ′ ) = L. ✷

The connections established above suggest the possibility of developing, or just inter-
preting, the theory of context-free languages in terms of tree automata and recognizable
forests. We shall illustrate this by a few examples. The results themselves are well
known.

Theorem 3.3.5 The intersection of a context-free language with a regular language is


context-free.

123
3 CONTEXT-FREE LANGUAGES AND TREE RECOGNIZERS

Proof. Consider a CF language L ⊆ X ∗ and a regular language U over the same


alphabet. Choose any ranked alphabet Σ and a recognizable ΣX-forest R such that
yd(R) = L. Then
L ∩ U = yd(R ∩ yd−1 (U )).
Since R ∩ yd−1 (U ) ∈ Rec(Σ, X) by Theorem 3.3.2 and Theorem 2.4.2, this means that
L ∩ U is context-free. ✷

The next example shows how the regular forest operations relate to language opera-
tions.

Definition 3.3.6 Let U and V be X-languages and x ∈ X. The x-substitution of U


into V is the language U ·x V of all words

w0 u1 w1 u2 . . . wk−1 uk wk ,

where k ≥ 0, u1 , . . . , uk ∈ U , w0 xw1 x . . . xwk−1 xwk ∈ V and x does not appear in the


word w0 w1 . . . wk .
The x-substitution closure of U is the language
[
U ∗x = (U i,x | i ≥ 0),

where U 0,x = {x} and U i,x = U i−1,x ·x U ∪ U i−1,x for i > 0.

Consider two ΣX-forests S and T and a symbol x ∈ X. Every tree p ∈ S ·x T is


obtained from some tree t ∈ T by replacing each occurrence of x by some tree from S.
Suppose x appears k times (k ≥ 0) in t and that we get p by replacing these occurrences,
from left to right, by the trees s1 , . . . , sk ∈ S. If

yd(t) = w0 xw1 x . . . xwk ,

then
yd(p) = w0 yd(s1 )w1 yd(s2 ) . . . yd(sk )wk ∈ yd(S) ·x yd(T ).
Conversely, if w ∈ yd(S) ·x yd(T ), then we may write w in the form

w = w0 u1 w1 u2 . . . wk−1 uk wk

so that k ≥ 0, w0 xw1 x . . . xwk ∈ yd(T ) and u1 , . . . , uk ∈ yd(S). Then there are trees
t ∈ T and s1 , . . . , sk ∈ S such that yd(t) = w0 xw1 x . . . xwk and yd(s1 ) = u1 , . . . ,
yd(sk ) = uk . If we replace the occurrences of x in t by the trees s1 , . . . , sk , then we get
a tree p ∈ S ·x T such that yd(p) = w. An easy induction on i shows now that

yd(T i,x ) = yd(T )i,x for all i ≥ 0.

Using these observations we get

Lemma 3.3.7 For any two ΣX-forests S and T , and any letter x ∈ X,

124
3.4 Another way to recognize CF languages

(a) yd(S ·x T ) = yd(S) ·x yd(T )


and
(b) yd(T ∗x ) = yd(T )∗x . ✷

Now we can derive the following well-known description of the family of context-free
languages.

Theorem 3.3.8 The context-free languages form the smallest family of languages which
contains the finite languages and is closed under (finite) union, x-substitutions and x-
substitution closures.

Proof. Clearly, all finite languages are context-free. Let U, V ⊆ X ∗ be CF and x ∈ X.


There exist recognizable forests S, T ⊆ FΣ (X) such that yd(S) = U , yd(T ) = V . Now
U ∪V = yd(S ∪T ), U ·x V = yd(S ·x T ) and V ∗x = yd(T ∗x ) are all seen to be context-free.
On the other hand, the Kleene theorem (Theorem 2.5.8) together with Corollary 3.2.4
and Lemma 3.3.7 shows that every CF language can be obtained from finite languages
by forming unions, x-substitutions and x-substitution closures. ✷
Note that when a CF X-language is expressed in terms of finite languages, unions,
substitutions and substitution closures, symbols not in X may be used as auxiliary
symbols in substitutions.
As an example we consider the language L = {xn y n | n ≥ 0}. Let ω ∈ Σ0 and σ ∈ Σ3 .
Then L is the yield of, for example, the recognizable ΣX-forest
T = {ω, σ(x, ω, y), σ(x, σ(x, ω, y), y), . . .}
which has the regular expression ω ·z σ(x, z, y)∗z . From this we get for L the represen-
tation
L = {e} ·z {xzy}∗z .
Here z is an auxiliary letter which does not appear in the language represented.

3.4 ANOTHER WAY TO RECOGNIZE CF LANGUAGES


If an ordinary finite automaton is viewed as a unary algebra, then its input symbols form
a ranked alphabet. There is a way to interpret ΣX-trees as words over Σ in the general
case, too. When this is done, recognizable forests become CF languages. Moreover,
every CF language can be obtained this way as a recognizable forest once its alphabet
is suitably ranked.
We consider the unary case as an introduction. The word
tη = σ1 . . . σk ∈ Σ∗
can be obtained from the corresponding Σ{x}-tree
t = σk (. . . σ1 (x) . . .)
recursively as follows:

125
3 CONTEXT-FREE LANGUAGES AND TREE RECOGNIZERS

1◦ xη = e for all x ∈ X.

2◦ tη = sησ if t = σ(s) (σ ∈ Σ).

Another way to get tη would be to erase the parentheses and x and then reverse the
resulting word. Both of these constructions can serve as a basis for the generalization
to the case of an arbitrary ranked alphabet. The reversing of the order of the word
is an inessential step due to our way of writing trees, and it will be omitted in the
generalization.
Let Σ be an arbitrary ranked alphabet and X any frontier alphabet. We shall treat Σ
as an ordinary alphabet, too. We assume that Σ and X are disjoint and that they do
not contain (, ) or the comma. Let

Y = Σ ∪ X ∪ {(, ), , }

and define
η : Y ∗ → Σ∗

as the monoid homomorphism such that


(
y for y ∈ Σ,
yη =
e for y ∈ Y − Σ.

Applied to a ΣX-tree t η erases all frontier letters x ∈ X, the parentheses and the
commas leaving the symbols σ ∈ Σ intact. It is easy to see that this can be carried out
as follows, too.

Lemma 3.4.1 The words tη (t ∈ FΣ (X)) can be found recursively as follows:

1◦ xη = e for all x ∈ X.

2◦ If t = σ(t1 , . . . , tm ) (m ≥ 0, σ ∈ Σm ), then tη = σt1 η . . . tm η. ✷

We have already noted that every regular ΣX-grammar may also be viewed as a
CF grammar generating a Y -language. Moreover, it is well-known that the family of
context-free languages is closed under homomorphisms. Hence we have

Lemma 3.4.2 If T ∈ Rec(Σ, X), then T η ∈ CF(Σ). ✷

Next we prove the following converse of Lemma 3.4.2.

Lemma 3.4.3 Let Σ and X be alphabets. If Σ is ranked so that Σ2 = Σ, then there


exists for each CF language L ⊆ Σ∗ a recognizable ΣX-forest T such that T η = L.

126
3.5 Exercises

Proof. First, let L be e-free. Then L is generated by a CF grammar G = (N, Σ, P, a0 )


in Greibach 2-form, where each production is of the form (i) a → σbc, (ii) a → σb or
(iii) a → σ (a, b, c ∈ N , σ ∈ Σ). We convert G into a regular ΣX-grammar G1 =
(N, Σ, X, P1 , a0 ), where the set P1 of productions is defined as follows. Fix any x ∈ X
and put then

P1 = {a → σ(b, c) | a → σbc ∈ P } ∪ {a → σ(b, x) | a → σb ∈ P } ∪


{a → σ(x, x) | a → σ ∈ P }.

In order to show that T (G1 ) is the required recognizable forest we extend η to a homo-
morphism
η1 |(Y ∪ N )∗ → (Σ ∪ N )∗
so that η1 |Y = η and η1 |N = 1N . It is easy to see that to every derivation

a ⇒G u1 ⇒G . . . ⇒G uk (a ∈ N, k ≥ 1)

there corresponds a derivation

a ⇒ G 1 v1 ⇒ G 1 . . . ⇒ G 1 vk (*)

such that vi η1 = ui (i = 1, . . . , k). Conversely, every derivation (*) is matched by the


derivation
a ⇒G v1 η1 ⇒G . . . ⇒G vk η1 .
Since η1 |Y ∗ = η, this implies T (G1 )η = L(G) = L. If e ∈ L, we apply this construction
to L − e and add then the tree x to T (G1 ). ✷
In the representation of Lemma 3.4.3 the frontier alphabet X can be fixed in advance
independently of Σ and the language L. A one-element alphabet X = {x} always suffices.
We say that a ΣX-recognizer A η-accepts a word w ∈ Σ∗ , if it accepts at least one
ΣX-tree t such that tη = w. The Σ-language η(A) η-recognized by A is the set of all
words η-accepted by A. In this terminology the previous results may be summed up as
follows.

Theorem 3.4.4 A language is η-recognized by some tree recognizer iff it is a context-free


language. ✷

3.5 EXERCISES
1. Is is possible that yd−1 (w) is infinite for some word w?

2. Prove Lemma 3.1.3.

3. Find an example of a nonrecognizable forest T such that yd(T ) is a recognizable


language.

4. Show that for every CF grammar G, D(G) is the image of P (G) under an alphabetic
tree homomorphism.

127
3 CONTEXT-FREE LANGUAGES AND TREE RECOGNIZERS

5. Recall that a groupoid is an algebra with one binary operation (and no other
operations). For Σ = Σ2 = {σ}, FΣ (X) is the free groupoid generated by X.
Verify that yd : FΣ (X) → X + is a groupoid epimorphism. Then prove that a
language L ⊆ X + is context-free iff it is the homomorphic image of a recognizable
subset of the free groupoid generated by X (cf. Exercise 2.22, and Mezei and
Wright [182]).
6. The set Comb(Σ, X) of “comb-like” ΣX-trees is defined as the smallest set S
satisfying the conditions 1◦ and 2◦ :
1◦ X ∪ Σ0 ⊆ S.
2◦ If m > 0, σ ∈ Σm , x1 , . . . , xm−1 ∈ X and t ∈ S, then σ(x1 , . . . , xm−1 , t) ∈ S.
(a) Prove that Comb(Σ, X) ∈ Rec(Σ, X).
(b) Let T be a recognizable forest such that T ⊆ Comb(Σ, X).
Show that T is generated by a regular ΣX-grammar (N, Σ, X, P, a0 ) in
which each production has the form a → σ(x1 , . . . , xm−1 , b), a → ω or
a → x (a, b ∈ N , m > 0, σ ∈ Σm , x1 , . . . , xm−1 ∈ X, ω ∈ Σ0 , x ∈ X).
(c) Infer from (b) that yd(T ) ∈ RecX for every recognizable T ⊆ Comb(Σ, X).
(d) Prove that for every ΣX-tree t there exists a comb-like ΣX-tree s such
that yd(s) = yd(t). Deduce from this fact that if yd(yd−1 (L)) = L for
some L ⊆ X ∗ , then
yd(yd−1 (L) ∩ Comb(Σ, X)) = L.

7. Prove Theorem 3.3.3 using the results of the previous exercise.


8. Give another proof for Theorem 3.3.4 using the fact that every CF language can
be generated by an invertible CF grammar in Chomsky normal form.

In Exercises 9–12 the theory of recognizable forests should be applied.


9. Prove that the languge U − V is CF if U is CF and V is a regular language.
10. Let ϕ : X ∗ → Y ∗ be a homomorphism of monoids. Prove that Lϕ−1 ∈ CF(X) for
every L ∈ CF(Y ).
11. Let h(t) denote the tree which is obtained from a given tree by rewriting every op-
erator σ as its rank r(σ). Obviously yd(h(t)) = yd(t). Show that h can be defined,
for any given Σ and X, as an alphabetic tree homomorphism. Two CF grammars
G1 and G2 are said to be structurally equivalent if h(D(G1 )) = h(D(G2 )). Prove
that there is an algorithm to determine whether or not two CF grammars are
structurally equivalent.
12. Prove Bar-Hillel’s pumping lemma (Lemma 1.6.13).
13. Let G be a regular ΣX-grammar. Construct a CF grammar G′ such that L(G′ ) =
T (G)η. Note that Lemma 3.4.2 follows as a result.

128
3.6 Notes and references

3.6 NOTES AND REFERENCES


The basic connection between recognizable forests and context-free languages has been
established in various ways. Mezei and Wright [182] proved that the equational sub-
sets of an algebra of finite type (in the monoid X ∗ these are the CF languages) are
the homomorphic images of the recognizable subsets of term algebras, i.e., recognizable
forests. Applied to groupoids this theorem gives the result of Exercise 5 (credited to
D. Muller). It also implies Theorem 3.3.4 which was explicitly formulated by Magidor
and Moran [166]. The proof using derivation forests goes back to Thatcher [237, 238]
and Doner [66]. Various forms of production trees have been used in this context by
Engelfriet [74], and Steinby [224]. Theorem 3.3.2 appears, for example, in Rounds
[215]. It is a special instance of the fact that the inverse homomorphic images of recog-
nizable subsets of algebras are recognizable (cf. Exercise 2.22). Theorem 3.3.3 appears
to be well-known. The proof outlined in Exercises 6 and 7 is from Steyart [229]. The
idea to use tree automata in the theory of CF languages was proposed by Rounds [214].
More examples of such applications can be found in Thatcher [239] and Engelfriet
[74]. The results of Section 3.4 are due to Ferenci [93]. The interested reader may
consult Ferenci [94] for further work in this direction.
As a conclusion we mention a few other topics. Using a ranked nonterminal alphabet it
is possible to define context-free tree grammars. Rounds [213, 214, 215] shows that the
yield-languages of CF forests are exactly the indexed languages. Arnold and Dauchet
[15, 17, 18], and Engelfriet and Schmidt [85] are some further references.
Possibilities to extend some of the results of this chapter to type 0 or context-sensitive
languages by generalizing the tree-concept have been investigated by Benson [32], But-
telman [48, 49], Hart [118, 119], and Révész [208]. Hierarchies of term languages
obtained by iteration of the yield-forming process have been studied by Maibaum [170],
Engelfriet and Schmidt [85], and Turner [245, 246]. Families of languages defined
by tree recognizers based on algebras belonging to a given variety of algebras were con-
sidered by Steinby [224]. Gécseg and Horváth [103] showed that a proper variety
may be complete in the sense that every CF language is recognizable by a finite algebra
of the variety (cf. the Notes and references section of Chapter 2).

129
4 TREE TRANSDUCERS AND TREE
TRANSFORMATIONS
In this chapter we shall deal with systems transforming trees into trees similarly as
generalized sequential machines transform strings into strings. There are two main
categories of such systems: frontier-to-root tree transducers which process a tree from
the leaves down towards the root, and root-to-frontier tree transducers which work in
the opposite direction. Special classes of tree transducers will play a basic part in
decomposing tree transformations into simpler ones.

4.1 BASIC CONCEPTS


Throughout this chapter, Σ, Ω and ∆ will stand for ranked alphabets. It will be assumed
that whenever an operator belongs to more than one ranked alphabet, then it has the
same rank in all of them. Moreover, X, Y and Z will always stand for (finite, nonvoid)
frontier alphabets.
Let us recall that FΣ (S) as defined in Section 2.1 denotes the set of Σ-trees over the
frontier alphabet S. Here we shall allow S to be a possibly infinite set of trees and then
use the notation FΣ [S] for FΣ (S). One can easily see that in such a case there always
exist a ranked alphabet Ω and a frontier alphabet Y such that FΣ [S] ⊆ FΣ (Y ).
Binary relations τ ⊆ FΣ (X) × FΩ (Y ) will be called tree transformations. An inclusion
(p, q) ∈ τ is interpreted to mean that τ may transform p into q. Because tree transfor-
mations are binary relations, we can speak about compositions, inverses, domains and
ranges of tree transformations as defined in Section 1.1.
With each tree transformation τ ⊆ FΣ (X) × FΩ (Y ) we associate the translation
{(yd(p), yd(q)) | (p, q) ∈ τ } from X ∗ into Y ∗ .
The important tree transformations are those which can be given in an effective way.
Next we define two general systems (tree transducers) inducing such transformations.
We shall need a countably infinite set

Ξ = {ξ1 , ξ2 , . . .}

of auxiliary variables. The subset of Ξ consisting of its first n ≥ 0 elements will be


denoted by Ξn , i.e., Ξn = {ξ1 , . . . , ξn }. The role of an auxiliary variable is to indicate an
occurrence of a subtree in a tree.
If all variables occurring in a tree q are among ξ1 , . . . , ξn , then the notation q(ξ1 , . . . , ξn )
may be also used for q. Moreover, if q1 , . . . , qn are arbitrary trees, then we generally
write q(q1 , . . . , qn ) for q(ξ1 ← q1 , . . . , ξn ← qn ).

131
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Definition 4.1.1 A frontier-to-root tree transducer (F-transducer ) is a system A =


(Σ, X, A, Ω, Y, P, A′ ) where

(1) Σ and Ω are ranked alphabets,

(2) X and Y are frontier alphabets,

(3) A is a ranked alphabet consisting of unary operators, the state set of A,


(It will be assumed that A is disjoint with all other sets in the definition of A,
except A′ .)

(4) A′ ⊆ A is the set of final states, and

(5) P is a finite set of productions (or rewriting rules) of the following two types:
(i) x → a(q) (x ∈ X, a ∈ A, q ∈ FΩ (Y )),
(ii) σ(a1 (ξ1 ), . . . , am (ξm )) → a(q(ξ1 , . . . , ξm )) (σ ∈ Σm , m ≥ 0, a1 , . . . , am , a ∈ A,
q(ξ1 , . . . , ξm ) ∈ FΩ (Y ∪ Ξm )).
(In the sequel we shall write simply σ(a1 , . . . , am ) for σ(a1 (ξ1 ), . . . , am (ξm )).)

We shall use also the notation (p, q) for a production p → q. Moreover, if a ∈ A is


a state and t is a tree, then we generally write at for a(t). Similarly, if T is a forest,
then AT will denote the forest {at | a ∈ A, t ∈ T }. Furthermore, for any a ∈ A, we put
A(a) = (Σ, X, A, Ω, Y, P, a).
Let us note that in the above definition it would be more exact to speak about produc-
tion schemes instead of productions. Indeed, soon we shall see that they define patterns
for rewriting trees.
Next we define the transformations induced by F-transducers. Consider the F-transducer
A of Definition 4.1.1 and, for every p ∈ FΣ [X ∪ AΞ], let pτA∗ be the subset of AFΩ (Y ∪ Ξ)
given as follows:

(1) if p = aξ (a ∈ A, ξ ∈ Ξ), then aξ ∈ pτA∗ ,

(2) if p ∈ X ∪ Σ0 , then aq ∈ pτA∗ for all (p, aq) ∈ P ,

(3) if p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), then aq(q1 , . . . , qm ) ∈ pτA∗ for all


(σ(a1 , . . . , am ), aq) ∈ P and ai qi ∈ pi τA∗ (a, ai ∈ A, i = 1, . . . , m), and

(4) nothing is in any pτA∗ unless this follows from (1)–(3).

Definition 4.1.2 Take an F-transducer A = (Σ, X, A, Ω, Y, P, A′ ). Then the relation

τA = {(p, q) | p ∈ FΣ (X), q ∈ FΩ (Y ), aq ∈ pτA∗ for some a ∈ A′ }

is called the transformation induced by A.

132
4.1 Basic concepts

For Definition 4.1.2 it would be enough to apply τA∗ to trees from FΣ (X). The above
more general case will be needed later.
Sometimes in our proofs we should know how an input tree is transformed step by step
into an output tree. Again, let A be the F-transducer of Definition 4.1.1, and consider
two trees p, q ∈ FΣ [X ∪ AFΩ (Y ∪ Ξ)]. It is said that p directly derives q in A if q can be
obtained from p by

(i) replacing an occurrence of an x ∈ X in p by the right side aq of a production


x → aq from P , or by

(ii) replacing an occurrence of a subtree σ(a1 q1 , . . . , am qm ) (σ ∈ Σm , a1 , . . . , am ∈ A,


q1 , . . . , qm ∈ FΩ (Y ∪ Ξ) in p by aq(q1 , . . . , qm ), where σ(a1 , . . . , am ) → aq is a
production from P .

Each application of rule (i) or rule (ii) is called a direct derivation in A. If q is obtained
from p by a direct derivation in A (i.e., p directly derives q in A), then we write p ⇒A q.
Therefore, ⇒A is a binary relation in FΣ [X ∪ AFΩ (Y ∪ Ξ)]. If there is no danger of
confusion, we generally omit A in ⇒A .
By finitely many consecutive applications of direct derivations we get derivations.
Accordingly, for any two trees p, q ∈ FΣ [X ∪ AFΩ (Y ∪ Ξ)] we say that

p = p0 ⇒ p1 ⇒ . . . ⇒ pi ⇒ . . . ⇒ pj ⇒ . . . ⇒ pk = q (1)

(k ≥ 0, pℓ ∈ FΣ [X ∪ AFΩ (Y ∪ Ξ)], ℓ = 1, . . . , k, 0 ≤ i < j ≤ k)


is a derivation of q from p in A, k is the length of this derivation and pi ⇒ . . . ⇒ pj
is a subderivation of (1). In this case we write p ⇒∗A q, or p ⇒∗ q if A is understood,
and say that p derives q in A. Therefore, ⇒∗ is the reflexive-transitive closure of ⇒.
Obviously, when p ⇒∗ q, there could be several (but finitely many) derivations of q
from p. However, when we write p ⇒∗ q, we usually have in mind, at least implicitly, a
certain well-defined derivation of q from p. Consequently, we may say that p ⇒∗ q is a
derivation.
Using the notation ⇒∗ the transformation τA induced by an F-transducer A =
(Σ, X, A, Ω, Y, P, A′ ) can also be given thus:

τA = {(p, q) | p ∈ FΣ (X), q ∈ FΩ (Y ), p ⇒∗ aq for some a ∈ A′ }.

As A may have different productions with the same left side, there could be more
than one q ∈ FΩ (Y ) such that (p, q) ∈ τA for a given p ∈ FΣ (X), i.e., A is in general
nondeterministic. However, at each step of a transformation we have only finitely many
choices. Therefore, pτA is finite for every p ∈ FΣ (X).
A tree transformation is an F-transformation if it can be induced by an F-transducer.
The class of all F-transformations will be denoted by F.
Take an arbitrary set A. The ith component of a vector a ∈ An will be denote by ai ;
i.e., a = (a1 , . . . , an ). If a1 = . . . = an = a then for a we write an . If a ∈ An and b ∈ B m
are arbitrary two vectors, then (a, b) will stand for (a1 , . . . , an , b1 , . . . , bm ). Assume that

133
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

k = min(m, n). Then ab stands for (a1 b1 , . . . , ak bk ) or ((a1 , b1 ), . . . , (ak , bk )), depending
on the context.
Consider a p ∈ FΣ (X ∪ Ξn ), and let p = (p1 , . . . , pn ) be a vector of trees. Then we
shall write p(p) for p(p1 , . . . , pn ). Moreover, if p ∈ FΣ (X ∪ Ξn )m and q = (q1 , . . . , qn ) is
a vector of trees, then p(q) will stand for (p1 (q), . . . , pm (q)).
Consider the homomorphism ϕ : (X ∪ Ξ)∗ → Ξ∗ given by xϕ = e (x ∈ X) and ξϕ = ξ
(ξ ∈ Ξ). Set

F̂Σ (X ∪ Ξn ) = {p ∈ FΣ (X ∪ Ξn ) | yd(p)ϕ is a permutation of ξ1 , . . . , ξn }

and
ˆ
F̂Σ (X ∪ Ξn ) = {p ∈ FΣ (X ∪ Ξn ) | yd(p)ϕ = ξ1 . . . ξn }.
Moreover, if m > 0 then let

F̂Σm (X ∪ Ξn ) = {p ∈ FΣ (X ∪ Ξn )m | yd(p1 )ϕ . . . yd(pm )ϕ is a


permutation of ξ1 , . . . , ξn }.

Now let A = (Σ, X, A, Ω, Y, P, A′ ) be an F-transducer, and consider a derivation

α : p ⇒∗ q (p, q ∈ FΣ [X ∪ AFΩ (Y )]).

Let
r(p1 , p2 ) ⇒ r(p11 , p2 ) ⇒ . . . ⇒ r(p1k , p2 ) ⇒ r(p1k , p′2 ) (2)
(r ∈ F̂Σ [X ∪ AFΩ (Y ) ∪ Ξ2 ])
be a subderivation of α, where the first k direct derivation steps apply to the subtree p1 ,
and then the (k + 1)th step concerns the subtree p2 . Replacing the subderivation (2) in
α by
r(p1 , p2 ) ⇒ r(p1 , p′2 ) ⇒ r(p11 , p′2 ) ⇒ . . . ⇒ r(p1k , p′2 ) (3)
we obviously get a new derivation

β : p ⇒∗ q.

The replacement of (2) in α by (3) is called an inversion of direct derivations. Finitely


many inversions of direct derivations is a reordering of direct derivations.
In the sequel we do not distinguish between derivations obtained from each other by
reorderings of direct derivations.
Again, consider the above F-transducer A and a tree p ∈ FΣ (X). Then by

p = p(p1 , . . . , pm ) ⇒∗ p(a1 q1 , . . . , am qm ) ⇒∗ aq(q1 , . . . , qm )

(p ∈ F̂Σ (X ∪ Ξm ), pi ⇒∗ ai qi , i = 1, . . . , m, p(a1 ξ1 , . . . , am ξm ) ⇒∗ aq)


we mean the derivation

p(p1 , . . . , pm ) ⇒ p(p11 , . . . , pm ) ⇒ . . . ⇒ p(p1k1 , . . . , pm ) ⇒ . . .

134
4.1 Basic concepts

p(p1k1 , . . . , pm1 ) ⇒ . . . ⇒ p(p1k1 , . . . , pmkm ) =


p(a1 q1 , . . . , am qm ) ⇒∗ aq(q1 , . . . , qm )
if pi ⇒∗ ai qi is the derivation pi ⇒ pi1 ⇒ . . . ⇒ piki = ai qi (ai ∈ A, qi ∈ FΩ (Y ),
i = 1, . . . , m), and p(a1 q1 , . . . , am qm ) ⇒∗ aq(q1 , . . . , qm ) is obtained by replacing ξi in
p(a1 ξ1 , . . . , am ξm ) ⇒∗ aq by qi (i = 1, . . . , m).
If we say that we write the derivation
α : p ⇒∗ aq (a ∈ A, p ∈ FΣ (X), q ∈ FΩ (Y ))
in the (more detailed) form
β : p = p(p1 , . . . , pm ) ⇒∗ p(a1 q1 , . . . , am qm ) ⇒∗ aq(q1 , . . . , qm )
(p ∈ F̂Σ (X ∪ Ξm ), pi ⇒∗ ai qi , i = 1, . . . , m, p(a1 ξ1 , . . . , am ξm ) ⇒∗ aq)
this also generally means that β is a reordering of α. Of course, such a reordering always
exists.
In the special case p = σ(ξ1 , . . . , ξm ) (σ ∈ Σm ) we write β in the form
β : σ(p1 , . . . , pm ) ⇒∗ σ(a1 q1 , . . . , am qm ) ⇒ aq(q1 , . . . , qm )
(pi ⇒∗ ai qi , i = 1, . . . , m, (σ(a1 , . . . , am ), aq) ∈ P ).
We illustrate the concepts of F-transducers and F-transformations by
Example 4.1.3 Let A = (Σ, {x}, {a0 , a1 }, Ω, {y}, P, {a0 }), where Σ = Σ2 = {σ}, Ω =
Ω1 = {ω} and P consists of the productions x → a1 y and σ(a1 , a1 ) → a0 ω(ξ1 ).
Consider the tree σ(x, x). One of the possible derivations
σ(x, x) ⇒ σ(a1 y, x) ⇒ σ(a1 y, a1 y) ⇒ a0 ω(y)
is illustrated by Fig. 4.1.

y y y y
x x
⇒ a1 x ⇒ a1 a1 ⇒ ω
σ
σ σ a0

Figure 4.1

Thus (σ(x, x), ω(y)) is in τA . In fact, τA consists of this single pair (σ(x, x), ω(y)).
Indeed, the only ΣX-tree of height 0 is x, which obviously is not in dom(τA ). If p ∈
FΣ (X) is a tree with height greater than 1, then it should contain at least one of the
following trees as a subtree:
σ(σ(x, x), σ(x, x)), σ(σ(x, x), x) and σ(x, σ(x, x)).
One can easily see that none of these subtrees can be transformed by A. ✷

135
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

F-transducers transform a tree from the leaves of the tree towards the root of the tree.
Now we define a system which works in the opposite direction.

Definition 4.1.4 A root-to-frontier tree transducer (R-transducer ) is a system A =


(Σ, X, A, Ω, Y, P, A′ ), where

(1) Σ, X, A, Ω, Y and A′ are specified the same way as in Definition 4.1.1, but here
A′ is called the set of initial states,

(2) P is a finite set of productions (or rewriting rules) of the following two types:
(i) ax → q (a ∈ A, x ∈ X, q ∈ FΩ (Y )),
(ii) aσ(ξ1 , . . . , ξm ) → q (a ∈ A, σ ∈ Σm , m ≥ 0, q ∈ FΩ [Y ∪ AΞm ]).

In the sequel we shall write simply aσ for aσ(ξ1 , . . . , ξm ). Moreover, for a production
p → q we shall use the notation (p, q), too.
Obviously, a production of type (ii) in Definition 4.1.4 can be written in the form

aσ → q(a1 ξ1n1 , . . . , am ξm
nm
)

where ai ∈ Ani , ni ≥ 0, i = 1, . . . , m, n1 +. . .+nm = n, and q ∈ F̂Ω (X∪Ξn ). In the sequel


we shall assume that whenever 1 ≤ i ≤ m and n1 +. . .+ni−1 +1 ≤ ii < i2 ≤ n1 +. . .+ni ,
ξi1 precedes ξi2 in yd(q)ϕ. Here ϕ is the homorphism defined on p. 134.
Next we define the transformations induced by R-transducers. Let A be the R-
transducer of Definition 4.1.4. For any a ∈ A and p ∈ FΣ (X) we define the subsets
pτA,a as follows:

(i) if p ∈ Σ0 ∪ X and (ap, q) ∈ P then q ∈ pτA,a ,

(ii) if p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), then for any (aσ, q(a1 ξ1n1 , . . . , am ξm
nm )) ∈ P

and qij ∈ pi τA,aij (1 ≤ i ≤ m, 1 ≤ j ≤ ni ), q(q1 , . . . , qm ) ∈ pτA,a where qi =


(qi1 , . . . , qini ) (i = 1, . . . , m),

(iii) nothing is in any pτA,a unless this follows from (i) and (ii).

Definition 4.1.5 Let A = (Σ, X, A, Ω, Y, P, A′ ) be an R-transducer. Then the transfor-


mation induced by A is the relation

τA = {(p, q) | p ∈ FΣ (X), q ∈ FΩ (Y ), q ∈ pτA,a for some a ∈ A′ }.

A tree transformation is an R-transformation if it can be induced by an R-transducer.


The class of all R-transformations will be denoted by R.

For R-transformations we also give another definition which shows how a transforma-
tion is carried out step by step.
Let p, q ∈ FΩ [Y ∪ AFΣ (X ∪ Ξ)] be trees, and consider the R-transducer of Defini-
tion 4.1.4. It is said that p directly derives q in A if q can be obtained from p by

136
4.1 Basic concepts

(i) replacing an occurrence of a subtree ax (a ∈ A, x ∈ X) in p by the right side q of


a production ax → q in P , or by

(ii) replacing an occurrence of a subtree aσ(p1 , . . . , pm ) (a ∈ A, σ ∈ Σm , m ≥ 0,


p1 , . . . , pm ∈ FΣ (X ∪ Ξ)) in p by q(p1 , . . . , pm ) where aσ → q is in P .

Each application of steps (i) and (ii) is called a direct derivation in A. The relation
expressing the direct derivation will be denoted by ⇒A , i.e., we write p ⇒A q if q is
obtained from p by a direct derivation in A. Frequently, A will be omitted in ⇒A . Any
finite sequence of consecutive direct derivations defines a derivation. More precisely,

p = p0 ⇒ p1 ⇒ . . . ⇒ pi ⇒ . . . ⇒ pj ⇒ . . . ⇒ pk = q (4)

(k ≥ 0, pℓ ∈ FΩ [Y ∪ AFΣ (X ∪ Ξ)], ℓ = 0, . . . , k, 0 ≤ i < j ≤ k)


is a derivation of q from p in A, k is the length of this derivation and pi ⇒ . . . ⇒ pj is a
subderivation of (4). If q can be obtained from p by a derivation, then we write p ⇒∗A q,
or simply p ⇒∗ q if A is understood from the context. Thus, ⇒∗ is the reflexive-transitive
closure of ⇒. Similarly as in the case of an F-transducer, we suppose that the notation
p ⇒∗ q implies a certain derivation of q from p in A.
Using the notation ⇒∗ , the transformation τA induced by an R-transducer A =
(Σ, X, A, Ω, Y, P, A′ ) can equivalently be defined thus:

τA = {(p, q) | p ∈ FΣ (X), q ∈ FΩ (Y ), ap ⇒∗ q for some a ∈ A′ }.

Let us note that although an R-transducer A is generally a nondeterministic system,


pτA is finite for every input tree p of A.
Let A = (Σ, X, A, Ω, Y, P, A′ ) be an R-transducer. Consider some n > 0, a ∈ An ,
p ∈ FΣ (X)n , q ∈ FΩ (Y )n and derivations ai pi ⇒∗ qi (i = 1, . . . , n). Then ap ⇒∗ q will
denote the vector of these derivations. Moreover, we assume that ap ⇒∗ q implicitly
expresses the n derivations ai pi ⇒∗ qi (i = 1, . . . , n).
Take the above R-transducer A and a derivation

α : p ⇒∗ q (p, q ∈ FΩ [Y ∪ AFΣ (X)]).

Let
r(p1 , p2 ) ⇒ r(p11 , p2 ) ⇒ . . . ⇒ r(p1k , p2 ) ⇒ r(p1k , p′2 ) (5)
(r ∈ F̂Ω [Y ∪ AFΣ (X ∪ Ξ2 )])
be a subderivation of α, where the first k direct derivation steps are carried out in the
subtree p1 , and then in the (k + 1)th step we apply a production in the subtree p2 .
Replacing the subderivation (5) in α by

r(p1 , p2 ) ⇒ r(p1 , p′2 ) ⇒ r(p11 , p′2 ) ⇒ . . . ⇒ r(p1k , p′2 ) (6)

we get a derivation
β : p ⇒∗ q.

137
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

The replacement of (5) in α by (6) is called an inversion of direct derivations. By


finitely many applications of inversions we get a reordering of direct derivations. We
shall not distinguish between derivations in an R-transducer if they are reorderings of
each other.
Again, take the above R-transducer A, a state a ∈ A and a tree p ∈ FΣ (X). Then by
ap = ap(p1 , . . . , pm ) ⇒∗ q(a1 pn1 1 , . . . , am pnmm ) ⇒∗ q(q1 , . . . , qm )
(p ∈ F̂Σ (X ∪ Ξm ), ap ⇒∗ q(a1 ξ1n1 , . . . , am ξm
nm
), ai ∈ Ani ,
ni ≥ 0, i = 1, . . . , m, n1 + . . . + nm = n, q ∈ F̂Ω (Y ∪ Ξn ),
n
aj pj j ⇒ qj , j = 1, . . . , m)
we mean the derivation
ap(p1 , . . . , pm ) ⇒∗ q(a1 pn1 1 , . . . , am pnmm ) ⇒
q(p11 (1) , a12 p1 , . . . , a1n1 p1 , . . . , am1 pm , . . . , amnm pm ) ⇒ . . .
q(p11 (k1 ) , a12 p1 , . . . , a1n1 p1 , . . . , am1 pm , . . . , amnm pm ) ⇒ . . .
q(p11 (k1 ) , . . . , p1n1 (k1n ) , . . . , am1 pm , . . . , amnm pm ) ⇒ . . .
1

q(p11 (k1 ) , . . . , p1n1 (k1n ) , . . . , pm1 (km1 ) , . . . , amnm pm ) ⇒ . . .


1

q(p11 (k1 ) , . . . , p1n1 (k1n ) , . . . , pm1 (km1 ) , . . . , pmnm (kmnm ) =


1

q(q11 , . . . , q1n1 , . . . , qm1 , . . . , qmnm ), assuming


that ai pni i ⇒∗ qi (1 ≤ i ≤ m) has its component derivations
aij pi ⇒ pij (1) ⇒ . . . ⇒ pij (kij ) = qij (qij ∈ FΩ (Y ), j = 1, . . . , ni ),

and ap(p1 , . . . , pm ) ⇒∗ q(a1 pn1 1 , . . . , am pnmm ) is obtained by replacing ξi (i = 1, . . . , m) in


ap ⇒∗ q(a1 ξ1n1 , . . . , am ξm
nm ) by p .
i
When we say that we write the derivation
α : ap ⇒∗ q (a ∈ A, p ∈ FΣ (X), q ∈ FΩ (Y ))
in the more detailed form
β : ap = ap(p1 , . . . , pm ) ⇒∗ q(a1 pn1 1 , . . . , am pnmm ) ⇒∗ q(q1 , . . . , qm )
(p ∈ F̂Σ (X ∪ Ξm ), ap ⇒∗ q(a1 ξ1n1 , . . . , am ξm
nm
), ai ∈ Ani , ni ≥ 0,
n
i = 1, . . . , m, n1 + . . . + nm = n, q ∈ F̂Ω (Y ∪ Ξn ), aj pj j ⇒ qj , j = 1, . . . , m),
it generally also means that β is a reordering of α. Obviously, such a reordering always
exists.
In case p = σ(ξ1 , . . . , ξm ) (σ ∈ Σm ), we write β in the form
β : aσ(p1 , . . . , pm ) ⇒∗ q(a1 pn1 1 , . . . , am pnmm ) ⇒∗ q(q1 , . . . , qm )
((aσ, q(a1 ξ1n1 , . . . , am ξm
nm
)) ∈ P, ai ∈ Ani , ni ≥ 0, i = 1, . . . , m, n1 + . . . + nm = n,
n
q ∈ F̂Ω (Y ∪ Ξn ), aj pj j ⇒∗ qj , j = 1, . . . , m).

138
4.1 Basic concepts

Example 4.1.6 Let A = (Σ, {x}, {a0 , a1 , a2 }, Ω, {y1 , y2 }, P, a0 ) be the R-transducer,


where Σ = Σ1 = {σ}, Ω = Ω1 ∪ Ω2 , Ω1 = {ω1 } and Ω2 = {ω2 } and P consists of the
productions
a0 σ → ω2 (a1 ξ1 , a2 ξ1 ),
a1 σ → ω1 (a1 ξ1 ), a2 σ → ω1 (a2 ξ1 ),
a1 x → y 1 , a2 x → y 2 .
Consider the trees p = σ(σ(σ(x))) and q = ω2 (ω1 (ω1 (y1 )), ω1 (ω1 (y2 ))). Then a deriva-
tion of q from a0 p is illustrated in Fig. 4.2.

x x x x x x x

σ σ σ σ σ a1 σ

σ ⇒ σ σ ⇒ a1 σ ⇒ ω1 σ ⇒

σ a1 a2 ω1 a2 ω1 a2

a0 ω2 ω2 ω2

x x x

y1 σ y1 σ y1 a2 y1 y2

ω1 σ ⇒ ω1 a2 ⇒ ω1 ω1 ⇒ ω1 ω1

ω1 a2 ω1 ω1 ω1 ω1 ω1 ω1

ω2 ω2 ω2 ω2

Figure 4.2

By induction on the heights of input trees one can easily prove that

τA = {(σ n (x), ω2 (ω1n−1 (y1 ), ω1n−1 (y2 ))) | n = 1, 2, . . .},

where σ 0 (ξ) = ξ and σ n (ξ) = σ(σ n−1 (ξ)) if n > 0. ✷

Both F-transducers and R-transducers generalize generalized sequential machines from


strings to trees (or from unary polynomial symbols to polynomial symbols of arbitrary fi-
nite type if strings are interpreted as unary polynomial symbols, as we did in Section 2.2).
At the same time there are the following main differences between F-transducers and
R-transducers:

(1) An F-transducer first processes an input subtree nondeterministically and then


makes copies of the resulting output subtree.

139
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

(2) An R-transducer can first make copies of an input subtree and then process each
copy independently in a nondeterministic fashion.

(3) F-transducers should process even those subtrees which are deleted afterwards.

Before ending this section we state and prove some simple general results.
The concept of tree homomorphism was introduced in Section 2.4. It is easy to see
that the tree homomorphism h : FΣ (X) → FΩ (Y ), given by the mappings

hm : Σm → FΩ (Y ∪ Ξm ) (m ≥ 0)

and
hX : X → FΩ (Y ),
can be induced by the one-state F-transducer A = (Σ, X, {a}, Ω, Y, P, a) where

P = {x → ahX (x) | x ∈ X} ∪ {σ(a, . . . , a) → ahm (σ) | σ ∈ Σm , m ≥ 0}.

Definition 4.1.7 A one-state F-transducer A = (Σ, X, {a}, Ω, Y, P, a) is an HF-


transducer if for every x ∈ X, resp. σ ∈ Σ, in P there is exactly one production
with left side x, resp. σ(a, . . . , a).

We have seen that every tree homomorphism can be induced by an HF-transducer.


The converse is also true: transformations induced by HF-transducers are tree homo-
morphisms.
We now introduce the R-transducer counterpart of HF-transducers.

Definition 4.1.8 A one-state R-transducer A = (Σ, X, {a}, Ω, Y, P, a) is an HR-


transducer if for each d ∈ X ∪ Σ in P there is exactly one production with the left
side ad.

Next we prove that the class of all tree homomorphisms coincides with the class of all
transformations induced by HR-transducers.

Theorem 4.1.9 The class of transformations induced by HF-transducers coincides with


the class of all transformations induced by HR-transducers.

Proof. Let A = (Σ, X, {a}, Ω, Y, P, a) be an HF-transducer. Consider the R-transducer


B = (Σ, X, {a}, Ω, Y, P ′ , a), where P ′ is given in the following way:

(ax, q) ∈ P ′ ⇐⇒ (x, aq) ∈ P (x ∈ X)

and

(aσ, q(aξ1 , . . . , aξm )) ∈ P ′ ⇐⇒ (σ(a, . . . , a), aq) ∈ P (σ ∈ Σm , m ≥ 0, q ∈ FΩ (Y ∪Ξm )).

It is obvious that B is an HR-transducer.

140
4.1 Basic concepts

By induction on hg(p), we show that for an arbitrary p ∈ FΣ (X) and q ∈ FΩ (Y ) the


equivalence
ap ⇒∗B q ⇐⇒ p ⇒∗A aq (7)
holds. This obviously implies τA = τB .
If hg(p) = 0, then (7) holds by the definition of P ′ .
Let p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), and assume that (7) has been proved for all
trees in FΣ (X) with heights less than hg(p).
Suppose that the left side of (7) holds, i.e., we have ap = aσ(p1 , . . . , pm ) ⇒B
q(ap1 , . . . , apm ) ⇒∗B q(q1 , . . . , qm ) = q, where (aσ, q(aξ1 , . . . , aξm )) ∈ P ′ and api ⇒∗B
qi (i = 1, . . . , m). Then, by the definition of P ′ , the production σ(a, . . . , a) →
aq(ξ1 , . . . , ξm ) is in P . Moreover, by the induction hypothesis, pi ⇒∗A aqi is valid for
each i (1 ≤ i ≤ m). Therefore, we have a desired derivation
p = σ(p1 , . . . , pm ) ⇒∗A σ(aq1 , . . . , aqm ) ⇒A aq(q1 , . . . , qm ) = aq.
The fact that p ⇒∗A aq implies ap ⇒∗B q can be shown by reversing the above argument.
To see that every HR-transformation is induced by an HF-transducer, it suffices to
observe that every HR-transducer B arises from an HF-transducer A by the above
construction. Hence HR- and HF-transducers appear in equivalent “associated” pairs.

We prove two more results.


Theorem 4.1.10 The following statements hold.
(i) For every F-transformation τ ⊆ FΣ (X) × FΩ (Y ), dom(τ ) ∈ Rec(Σ, X).
(ii) There exists a tree homomorphism h : FΣ (X) → FΩ (Y ) such that range(h) 6∈
Rec(Ω, Y ).
Proof. In order to show (i) consider an F-transducer A = (Σ, X, A, Ω, Y, P, A′ ). Con-
struct an NDF ΣX-recognizer B = (B, β, B ′ ), where B = (A, Σ), B ′ = A′ , and, for all
m ≥ 0, σ ∈ Σm and a1 . . . , am ∈ A,
σ B (a1 , . . . , am ) = {a | (∃q ∈ FΩ (Y ∪ Ξm ))((σ(a1 , . . . , am ), aq) ∈ P )}.
Finally let
xβ = {a ∈ A | (∃q ∈ FΩ (Y ))((x, aq) ∈ P )} (x ∈ X).
We end the proof of (i) by the observation that for all a ∈ A and p ∈ FΣ (X) the
equivalence
a ∈ pβ̂ ⇐⇒ (∃q ∈ FΩ (Y ))(p ⇒∗ aq)
holds. This can be shown by induction on hg(p).
For a proof of (ii), see Example 2.4.15. ✷

Example 2.4.15 shows also that the translation of a context-free language by a tree
transducer is not always context-free. In fact, in this example the finite language {x} is
n
translated into the non-CF language {x2 | n ≥ 0}.

141
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Lemma 4.1.11 For each T ∈ Rec(Σ, X) there exists an F-transducer A such that
dom(τA ) = range(τA ) = T and τA is the identity mapping of T .

Proof. Let B = (B, β, B ′ ) be an F ΣX-recognizer with B = (B, Σ) and T (B) = T . Take


the F-transducer A = (Σ, X, B, Σ, X, P, B ′ ) where

P = {x → β(x)x | x ∈ X} ∪ {σ(b1 , . . . , bm ) → bσ(ξ1 , . . . , ξm ) |


m ≥ 0, σ ∈ Σm , b, b1 , . . . , bm ∈ B, σ B (b1 , . . . , bm ) = b}.

Obviously, A has the desired properties. ✷

We end off this Section with

Definition 4.1.12 Two R- or F-transducers A and B are equivalent if τA = τB holds.

4.2 SOME CLASSES OF TREE TRANSFORMATIONS


In this section we shall define several classes of F- and R-transformations and then
compare them with each other with respect to set theoretic inclusion. It will turn out
that in most cases the classes to be investigated are incomparable.

Definition 4.2.1 Let A = (Σ, X, A, Ω, Y, P, A′ ) be an F-transducer. Then:

(1) A production of A is linear if each auxiliary variable occurs at most once in its
right-hand side. Moreover, A is a linear F-transducer (LF-transducer) if all of its
productions are linear.

(2) A is a totally defined F-transducer (TF-transducer) if

(i) for each x ∈ X there is a production in P with left-hand side x and

(ii) for all m ≥ 0, σ ∈ Σm and a1 , . . . , am ∈ A there is a production in P with left-hand


side σ(a1 , . . . , am ).

(3) A is a nondeleting F-transducer (NF-transducer) if for every production


σ(a1 , . . . , am ) → aq (σ ∈ Σm , m ≥ 0) from P each ξi ∈ Ξm occurs at least once in
q.

(4) A is a deterministic F-transducer (DF-transducer) if there are no two distinct pro-


ductions in P with the same left-hand side.

(5) A is an F-relabeling if each of its productions is of the form

(i) x → ay (x ∈ X, a ∈ A, y ∈ Y ) or

(ii) σ(a1 , . . . , am ) → aω(ξ1 , . . . , ξm ), where σ ∈ Σm , a1 , . . . , am , a ∈ A, ω ∈ Ωm .

Transformations induced by F-relabelings are also called F-relabelings.

142
4.2 Some classes of tree transformations

To illustrate the above concepts, let us take the following example.


Example 4.2.2 Let A = (Σ, {x}, {a0 , a1 }, Ω, {y}, P, {a1 }) be the F-transducer with Σ =
Σ2 = {σ} and Ω = Ω2 = {ω}, where P consists of the productions
x → a0 y,
σ(a0 , a0 ) → a1 ω(ξ1 , ξ2 ), σ(a0 , a1 ) → a0 ω(ξ1 , ξ2 ), σ(a1 , a0 ) → a1 ω(ξ1 , ξ2 ),
σ(a1 , a1 ) → a1 ω(ξ1 , ξ2 ).
Then A is a linear, totally defined, nondeleting and deterministic F-transducer. More-
over, A is an F-relabeling. ✷
Example 4.1.3 gives an F-transducer which is linear and deterministic, but it is neither
totally defined nor nondeleting.
Let us note that F-relabelings are always linear and nondeleting F-transducers.
We now define the R-transducer counterparts of the above classes of F-transducers.
Definition 4.2.3 Let A = (Σ, X, A, Ω, Y, P, A′ ) be an R-transducer. Then:
(1) A production of A is linear if each auxiliary variable occurs at most once in its
right-hand side. Moreover, A is a linear R-transducer (LR-transducer) if all of its
productions are linear.
(2) A is a totally defined R-transducer (TR-transducer) if
(i) for all a ∈ A and x ∈ X there is a production in P with left-hand side ax, and
(ii) for all a ∈ A and σ ∈ Σm (m ≥ 0) there is a production in P with left-hand side
aσ.
(3) A is a nondeleting R-transducer (NR-transducer) if for every production aσ → q
(σ ∈ Σm , m > 0) from P each ξi ∈ Ξm occurs at least once in q.
(4) A is a deterministic R-transducer (DR-transducer) if A′ is a singleton and there
are no distinct productions in P with the same left-hand side.
(5) A is an R-relabeling if each of the productions of A has the form
(i) ax → y (a ∈ A, x ∈ X, y ∈ Y ) or
(ii) aσ → ω(a1 ξ1 , . . . , am ξm ), where a, a1 , . . . , am ∈ A, σ ∈ Σm , ω ∈ Ωm . Transforma-
tions induced by R-relabelings will also be called R-relabelings.
Example 4.2.4 Let A = (Σ, {x}, {a0 , a1 }, Ω, {y1 , y2 }, P, {a0 }) be an R-transducer with
Σ = Σ2 = {σ} and Ω = Ω2 = {ω}. Moreover, P consists of the productions
a0 x → y 1 , a1 x → y 2 ,
a0 σ → ω(a1 ξ1 , a1 ξ2 ), a1 σ → ω(a0 ξ1 , a0 ξ2 ).
Then A is a linear, totally defined, nondeleting and deterministic R-transducer. More-
over, A is an R-relabeling. ✷

143
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

The R-transducer of 4.1.6 is deterministic and nondeleting, but it is neither linear nor
totally defined.
Let us note that R-relabelings are linear and nondeleting R-transducers.
The abbreviations introduced above for classes of tree transducers can be combined to
indicate further subclasses. For instance, an LNF-transducer is a linear nondeleting F-
transducer. Moreover, a transformation is a K-transformation if it can be induced by a K-
transducer. The class of all K-transformations will be denoted by K. Thus, for example,
LN F is the class of all LNF-transformations, i.e., the class of all transformations induced
by linear nondeleting F-transducers. By Theorem 4.1.9, we shall write simply H instead
of HF and HR. Moreover, Frel, resp. Rrel, will denote the class of F-relabelings, resp.
R-relabelings.
We now prove

Theorem 4.2.5 F and R are incomparable.

Proof. In order to prove Theorem 4.2.5, we give (i) an F-transformation which is not
in R and (ii) an R-transformation which cannot be induced by any F-transducer.
(i) Consider the LDF-transducer A of Example 4.1.3. If for an R-transducer B =
(Σ, {x}, B, Ω, {y}, P ′ , B ′ ) we have (σ(x, x), ω(y)) ∈ τB , then at the first step of a
derivation bσ(x, x) ⇒∗B ω(y) (b ∈ B ′ ) we should apply a production of the form
bσ → b′ ξ1 , bσ → b′ ξ2 , bσ → ω(b′ ξ1 ), bσ → ω(b′ ξ2 ) or bσ → ω(y), where b′ ∈ B. In
each of the above cases one of the auxiliary variables ξ1 and ξ2 is deleted. Therefore,
dom(τB ) is infinite.
(ii) Take the DR-transducer A of Example 4.1.6. Assume that an F-transducer
B = (Σ, {x}, B, Ω, {y1 , y2 }, P ′ , B ′ ) induces τA . Obviously, P ′ should then contain a
production of the form
σ(b) → b1 ω2 (q1 , q2 ) (b, b1 ∈ B).
We may confine ourselves to the following cases:
(I) q1 = σ k (y1 ) and q2 = σ k (y2 ),

(II) q1 = σ l (ξ1 ) and q2 = σ k (y2 ),

(III) q1 = σ k (y1 ) and q2 = σ l (ξ1 ),

(IV) q1 = σ m (ξ1 ) and q2 = σ n (ξ1 ).


Obviously, in a derivation σ r (x) ⇒∗B b′ ω2 (ω1r−1 (y1 ), ω1r−1 (y2 )) (r > 1, b′ ∈ B ′ ) the last
application of the above productions can be followed by applications of productions of
the form σ(b̄) → b̄1 ξ1 (b̄, b̄1 ∈ B) only. Let t be the maximum of exponents in (I)–(IV).
If r > t + 1 and τB (σ r (x)) = ω2 (ω1r−1 (yi ), ω1r−1 (yj )) (1 ≤ i, j ≤ 2) then i = j. ✷

From the proof of Theorem 4.2.5 we directly get

Corollary 4.2.6 DF and DR are incomparable and so are DF and R, and F and DR.

144
4.2 Some classes of tree transformations

As we have mentioned one of the main differences between F- and R-transducers is


that while F-transducers first process an input subtree and then copy the resulting
output subtree, R-transducers first copy an input subtree and then treat these copies
independently. In the case of an LR-transducer none of the input subtrees of a tree is
copied during the translation of the tree. This property leads to

Theorem 4.2.7 LR is a proper subclass of LF.

Proof. By (i) in the proof of Theorem 4.2.5, LF is not a subclass of LR. Thus, it is
enough to show the validity of LR ⊆ LF.
Let A = (Σ, X, A, Ω, Y, P, A′ ) be an LR-transducer. Then the productions from P can
be written in the form

(i) ax → q (a ∈ A, x ∈ X, q ∈ FΩ (Y )), or

(ii) aσ(ξ1 , . . . , ξm ) → q(a1 ξ1 , . . . , am ξm ) (a, a1 , . . . , am ∈ A, m ≥ 0, σ ∈ Σm , q ∈ FΩ [Y ∪


AΞm ]).

Now take the following R-transducer A. If A is nondeleting, then A = A. In the


opposite case A = (Σ, X, A, Ω, Y, P , A′ ) is given as follows. Let A = A ∪ {∗} (∗ 6∈ A).
Fix any y ∈ Y and enlarge P by all productions ∗x → y (x ∈ X) and ∗σ → y (m ≥
0, σ ∈ Σm ). Denote by P the resulting set of productions. Obviously, A is linear and
equivalent to A. The only difference between A and A is that A transforms (in state
∗) even those subtrees of a tree p ∈ FΣ (X) which are deleted during the corresponding
derivation of p in A.
Next, construct the F-transducer B = (Σ, X, B, Ω, Y, P ′ , B ′ ), where B = A and B ′ =
A . Moreover, given any x ∈ X, b ∈ B and q ∈ FΩ (Y ), x → bq is in P ′ iff bx → q is in P .

Furthermore, the production

σ(b1 , . . . , bm ) → bq(ξ1 , . . . , ξm ) (σ ∈ Σm , m ≥ 0, b1 , . . . , bm , b ∈ B, q ∈ FΩ (Y ∪ Ξm ))

is in P ′ iff P contains a production

bσ → q(c1 ξ1 , . . . , cm ξm ),

such that for each i = 1, . . . , m,



ci if ξi occurs in q,
bi =
∗ otherwise.

Obviously B is linear.
In order to complete the proof of Theorem 4.2.7, it is enough to show that the equiv-
alence
p ⇒∗B bq ⇐⇒ bp ⇒∗A q (1)
holds for all b ∈ B, p ∈ FΣ (X) and q ∈ FΩ (Y ). We shall proceed by induction on hg(p).
If hg(p) = 0, then (1) obviously holds by the definition of P ′ .

145
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Now let p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), and assume that (1) has been proved for
all trees in FΣ (X) of lesser height.
(I) Let p ⇒∗B bq hold. More in detail, let

p = σ(p1 , . . . , pm ) ⇒∗B σ(b1 q1 , . . . , bm qm ) ⇒B bq(q1 , . . . , qm ) = bq

where pi ⇒∗B bi qi (i = 1, . . . , m). Then by the induction hypothesis, we have bi pi ⇒∗A


qi (i = 1, . . . , m). Moreover, by the definition of P ′ , bσ → q(b1 ξ1 , . . . , bm ξm ) is in P .
Therefore,

bp = bσ(p1 , . . . , pm ) ⇒ q(b1 p1 , . . . , bm pm ) ⇒∗ q(q1 , . . . , qm ) = q

also exists in A.
(II) Assume that in A we have a derivation

bp = bσ(p1 , . . . , pm ) ⇒ q(b1 p1 , . . . , bm pm ) ⇒∗ q(q1 , . . . , qm ) = q

where each qi (i = 1, . . . , m) is obtained by a derivation bi pi ⇒∗ qi in A. Moreover, let


bi = ∗ and qi = y if ξi does not occur in q. Then σ(b1 , . . . , bm ) → bq is in P ′ . Furthermore,
by the induction hypothesis, there are derivations pi ⇒∗B bi qi (i = 1, . . . , m). Therefore,
the derivation

p = σ(p1 , . . . , pm ) ⇒∗B σ(b1 q1 , . . . , bm qm ) ⇒B bq(q1 , . . . , qm ) = bq

is also valid. ✷

For linear nondeleting tree transformations we have the following stronger result.

Theorem 4.2.8 LN R = LN F.

Proof. The LF-transducer B constructed to the LNR-transducer A in the proof of the


previous Theorem is obviously nondeleting.
Conversely, let C = (Σ, X, C, Ω, Y, P ′′ , C ′ ) be an arbitrary LNF-transducer. Construct
the R-transducer A = (Σ, X, C, Ω, Y, P, C ′ ), where P is defined as follows:

(ax, q) ∈ P ⇐⇒ (x, aq) ∈ P ′′

and

(aσ, q(a1 ξ1 , . . . , am ξm )) ∈ P ⇐⇒ (σ(a1 , . . . , am ), aq(ξ1 , . . . , ξm )) ∈ P ′′ ,

where x ∈ X, a, a1 , . . . , am ∈ A, σ ∈ Σm (m ≥ 0) and q ∈ FΩ (Y ∪ Ξm ). Obviously, A is


an LNR-transducer. Now to A construct the F-transducer B as in the proof of Theorem
4.2.7. Then B = C. ✷

The LF-transducer B constructed to an R-relabeling in the proof of Theorem 4.2.7 is


obviously an F-relabeling. Moreover, the R-transducer A given to an F-relabeling C in
the proof of Theorem 4.2.8 is an R-relabeling. Thus, we have

146
4.3 Compositions and decompositions of tree transformations

Corollary 4.2.9 Frel = Rrel. ✷

According to Corollary 4.2.9, we may speak simply about relabelings.


One can easily show the existence of an LNF-transformation which is not a relabeling.
Our comparison results can be summarized by the diagram in Fig. 4.3.

F R

DF LF DR

LR

LN F = LN R

Frel = Rrel

Figure 4.3.

4.3 COMPOSITIONS AND DECOMPOSITIONS OF TREE


TRANSFORMATIONS
Let K be a class of tree transformations. We say that K is closed under composition if τ1 ◦
τ2 ∈ K whenever τ1 , τ2 ∈ K. As we shall see, some of our classes of tree transformations
are closed under composition while others are not. On the other hand, in many cases it
is possible to decompose a tree transformation into a composition of simpler ones.
For any two classes K1 and K2 of tree transformations, we introduce the notation
K1 ◦ K2 = {τ1 ◦ τ2 | τ1 ∈ K1 , τ2 ∈ K2 }. Using this notation, the closure of a class K of
tree transformations under composition can be expressed by the inclusion K ◦ K ⊆ K.
Similarly, the fact that all transformations in K can be given as compositions of a
transformation in K1 by a transformation from K2 can be expressed by K ⊆ K1 ◦ K2 .
Finally, if K is a class of tree transformations, then let K1 = K and Kn = K ◦ Kn−1
(n > 1). All of the classes defined in the previous section (R, F, LF , H etc.) include all
identity transformations {(t, t) | t ∈ FΣ (X)}. Hence, if K is any one of these classes,
then we know that
K ⊆ K2 ⊆ Kn ⊆ . . . .
First we prove a decomposition theorem concerning F-transformations.

Lemma 4.3.1 F ⊆ LF ◦ H and F ⊆ LR ◦ H.

147
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Proof. Let A = (Σ, X, A, ∆, Z, P, A′ ) be an arbitrary F-transducer. Arrange the pro-


ductions from P in a fixed order and number them from 1 to |P |. For all i(= 1, . . . , |P |),
if the left side of the ith production is x ∈ X, then let x(i) be a new letter. Denote
by Y the set of all such x(i) . Moreover, for all i(= 1, . . . , |P |), if the symbol σ ∈ Σm
(m ≥ 0) occurs in the left-hand side of the ith production, then σ (i) will be a new m-ary
operator. The set of all such operators will be denoted by Ω.
Now we introduce the F-transducer B = (Σ, X, A, Ω, Y, P ′ , A′ ), where P ′ is defined as
follows:
(i) x → ax(i) (x ∈ X, a ∈ A) is in P ′ iff the ith production in P is x → ar for some r,
(ii) σ(a1 , . . . , am ) → aσ (i) (ξ1 , . . . , ξm ) (σ ∈ Σm , m ≥ 0, a1 , . . . , am ∈ A) is in P ′ iff the
ith production in P is σ(a1 , . . . , am ) → ar for some r.
Obviously, B is linear and nondeleting. Thus, by Theorem 4.2.8, τB is a linear nondelet-
ing R-transformation, as well.
Next define the F-transducer C = (Ω, Y, {c0 }, ∆, Z, P ′′ , c0 ) in the following way:
(i) x(i) → c0 r is in P ′′ iff the ith production in P is x → ar,
(ii) σ (i) (c0 , . . . , c0 ) → c0 r is in P ′′ iff the ith production in P is σ(a1 , . . . , am ) → ar.
Then C is an HF-transducer.
We prove that τA = τB ◦ τC . For this it is enough to show that, for all p ∈ FΣ (X),
r ∈ F∆ (Z) and a ∈ A, the equivalence
p ⇒∗A ar ⇐⇒ (∃q ∈ FΩ (Y ))(p ⇒∗B aq ∧ q ⇒∗C c0 r) (1)
holds. We proceed by induction on hg(p).
If hg(p) = 0, then (1) obviously holds.
Assume that p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0) and that (1) has been proved for all
trees from FΣ (X) of lesser height.
(I) Let
p ⇒∗A σ(a1 r1 , . . . , am rm ) ⇒A ar(r1 , . . . , rm ) = ar, (2)
where pi ⇒∗A ai ri (ri ∈ F∆ (Z)) holds for each i(= 1, . . . , m). Then, by the induction
hypothesis, there are trees qi ∈ FΩ (Y ) (i = 1, . . . , m) such that pi ⇒∗B ai qi and qi ⇒∗C c0 ri
hold. Assume that the production σ(a1 , . . . , am ) → ar last applied in (2) is the ith one
in P . Then
(σ(a1 , . . . , am ), aσ (i) (ξ1 , . . . , ξm )) ∈ P ′ and (σ (i) (c0 , . . . , c0 ), c0 r) ∈ P ′′ .
Therefore, taking q = σ (i) (q1 , . . . , qm ), we have the desired derivations
p ⇒∗B σ(a1 q1 , . . . , am qm ) ⇒B aσ (i) (q1 , . . . , qm ) = aq
and
q ⇒∗C σ (i) (c0 r1 , . . . , c0 rm ) ⇒C c0 r(r1 , . . . , rm ) = c0 r.
(II) The fact that the right side of (1) implies its left side can be proved by inverting
the above computation. ✷

148
4.3 Compositions and decompositions of tree transformations

Lemma 4.3.2 F ◦ H ⊆ F.

Proof. Let A = (Σ, X, A, Ω, Y, P, A′ ) be an F-transducer and B =



(Ω, Y, {b0 }, ∆, Z, P , b0 ) an HF-transducer. We shall construct an F-transducer C
whose productions will be composed of productions of A and derivations in B. For this,
using the fact that derivations in B can be started from trees in FΩ [Y ∪ b0 Ξ] (see p.
133), we define derivations in B for trees in FΩ (Y ∪ Ξ). Take two trees q ∈ FΩ (Y ∪ Ξm )
and r ∈ F∆ (Z ∪ Ξm ). We write q ⇒∗B b0 r if

q(b0 ξ1 , . . . , b0 ξm ) ⇒∗B b0 r

holds. Now define an F-transducer C = (Σ, X, A, ∆, Z, P ′′ , A′ ), where P ′′ is given as


follows:

(i) x → ar (x ∈ X, a ∈ A, r ∈ F∆ (Z)) is in P ′′ iff there is a production x → aq in P


such that q ⇒∗B b0 r holds,

(ii) σ(a1 , . . . , am ) → ar (σ ∈ Σm , m ≥ 0, a1 , . . . , am , a ∈ A, r ∈ F∆ (Z ∪ Ξm )) is in P ′′
iff there is a production σ(a1 , . . . , am ) → aq in P such that q ⇒∗B b0 r holds. Since
at each step of the transformation of a tree the number of applications is finite,
P ′′ is finite.

We prove that for all a ∈ A, p ∈ FΣ (X) and r ∈ F∆ (Z) the equivalence

p ⇒∗C ar ⇐⇒ (∃q ∈ FΩ (Y ))(p ⇒∗A aq ∧ q ⇒∗B b0 r) (3)


holds. We proceed by induction on hg(p).
If hg (p) = 0 then (3) obviously holds.
Assume that p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0) and that (3) has been proved for all
trees from FΣ (X) of lesser height.
(I) First we show that the right side of (3) implies its left side. For this assume that
the derivations
p ⇒∗A σ(a1 q1 , . . . , am qm ) ⇒A aq(q1 , . . . , qm ) = aq
(pi ⇒∗A ai qi , i = 1, . . . , m)
and
q ⇒∗B q(b0 r1 , . . . , b0 rm ) ⇒∗B b0 r(r1 , . . . , rm ) = b0 r
(qi ⇒∗B b0 ri , i = 1, . . . , m)
are given. Then, by the induction hypothesis, the relations pi ⇒∗C ai ri (i = 1, . . . , m)
also hold. Moreover, by the definition of P ′′ , σ(a1 , . . . , am ) → ar is in P ′′ . Thus, we
have the derivation

p ⇒∗C σ(a1 r1 , . . . , am rm ) ⇒C ar(r1 , . . . , rm ) = ar. (4)

(II) Suppose that (4) and the derivations pi ⇒∗C ai ri (i = 1, . . . , m) are valid. Then, by
the induction hypothesis, there are trees qi ∈ FΩ (Y ) (i = 1, . . . , m) such that pi ⇒∗A ai qi

149
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

and qi ⇒∗B b0 ri hold. Moreover, by the definition of P ′′ , there exists a q ∈ FΩ (Y ∪ Ξm )


with (σ(a1 , . . . , am ), aq) ∈ P and q ⇒∗B b0 r. Therefore, for q = q(q1 , . . . , qm )

p ⇒∗A σ(a1 q1 , . . . , am qm ) ⇒A aq(q1 , . . . , qm ) = aq

and
q ⇒∗B q(b0 r1 , . . . , b0 rm ) ⇒∗B b0 r(r1 , . . . , rm ) = b0 r
hold. ✷

From Theorem 4.2.7 and the Lemmas 4.3.1 and 4.3.2 we directly obtain

Theorem 4.3.3 F = LF ◦ H = LR ◦ H. ✷

The constructions in the proofs of Lemma 4.3.1 and 4.3.2 preserve determinism. Thus,
we have

Corollary 4.3.4 DF = LDF ◦ H. ✷

Now we investigate some special classes of F-transformations for closure under com-
position.

Lemma 4.3.5 Let A = (Σ, X, A, Ω, Y, P, A′ ) be an F-transducer. Then there exists a


totally defined F-transducer B = (Σ, X, B, Ω, Y, P ′ , B ′ ) such that τA = τB. Moreover, if
A is linear, then B can be chosen linear, too.

Proof. Let B = A ∪ {∗} and B ′ = A′ . The required B results if we put

P ′ = P ∪ {x → ∗y | x ∈ X, y ∈ Y } ∪ {σ(b1 , . . . , bm ) → ∗y | σ ∈ Σm ,

m ≥ 0, b1 , . . . , bm ∈ B, y ∈ Y }.
If A is linear, then so is B. ✷

Theorem 4.3.6 The following equalities hold:

(i) LF ◦ LF = LF,

(ii) LR ◦ LR = LF.

Proof. In order to show (i), take two LF-transducers A = (Σ, X, A, Ω, Y, P, A′ ) and


B = (Ω, Y, B, ∆, Z, P ′ , B ′ ). In view of Lemma 4.3.5, we may assume that B is totally
defined. Construct an F-transducer C = (Σ, X, C, ∆, Z, P ′′ , C ′ ) with C = A × B and
C ′ = A′ × B ′ . Furthermore, P ′′ is defined as follows:

(I) x → (a, b)r (x ∈ X, (a, b) ∈ C, r ∈ F∆ (Z)) is in P ′′ iff there is a production x → aq


in P such that q ⇒∗B br holds,

150
4.3 Compositions and decompositions of tree transformations

(II) σ((a1 , b1 ), . . . , (am , bm )) → (a, b)r

(σ ∈ Σm , m ≥ 0, (a1 , b1 ), . . . , (am , bm ), (a, b) ∈ C, r ∈ F∆ (Z ∪ Ξm ))

is in P ′′ iff there is a production σ(a1 , . . . , am ) → aq in P such that q(b1 ξ1 , . . . , bm ξm ) ⇒∗B


br holds.
We shall prove that for arbitrary p ∈ FΣ (X), r ∈ F∆ (Z) and (a, b) ∈ C the equivalence

p ⇒∗C (a, b)r ⇐⇒ (∃q ∈ FΩ (Y ))(p ⇒∗A aq ∧ q ⇒∗B br) (5)

holds. We proceed by induction on hg(p).


If hg(p) = 0, then (5) obviously holds.
Now let p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), and assume that (5) has been proved for
all trees of lesser height.
First we show that the right side of (5) implies the left side. Suppose we are given
derivations
p ⇒∗A σ(a1 q1 , . . . , am qm ) ⇒A aq(q1 , . . . , qm ) = aq
and
q ⇒∗B q(b1 r1 , . . . , bm rm ) ⇒∗B br(r1 , . . . , rm ) = br
where pi ⇒∗A ai qi and qi ⇒∗B bi ri (i = 1, . . . , m). (Observe that for each i (1 ≤ i ≤ m)
there exists an ri such that qi ⇒∗B bi ri holds since B is totally defined.) Then, by
the induction hypothesis, the derivations pi ⇒∗C (ai , bi )ri (i = 1, . . . , m) are also valid.
Furthermore, by the definition of P ′′ , the production

σ((a1 , b1 ), . . . , (am , bm )) → (a, b)r

is in P ′′ . Therefore, we get the derivation

p ⇒∗C σ((a1 , b1 )r1 , . . . , (am , bm )rm ) ⇒C (a, b)r(r1 , . . . , rm ) = (a, b)r.

The fact that the left side of (5) implies its right side can be shown by reversing the
above argument.
In order to prove (ii) it is enough to note that the HF-transducer C constructed to the
LF-transducer A in the proof of Lemma 4.3.1 is also linear. Moreover, by Theorem 4.2.7,
the inclusion LR ⊆ LF holds. ✷

Using an argument similar to that used in the proof of Theorem 4.3.6 (i), one can
prove

Theorem 4.3.7 The classes DF and H are closed under composition. ✷

From Theorem 4.3.7, by Theorem 4.3.6 (i), we get

Corollary 4.3.8 The class LDF is closed under composition. ✷

151
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Using our decomposition results, one can prove

Theorem 4.3.9 F ◦ DF = F. ✷

Now we turn to decomposition of R-transducers.

Lemma 4.3.10 R ⊆ H ◦ LR.

Proof. Let A = (Σ, X, A, ∆, Z, P, A′ ) be an arbitrary R-transducer. Let n be the greatest


integer with Σn 6= ∅. For any production d ∈ P and natural number i (1 ≤ i ≤ n),
denote by k(d, i) the number of occurrences of ξi in the right-hand side of d. Set k =
S i) | d ∈ P, i = 1, . . . , n}. Furthermore,
max{k(d,

take the ranked alphabet Ω given by
Ω = (Ωm·k | m ≥ 0) and Ωm·k = {σ | σ ∈ Σm } (m ≥ 0).
Let B = (Σ, X, {b0 }, Ω, X, P ′ , b0 ) be the HR-transducer where P ′ consists of all pro-
ductions
b0 x → x (x ∈ X)
and
b0 σ → σ ′ (bk0 ξ1k , . . . , bk0 ξm
k
) (σ ∈ Σm , m ≥ 0).
Next define an LR-transducer C = (Ω, X, A, ∆, Z, P ′′ , A′ ), where P ′′ is given as follows:

(i) ax → r (x ∈ X) is in P ′′ iff it is in P .

(ii) Let σ ∈ Σm (m ≥ 0) and ξ i ∈ Ξk with ξij = ξ(i−1)k+j (i = 1, . . . , m, j = 1, . . . , k).


Then aσ ′ → r(a1 ξ 1 , . . . , am ξ m ) is in P ′′ iff aσ → r(a1 ξ1n1 , . . . , am ξm
nm ) is in P (for

some n1 , . . . , nm ).

For each p ∈ FΣ (X) let us denote by p′ ∈ FΩ (X) the tree given as follows:

(I) if p = x ∈ X, then p′ = x,

(II) if p = σ(p1 , . . . , pm ) (σ ∈ Σm , m ≥ 0), then p′ = σ ′ (p′k ′k


1 , . . . , pm ).

It is easy to show that the transformation τB is exactly the mapping p → p′ (p ∈


FΣ (X)).
In order to prove τA = τB ◦ τC it is enough to show that for all a ∈ A, p ∈ FΣ (X) and
r ∈ F∆ (Z) the equivalence
ap ⇒∗A r ⇐⇒ ap′ ⇒∗C r (6)
holds. We proceed by induction on hg(p).
If hg(p) = 0 then, by the choice of P ′′ , (6) is obviously valid.
Now let p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), and assume that (6) has been proved for
all trees of lesser height.
First we prove that the left side of (6) implies its right side. Assume that

ap ⇒A r(a1 pn1 1 , . . . , am pnmm ) ⇒∗A r(r1 , . . . , rm ) = r

152
4.3 Compositions and decompositions of tree transformations

where ai pni i ⇒∗A ri (i = 1, . . . , m). Then, by the definition of P ′′ , the production


aσ ′ → r(a1 ξ 1 , . . . , am ξ m ) is in P ′′ . Moreover, by the induction hypothesis, there are
derivations ai p′n ∗
i ⇒C ri for all i(= 1, . . . , m). Therefore, we have the desired derivation
i

ap′ ⇒C r(a1 p′n ′nm ∗


1 , . . . , am pm ) ⇒C r(r1 , . . . , rn ) = r.
1

The fact that the right side of (6) implies its left side can be proved by the converse
of the computation above. ✷

Lemma 4.3.11 H ◦ R ⊆ R.

Proof. Let A = (Σ, X, {a0 }, Ω, Y, P, a0 ) be an HR-transducer and B = (Ω, Y, B, ∆, Z, P ′ ,


B ′ ) an arbitrary R-transducer. Take the R-transducer C = (Σ, X, B, ∆, Z, P ′′ , B ′ ), where
P ′′ is given in the following way:

(i) bx → r (b ∈ B, x ∈ X, r ∈ F∆ (Z)) is in P ′′ iff there is a production a0 x → q in P


such that bq ⇒∗B r holds;

(ii) bσ → r (b ∈ B, σ ∈ Σm , m ≥ 0, r ∈ F∆ [Z ∪ BΞm ]) is in P ′′ iff there is a production


a0 σ → q(a0 ξ1 , . . . , a0 ξm ) (q ∈ FΩ (Y ∪ Ξm )) in P such that bq ⇒∗B r holds.

To show τA ◦ τB = τC it is enough to prove that for arbitrary b ∈ B, p ∈ FΣ (X) and


r ∈ F∆ (Z) the equivalence

bp ⇒∗C r ⇐⇒ (∃q ∈ FΩ (Y ))(a0 p ⇒∗A q ∧ bq ⇒∗B r)

holds. This can be carried out by induction on hg(p). ✷

From Lemmas 4.3.10 and 4.3.11 we directly get

Theorem 4.3.12 R = H ◦ LR. ✷

Using Theorems 4.3.3 and 4.3.12 we obtain

Theorem 4.3.13 For each n ≥ 1 the inclusions F n ⊆ Rn+1 and Rn ⊆ F n+1 hold. ✷

Taking n = 1 in Theorem 4.3.13, we see that every F-transformation can be given as


the composition of two R-transformations, and each R-transformation can be obtained
as the composition of two F-transformations. Thus, taking Theorem 4.2.5 into account,
we get

Corollary 4.3.14 Neither F nor R is closed under composition. ✷

One can show that F is not closed under composition by LNF-transformations either.
For R, we have

Theorem 4.3.15 R ◦ LN R = R.

153
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Proof. By Theorem 4.3.12, it suffices to show that LR is closed under compositions by


LNR-transformations.
Let A = (Σ, X, A, Ω, Y, P, A′ ) be an LR-transducer and B = (Ω, Y, B, ∆, Z, P ′ , B ′ ) an
LNR-transducer. Take the R-transducer C = (Σ, X, C, ∆, Y, P ′′ , C ′ ) with C = A × B
and C ′ = A′ × B ′ . Moreover, P ′′ is given as follows:

(i) (a, b)x → r ((a, b) ∈ C, x ∈ X, r ∈ F∆ (Z)) is in P ′′ iff there is a production ax → q


in P such that bq ⇒∗B r holds.

(ii) (a, b)σ → r((a1 , b1 )ξ1 , . . . , (am , bm )ξm )

((a, b), (a1 , b1 ), . . . , (am , bm ) ∈ C, σ ∈ Σm , m ≥ 0, r ∈ F∆ [Z ∪ CΞm])

is in P ′′ iff there is a production aσ → q(a1 ξ1 , . . . , am ξm ) (q ∈ FΩ (Y ∪ Ξm )) in P


such that bq ⇒∗B r(b1 ξ1 , . . . , bm ξm ) holds.

In order to show τC = τA ◦ τB it is enough to prove that for arbitrary (a, b) ∈ C, p ∈


FΣ (X) and q ∈ F∆ (Z) the equivalence

(a, b)p ⇒∗C r ⇐⇒ (∃q ∈ FΩ (Y ))(ap ⇒∗A q ∧ bq ⇒∗B r)

holds. This can be done by induction on hg(p). ✷

Later on we need the following results.

Lemma 4.3.16 Let τ ⊆ FΣ (X) × FΩ (Y ) be an arbitrary F-transformation and T ∈


Rec(Ω, Y ). Then T τ −1 ∈ Rec(Σ, X).

Proof. By Lemma 4.1.11, there exists an F-transducer A with dom(τA ) = range(τA ) = T


and τA is the identity mapping on T . Moreover, by the proof of Lemma 4.1.11, we may
suppose that A is deterministic. Furthermore, by Theorem 4.3.9, F ◦ DF = F. Thus,
since T τ −1 = dom(τ ◦ τA ), in order to prove Lemma 4.3.16, it is enough to show that
the domain of an F-transformation is recognizable. But this is true by (i) of Theorem
4.1.10. ✷

From Theorem 4.1.10 and Lemma 4.3.16, using the inclusion R ⊆ F 2 (see Theorem
4.3.13), we get

Corollary 4.3.17 Let τ ⊆ FΣ (X) × FΩ (Y ) be an arbitrary R-transformation. If T ∈


Rec(Ω, Y ), then T τ −1 ∈ Rec(Σ, X). In particular, dom(τ ) ∈ Rec(Σ, X). ✷

4.4 TREE TRANSDUCERS WITH REGULAR LOOK-AHEAD


Consider an F-transducer A = (Σ, X, A, Ω, Y, P, A′ ). Take a tree p = σ(p1 , . . . , pm ) ∈
FΣ (X) (σ ∈ Σm , m > 0) and a derivation σ(p1 , . . . , pm ) ⇒∗ σ(a1 q1 , . . . , am qm ) (ai ∈
A, qi ∈ FΩ (Y ), pi ⇒∗ ai qi , i = 1, . . . , m). Then, knowing the states a1 , . . . , am , our

154
4.4 Tree transducers with regular look-ahead

transducer can decide which production σ(a1 , . . . , am ) → q to apply next. In other


words, after inspecting the properties of the subtrees p1 , . . . , pm , the F-transducer A can
select the production to be applied in the next step of the translation of p. Moreover,
these properties of subtrees are regular in the sense that dom(τA(ai ) ) is a regular forest
for each i(= 1, . . . , m). Obviously, R-transducers lack this possibility. This observation
leads to the idea to provide R-transducers with regular look-ahead as follows.

Definition 4.4.1 A root-to-frontier tree transducer with regular look-ahead (RR -trans-
ducer) is a system A = (Σ, X, A, Ω, Y, P, A′ ), where
(1) Σ, X, A, Ω, Y and A′ have the same meanings as in Definition 4.1.4,
(2) P is a finite set of productions (or rewriting rules) of the form (p → q, D), where
p → q is an R-transducer production and D is a mapping of the set of all auxiliary
variables occurring in p into Rec(Σ, X).
If p is of the form ax (x ∈ X) or aσ with σ ∈ Σ0 , then the domain of D is empty. We
write such rules generally as ax → q and aσ → q, respectively. Moreover, for any a ∈ A,
we put A(a) = (Σ, X, A, Ω, Y, P, a).

Definition 4.4.2 Let A be the RR -transducer of Definition 4.4.1. A is called determin-


istic if the following conditions are satisfied:
(i) A′ is a singleton.
(ii) If (p1 → q1 , D1 ) and (p2 → q2 , D2 ) are two productions in P with p1 = p2 , and
q1 6= q2 , then there exists an i (1 ≤ i ≤ m) such that D1 (ξi ) ∩ D2 (ξi ) = ∅, where
m is the number of auxiliary variables in p1 (= p2 ).
Linear and nondeleting RR -transducers are defined in the same way as their R-
transducer counterparts.

Definition 4.4.3 Take an RR -transducer A = (Σ, X, A, Ω, Y, P, A′ ), and let p, q ∈


FΩ [Y ∪ AFΣ (X)] be two trees. It is said that p directly derives q in A (in notation,
p ⇒A q) if q can be obtained from p
(i) by replacing an occurrence of an ax (a ∈ A, x ∈ X) in p by the right side q of a
production ax → q in P , or
(ii) by replacing an occurrence of a subtree aσ(p1 , . . . , pm ) (a ∈ A, σ ∈ Σm , m ≥
0, p1 , . . . , pm ∈ FΣ (X)) in p by q(p1 , . . . , pm ), where (aσ → q, D) is in P and
pi ∈ D(ξi ) for each i(= 1, . . . , m).
A sequence
p = p0 ⇒A p1 ⇒A . . . ⇒A pk = q (k ≥ 0)
obtained by consecutive applications of direct derivations is a derivation of q from p in
A. When such a derivation exists, we write p ⇒∗A q. Again, this notation will also be
used to indicate a certain derivation.
If there is no danger of confusion, then we generally omit A in ⇒A and ⇒∗A .

155
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

According to Definition 4.4.3, the difference between derivations in R-transducers and


RR -transducers is that in case of an RR -transducer A a production aσ → q can be
applied to a tree aσ(p1 , . . . , pm ) if and only if there is a production (aσ → q, D) of A
such that each subtree pi (1 ≤ i ≤ m) is in the recognizable forest D(ξi ).

Definition 4.4.4 Let A = (Σ, X, A, Ω, Y, P, A′ ) be an RR -transducer. Then the relation

τA = {(p, q) | p ∈ FΣ (X), q ∈ FΩ (Y ), ap ⇒∗ q for some a ∈ A′ }

is called the transformation induced by A.


A relation τ is an RR -transformation if there exists an RR -transducer A such that
τ = τA .
Linear, nondeleting and deterministic RR -transformations are defined in an obvious
way.
The class of all RR -transformations will be denoted by RR .

Let us note that there exists a recursive definition of transformations induced by RR -


transducers. This can be obtained by an obvious modification of the corresponding
definition of transformations induced by R-transducers.
Moreover, for RR -transducers the notion of a reordering of direct derivations can be
defined in the same way as in the case of R-transducers. Furthermore, the remarks
concerning different forms of derivations in R-transducers are valid for RR -transducers,
too.
To illustrate the concepts of RR -transducers and RR -transformations, consider

Example 4.4.5 Let X = {x} and Σ = Σ1 ∪ Σ2 , where Σi = {σi } (i = 1, 2). Take the
forests T1 = {σ1 (x)}∗x and T2 = {σ1 (x)}. Let A = (Σ, X, {a0 , a1 }, Ω, Y, P, a0 ) be the
RR -transducer where Ω = Ω1 = {ω}, Y = {y} and P consists of the productions

(a0 σ2 → ω(a1 ξ1 ), D1 ) (D1 (ξ1 ) = T1 , D1 (ξ2 ) = T2 ),

(a1 σ1 → ω(a1 ξ1 ), D2 ) (D2 (ξ1 ) = T1 ),


a1 x → y.
Then τA = {(σ2 (σ1n (x), σ1 (x)), ω n+1 (y)) | n = 0, 1, . . .}. Observe that (without
regular look-ahead) the corresponding R-transducer would induce the transformation
{(σ2 (σ1n (x), p), ω n+1 (y)) | p ∈ FΣ (X), n = 0, 1, . . .}. ✷

Obviously R-transducers are special cases of RR -transducers. On the other hand, RR -


transducers can restrict the domain of possible subtrees of input trees even if these are
deleted. In fact, no R-transducer could induce the τA considered in the above example.
Assume that such an R-transducer

B = (Σ, X, B, Ω, Y, P ′ , B ′ )

exists. Then for every n(≥ 0), the production applied first in a derivation b0 σ2 (σ1n (x),
σ1 (x)) ⇒∗B ω n+1 (y) (b0 ∈ B ′ ) should be of the form

156
4.4 Tree transducers with regular look-ahead

(i) b0 σ2 → q(bξ1 ) or

(ii) b0 σ2 → q(bξ2 ) (b ∈ B, q = ω m (ξ1 ), m ≥ 0).

Let k be the maximum of the heights of right sides of productions from P ′ and n ≥ 3k.
Then the considered production should be of the form (i). But in this case all pairs
(σ2 (σ1n (x), p), ω n+1 (y)) (p ∈ FΣ (X)) are in τB , which is a contradiction. ✷

Theorem 4.4.6 The following inclusions hold:

(i) RR ⊆ DFrel ◦ R,

(ii) LRR ⊆ DFrel ◦ LR,

(iii) DRR ⊆ DFrel ◦ DR,

(iv) LDRR ⊆ DFrel ◦ LDR.

Proof. Let A = (Σ, X, A, ∆, Y, P, A′ ) be an arbitrary RR -transducer. Let T1 , . . . , Tk (⊆


FΣ (X)) be all regular forests which appear as images in the D-mappings of the produc-
tions in P . Denote by V the set of all k-dimensional vectors with components 0 or 1.
Now take a ranked alphabet Ω, where Ω0 = Σ0 , and for each m > 0, Ωm = Σm × V m .
Thus, the elements from Ωm (m > 0) can be given in the form (σ, (v1 , . . . , vm )), where
σ ∈ Σm and v1 , . . . , vm ∈ V.
Let Ai = (Ai , αi , A′i ) be ΣX-recognizers with Ai = (Ai , Σ) and T (Ai ) = Ti (i =
1, . . . , k). We introduce the F-transducer B = (Σ, X, B, Ω, X, P ′ , B ′ ) where B = B ′ =
A1 × . . . × Ak and P ′ consists of the following productions:

(I) x → (xα1 , . . . , xαk )x (x ∈ X),

(II) σ → (σ A1 , . . . , σ Ak )σ (σ ∈ Σ0 ),

(III) σ(a1 , . . . , am ) → a(σ, (v1 , . . . , vm ))(ξ1 , . . . , ξm )

(σ ∈ Σm , m > 0; a, ai ∈ B, vi ∈ V, i = 1, . . . , m),

where
a = (σ A1 (a11 , . . . , am1 ), . . . , σ Ak (a1k , . . . , amk ))
and vij = 1 iff aij ∈ A′j . Obviously, B is a deterministic F-relabeling.
One can easily show that B relabels every ΣX-tree p in the following way:

(α) if p ∈ X ∪ Σ0 , then τB(p) = p,

(β) if p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0) then τB(p) = (σ, (v1 , . . . , vm ))(τB (p1 ), . . . ,


τB (pm )), where vij = 1 iff pi ∈ Tj (1 ≤ i ≤ m, 1 ≤ j ≤ k).

Next construct the R-transducer C = (Ω, X, A, ∆, Y, P ′′ , A′ ) where P ′′ consists of the


productions below:

157
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

(α′ ) ap → r (a ∈ A, p ∈ X ∪ Ω0 , r ∈ F∆ (Y )) is in P ′′ iff it is in P,
(β ′ ) a(σ, (v1 , . . . , vm )) → r (a ∈ A; σ ∈ Σm , m > 0; vi ∈ V, i = 1, . . . , m; r ∈ F∆ [Y ∪
AΞm ]) is in P ′′ iff (σ, (v1 , . . . , vm )) occurs in a tree τB (p) (p ∈ FΣ (X)) and P
contains a production (aσ → r, D) such that vij = 1 whenever D(ξi ) = Tj (1 ≤
i ≤ m, 1 ≤ j ≤ k).
In order to prove τA = τB ◦ τC it is enough to show that for arbitrary a ∈ A, p ∈ FΣ (X)
and r ∈ F∆ (Y ) the equivalence
ap ⇒∗A r ⇐⇒ aτB (p) ⇒∗C r
holds. This can be carried out by induction on hg(p).
It is also easy to show that C is deterministic (linear) if A is deterministic (linear). ✷

Theorem 4.4.6 (iii) shows that DRR -transducers induce (partial) mappings.
Next we show that RR is closed under certain special F-transformations.

Theorem 4.4.7 The following inclusions hold:


(i) RR ◦ LF ⊆ RR ,
(ii) DRR ◦ DLF ⊆ DRR ,
(iii) DRR ◦ DLR ⊆ DRR ,
(iv) DRR ◦ H ⊆ DRR .

Proof. Let A = (Σ, X, A, Ω, Y, P, A′ ) be an RR -transducer, and take an LF- transducer


B = (Ω, Y, B, ∆, Z, P ′ , B ′ ).
We want to treat cases (i) and (ii) together. Since the set of initial states
of a DRR -transducer should be a singleton we shall use the LF-transducer B =
′ ′
(Ω, Y, B, ∆, Z, P , b0 ) instead of B, where B = B ∪ b0 (b0 6∈ B) and P is obtained
by enlarging P ′ by the following productions: if y → bq (y ∈ Y ), is in P ′ and b ∈ B ′ ,

then y → b0 q is in P . Similarly, if σ(b1 , . . . , bm ) → bq (σ ∈ Σm , m ≥ 0) is in P ′ and

b ∈ B ′ then the production σ(b1 , . . . , bm ) → b0 q is in P . It is obvious that τB = τB .
Construct the RR -transducer C = (Σ, X, A × B, ∆, Z, P ′′ , A′ × {b0 }), where P ′′ is given
as follows:
(I) (a, b)p → r (a ∈ A, b ∈ B, p ∈ X ∪ Σ0 , r ∈ F∆ (Z)) is in P ′′ iff there exists a
production ap → q in P such that q ⇒∗B br holds.
(II) Assume that the production (aσ → q(a1 ξ1n1 , . . . , am ξm nm ), D) (a ∈ A; σ ∈ Σ , m >
m
n
0; ai ∈ A , i = 1, . . . , m; n1 + . . . + nm = n, q ∈ F̂Ω (Y ∪ Ξn )) is in P and that there
i

∗ br(ξ , . . . , ξ ) with b ∈ B; b ∈ B ni , ξ ∈
is a derivation q(b1 ξ 1 , . . . , bm ξ m ) ⇒B 1 m i i
Ξni , ξij = ξn1 +...+ni−1 +j , 1 ≤ j ≤ ni , i = 1, . . . , m and r ∈ F∆ (Z ∪ Ξn ). Then P ′′
contains the production ((a, b)σ → r(a1 b1 ξ1n1 , . . . , am bm ξm
T
nm ), D ′ ), where D ′ (ξ ) =
i
−1
(τA(ai ) (dom(τB(bij ) )) | j = 1, . . . , ni ) ∩ D(ξi ) (i = 1, . . . , m). If b ∈ B ′ , then
j
((a, b0 )σ → r(a1 b1 ξ1n1 , . . . , am bm ξm
nm ), D ′ ) is also in P ′′ .

158
4.4 Tree transducers with regular look-ahead

By Corollary 4.3.17, the domain of an R-transformation is regular. Moreover, also


by Corollary 4.3.17, the inverse of an R-transformation preserves regularity. Thus, by
Corollary 4.2.9 and Theorems 4.4.6 and 2.4.2, D ′ (ξi ) (1 ≤ i ≤ m) is regular.
In order to show τA ◦τB = τC it is enough to prove that for all (a, b) ∈ A×B, p ∈ FΣ (X)
and r ∈ F∆ (Z) the equivalence

(a, b)p ⇒∗C r ⇐⇒ (∃q ∈ FΩ (Y ))(ap ⇒∗A q ∧ q ⇒∗B br)

holds. This can be done by induction on hg(p).


One can easily check that if A and B are deterministic, then so is C. Thus, (i) and (ii)
are valid.
For (iii), take a DRR -transducer A = (Σ, X, A, Ω, Y, P, a0 ) and a DLR-transducer B =
(Ω, Y, B, ∆, Z, P ′ , b0 ).
Consider the RR -transducer C = (Σ, X, A × B, Ω, Y, P ′′ , (a0 , b0 )), where P ′′ is given in
the following way:

(I) If ap → q (a ∈ A, p ∈ X ∪ Σ0 , q ∈ FΩ (Y )) is in P and bq ⇒∗B r (b ∈ B, r ∈ F∆ (Z))


holds, then (a, b)p → r is in P ′′ .

(II) Suppose that (aσ → q(a1 ξ1n1 , . . . , am ξm nm ), D) (a ∈ A, σ ∈ Σ , m > 0, a ∈ Ani , i =


m i
1, . . . , m, n1 + . . . + nm = n, q ∈ F̂Ω (Y ∪ Ξn )) is in P and there is a derivation
bq ⇒∗B r(b1 ξ 1 , . . . , bm ξ m ) with b ∈ B, bi ∈ B ni , ξ i ∈ Ξni , ξij = ξn1 +...+ni−1 +j , 1 ≤
j ≤ ni , i = 1, . . . , m and r ∈ F∆ (Z ∪ Ξn ). Then the production

((a, b) → r(a1 b1 ξ1n1 , . . . , am bm ξm


nm
), D ′ )

is in P ′′ , where for every i(= 1, . . . , m),


\
D ′ (ξi ) = (dom(τA(aij ) ) | ξij (1 ≤ j ≤ ni ) does not occur in r) ∩ D(ξi ).

Obviously, C is a DRR -transducer. Moreover, for all a ∈ A, b ∈ B, p ∈ FΣ (X) and


r ∈ F∆ (Z) the equivalence

(a, b)p ⇒∗C r ⇐⇒ (∃q ∈ FΩ (Y ))(ap ⇒∗A q ∧ bq ⇒∗B r)

holds. This can be proved by induction on hg(p). Therefore, τC = τA ◦ τB . Thus we have


shown that DRR ◦ DLR ⊆ DRR .
To show (iv), let A = (Σ, X, A, Ω, Y, P, a0 ) be a DRR -transducer and B =
(Ω, Y, {b0 }, ∆, Z, P ′ , b0 ) an HF-transducer.
Construct an RR -transducer C = (Σ, X, A, ∆, Z, P ′′ , a0 ), where P ′′ is given as follows:

(I) ap → r (a ∈ A, p ∈ Σ0 ∪ X, r ∈ F∆ (Z)) is in P ′′ iff there is a production ap → q in


P such that q ⇒∗B b0 r holds.

(II) Suppose that the production (aσ → q(a1 ξ1n1 , . . . , am ξm nm ), D) (a ∈ A, σ ∈ Σ , m >


m
n
0, ai ∈ A , i = 1, . . . , m, n1 + . . . + nm = n, q ∈ F̂Ω (Y ∪ Ξn )) is in P and there is a
i

159
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

k1n k
derivation q(bn0 1 ξ 1 , . . . , bn0 m ξ m ) ⇒∗B b0 r(ξ1k111 , . . . , ξ1n 1 , . . . , ξm
km1 , . . . , ξ mnm ) where
1 mnm
1
ξ i ∈ Ξni , ξij = ξn1 +...+ni−1 +j , 1 ≤ j ≤ ni , i = 1, . . . , m, k11 + . . . + k1n1 + . . . + km1 +
. . . + kmnm = k, r ∈ F̂∆ (Z ∪ Ξk ). Then the production
k1n k1n1
(aσ → r(ak1111 ξ1k11 , . . . , a1n11 ξ1
, . . . , akmm1
1
km1
ξm kmnm kmnm
, . . . , amnm
ξm ), D ′ )
T
is in P ′′ , where for every i(= 1, . . . , m), D ′ (ξi ) = (dom(τA(aij ) ) | ξij occurs in q
but it does not occur in r) ∩ D(ξi ).

Using a similar argument as in the proof of (ii), we get that D ′ (ξi ) is a regular forest.
It is obvious that C is deterministic.
Finally, to show τA ◦ τB = τC it is enough to prove that for all a ∈ A, p ∈ FΣ (X) and
r ∈ F∆ (Z) the equivalence

ap ⇒∗C r ⇐⇒ pτA(a) ⇒∗B b0 r

holds. This can be done by induction on hg(p). ✷

From Theorem 4.4.7 we get

Corollary 4.4.8 The inclusions

(i) RR ◦ Frel ⊆ RR ,

(ii) DRR ◦ DFrel ⊆ DRR , and

(iii) DRR ◦ DRrel ⊆ DRR

hold. ✷

Next we show that the classes of LF-transformations and LRR -transformations coin-
cide.

Theorem 4.4.9 LRR = LF.

Proof. Since DFrel ⊆ LN F, the inclusion LRR ⊆ LF is implied by Theorems 4.4.6


(ii), 4.2.8 and 4.3.6 (ii).
In order to prove LF ⊆ LRR , take an LF-transducer A = (Σ, X, A, Ω, Y, P, A′ ). Con-
sider the RR -transducer B = (Σ, X, A, Ω, Y, P ′ , A′ ), where P ′ is given as follows:

(i) If x → aq (x ∈ X, a ∈ A, q ∈ FΩ (Y )) is in P , then ax → q is in P ′ .

(ii) If σ(a1 , . . . , am ) → aq (σ ∈ Σm , m ≥ 0, a1 , . . . , am , a ∈ A, q ∈ FΩ (Y ∪ Ξm )) is in P ,
then (aσ → q(a1 ξ1 , . . . , am ξm ), D) is in P ′ , where for every i(= 1, . . . , m),

dom(τA(ai ) ) if ξi does not occur in q,
D(ξi ) =
FΣ (X) otherwise.

160
4.4 Tree transducers with regular look-ahead

Obviously, B is an LRR -transducer. To prove τA = τB it is enough to show that for


each a ∈ A, p ∈ FΣ (X) and q ∈ FΩ (Y ) the equivalence

p ⇒∗A aq ⇐⇒ ap ⇒∗B q

holds. Again, we omit the straightforward inductive proof. ✷

In the proof of the above theorem we used look-ahead to ensure that the LRR -
transducer will not transform any tree which contains a subtree for which the LF-
transducer has no transform but which it would later delete.
From Theorem 4.4.9, by Theorem 4.3.6 (i), we get

Corollary 4.4.10 LRR is closed under composition. ✷

Next we show that RR is closed under LRR -transformations and DRR is closed under
composition.

Theorem 4.4.11 The following equations hold:

(i) RR ◦ LRR = RR ,

(ii) DRR ◦ DRR = DRR .

Proof. RR ◦ LRR = RR follows from Theorem 4.4.7 by Theorem 4.4.9.


Since, for each Σ and X, the identity mapping on FΣ (X) is in DRR , in order to prove
(ii) it is enough to show the validity of the inclusion DRR ◦ DRR ⊆ DRR .
By Theorem 4.4.6 (iii), the inclusion DRR ◦ DRR ⊆ DRR ◦ DFrel ◦ DR holds from
which, using Corollary 4.4.8 (ii), we get DRR ◦DRR ⊆ DRR ◦DR. This latter inclusion,
by the proof of Lemma 4.3.10, implies DRR ◦ DRR ⊆ DRR ◦ H ◦ DLR. Now, using
Theorem 4.4.7 (iv), we get DRR ◦ DRR ⊆ DRR ◦ DLR, from which by Theorem 4.4.7
(iii), we arrive at the desired inclusion DRR ◦ DRR ⊆ DRR . ✷

To end this section we prove the analogue of Theorem 4.3.12.

Theorem 4.4.12 RR = H ◦ LRR .

Proof. The inclusion H ◦ LRR ⊆ RR directly follows from Theorem 4.4.11 (i). To show
RR ⊆ H ◦ LRR , consider an RR -transducer A = (Σ, X, A, ∆, Z, P, A′ ). Omit regular
look-ahead in A and for the resulting R-transducer consider the H-transducer B and
LR-transducer C given in the proof of Lemma 4.3.10. Now it is impossible to provide C
with a suitable regular look-ahead in an obvious way since H-transducers do not preserve
regularity. We shall solve this problem in the following way.
Take the tree homomorphism h : FΩ (X) → FΣ (X) given as follows:

(i) hX (x) = x (x ∈ X),

(ii) hmk (σ ′ ) = σ(ξ1 , ξk+1 , . . . , ξ(m−1)k+1 ) (σ ∈ Σm , m ≥ 0).

161
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

One can easily verify that for every p ∈ FΣ (X) the equality h(τB (p)) = p holds, i.e.,
h(p′ ) = p (for p′ , see the proof of Lemma 4.3.10).
Now replacing each production aσ ′ → r(a1 ξ 1 , . . . , am ξ m ) (σ ∈ Σm , m > 0, (aσ →
r(a1 ξ1n1 , . . . , am ξm
nm ), D) ∈ P ) in P ′′ by (aσ ′ → r(a ξ , . . . , a ξ ), D ′ ), where D ′ (ξ ) =
1 1 m m ij
h−1 (D(ξi )) (i = 1, . . . , m, j = 1, . . . , k), from C we get an LRR -transducer since, by The-
orem 2.4.18, h−1 preserves recognizability. Let us denote the resulting LRR -transducer
also by C.
Using tree induction, it is easy to prove that τA = τB ◦ τC . ✷

4.5 GENERALIZED SYNTAX DIRECTED TRANSLATORS


In studying certain properties of tree transformations it is technically useful to consider
systems that translate trees into strings. Such systems are also of interest as mathemat-
ical models of syntax directed translations of context-free languages.

Definition 4.5.1 A generalized syntax directed translator (GSDT) is a system A =


(Σ, X, A, Y, P, A′ ), where
(1) Σ is a ranked alphabet,

(2) A is a unary ranked alphabet (the state set),

(3) X and Y are alphabets,

(4) A′ ⊆ A is the set of initial states, and

(5) P is a finite set of productions (or rewriting rules) of the following two types:
(i) ax → w (a ∈ A, x ∈ X, w ∈ Y ∗ ),
(ii) aσ → w (a ∈ A, σ ∈ Σm , m ≥ 0, w ∈ (Y ∪ AΞm )∗ ). (Here AΞm is treated as
an alphabet; the elements of it are the trees of the form aξi with a ∈ A and
ξi ∈ Ξm .)

For ap → w we shall use the notation (ap, w), too. Moreover, for any a ∈ A, we put
A(a) = (Σ, X, A, Y, P, a).
Next we define translations induced by a GSDT A. To this end, we associate with
each a ∈ A and p ∈ FΣ (X) a subset pτA,a as follows:
(i) if p ∈ (X ∪ Σ0 ), then pτA,a = {w | (ap, w) ∈ P };

(ii) if p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), then for all

(aσ, w1 ai(1) ξi1 w2 . . . wk ai(k) ξik wk+1 ) ∈ P

(1 ≤ ij ≤ m, j = 1, . . . , k, w1 , . . . , wk+1 ∈ Y ∗ ) and vij ∈ pij τA,ai(j) (j = 1, . . . , k)


the word w1 vi1 w2 . . . wk vik wk+1 is in pτA,a, and

(iii) nothing is in any pτA,a unless this follows from (i) and (ii).

162
4.5 Generalized syntax directed translators

Definition 4.5.2 Let A = (Σ, X, A, Y, P, A′ ) be a GSDT. Then the translation induced


by A is the relation τA = {(p, w) | p ∈ FΣ (X), w ∈ Y ∗ , w ∈ pτA,a for some a ∈ A′ }.
The class of all translations induced by GSDTs will be denoted by G.
For translations induced by GSDTs we give another definition showing how a transla-
tion is carried out step by step.
Let A be the GSDT of Definition 4.5.1. Take two words v, w ∈ (Y ∪ AFΣ (X ∪ Ξ))∗ .
(Here again each element of AFΣ (X ∪ Ξ) is considered a symbol, i.e., we ignore the fact
that these elements are composed of simpler objects.) We say that v directly derives w
in A, and write v ⇒A w, if w can be obtained from v by
(i) replacing an occurrence of ax (a ∈ A, x ∈ X) in v by the right side w of a production
ax → w from P , or
(ii) replacing an occurrence of an aσ(p1 , . . . , pm ) (a ∈ A, σ ∈ Σm , m ≥ 0, p1 , . . . , pm ∈
FΣ (X ∪ Ξ)) in v by w1 ai(1) pi1 w2 . . . wk ai(k) pik wk+1 where
aσ → w1 ai(1) ξi1 w2 . . . wk ai(k) ξik wk+1 (1 ≤ ij ≤ m, j = 1, . . . , k, w1 , . . . , wk+1 ∈ Y ∗ )
is a production in P.
Each application of a step (i) or (ii) is called a direct derivation in A. A sequence
v = v0 ⇒A v1 ⇒A . . . ⇒A vk = w (k ≥ 0, vi ∈ (Y ∪ AFΣ (X ∪ Ξ))∗ , i = 0, . . . , k)
of consecutive direct derivations is a derivation of w from v in A, and k is the length of
this derivation. If w can be obtained from v by a derivation in A, then we write v ⇒∗A w.
Thus ⇒∗A is the reflexive-transitive closure of ⇒A . Again, we suppose that the notation
v ⇒∗A w implicitly includes a given derivation of w from v.
Using the notation ⇒∗A , the translation τA induced by a GSDT A = (Σ, X, A, Y, P, A′ )
can be given by
τA = {(p, w) | p ∈ FΣ (X), w ∈ Y ∗ , ap ⇒∗A w for some a ∈ A′ }.
The concept of a reordering of direct derivations in GSDTs can be defined in a similar
way as in the case of an R-transducer. Moreover, different forms of derivations can be
introduced in an obvious manner.
Deterministic, linear, totally defined and nondeleting GSDTs are defined in a natural
way. Moreover, a one-state totally defined deterministic GSDT is a GSDH-translator.
The translation induced by a GSDH-translator is called a generalized syntax directed
homomorphism (GSD homomorphism). The class of all GSD homomorphisms will be
denoted by Ghom .
Example 4.5.3 Let B = (Σ, {x}, {b0 , b1 , b2 }, {y1 , y2 }, P ′ , b0 ) be a GSDT, where Σ =
Σ1 = {σ} and P ′ consists of the productions
b0 σ → b1 ξ 1 b2 ξ 1 ,
b1 σ → b1 ξ 1 , b2 σ → b2 ξ 1 ,
b1 x → y1 , b2 x → y2 .

163
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Then B is deterministic, totally defined and nondeleting, but it is not linear.


Take the tree p = σ(σ(σ(x))) and the word w = y1 y2 . Moreover, consider the derivation

p ⇒B b1 σ(σ(x))b2 σ(σ(x)) ⇒B b1 σ(x)b2 σ(σ(x)) ⇒B b1 xb2 σ(σ(x)) ⇒B


y1 b2 σ(σ(x)) ⇒B y1 b2 σ(x) ⇒B y1 b2 x ⇒B y1 y2 = w,

i.e., τB (p) = yd(τA (p)), where A is the R-transducer of Example 4.1.6. One can easily
show that the previous equality holds for every p ∈ FΣ ({x}). ✷

The above relation generally holds between GSDTs and R-transducers as it is shown
by

Theorem 4.5.4 For each GSDT A = (Σ, X, A, Y, P, A′ ) there exist a ranked alphabet Ω
and an R-transducer B = (Σ, X, A, Ω, Y, P ′ , A′ ) such that τA = {(p, yd(q)) | (p, q) ∈ τB }.
Moreover, if A is linear, deterministic, nondeleting or a GSDH-transducer, then B
can also be chosen, correspondingly, as a linear, deterministic, nondeleting or an RH-
transducer.
Conversely, for every R-transducer B there exists a GSDT A such that {(p, yd(q)) |
(p, q) ∈ τB } = τA . If B is, respectively linear, deterministic, nondeleting or an RH-
transducer, then A is linear, deterministic, nondeleting or a GSDH-translator.

Proof. Let A = (Σ, X, A, Y, P, A′ ) be a GSDT. To define B, for each production ap → w


(a ∈ A, p ∈ X ∪ Σ, w ∈ (Y ∪ AΞ)∗ ) in P , let ω(ap,w) be an operator with rank |w|. Let
Ω be the resulting ranked alphabet. Moreover, P ′ is defined as follows:
(i) If ap → w (a ∈ A, p ∈ X ∪ Σ0 , w ∈ Y ∗ ) is in P and |w| = k, then the production
ap → ω(ap,w) (q1 , . . . , qk ) (qi ∈ Y, i = 1, . . . , k) with yd(ω(ap,w) (q1 , . . . , qk )) = w is in
P ′.

(ii) If aσ → w (a ∈ A, σ ∈ Σm , m > 0, w ∈ (Y ∪ AΞm )∗ ) is in P with |w| = k, then


the production aσ → ω(aσ,w) (q1 , . . . , qk ) (qi ∈ Y ∪ AΞm , i = 1, . . . , k) satisfying
yd(ω(aσ,w) (q1 , . . . , qk )) = w is in P ′ , where yd is taken over the frontier alphabet
Y ∪ AΞm .
In order to prove τA = {(p, yd(q)) | (p, q) ∈ τB } it is enough to show that, for all
a ∈ A, p ∈ FΣ (X) and w ∈ Y ∗ , the equivalence

ap ⇒∗A w ⇐⇒ (∃q ∈ FΩ (Y ))(ap ⇒∗B q ∧ yd(q) = w)

holds. This can be done in an obvious way by induction on hg(p).


It is also obvious from the construction of B that the remaining conclusions of the
first part of Theorem 4.5.4 hold, too.
Conversely, consider an R-transducer B = (Σ, X, B, Ω, Y, P ′ , B ′ ). The productions of
the desired GSDT A = (Σ, X, B, Y, P, B ′ ) are given as follows:
(I) For all b ∈ B, p ∈ X ∪ Σ0 and q ∈ FΩ (Y ), if bp → q is in P ′ , then bp → yd(q) is in
P.

164
4.6 Surface forests

(II) For all b ∈ B, σ ∈ Σm (m > 0) and q ∈ FΩ (Y ∪ BΞm), if bσ → q is in P ′ , then


bσ → yd(q) is in P , where yd is again taken over the alphabet Y ∪ BΞm.
To prove τA = {(p, yd(q)) | (p, q) ∈ τB } it is enough to show that the equivalence

bp ⇒∗A w ⇐⇒ (∃q ∈ FΩ (Y ))(bp ⇒∗B q ∧ yd(q) = w)

holds for arbitrary b ∈ B, p ∈ FΣ (X) and w ∈ Y ∗ . This can be carried out by induction
on hg(p). Moreover, the remaining conclusions of the second part of Theorem 4.5.4 are
obviously valid. ✷

4.6 SURFACE FORESTS


The images of regular forests under tree transformations are called surface forests. In
this section we compare classes of surface forests belonging to different classes of tree
transformations.

Definition 4.6.1 Let K be a class of tree transformations. A forest S ⊆ FΩ (Y ) is


called a K-surface forest if there exist a ranked alphabet Σ, a frontier alphabet X, a
forest R ∈ Rec(Σ, X), and a K-transformation τ ⊆ FΣ (X) × FΩ (Y ) such that S = Rτ .
The class of all K-surface forests is denoted by Surf(K).

The following lemma is obvious.

Lemma 4.6.2 If K is a class of tree transformations which contains all identity trans-
formations, then Rec is included as a subclass in Surf(K). ✷

Of course, this lemma applies to all of the classes of tree transformations which we
have considered (F, R, LF , H etc.).
Next we characterize F-transformations preserving regularity. For this we should in-
troduce some more terminology.

Definition 4.6.3 A tree transformation τ ⊆ FΣ (X) × FΩ (Y ) is said to preserve regu-


larity if Rτ ∈ Rec(Ω, Y ) whenever R ∈ Rec(Σ, X). Moreover, a class K of tree transfor-
mations preserves regularity if every τ in K preserves regularity.

We say that an F-transducer A = (Σ, X, A, Ω, Y, P, A′ ) is connected if for each a ∈ A


there are p ∈ FΣ (X) and q ∈ FΩ (Y ) such that p ⇒∗ aq holds.

Definition 4.6.4 For each p ∈ FΣ (X ∪Ξn ), pathi (p) (1 ≤ i ≤ n) is given in the following
way:
(i) if p ∈ Σ0 ∪ X, then pathi (p) = ∅,

(ii) if p = ξi , then pathi (p) = {e},

(iii) if p = ξj (j 6= i), then pathi (p) = ∅,

165
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

(iv) if p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), then

pathi (p) = {jwj | wj ∈ pathi (pj ), j = 1, . . . , m}.

Thus, pathi (p) is a language over the alphabet {1, . . . , m}, where m is the maximal
integer with Σm 6= ∅.
Obviously, the elements of pathi (p) describe paths leading from the root of p to a leaf
labelled by ξi . 
If pathi (p) consists of a single word, then l pathi (p) denotes the length of this word.

Lemma 4.6.5 LF preserves regularity.

Proof. Since the F-transducer given in the proof of Lemma 4.1.11 is linear, by Theorem
4.3.6 (i), it is enough to show that for each LF-transducer A = (Σ, X, A, Ω, Y, P, A′ ),
range(τA ) is regular. Without loss of generality, we may assume that A is connected.
Consider the regular ΩY -grammar G = (A, Ω, Y, P ′ , A′ ), where P ′ is given as follows:

(i) if x → aq x ∈ X, a ∈ A, q ∈ FΩ (Y ) is in P , then a → q is in P ′ ,

(ii) if σ(a1 , . . . , am ) → aq σ ∈ Σm , m ≥ 0, a1 , . . . , am , a ∈ A, q ∈ FΩ (Y ∪ Ξm ) is in P ,
then a → q(a1 , . . . , am ) is in P ′ .
In order to prove the lemma it is enough to show that the equivalence

a ⇒∗G q ⇐⇒ ∃p ∈ FΣ (X) (p ⇒∗A aq) (1)

holds for all a ∈ A and q ∈ FΩ (Y ).


(I) First we prove that the left side of (1) implies its right side. For this, assume that
a ⇒∗G q is valid. We shall proceed by induction on the length l of a ⇒∗G q.
Let l = 1. Then a → q is in P ′ , and the following two cases are possible:

(Ia) There is a production x → aq x ∈ X, a ∈ A, q ∈ FΩ (Y ) .
(Ib) There is a production σ(a1 , . . . , am ) → aq (σ ∈ Σm , m ≥ 0, a1 , . . . , am , a ∈ A)
such that in q no auxiliary variables occur, i.e., q ∈ FΩ (Y ).
In case (Ia) take p = x.
In case (Ib), since A is connected, there are pi ∈ FΣ (X) and qi ∈ FΩ (Y ) (i =
1, . . . , m) such that pi ⇒∗A ai qi hold. Now taking p = σ(p1 , . . . , pm ) we have
p = σ(p1 , . . . , pm ) ⇒∗A σ(a1 q1 , . . . , am qm ) ⇒A aq(q1 , . . . , qm ) = aq.
Next, assume that l > 1 and that our statement has been proved for derivations of
length less than l. Then a ⇒∗G q can be written in the form a ⇒G q(a1 , . . . , am ) ⇒∗G
q(q1 , . . . , qm ) = q, where σ(a1 , . . . , am ) → aq is in P for some σ ∈ Σm (m > 0) and
ai ⇒∗G qi (1 ≤ i ≤ m) if ξi occurs in q. By the induction hypothesis, for all such i
there exists a pi ∈ FΣ (X) with pi ⇒∗A ai qi . In the remaining cases, i.e., if ξi does
not occur in q, let pi ∈ FΣ (X) and qi ∈ FΩ (Y ) (1 ≤ i ≤ m) be arbitrary such that
pi ⇒∗A ai qi . Then p = σ(p1 , . . . , pm ) satisfies p ⇒∗A aq.

166
4.6 Surface forests

(II) Assume that p ⇒∗A aq holds. We shall show by induction on hg(p) that the left
side of (1) is also valid. If hg(p) = 0, then, by the choice of P ′ , the right side of
(1) obviously implies its left side.
Now let p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), and assume that our statement
has been proved for all trees from FΣ (X) with height less than hg(p). Moreover,
let us write p ⇒∗A aq in the form p ⇒∗A σ(a1 q1 , . . . , am qm ) ⇒A aq(q1 , . . . , qm ),
where σ(a1 , . . . , am ) → aq is in P and pi ⇒∗A ai qi (i = 1, . . . , m). Then, by the
definition of P ′ and the induction hypothesis, we have a ⇒G q(a1 , . . . , am ) ⇒∗G
q(q1 , . . . , qm ) = q. ✷

From Lemma 4.6.5, using Theorems 4.2.7 and 4.4.9, respectively, we get the following
results.

Corollary 4.6.6 LR preserves regularity. ✷

Corollary 4.6.7 LRR preserves regularity. ✷

A state a ∈ A of an F-transducer A = (Σ, X, A, Ω, Y, P, A′ ) is nondeleting if there exist


two trees p ∈ F̂Σ (X ∪ Ξ1 ) and q ∈ FΩ (Y ∪ Ξ1 ) such that p(aξ1 ) ⇒∗ a′ q for some a′ ∈ A′
and ξ1 occurs in q. Otherwise a is deleting. The state a is copying if there are two trees
p ∈ F̂Σ (X ∪ Ξ1 ) and q ∈ FΩ (Y ∪ Ξ1 ) such that p(aξ1 ) ⇒∗ a′ q for some a′ ∈ A′ and ξ1
occurs at least twice in q.

Lemma 4.6.8 Let A = (Σ, X, A, Ω, Y, P, A′ ) be a connected F-transducer. If τA pre-


serves regularity and a ∈ A is copying, then range(τA(a) ) is finite.

Proof. Assume that τA preserves regularity. Let a ∈ A be a copying state, and take two
trees p ∈ F̂Σ (X ∪ Ξ1 ) and q ∈ F̂Ω (Y ∪ Ξn ) such that p(aξ1 ) ⇒∗ a′ q(ξ1n ) where a′ ∈ A′
and n > 1. Suppose that range(τA(a) ) is infinite. Then there is an s ∈ range(τA(a) ) with
hg(s) > k · |A|, where k is the maximum of the heights of the right-hand sides of the
productions in P . Let r ∈ FΣ (X) be a tree such that r ⇒∗ as. Since hg(s) > k · |A|,
there are trees r1 , r2 ∈ F̂Σ (X ∪ Ξ1 ) and r3 ∈ FΣ (X) such that the following conditions
are satisfied:

(i) r1 r2 (r3 ) = r,

(ii) r3 ⇒∗ bs3 , r2 (bξ1 ) ⇒∗ bs2 and r1 (bξ1 ) ⇒∗ as1 for some b ∈ A, s1 , s2 ∈ FΩ (Y ∪ Ξ1 )


and s3 ∈ FΩ (Y ),

(iii) hg(s2 ) > 0, and ξ1 occurs in s1 and s2 ,

(iv) s1 (s2 (s3 )) = s.

167
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS


Therefore, for each i(= 0, 1, . . .), there is a derivation pi = p r1 (r2i (r3 )) ⇒∗ a′ q(tni ) = qi
where ti = s1 si2 (s3 ) (the powers ti of any tree t ∈ FΣ (X ∪ Ξ1 ) are defined thus: t0 = ξ1 ,
and ti+1 = t(ti ) for each i ≥ 0). Obviously, hg(qi ) increases with i when i is large enough.
Now consider the forest T = {pi | i = 0, 1, . . .}. Obviously, T is regular. Since τA
preserves regularity, this implies that T ′ = T τA is also regular. Take an ΩY -recognizer
B = (B, Ω, Y, β, B ′ ) with T ′ = T (B). Choose an
 
i ≥ 2k(hg(p(r)) + 1) + 2|B| k hg(p(r)) + 1 .

Then
 there exists a tree t ∈ F Ω (Y ) with k hg(p(r)) + 1 + |B| ≤ hg(t) < k hg(p(r)) +
1 + 2|B| such that
q = q(t, tin−1 ) (2)
is also in T ′ . To prove the lemma it is enough to show that there exist no j and a′′ ∈ A′
such that pj ⇒∗ a′′ q. Suppose
m
pj ⇒∗ a′′ q ′ (t′ ) = a′′ q

holds, where a′′ ∈ A′ , q ′ ∈ F̂Ω (Y ∪ Ξm ), r3 ⇒∗ b1 s′1 , r2 (bl ξ1 ) ⇒∗ bl+1 s′l+1 b1 , bl+1 ∈ A,
sl+1 ∈ FΩ (Y ∪ Ξ1 ), l = 1, . . . , j, s′1 ∈ FΩ (Y ) , r1 (bj+1 ξ1 ) ⇒∗ bj+2 s′j+2 bj+2 ∈ A,
 
s′j+2 ∈ FΩ (Y ∪ Ξ1 ) , p(bj+2 ξ1 ) ⇒∗ a′′ s′j+3 = a′′ q ′ and t′ = s′j+2 s′j+1 (. . . (s′1 ) . . .) .
By the choice of i, there exists a u (2 ≤ u ≤ j + 3) such that ξ1 occurs in
su , s′u+1 , . . . , s′j+3 but ξ1 does not occur in s′u−1 . Moreover, let u − 1 ≤ u1 < . . . <

uv ≤ j + 3 be a maximal sequence with 1 ≤ hg(su1 ) < . . . < hg(suv ), where sl =


s′l s′l−1 (. . . (s′1 ) . . .) (l = 1, . . . , j+3). Then v ≥ 2k hg(p(r))+1 +2|B|. Taking into con-
sideration that hg(t) ≥ k(hg(p(r))+ 1)+ |B| (and |B| ≥ 1), for an l (2 ≤ l ≤ v),the word
w forming path1 (q) is a subword of a word in path1 s′j+3 (s′j+2 (. . . (s′ul ) . . .)) . (Infor-

mally speaking, this means that there is a word in path1 s′j+3 (s′j+2 (. . . (s′ul ) . . .)) going
 
through the root of t.) Therefore, we have l path1 (q) + hg(t) ≥ 2k hg(p(r))  + 1 + 2|B|.
But, by (2) and the choice of t, l path1 (q) + hg(t) < 2k hg(p(r)) + 1 + 2|B|, which is
a contradiction. ✷

Lemma 4.6.9 Let A = (Σ, X, A, Ω, Y, P, A′ ) be a connected F-transducer such that for


every copying state a ∈ A, range(τA(a) ) is finite. Then A is equivalent to a linear F-
transducer.

Proof. Suppose that a1 , . . . , ak areS all the copying states of A. Let Ti = range(τA(ai ) )
(i = 1, . . . , k). Moreover, set T = (Ti | i = 1, . . . , k). By our assumptions, T is finite.
Define an F-transducer B = (Σ, X, B, Ω, Y, P ′ , B ′ ), where
[
B = (A − {ai | i = 1, . . . , k}) ∪ ({ai } × Ti | i = 1, . . . , k)

and
B ′ = (A′ ∪ A′ × T ) ∩ B.
Moreover, P ′ is given as follows:

168
4.6 Surface forests

(i) If p → aq (p ∈ Σ0 ∪ X) is in P and a = ai for some i (1 ≤ i ≤ k), then p → (a, q)q


is in P ′ . If a 6∈ {a1 , . . . , ak }, then p → aq itself is in P ′ .

(ii) Let
σ(b1 , . . . , bm ) → aq(ξ1 , . . . , ξm )
(σ ∈ Σm , m > 0, b1 , . . . , bm , a ∈ A, q ∈ FΩ (Y ∪ Ξm )) be in P . We distinguish the
following cases:
(iia) The state a is deleting. Fix any q ∈ FΩ (Y ∪ Ξm ) such that every ξi occurs
at most once in q. Then P contains every linear production σ(c1 , . . . , cm ) →
aq(ξ1 , . . . , ξm ) such that

(bj , qj )(qj ∈ Tl ) if bj is copying and bj = al ,
cj =
bj otherwise.

(iib) The state a is nondeleting but not copying. Then all productions

σ(c1 , . . . , cm ) → aq(η1 , . . . , ηm )

are in P ′ where for each j(= 1, . . . , m),



(bj , qj ) (qj ∈ Tl ) if bj is copying and bj = al ,
cj =
bj otherwise

and 
ξj if ξj occurs at most once in q,
ηj =
qj (= π2 (cj )) otherwise.
(Observe that if ξj occurs at least twice in q, then bj is copying.)
(iic) The state a is copying. Then P ′ contains all productions

σ(c1 , . . . , cm ) → (a, q)q

where q = q(η1 , . . . , ηm ) and for each j(= 1, . . . , m),



(bj , qj ) (qj ∈ Tl ) if bj is copying and bj = al ,
cj =
bj otherwise

and 
qj if ξj occurs in q,
ηj =
any fixed tree from FΩ (Y ) otherwise.

(Note that bj is copying if ξj occurs in q.)

This ends the construction of P ′ . Obviously, B is an LF-transducer.


We show that A is equivalent to B.

(I) Assume that p ⇒∗A aq (p ∈ FΣ (X), q ∈ FΩ (Y ), a ∈ A) holds. We prove that

169
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

(Ia) p ⇒∗B aq if a is nondeleting but not copying,


(Ib) p ⇒∗B (a, q)q if a is copying,
(Ic) p ⇒∗B aq for some q ∈ FΩ (Y ) if a is deleting.
We shall proceed by induction on hg(p). If hg(p) = 0 then, by (i), (Ia), (Ib) and
(Ic) obviously hold.
Next let p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0), and write p ⇒∗A aq in the more
detailed form

σ(p1 , . . . , pm ) ⇒∗A σ(b1 q1′ , . . . , bm qm



) ⇒A aq ′ (q1′ , . . . , qm

) = aq

where σ(b1 , . . . , bm ) → aq ′ is in P and for each j (1 ≤ j ≤ m), pj ⇒∗A bj qj′ . Then,


by the induction hypothesis, for all j(= 1, . . . , m), we have pj ⇒∗B cj qj , where
(Ia′ ) cj = bj and qj = qj′ if bj is nondeleting and not copying,
(Ib′ ) cj = (bj , qj ) and qj = qj′ if bj is copying,
(Ic′ ) cj = bj and qj = q j for some q j ∈ FΩ (Y ) if bj is deleting.
Therefore:
(Ia′′ ) If a is nondeleting but not copying, then the production

σ(c1 , . . . , cm ) → aq ′ (η1 , . . . , ηm )

is in P ′ , were ηj (j = 1, . . . , m) is given by (iib).


(Ib′′ ) If a is copying then the production

σ(c1 , . . . , cm ) → (a, q ′ )q ′

with q ′ = q ′ (η1 , . . . , ηm ) is in P ′ , were ηj (j = 1, . . . , m) is given by (iic).


(Ic′′ ) If a is deleting then the production

σ(c1 , . . . , cm ) → aq ′

given by (iia) is in P ′ .
Thus, in all three cases the required derivations in B exist.

(II) Assume that one of the following relations hold:


(IIa) p ⇒∗B aq or
(IIb) p ⇒∗B (a, q)q
where p ∈ FΣ (X), q ∈ FΩ (Y ) and a ∈ A.
Then, by reversing the above computation, one can show that the desired deriva-
tions

170
4.6 Surface forests

(IIc) p ⇒∗A aq if a is nondeleting,


(IId) p ⇒∗A aq for some q ∈ FΩ (Y ) if a is deleting
exist. Since the final states are nondeleting, this ends the proof of the lemma. ✷

We can now state and prove

Theorem 4.6.10 Let A = (Σ, X, A, Ω, Y, P, A′ ) be an arbitrary F-transducer. Then τA


preserves regularity iff A is equivalent to an LF-transducer.

Proof. If A is equivalent to an LF-transducer then, by Lemma 4.6.5, τA preserves


regularity.
Conversely, let τA preserve regularity. We may assume that A is connected. Then by
Lemmas 4.6.8 and 4.6.9, A is equivalent to an LF-transducer. ✷

From Example 2.4.15, we directly obtain

Theorem 4.6.11 Neither F nor R preserves regularity. ✷

The following result shows that Surf(F) ⊂ Surf(R). More precisely, we have

Theorem 4.6.12 Surf(F) = Surf(H) and Surf(H) is a proper subclass of Surf(R).

Proof. The first statement of Theorem 4.6.12 follows from Theorem 4.3.3 and Lemma
4.6.5.
It is obvious that Surf(H) ⊆ Surf(R). We show that the inclusion is proper. For this,
consider Example 4.1.6. Moreover, let S = {ω2 (ω1n (y1 ), ω1n (y2 )) | n = 0, 1, . . .}. If R
denotes the regular forest {σ(x)} ·x {σ(x)}∗x , then RτA = S. Therefore, S ∈ Surf(R).
Assume that for an HR-transducer B = (∆, Z, {b0 }, Ω, Y, P ′ , b0 ) and regular forest
T ⊆ F∆ (Z), we have S = T τB. Then B can be chosen linear since in the opposite case
in T τB there is a tree with at least two occurrences of a subtree. Therefore, by Theorem
2.4.16, S is regular. But one can show similarly as in Example 2.4.15 that S is not
regular. ✷

Next we show some closure properties of surface forests which will be needed also in
Section 4.7.

Theorem 4.6.13 Let S ∈ Surf(F) and let T be a recognizable forest. Then S ∩ T ∈


Surf(F).

Proof. Let τ1 ⊆ FΣ (X) × FΩ (Y ) be an F-transformation and S = Rτ1 where R ∈


Rec(Σ, X). Take an arbitrary regular forest T ⊆ FΩ (Y ). Denote by τ2 ⊆ FΩ (Y )× FΩ (Y )
the DF-transformation given in the proof of Lemma 4.1.11 which corresponds to T . Then
Rτ1 ◦ τ2 = S ∩ T . But, by Theorem 4.3.9, τ1 ◦ τ2 is an F-transformation. ✷

For R-surface forests we have a similar result.

171
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Theorem 4.6.14 The intersection of an R-surface forest with a regular forest is again
an R-surface forest.

Proof. The proof is similar to that of the previous theorem, but now we shall use the fact
that the transformation given in the proof of Lemma 4.1.11 is an LNR-transformation.
Moreover, by Theorem 4.3.15, the composition of an R-transformation by an LNR-
transformation is again an R-transformation. ✷

By Theorem 4.3.7, DF is closed under composition. Therefore, Surf(DF ) is closed


under DF-transformations. Although DR is not closed under composition, we shall
show that Surf(DR) is closed under DR-transformations. For this, we need

Theorem 4.6.15 Let A = (Σ, X, A, Ω, Y, P, a0 ) and B = (Ω, Y, B, ∆, Z, P ′ , b0 ) be any


DR-transducers. Then there exists a DR-transducer C = (Σ, X, C, ∆, Z, P ′′ , c0 ) such that
for every R ⊆ FΣ (X), SτC = RτA ◦ τB, where S = R ∩ dom(τA ◦ τB).

Proof. Let C = A × B and c0 = (a0 , b0 ). We want ′′



 to define∗ P in such a way that
whenever ap ⇒A q a ∈ A, p ∈ FΣ (X), q ∈ FΩ (Y ) and bq ⇒B r b ∈ B, r ∈ F∆ (Z)
hold, then (a, b)p ⇒∗C r. If p ∈ Σ0 ∪ X, then (ap, q) ∈ P . If we put the production
(a, b)p → r in P ′′ , C will have the desired property for these a, b, p, q and r.
Now let p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0) and suppose

ap = aσ(p1 , . . . , pm ) ⇒A q(. . . , aij pi , . . .) ⇒∗A q(. . . , qij , . . .) = q,

where (aσ, q(. . . , aij ξi , . . .)) ∈ P (q ∈ F̂Ω (Y ∪ Ξn ) for some n) and aij pi ⇒∗A qij , i.e.,
the considered copy of pi is translated by A starting in state aij into qij . Furthermore,
suppose that applying to q the transducer B starting in b, we get

bq = bq(. . . , qij , . . .) ⇒∗B r(. . . , bij1 qij , . . . , bijk qij , . . .) ⇒∗B r(. . . , rij1 , . . . , rijk , . . .) = r

(bq ⇒∗B r, r ∈ F∆ (Z ∪ Ξn ), bijl qij ⇒∗B rijl , l = 1, . . . , k)


(meaning that the given occurrence of qij in q has k translations by B starting the
translations in states bij1 , . . . , bijk ). Thus, if we have the production

(a, b)σ → r . . . , (aij , bij1 )ξi , . . . , (aij , bijk )ξi , . . .

in P ′′ and suppose that C has the required property for trees with height less than hg(p),
then (a, b)p ⇒∗C r also holds. Accordingly, the formal definition of P ′′ reads as follows:

(i) The production (a, b)x → r (a, b) ∈ C, x ∈ X, r ∈ F∆ (Z) is in P ′′ if there is an
ax → q in P such that bq ⇒∗B r.

(ii) If the production aσ → q(a1 ξ1n1 , . . . , am ξmnm ) a ∈ A, σ ∈ Σ , m ≥ 0, a ∈ Ani , i =


m i

1, . . . , m, n1 + . . . + nm = n, q ∈ F̂Ω (Y ∪ Ξn ) is in P and

n′ n′1n n′ n′ 
bq ⇒∗B r b11 ξ1111 , . . . , b1n1 ξ1n1 1 , . . . , bm1 ξm1
m1 mnm
, . . . , bmnm ξmn m

172
4.7 Auxiliary concepts and results


bij ∈ B nij , ξij = ξn1 +...+ni−1 +j , i = 1, . . . , m, j = 1, . . . , ni , n′11 + . . . + n′mnm =

n′ , r ∈ F̂∆ (Z ∪ Ξn′ ) holds, then the production

n′ n′1n n′ n′ 
(a, b)σ → r (a1111 b11 , . . . , a1n 1 b1n1 )ξ1k1 , . . . , (amm1 mnm km
1 bm1 , . . . , amnm bmnm )ξm
1

in in P ′′ , where ki = n′i1 + . . . + n′ini (i = 1, . . . , m).

Obviously, C is a DR-transducer. Moreover, to prove the theorem it is enough to show


that for arbitrary (a, b) ∈ C, p ∈ FΣ (X), q ∈ FΩ (Y ) and r ∈ F∆ (Z), ap ⇒∗A q and
bq ⇒∗B r jointly imply (a, b)p ⇒∗C r. This can be proved by induction on hg(p). ✷

Let us note that the C constructed above may delete certain subtrees of input trees so
that dom(τC ) becomes larger than dom(τA ◦ τB ).
If R in Theorem 4.6.15 is regular then, by Corollary 4.3.17 and Theorem 2.4.2, S is
also regular. Thus we have

Corollary 4.6.16 Surf(DR) is closed under DR-transformations. ✷

4.7 AUXILIARY CONCEPTS AND RESULTS


In Section 4.3 it has been shown that neither F nor R is closed under composition.
In the next section we shall prove that compositions of n F-transformations or n R-
transformations lead to proper hierarchies when n assumes the values 0, 1, 2, . . ..
The purpose of this section is to introduce concepts and present results needed in
Section 4.8.
Let K be a class of forests and S a class of tree transformations. Then S(K) denotes
the class {T τ | T ∈ K, τ ∈ S}. Moreover, yd S(K) will stand for {yd(T ) | T ∈ S(K)}.

Definition 4.7.1 Let Σ be a ranked alphabet and X an alphabet. Let f be a mapping


which associates with each d ∈ Σ ∪ X a nonvoid recognizable forest Td ⊆ FΩ(d) (Ξ1 )
where Ω(d) is a ranked alphabet consisting of unary operational symbols only. It is also
supposed that Ω(d) is disjoint with Σ.
Now define theS mapping f from the set of all ΣX-forests into the set of subsets of
FΣ∪Ω (X) Ω = (Ω(d) | d ∈ Σ ∪ X) in the following way:

(i) if p ∈ Σ0 ∪ X, then f (p) = {q(p) | q ∈ Tp },

(ii) if p = σ(p1 , . . . , pm ) (σ ∈ Σm , m > 0, p1 , . . . , pm ∈ FΣ (X)), then

f (p) = {q(σ(q1′ , . . . , qm

)) | q ∈ Tσ , qi′ ∈ f (pi ), i = 1, . . . , m}, and
S 
(iii) if T ⊆ FΣ (X), then f (T ) = f (p) | p ∈ T .

173
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

The mapping f is called a regular insertion.


In the sequel we shall write simply f for f .
The above regular insertion can be interpreted as follows: f inserts directly below each
node of a tree p ∈ FΣ (X) a unary tree from the regular forest Td if the label of the node
in question is d. The insertion of ξ1 means that the given node is unchanged. The name
“regular insertion” is more expressive if trees are given in Polish prefix form. In this
case f inserts a word from Td directly before an occurrence of d in the word p.

Lemma 4.7.2 Rec is closed under regular insertion.

Proof. Let T ⊆ FΣ (X) be a regular forest and f a regular insertion given by f (d) = Td
(d ∈ Σ ∪ X, Td ⊆ FΩ (Ξ1 )). Consider a regular tree grammar G = (N, Σ, X, P, a0 )
given in normal form such that T (G) = T . Moreover, for every Td (d ∈ Σ ∪ X) let
Gd = (N d , Ω, Ξ1 , P d , ad0 ) be a regular tree grammar in normal form generating Td . For
each d ∈ Σ ∪ X and a ∈ N consider the tree grammar Gda = (Nad , Ω, Ξ1 , Pad , (ad0 , a)),
where Nad = N d × {a} and

Pad = {(ad , a) → ω((bd , a)) | ad → ω(bd ) ∈ P d } ∪ {(ad , a) → ξ1 | ad → ξ1 ∈ P d }.

Obviously, T (Gda ) = Td holds for each d ∈ Σ ∪ X and a ∈ N .


Assume that the sets of nonterminal symbols of the grammar Gd (d ∈ Σ ∪ X) are
pairwise disjoint and also disjoint with NSand N × (Σ ∪ X). Construct the tree grammar
G′ = (N ′ , Σ ∪ Ω, X, P ′ , a0 ), where N ′ = (Nad | d ∈ Σ ∪ X, a ∈ N ) ∪ N ∪ N × (Σ ∪ X)
and P ′ is given as follows:

P ′ = {a → (ad0 , a) | a ∈ N, d ∈ Σ ∪ X}
[ 
∪ Pad − {(ad , a) → ξ1 | ad ∈ N d } | a ∈ N, d ∈ Σ ∪ X
∪ {(ad , a) → (a, d) | ad → ξ1 ∈ Pad , ad ∈ N d , a ∈ N, d ∈ Σ ∪ X}
∪ {(a, σ) → σ(a1 , . . . , am ) | a → σ(a1 , . . . , am ) ∈ P, σ ∈ Σm , m > 0, a, a1 , . . . , am ∈ N }
∪ {(a, d) → d | a → d ∈ P, a ∈ N, d ∈ Σ0 ∪ X}.

From the construction of G′ it is obvious that the following statements are valid:

(ia) For any production a → σ(a1 , . . . , am ) ∈ P (σ ∈ Σm , m > 0) and tree q ∈ Tσ there


exists a derivation in G′

a ⇒ (aσ0 , a) ⇒∗ q((aσ , a)) ⇒ q((a, σ)) ⇒ q(σ(a1 , . . . , am )) (aσ ∈ N σ ).

(ib) For any production a → d ∈ P (d ∈ Σ0 ∪ X) and tree q ∈ Td there exists a


derivation in G′ , a ⇒ (ad0 , a) ⇒∗ q((ad , a)) ⇒ q((a, d)) ⇒ d (ad ∈ N d ).

Conversely,

(ii) for any a ∈ N and p ∈ FΣ∪Ω (X) each derivation a ⇒∗G′ p should have the form

174
4.7 Auxiliary concepts and results

(iia) a ⇒ (aσ0 , a) ⇒ q1 ((aσ1 , a)) ⇒ . . . ⇒ qn ((aσn , a)) ⇒ qn ((a, σ)) ⇒


qn (σ(a1 , . . . , am )) ⇒∗ p
for some a → σ(a1 , . . . , am ) ∈ P, qn ∈ Tσ , σ ∈ Σm , m > 0 and aσ0 , . . . , aσn ∈
N σ , or the form
(iib) a ⇒ (ad0 , a) ⇒ q1 ((ad1 , a)) ⇒ . . . ⇒ qn ((adn , a)) ⇒ qn ((a, d)) ⇒ qn (d) for some
a → d ∈ P, qn ∈ T, d ∈ Σ0 ∪ X and ad0 , . . . , adn ∈ N d .
Properties (ia), (ib), and (ii) obviously imply that T (G′ ) = f (T ). ✷

Lemma 4.7.3 Let K be a class of forests closed under regular insertion. Then R(K)
is also closed under regular insertion.

Proof. Let R ∈ K be an arbitrary ΣX-forest and take an R-transducer A =


(Σ, X, A, Ω, Y, P, A′ ). Set S = RτA . Moreover, for every d ∈ Σ ∪ X take a unary
operator #d , and let f be the regular insertion given by f (d) = {#d (ξ1 )}∗ξ1 .
First we shall show that if g is a regular insertion for which g(d) = {#(ξ1 )}∗ξ1 (d ∈
Ω ∪ Y ), then g(S) ∈ R(K).
′ ′
Construct the R-transducer B = (Σ S ∪ {#d | d ∈ Σ ∪ X}, X, B, Ω ∪ {#}, Y, P , A ) with
B = A ∪ C, where C = {p | p ∈ (sub(q) | q is the right-hand side of a rule in P ) −

Ξ }. Moreover, P is the union of the following ten sets of productions:

P1 ={a#d → #(aξ1 ), a#d → aξ1 | a ∈ A, d ∈ Σ ∪ X},

P2 ={a#d → ω(q 1 ξ1 , . . . , q m ξ1 ) | ad → q is in P for some


d ∈ Σ ∪ X, a ∈ A, q = ω(q1 , . . . , qm ), ω ∈ Ωm , m > 0},

P3 ={a#d → qξ1 | ad → q is in P for some d ∈ Σ, q = a′ ξi , a, a′ ∈ A},

P4 ={a#d → qξ1 | ad → q is in P for some d ∈ Σ ∪ X, a ∈ A, q = ω ∈ Ω0 },

P5 ={a#d → qξ1 | ad → q is in P for some d ∈ Σ ∪ X, a ∈ A, q = y ∈ Y },

P6 ={q#d → #(qξ1 ), q#d → ω(q 1 ξ1 , . . . , q m ξ1 ), q#d → qξ1 |


q = ω(q1 , . . . , qm ), ω ∈ Ωm , m > 0},

P7 ={aξi #d → #(aξi ξ1 ), aξi #d → aξi ξ1 , ω#d → #(ωξ1 ), ω#d → ωξ1 ,


y#d → #(yξ1 ), y#d → yξ1 | 1 ≤ i ≤ r(P ), r(P ) is the maximum of ranks
of the operators appearing in the left-hand sides of productions from P,
a ∈ A, ω ∈ Ω0 , y ∈ Y },

P8 ={aξi σ → aξi | a ∈ A, σ ∈ Σm , m > 0, 1 ≤ i ≤ m},

P9 ={ωd → ω | ω ∈ Ω0 , d ∈ Σ ∪ X} and

P10 ={yd → y | y ∈ Y, d ∈ Σ ∪ X}.

175
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

One can easily see that B works as follows: assume that for some a ∈ A, p ∈ FΣ (X)
and q ∈ FΩ (Y ) a derivation ap ⇒∗A q exists. Let q ′ be a tree obtained by inserting
in q arbitrary trees from {#(ξ1 )}∗ξ1 below symbols from Ω ∪ Y . Then for a p′ ∈ f (p),
ap′ ⇒∗B q ′ holds. Conversely, if for some a ∈ A, p ∈ FΣ (X), p′ ∈ f (p) and q ′ ∈ FΩ∪{#} (Y )
a derivation ap′ ⇒∗B q ′ holds then there is a q ∈ FΩ (Y ) such that q ′ ∈ g(q) and ap ⇒∗A q.
Now, consider an arbitrary regular insertion h (into ΩY -trees). For each d ∈ Ω ∪ Y ,
there is a regular tree grammar Gd = (Nd , Ω(d), Ξ1 , Pd , {ad0 }) such that h(d) = T (Gd ).
We may assume that every Gd is in normal form. Since Ω(d) is unary, this means that
the productions of Gd are of the form ad → ωd (a′d ) or ad → ξ1 (ad , a′d ∈ Nd , ωd ∈ Ω(d)).
Furthermore we may assume that the sets Nd are pairwise disjoint. Now construct the
R-transducer
C = (Ω ∪ {#}, Y, C, ∆, Y, P ′′ , C ′ )
with [
C= (Nd | d ∈ Ω ∪ Y ), C ′ = {ad0 | d ∈ Ω ∪ Y }
and
[  [  
∆= Ω(d) | d ∈ Ω ∪ Y ∪ Ω ∆1 = Ω(d) | d ∈ Ω ∪ Y ∪ Ω1 , ∆m = Ωm (m 6= 1) .

Furthermore, P ′′ is given as follows:

(I) ad # → ωd (a′d ξ1 ) (ad , a′d ∈ Nd , ωd ∈ Ω(d), d ∈ Ω ∪ Y ) is in P ′′ if ad → ωd (a′d ) is in


Pd .

(II) aω ω → ω(ad10 ξ1 , . . . , adm0 ξm ) is in P ′′ for ω ∈ Ωm , m ≥ 0, d1 , . . . , dm ∈ Ω ∪ Y


and aω ∈ Nω if aω → ζ1 is in Pω .

(III) For each y ∈ Y and ay ∈ Ny , ay y → y is in P ′′ if ay → ξ1 is in Py .

Obviously, C is an R-relabeling. Therefore, by Theorem 4.3.15, τB ◦ τC = τ is an R-


transformation. Moreover, by the constructions of B and C, it is clear that the equality
h(S) = f (R)τ holds. ✷

In the next section we shall need

Theorem 4.7.4 Let τ : X ∗ → Y ∗ be a mapping induced by a deterministic gsm and


Σ a ranked alphabet. Then there exist a ranked alphabet Ω and a DRR -transducer B =
(Σ, X, B, Ω, Y, P ′ , b0 ) such that the equality yd(T )τ = yd(T τB) holds for every T ⊆
FΣ (X).

Proof. Consider the deterministic gsm A = (X, A, Y, a0 , P, A′ ) inducing τ . We shall


show the existence of a ranked alphabet Ω and that of a DRR -transducer B =
(Σ, X, B, Ω, Y, P ′ , b0 ) such that for any p ∈ FΣ (X),

(i) yd(pτB) = yd(p)τ if yd(p) ∈ dom(τ ), and

176
4.7 Auxiliary concepts and results

(ii) p ∈ dom(τB ) implies yd(p) ∈ dom(τ ).


These obviously will imply the validity of Theorem 4.7.4.
For each a1 , a2 ∈ A, let T (a1 , a2 ) denote the set of all such trees p ∈ FΣ (X) that
a1 yd(p) ⇒∗A wa2 holds forsome w ∈ Y ∗ . By Lemma 1.7.4 and Theorem 3.3.2, every
T (a1 , a2 ) = yd−1 L(a1 , a2 ) is a regular forest. Now let B = (A × A) ∪ {b0 } (b0 6∈ A)
and Ω = Σ ∪ {ωax | a ∈ A, x ∈ X}, where r(ωax ) equals the length of the word w
obtained from the production ax → wa′ ∈ P (a ∈ A′ ). (The ranks of symbols from Σ
are unchanged.) Moreover, P ′ is given as follows:
(I) For arbitrary m > 0, σ ∈ Σm and a1 , a2 , . . . , am+1 ∈ ′
 A, P contains the production
(a1 , am+1 )σ → σ((a1 , a2 )ξ1 , . . . , (am , am+1 )ξm ), D where D(ξi ) = T (ai , ai+1 ) (i =
1, . . . , m).

(II) If σ ∈ Σ0 and a ∈ A, then the production (a, a)σ → σ is in P ′ .

(III) For arbitrary x ∈ X and (a1 , a2 ) ∈ A×A, P ′ contains the production (a1 , a2 )x → q,
where a1 x ⇒A wa2 (w ∈ Y ∗ ) and q ∈ FΩ (Y ) is a fixed tree with yd(q) = w (such
q exists by the definition of ωa1 x ).

(IV) For arbitrary m > 0, σ ∈ Σm and a1 , . . . , am+1 ∈ A, if a1 = a0 and am+1 ∈ A′ ,


then the production b0 σ → σ((a1 , a2 )ξ1 , . . . , (am , am+1 )ξm ), D is in P ′ , where
D(ξi ) = T (ai , ai+1 ) (i = 1, . . . , m).

(V) For arbitrary x ∈ X, if a0 x ⇒A wa1 (w ∈ Y ∗ ) and a1 ∈ A′ , then the production


b0 x → q is in P ′ , where q ∈ FΩ (Y ) is a fixed tree with yd(q) = w (again, by the
definition of ωa0 x , such q exists).

(VI) If a0 ∈ A′ and σ ∈ Σ0 , then the production b0 σ → σ is in P ′ .


In order to prove Theorem 4.7.4 it is enough to show that for arbitrary (a1 , a2 ) ∈
A × A, p ∈ FΣ (X) and q ∈ FΩ (Y ) the implication

(a1 , a2 )p ⇒∗B q =⇒ a1 yd(p) ⇒∗A yd(q)a2

holds. This can be carried out by induction on hg(p). ✷

We shall now introduce some more concepts that will be needed in the next section.
Let A = (Σ, X, A, Ω, Y, P, A′ ) be an R-transducer. Take a tree p ∈ FΣ (X) and a node
d of p. Denote by s the subtree of p at this node d. Consider a state a and a derivation
α : ap ⇒∗ q (q ∈ FΩ (Y )). Suppose exactly k copies of this occurrence of s are created
during α and that these are translated into the trees t1 , . . . , tk (∈ FΩ (Y )) starting the
translations, respectively, in states a1 , . . . , ak . In the next definition we distinguish a
sequence of these states which will be called the state-sequence of α at d.

Definition 4.7.5 Let A = (Σ, X, A, Ω, Y, P, A′ ) be an R-transducer. Take a derivation

α : ap ⇒∗ q (a ∈ A, p ∈ FΣ (X), q ∈ FΩ (Y )).

177
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Let d be a node of p and s the subtree at this node d. Replace the given occurrence of
s in p by ξ1 and denote by r the resulting tree. Write α in the form

ap = ar(s) ⇒∗ q(asn ) ⇒∗ q(t),


ˆ
where q ∈ F̂Ω (Y ∪ Ξn ), a ∈ An , ar ⇒∗ q(aξ1n ), asn ⇒∗ t and t ∈ FΩ (Y )n . Denote by
ai di → qi (ai ∈ A, di ∈ Σ ∪ X) the production applied first in the derivation ai s ⇒∗ ti
(i = 1, . . . , n). Then a = (a1 , . . . , an ) is the state-sequence and

(a1 d1 → q1 , . . . , an dn → qn )

is the production-sequence of α at d.

Often we shall speak about the state-sequence and production-sequence of α at a


subtree s. In such cases the node to which the given occurrence of s belongs will be clear
from the context.
We now define state-sequences for derivations in GSDTs.

Definition 4.7.6 Let A = (Σ, X, A, Y, P, A′ ) be a GSDT. Take a derivation

α : ap ⇒∗ w (a ∈ A, p ∈ FΣ (X), w ∈ Y ∗ ).

Let d be a node of p and s the subtree of p at d. Replace the given occurrence of s in p


by ξ1 and denote by r the resulting tree. Write α in the form

ap = ar(s) ⇒∗ w1 a1 sw2 . . . wn an swn+1 ⇒∗ w1 v1 w2 . . . wn vn wn+1 ,

where ar ⇒∗ w1 a1 ξ1 w2 . . . wn an ξ1 wn+1 (wi ∈ Y ∗ , i = 1, . . . , n + 1, a1 , . . . , an ∈ A) and


ai s ⇒∗ vi (vi ∈ Y ∗ , i = 1, . . . , n). Then a = (a1 , . . . , an ) is the state-sequence of α at d.

Like in the case of R-transducers, we shall also speak about the state-sequence of α at
the subtree s.

Definition 4.7.7 Let A be an R-transducer A = (Σ, X, A, Ω, Y, P, A′ ) [a GSDT A =


(Σ, X, A, Y, P, A′ )]. Then a derivation α : ap ⇒∗ q (a ∈ A, p ∈ FΣ (X), q ∈ FΩ (Y ))
[β : ap ⇒∗ w (a ∈ A, p ∈ FΣ (X), w ∈ Y ∗ )] is k-copying if for every node d of p the
length of the state sequence of α [β] at d is at most k. Moreover, A is k-copying if every
derivation α : ap ⇒∗ q (p ∈ FΣ (X), q ∈ FΩ (Y )) [β : ap ⇒∗ w (p ∈ FΣ (X), w ∈ Y ∗ )]
with a ∈ A′ is k-copying. Finally, A is finite-copying if it is k-copying for some k.

We shall use the notation Rk for the class of all transformations induced by k-copying
R-transducers. Similarly, Gk denotes the class of all transformations induced by k-
copying GSDT’s. Moreover, Rf and Gf will stand for the classes of transformations
induced by finite-copying R-transducers and finite copying GSDT’s, respectively. Cor-
responding notations will be used for the classes DR, DG etc.
The next result shows that R-transformational languages can be studied through gen-
eralized syntax directed translations.

178
4.7 Auxiliary concepts and results

Theorem 4.7.8 For every k-copying GSDT A = (Σ, X, A, Y, P, A′ ) there exist a ranked
alphabet Ω and a k-copying R-transducer B = (Σ, X, A, Ω, Y, P ′ , A′ ) such that τA =
{(p, yd(q)) | (p, q) ∈ τB}.
Conversely, for every k-copying R-transducer B there exists a k-copying GSDT A such
that τA = {(p, yd(q)) | (p, q) ∈ τB }.

Proof. The R-transducer and GSDT constructed in the proof of Theorem 4.5.4 obviously
have the required properties. ✷

The following theorem gives sufficient conditions under which Rk (K) = DRk (K) holds
for a given class K of forests.

Theorem 4.7.9 Let K be a class of forests closed under relabeling and regular insertion.
Take an R-transducer A = (Σ, X, A, Ω, Y, P, A′ ), an R ∈ K and a positive integer k.
Then

S = {q ∈ FΩ (Y ) | there is a k-copying derivation ap ⇒∗ q for some a ∈ A′ and p ∈ R}

is in DRk (K).

Proof. Since K is closed under regular insertion, we may assume that A′ is a singleton.
Indeed, in the opposite case enlarge A by a new state a0 , Σ by a new unary operational
symbol σ and P by all productions a0 σ → aξ1 (a ∈ A′ ). Let A be the resulting R-
transducer with initial state a0 , and let R = f (R) , where f is a regular insertion given
by f (d) = {σ(ξ1 )} (d ∈ X ∪ Σ). Then R ∈ K and τA (R) = τA (R). Furthermore,
a derivation ap ⇒∗A q (a ∈ A′ , p ∈ R, q ∈ FΩ (Y )) is k-copying if the corresponding
derivation a0 σ(p) ⇒A∗ q is k-copying, and conversely. Thus, we shall assume that A′ =

{a0 }. Now we introduce the alphabet

X = {((a1 x, q1 ), . . . , (at x, qt )) | t ≤ k, x ∈ X, ai x → qi ∈ P (i = 1, . . . , t)}

and the ranked alphabet ∆ with

∆m = {((a1 σ, q1 ), . . . , (at σ, qt )) | t ≤ k, σ ∈ Σm , ai σ → qi ∈ P (i = 1, . . . , t)}

(m = 0, 1, . . .). Consider the R-transducer B = (Σ, X, {b0 }, ∆, X, P ′ , b0 ) where P ′ con-


sists of the productions

b0 x → ((a1 x, q1 ), . . . , (at x, qt )) (x ∈ X, ((a1 x, q1 ), . . . , (at x, qt )) ∈ X)

and
b0 σ → ((a1 σ, q1 ), . . . , (at σ, qt ))(b0 ξ1 , . . . , b0 ξm )
(σ ∈ Σm , ((a1 σ, q1 ), . . . , (at σ, qt )) ∈ ∆m , m = 0, 1, . . .).
Obviously, B is an R-relabeling which relabels trees in the following way: if σ ∈ Σ
[resp. x ∈ X] is a label at a node d of a tree p ∈ FΣ (X), then B relabels d by a

179
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

sequence of productions ((a1 σ, q1 ), . . . , (at σ, qt )) [resp. ((a1 x, q1 ), . . . , (at x, qt ))] from P


with length at most k.
Next define an R-transducer C = (∆, X, C, Ω, Y, P ′′ , c0 ) with

C = {(u; a1 , . . . , at ) | 1 ≤ u ≤ t ≤ k, ai ∈ A (i = 1, . . . , t)}

and c0 = (1; a0 ). Moreover, P ′′ is defined as follows:

(i) For each (u; a1 , . . . , at ) ∈ C and ((a1 p, q1 ), . . . , (at p, qt )) ∈ ∆0 ∪ X,


(u; a1 , . . . , at )((a1 p, q1 ), . . . , (at p, qt )) → qu is in P ′′ .

(ii) Let (u; a1 , . . . , at ) ∈ C and ((a1 σ, q1 ), . . . , (at σ, qt )) ∈ ∆m (m > 0). Write


(ai σ, qi ) in the more detailed form ai σ → qi (ai1 ξ1ni1 , . . . , aim ξm nim ) (a ∈ Anij , j =
ij
1, . . . , m, ni1 + . . . + nim = ni , qi ∈ F̂Ω (Y ∪ Ξni ), i = 1, . . . , t). Then the production

(u; a1 , . . . , at )((a1 σ, q1 ), . . . , (at σ, qt )) →



qu ((u11 ; b1 ), . . . , (u1nu1 ; b1 ))ξ1nu1 , . . . , ((um1 ; bm ), . . . , (umnum ; bm ))ξm
num

in is P ′ , provided that n1j + . . . + ntj ≤ k (j = 1, . . . , m), where ujl = n1j + . . . +


nu−1j + l, bj = (a1j , . . . , atj ) and j = 1, . . . , m.

Obviously, C is a deterministic R-transducer. Furthermore, one can easily see the


following connection between derivations in A and C:
Let p ∈ FΣ (X) and q ∈ FΩ (Y ) be arbitrary trees, and take a k-copying derivation

α : a0 p ⇒∗A q.

Consider the tree p with (p, p) ∈ τB which is the result of relabeling each node d of p by
the production-sequence of α at d. Then in C we have a derivation

β : (1; a0 )p ⇒∗C q

such that if a = (a1 , . . . , an ) (n ≤ k) is the state-sequence of α at d then ((1; a), . . . , (n; a))
is the state-sequence of β at d. Conversely, if for a p′ ∈ F∆ (X) and q ′ ∈ FΩ (Y ) there is
a derivation
β ′ : (1; a0 )p′ ⇒∗C q ′ ,
then for the (uniquely determined) tree p′ ∈ FΣ (X) with (p′ , p′ ) ∈ τB we have the
derivation
α′ : a0 p′ ⇒∗A q ′ .
Moreover, the state-sequence of β ′ at a node d of p′ is of the form ((1; a′ ), . . . , (m; a′ ))
(a′ = (a′1 , . . . , a′m )) with m ≤ k, and a′ is the state-sequence of α′ at d. Therefore, C
is k-copying and S = RτB ◦ τC holds. Since K is closed under relabelings, this implies
S ∈ DRk (K). ✷

From Theorem 4.7.9, by Theorem 4.7.8, we get

180
4.7 Auxiliary concepts and results

Corollary 4.7.10 Let K be a class of forests closed under relabeling and regular inser-
tion. Take a GSDT A = (Σ, X, A, Y, P, A′ ), a T ∈ K and a positive integer k. Then the
language

L = {w ∈ Y ∗ | there is a k-copying derivation ap ⇒∗ w for some a ∈ A′ and p ∈ T }

is in DG k (K). ✷

Three more language operations will be needed.

Definition 4.7.11 Let X be an alphabet and # 6∈ X a symbol. For each L ⊆ X ∗ ,


res(L, #) (regular substitution) denotes the language defined as follows:

(i) if L = {e}, then res(L, #) = #∗ ,

(ii) if L = {x} (x ∈ X), then res(L, #) = #∗ x#∗ ,

(iii) if L = {ux} (u ∈ X ∗ , x ∈ X), then res(L, #) = res(u, #)res(x, #),


S
(iv) if L is arbitrary, then res(L, #) = (res(w, #) | w ∈ L).

Theorem 4.7.12 Let K be a class of forests closed under regular insertion. For each
R ∈ K there exist a linear nondeleting GSDT A and a forest S ∈ K such that
res(yd(R), #) = SτA .

Proof. Let R ⊆ FΣ (X), R ∈ K, and denote yd(R) by L. Let ∆ = ∆1 = {d | d ∈ Σ ∪ X}


and let f be the regular insertion defined by f (d) = {d(ξ1 )}∗ξ1 (d ∈ Σ ∪ X). Define the
GSDT A = (Ω, X, {a0 }, X ∪ {#}, P, a0 ) with Ω = Σ ∪ ∆ (Ω1 = Σ1 ∪ ∆, Ωm = Σm , m 6= 1)
so that

P = {a0 x → #a0 ξ1 , a0 x → a0 ξ1 # | x ∈ X} ∪ {a0 σ → #a0 ξ1 | σ ∈ Σ0 }∪

{a0 x → x | x ∈ X} ∪ {a0 σ → a0 ξ1 . . . a0 ξm | σ ∈ Σm , m ≥ 0}.


Obviously, A is a linear nondeleting GSDT satisfying res(L, #) = f (R)τA . Moreover, by
our assumptions, f (R) = S ∈ K. ✷

Theorem 4.7.13 Let Y be an alphabet and # 6∈ Y a symbol. Take a language L ⊆ Y ∗


and a class K of forests closed under relabeling and regular insertion. If res(L, #) ∈
DG(K), then L ∈ DG f (K).

Proof. Let res(L, #) = T τA where A = (Σ, X, A, Y ∪ {#}, P, a0 ) is a deterministic


GSDT and T ⊆ FΣ (X) is a forest from K. Moreover, let A = {a1 , . . . , ak }. A
word yi1 #n1 yi2 #n2 . . . yir−1 #nr−1 yir (∈ res(L, #), yi1 , . . . , yir ∈ Y ) is called proper if
n1 , n2 , . . . , nr−1 are pairwise distinct.
Consider a derivation

α : a0 p ⇒∗ w1 b1 p1 w2 b2 p1 w3 . . . ws bs p1 ws+1 ⇒∗ w1 v1 w2 v2 w3 . . . ws vs ws+1 = w,

181
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

where p ∈ T , p1 is a subtree of p, (b1 , b2 , . . . , bs ) is the state-sequence of α at p1 , bi p1 ⇒∗ vi


(i = 1, . . . , s) and w1 , . . . , ws+1 , v1 , . . . , vs ∈ (Y ∪{#})∗ . If w is proper and bi = bj (i 6= j),
then in vi (and thus in vj ) at most one symbol from Y may occur.
Now for each σ ∈ Σm (m > 0) take all pairs (σ, M ), where M is a matrix of type k × m
whose elements are from Y ∪ AΞm ∪ {e}. Moreover, let Ω be a ranked alphabet with
Ω0 = Σ0 and Ωm = {(σ, M ) | σ ∈ Σm } (m > 0).
Let Y = {y1 , . . . , yl } and denote by Tij (i = 1, . . . , k, j = 1, . . . , l) the set of all trees
p ∈ FΣ (X) for which v ∈ #∗ yj #∗ , where v is the word obtained from the derivation
ai p ⇒∗ v. Moreover, let Til+1 (i = 1, . . . , k) be the forest of all trees p ∈ FΣ (X) satisfying
v ∈ #∗ , where v is obtained again by the derivation ai p ⇒∗ v.
By Theorems 4.5.4 and 3.3.2 and Corollary 4.3.17, the Tij (i = 1, . . . , k, j = 1, . . . , l+1)
are recognizable forests. Therefore, there are ΣX-recognizers Aij = (Aij , αij , A′ij ) (i =
1, . . . , k, j = 1, . . . , l + 1) with Aij = (Aij , Σ) such that T (Aij ) = Tij . Consider the
DF-relabeling B = (Σ, X, B, Ω, X, P ′ , B) where

B = {(pα̂11 , . . . , pα̂1l+1 , . . . , pα̂k1 , . . . , pα̂kl+1 ) | p ∈ FΣ (X)},

and P ′ is given as follows:


(i) For each x ∈ X, the production

x → (xα11 , . . . , xα1l+1 , . . . , xαk1 , . . . , xαkl+1 )x

is in P ′ .

(ii) For every σ ∈ Σ0 , the production

σ → (σ A11 , . . . , σ A1l+1 , . . . , σ Ak1 , . . . , σ Akl+1 )σ

is in P ′ .

(iii) For each σ ∈ Σm (m > 0) the productions

σ(b1 , . . . , bm ) → b(σ, M )(ξ1 , . . . , ξm )


(t) (t) (t) (t)
are in P ′, where bt = (b11 , . . . , b1l+1 , . . . , bk1 , . . . , bkl+1 ), b =
(1) (m)
(b11 , . . . , b1l+1 , . . . , bk1 , . . . , bkl+1 ) ∈ B (t = 1, . . . , m), bij = σ Aij (bij , . . . , bij )
(i = 1, . . . , k, j = 1, . . . , l + 1) and the element mit (i = 1, . . . , k, t = 1, . . . , m) of
matrix M is given by

 (t)
 e if bil+1 ∈ A′il+1 ,
mit = (t)
 yu if biu ∈ A′iu (1 ≤ u ≤ l),

ai ξt otherwise.
(t) (t)
Obviously, mit is well-defined since there are no two components bij1 and bij2
(t) (t)
(1 ≤ i ≤ k, 1 ≤ j1 , j2 ≤ l + 1, j1 6= j2 ) such that bij1 ∈ A′ij1 and bij2 ∈ A′ij2 both
hold.

182
4.7 Auxiliary concepts and results

By the definition of B, it relabels trees in the following way: take a tree p ∈ FΣ (X),
and let σ(p1 , . . . , pm ) (m > 0) be the subtree of p at a node d. The B provides us with the
information about which of the subtrees p1 , . . . , pm is translated by A(ai ) (i = 1, . . . , k)
into a word from (Y ∪ {#})∗ with

(I) no occurrence of letters from Y ,

(II) exactly one occurrence of letters from Y ,

(IIIa) at least two occurrences of letters from Y , or

(IIIb) the given subtree is not in dom(τA(ai ) ).

Next take the GSDT C = (Ω, X, A, Y, P ′′ , a0 ) where P ′′ is given as follows:



(a) If ap → w a ∈ A, p ∈ X ∪ Σ0 , w ∈ (Y ∪ {#})∗ is in P , then the production
obtained from ap → w by replacing all occurrences of # in w by e will be in P ′′ .

(b) Let aσ → w a ∈ A, σ ∈ Σm , m > 0, w ∈ (Y ∪ {#} ∪ AΞm )∗ be in P . Then
all productions a(σ, M ) → w′ are in P ′′ where w′ is the result of replacing all
occurrences of ai ξj in w by mij (1 ≤ i ≤ k, 1 ≤ j ≤ m) and all occurrences of #
by e.

It is clear that C is deterministic. Moreover, one can show by induction on hg(p) for
arbitrary a ∈ A, p ∈ FΣ (X) and w ∈ (Y ∪ {#})∗ the implication

ap ⇒∗A w =⇒ aτB (p) ⇒∗C ϕ(w)

holds, where ϕ : (Y ∪ {#})∗ → Y ∗ is the homomorphism given by ϕ(y) = y (y ∈ Y ) and


ϕ(#) = e. Thus

L = {w′ ∈ Y ∗ | a0 τB(p) ⇒∗C w′ , a0 p ⇒∗A w, (1)


∗ ′
p ∈ T, w ∈ (Y ∪ {#}) and w is proper if |w | > 2}.

Furthermore, by our remark concerning state-sequences of derivations yielding proper


words and the construction of C, the elements of a state-sequence of a derivation
a0 τB (p) ⇒∗C w′ from (1) are different at any node of τB (p). Therefore, since C has
k elements, each element of L can be obtained by a k-copying derivation in C. Finally,
since by our assumptions T τB ∈ K, using Corollary 4.7.10 we get L ∈ DG k (K). ✷

Definition 4.7.14 Let X be an alphabet and # 6∈ X a symbol. For each language


L ⊆ X ∗ , the language c∗ (L, #) is defined by

c∗ (L, #) = {(w#)n | w ∈ L, n = 1, 2, . . .}.

Theorem 4.7.15 Let K be a class of forests closed under regular insertion.


 For each
R ∈ K there exist a DGSDT A and a forest S ∈ K such that c∗ yd(R), # = SτA .

183
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Proof. Suppose R ⊆ FΣ (X) and let L = yd(R). We introduce the ranked alpha-
bet ∆ = ∆1 = {d | d ∈ Σ ∪ X} and define a regular insertion f by f (d) = {d(ξ1 )}∗ξ1
(d ∈ Σ ∪ X). Moreover, let Ω be the ranked alphabet for which Ω1 = Σ1 ∪ ∆ and
Ωm = Σm (m ≥ 0, m 6= 1). Consider the GSDT

A = (Ω, X, {a1 , a2 }, X ∪ {#}, P, a1 )

where

P = {a1 d → a1 ξ1 a2 ξ1 # | d ∈ Σ ∪ X}
∪ {a2 d → a2 ξ1 | d ∈ Σ ∪ X}
∪ {a1 x → e | x ∈ X} ∪ {a1 σ → e | σ ∈ Σm , m ≥ 0}
∪ {a2 x → x | x ∈ X} ∪ {a2 σ → a2 ξ1 . . . a2 ξm | σ ∈ Σm , m ≥ 0}.

It is obvious that A is a deterministic GSDT satisfying c∗ (L, #) = SτA , where S = f (R).


Moreover, by our assumptions S ∈ K. ✷

Theorem 4.7.16 Let U ⊆ c∗ (L, #) (L ⊆ Z ∗ , # 6∈ Z) be a language containing infinitely


many words (w#)n for each w ∈ L. Furthermore, let K be a class of forests closed under
relabeling and regular insertion. If U ∈ DG f R(K) , then L ∈ DG(K).

Proof. Let A = (Σ, X, A, Ω, Y, P, A′ ) be an R-transducer and B = (Ω, Y, B, Z ∪


{#}, P ′ , b0 ) a k-copying deterministic GSDT. Moreover, take a forest R ⊆ FΣ (X) from
K satisfying U = (RτA )τB . Since K is closed under regular insertion, we may, without
any loss of generality, assume that A′ is a singleton, say A′ = {a0 }. First we shall
construct an R-transducer A = (Σ, X, A, Ω, Y, P , a0 ) which translates every p ∈ FΣ (X)
into a tree q ∈ FΩ (Y ) in the same way as A provided that q ∈ dom(τB ). In addition, if
during the translation of p into q by A, an occurrence of a subtree p′ in p is translated
starting in a state a into a tree q ′ , then during the corresponding translation of p by
A, p′ will be translated starting in a state consisting of a and the state-sequence of the
derivation of q in B at the subtree q ′ . Thus, A will have the property that if during
the above translation of p by A, two copies of an occurrence of p′ are translated starting
in states a1 and a2 , respectively, into the trees q1 and q2 such that a1 = a2 , then the
state-sequences of the derivation of q in B at q1 and q2 coincide.
Let τB(q) = (w#)m (w ∈ Z ∗ ). If m is large enough, then the properties of A will
make it possible to replace in a derivation a0 p ⇒∗A q different derivations of p′ starting
from the same state by one of them such that for the resulting output tree q we shall

have τB (q) = (w#)m with m′ ≥ m. By prescribing the applications of productions of
A in this manner we shall arrive at a DR-transducer A1 such that (SτA1 )τB contains
infinitely many words (w#)m for each w ∈ L and S is obtained from R by a relabeling.
Afterwards applying a deterministic gsm to (SτA1 )τB , we shall get L.
Thus construct the R-transducer A = (Σ, X, A, Ω, Y, P , a0 ) where

A = {(a, b) | a ∈ A, b ∈ B n , n = 0, 1, . . . , k}

and a0 = a0 , (b0 ) . Moreover, P is given in the following way:

184
4.7 Auxiliary concepts and results


(i) Let ap → q a ∈ A, p ∈ X ∪ Σ0 , q ∈ FΩ (Y ) be in P and take a vector b ∈ B n
(0 ≤ n ≤ k). Then the production (a, b)p → q is in P .

(ii) Let aσ → q(a1 ξ1n1 , . . . , am ξm


nm ) a ∈ A, σ ∈ Σ , m > 0, a ∈ Ani , i = 1, . . . , m,
m i

n1 + . . . + nm = n, q ∈ F̂Ω (Y ∪ Ξn ) be in P and b = (b1 , . . . , bs ) ∈ B s . Moreover,
for every u (1 ≤ u ≤ s), and every j (1 ≤ j ≤ n) take the derivation

bu q ⇒∗B wuj1 buj1 ξj wuj2 . . . wujuj bujuj ξj wujuj +1



wuj1 , . . . , wujuj +1 ∈ (Z ∪ {#} ∪ B(Ξn − {ξj }))∗ , buj1 , . . . , bujuj ∈ B .

Set bj = (b1j1 , . . . , b1j1j , . . . , bsj1 , . . . , bsjsj ) (j = 1, . . . , n). Then the production

(a, b)σ → q ((a11 , b1 ), . . . , (a1n1 , bn1 ))ξ1n1 , ((a21 , bn1 +1 ), . . . 


. . . , (a2n2 , bn1 +n2 ))ξ2n2 , . . . , ((am1 , bn1 +...+nm−1 +1 ), . . . , (amnm , bn ))ξm
nm

is in P , provided that for each j = 1, . . . , n the length of the sequence bj is not


greater than k.
From the construction of A, one can easily see the following connection between A and
A. Take a tree p ∈ FΣ (X), a node  d of p and let p′ be the subtree of p at d. Moreover,
write p = r(p′ ) r ∈ F̂Σ (X ∪ Ξ1 ) , and consider a derivation

α : a0 r(p′ ) ⇒∗A q(ap′n ) ⇒∗A q(t) = q


ˆ 
q ∈ FΩ (Y ), a0 r ⇒∗A q(aξ1n ), q ∈ F̂Ω (Y ∪ Ξn ), ap′n ⇒∗A t, t ∈ FΩ (Y )n

with q ∈ dom(τB ). Then in A we have a derivation


 
β : a0 , (b0 ) r(p′ ) ⇒∗ q ((a1 , b1 ), . . . , (an , bn ))p′n ⇒∗ q(t) = q,

where bi (1 ≤ i ≤ n) is the state-sequence of the derivation



γ : b0 q ⇒∗B w ∈ (Z ∪ {#})∗

at the subtree ti . Therefore, if (ai , bi ) = (aj , bj ) (1 ≤ i, j ≤ n), then the state-sequences


of γ at the subtrees ti and tj coincide. We can assume that A itself has this property,
because the equality τA ◦ τB = τA ◦ τB obviously holds.
Consider a word (w#)m ∈ (RτA )τB with m > 2k + 1.  More exactly, let p ∈ R be a
tree for which under the derivation a0 p ⇒∗A q ∈ FΩ (Y ) the equality τB (q) = (w#)m
holds. Let r ∈ F̂Σ (X ∪ Ξ1 ) and p′ ∈ FΣ (X) with r(p′ ) = p. Moreover, write the above
derivation in the form
α′ : a0 r(p′ ) ⇒∗A q(ap′n ) ⇒∗A q(t) = q
ˆ 
q ∈ FΩ (Y ), a0 r ⇒∗A q(aξ1n ), q ∈ F̂Ω (Y ∪ Ξn ), ap′n ⇒∗A t, t ∈ FΩ (Y )n .

Assume that a state a ∈ A occurs more than once in a, and let ai1 , . . . , aij (1 ≤ i1 <
. . . < ij ≤ n) be all occurrences of a in a. Then the state-sequences of

β ′ : b0 q ⇒∗A (w#)m ∈ (Z ∪ {#})∗

185
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

at the subtrees ti1 , . . . , tij coincide. Let (b1 , . . . , bs ) be this common state-sequence.
Among ti1 , . . . , tij let ti1 be the tree for which τB(b1 ) (ti1 ) . . . τB(bs ) (ti1 ) has a maximal
number of occurrences of #. Replace the considered occurrences of ti1 , . . . , tij in q by

ti1 , and denote by q ′ the resulting tree. We claim that for q ′ we have τB (q ′ ) = (w#)m
with m′ ≥ m. To prove it let us distinguish the following two cases:

(I) There exists an r (1 ≤ r ≤ s) such that # occurs at least twice in the word
τB(br ) (ti1 ). Then our claim obviously holds.

(II) # occurs at most once in each word τB(b1 ) (ti1 ), . . . , τB(bs ) (ti1 ). Take a fixed r
(1 < r ≤ j), and write β ′ in the form

b0 q ⇒∗B w1 b1 tir w2 . . . ws bs tir ws+1 ⇒∗B


w1 v1 w2 . . . ws vs ws+1 = (w#)m .

Since m > 2k + 1 and s ≤ k, there exists a wu (1 ≤ u ≤ s + 1) such that # occurs


at least twice in wu . This also implies our claim.

Thus we have got the following result. If we replace in α′ every subderivation



ar p′ ⇒∗A tr (ar = a, r = i1 , . . . , ij ) by ap′ ⇒∗A ti1 , then b0 q ′ ⇒∗B (w#)m with m′ ≥ m
holds for the resulting output tree q ′ . Therefore, prescribing the applications of the
productions of A in this way, we arrive at a deterministic R-transformation whose com-
position by τB , applied to a suitable forest from K, for each w ∈ L yields infinitely many
words (w#)m (m ≥ 1), and only such words. Next we show how this can be carried out.
First we define a deterministic R-transducer A1 .
Let A = {a1 . . . , as }, and define a set X of variables by

X = { x, (c1 , . . . , cs ) | x ∈ X, ci = (ai x, qi ) ∈ P or ci = ∗, i = 1, . . . , s}

where ∗ is a new symbol. Moreover, define the ranked alphabet ∆, where for each
m (≥ 0),

∆m = { σ, (c1 , . . . , cs ) | σ ∈ Σm , ci = (ai σ, qi ) ∈ P or ci = ∗, i = 1, . . . , s}.

Now take the R-transducer A1 = (∆, X, A, Ω, Y, P1 , a0 ) for which P1 is given as follows:



(α) For each ai ∈ A and x, (c1 , . . . , cs ) ∈ X, if ci = (ai x, qi ), then the production

ai x, (c1 , . . . , cs ) → qi

is in P1 .

(β) For each ai ∈ A and σ, (c1 , . . . , cs ) ∈ ∆m , if ci = (ai σ, qi ), then the production

ai σ, (c1 , . . . , cs ) → qi

is in P1 .

186
4.7 Auxiliary concepts and results

Obviously, A1 is a deterministic R-transducer.


Next, let D = (Σ, X, {d0 }, ∆, X, P ′′ , d0 ) be the F-relabeling where
 
P ′′ = {x → d0 x, (c1 , . . . , cs ) | x ∈ X,  x, (c 1 , . . . , cs ) ∈ X}∪
{σ(d0 , . . . , d0 ) → d0 σ, (c1 , .. . , cs ) (ξ1 , . . . , ξm ) | σ ∈ Σm ,
σ, (c1 , . . . , cs ) ∈ ∆m , m ≥ 0}.
Put S = RτD . Since K is closed under relabeling, S ∈ K. Moreover, taking into
consideration the remarks preceding the construction of A1 , one can easily see that, for
each w ∈ L, (SτA1 )τB contains infinitely many words of the form (w#)m (m ≥ 1), and
only such words.
Finally, take the deterministic gsm C = (Z ∪ {#}, {c0 , c1 }, Z, c0 , PC , {c1 }) where
PC = {c0 z → zc0 | z ∈ Z} ∪ {c0 # → ec1 } ∪ {c1 z → ec1 | z ∈ Z ∪ {#}}.
Obviously, (w#)m τC = w for all w ∈ Z ∗ and m ≥ 1.
Denote by B1 the deterministic k-copying R-transducer obtained from B by Theorems
4.5.4 and 4.7.8. Moreover, let C1 be the DRR -transducer given to C by Theorem 4.7.4.
Then the equality L = yd(SτA1 ◦ τB1 ◦ τC1 ) holds. Thus, by a repeated application
of Theorem 4.4.6 (iii) and Corollary 4.4.8 (ii) and using Theorem 4.6.15 and Corollary
4.3.17, we get for a suitable deterministic R-transformation τ and a suitable T ∈ K the
equality T τ = SτA1 ◦ τB1 ◦ τC1 . (Observe that the F-transducer A given in Lemma 4.1.11
is an F-relabeling. Hence, closure under relabeling implies closure under intersection
with regular forests.) Finally, again by Theorem 4.5.4, we have L ∈ DG(T ). ✷
Definition 4.7.17 Let X be an alphabet and # 6∈ X a symbol. Then for L ⊆ X ∗ the
language c2 (L, #) is defined by c2 (L, #) = {w#w | w ∈ L}.
Theorem 4.7.18 Let K be a class of forests closed under relabeling and regular inser-
tion. If R ∈ K, then there exist a 2-copying GSDH-transducer A and a forest T ∈ K
such that c2 (yd(R), #) = T τA .
Proof. Suppose R ⊆ FΣ (X) and let L = yd(R). Moreover, take the ranked alphabet
∆ = ∆1 = {d | d ∈ Σ ∪ X}, and consider the regular insertion defined by f (d) =
{d(ξ1 )}∗ξ1 (d ∈ Σ ∪ X), and set S = f (R). Then S ∈ K. Finally, let Ω = Σ ∪ ∆ be the
ranked alphabet with Ω1 = Σ1 ∪ ∆ and Ωm = Σm (m ≥ 0, m 6= 1).
Now consider the R-relabeling B = (Ω, X, {b0 , b1 }, Ω, X, P, b0 ), where
P = {b0 d → d(b1 ξ1 ) | d ∈ Σ ∪ X}∪
{b1 σ → σ(b1 ξ1 , . . . , b1 ξm ) | σ ∈ Σm , m ≥ 0}∪
{b1 x → x | x ∈ X}.
Obviously, T = SτB consists of all trees of the form d(r), where r ∈ R and
d = root(r). Since B is a relabeling, T ∈ K. Now we construct the required GSDT
A = (Ω, X, {a0 }, X ∪ {#}, P ′ , a0 ), where
P ′ = {a0 d → a0 ξ1 #a0 ξ1 | d ∈ Σ ∪ X}∪
{a0 σ → a0 ξ1 . . . a0 ξm | σ ∈ Σm , m ≥ 0} ∪ {a0 x → x | x ∈ X}.
It is clear that A is a 2-copying GSDH-transducer and that c2 (L, #) = T τA holds. ✷

187
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Theorem 4.7.19 Let Y be an alphabet and # 6∈ Y a symbol. Take a language L ⊆ Y ∗


and a class K of forests closed under relabeling and regular insertion. If c2 (L, #) ∈ G(K),
then L ∈ DG(K).
Proof. The idea behind the proof is similar to that of Theorem 4.7.16, but this is much
simpler.
Let A = (Σ, X, A, Y ∪ {#}, P, A′ ) be a GSDT and R ∈ K a ΣX-forest such that
RτA = c2 (L, #). Since K is closed under regular insertion, we may assume that A′ is a
singleton, say A′ = {a0 }. 
Take a tree p ∈ R, a subtree p′ of p and let p = r(p′ ) r ∈ F̂Σ (X ∪ Ξ1 ) . Consider a
derivation
α : a0 r(p′ ) ⇒∗ w1 a1 p′ w2 . . . wk ak p′ wk+1 ⇒∗ w1 v1 w2 . . . wk vk wk+1 = w#w,
where a0 r(ξ1 ) ⇒∗ w1 a1 ξ1 w2 . . . wk ak ξ1 wk+1 , w1 , . . . , wk+1 , v1 , . . . , vk ∈ (Y ∪ {#})∗ and
ai p′ ⇒∗ vi (i = 1, . . . , k). Then (a1 , . . . , ak ) is the state-sequence of α at p′ . Assume that
a state a ∈ A occurs at least twice in (a1 , . . . , ak ), and let ai1 and ai2 (1 ≤ i1 < i2 ≤ k)
be two such occurrences of a. Then, taking the relevant occurrences of vi1 and vi2 in
w#w, we have the decomposition w#w = u1 vi1 u2 vi2 u3 . On the other hand the words
u1 vij u2 vij u3 (j = 1, 2) are also in RτA . Hence, vi1 = vi2 must hold. This implies that
if we replace for each t (1 ≤ t ≤ k) such that at = a, at p′ ⇒∗ vt by at p′ ⇒∗ vi1 , we get
the same word w#w. Therefore, prescribing accordingly the applications of productions
from P , we arrive at a deterministic GSDT yielding c2 (L, #). This can be carried out in
the same way as in the proof of Theorem 4.7.16, but here the resulting A1 is a DGSDT.
Thus, taking the F-relabeling D defined in the proof of Theorem 4.7.16, for S = RτD ,
we have S ∈ K and SτA1 = c2 (L, #). Moreover, by Theorem 4.5.4, there exists a DR-
transducer B1 with c2 (L, #) = yd(SτB1 ). Finally, consider the deterministic gsm C of
the proof of Theorem 4.7.16 with Y instead of Z, and let C1 be the corresponding DRR -
transducer. Then the equality L = yd(SτB1 ◦ τC1 ) holds. Thus, by Theorem 4.4.6 (iii),
Corollary 4.4.8 (ii), Theorem 4.6.15 and Corollary 4.3.17, for suitable DR-transformation
τ and a T ∈ K, we get T τ = SτB1 ◦ τC1 . This, by Theorem 4.5.4, implies L ∈ DG(T ). ✷

4.8 THE HIERARCHIES OF TREE TRANSFORMATIONS,


SURFACE FORESTS AND TRANSFORMATIONAL
LANGUAGES
In this section we prove that the compositions of n F-transformations or n R-
transformations form proper hierarchies when n = 0, 1, 2, . . .. Similar results will be
shown for the classes of forests (n-surface forests) which can be obtained from regular
forests by compositions of n F- or n R-transformations. All these results will follow from
the fact that the classes of languages (n-transformational languages) obtained by taking
the yields of n-surface forests form a proper hierarchy.
Definition 4.8.1 A forest T is an (n, R)-surface forest if T ∈ Surf(Rn ). (n, F)- and
(n, RR )-surface forests are defined in a similar way.

188
4.8 The hierarchies of tree transformations, surface forests andtransformational languages

Definition 4.8.2 A (string) language L is an (n, R)-transformational language if


L = yd(T ) for some (n, R)-surface forest T . (n, F)- and (n, RR )-transformational lan-
guages are defined similarly.
If n = 1 then we shall speak about R-, F- and RR -transformational languages, as well.

The following results show that in studying (n, R)-surface forests and (n, R)-
transformational languages we can use RR -transformations, too.

Theorem 4.8.3 For each natural number n, the equality Surf(Rn ) = Surf(RnR ) holds.

Proof. This follows from Theorems 4.4.6 (i) and 4.3.15 and Lemma 4.6.5. ✷

From Theorem 4.8.3 we directly get

Corollary 4.8.4 For every natural number n, the class of (n, R)-transformational lan-
guages coincides with the class of (n, RR )-transformational languages. ✷

Using Theorems 4.4.7 (i) and 4.2.7, from Theorem 4.8.3 we obtain

Corollary 4.8.5 For every natural number n, Surf(Rn ) is closed under LF-
transformations and LR-transformations. ✷

Now we can state and prove a result giving a recursive procedure by which the hierarchy
theorems can be proved easily. The procedure will be based on the “bridge theorems” of
the previous section which concern the operations res, c2 and c∗ . These associate with
each language which is not in a given class another language which is not in another,
larger class.

Theorem 4.8.6 Let K be a class of forests closed under relabeling and regular insertion.
If ydDRf (K) ⊂ ydR(K), then for each integer n ≥ 1,
 
ydRn (K) ⊂ ydDRf Rn (K) ⊂ ydDR Rn (K) ⊂ ydRn+1 (K).

Proof. By Theorem 4.3.15 and Lemma 4.7.3, Rn (K) is closed under relabeling and
regular insertion, for every n ≥ 1. In the sequel these facts will be used without further
mention.
We shall proceed by induction on n. Let n = 1. Take a forest R such that R ∈ R(K)
and yd(R) 6∈ ydDRf (K). Then by Theorems 4.7.12, 4.5.4 and 4.2.8  there exist an LNF-
transformation τ and a forest S ∈ R(K) such that res yd(R), # = yd(Sτ ). Moreover,
by Theorem 4.3.15, Sτ ∈ R(K). On the other hand, since yd(R) 6∈ ydDRf (K), by
Theorems 4.7.13 and 4.5.4, res yd(R), # 6∈ ydDR(K). Thus, the proper inclusion
ydDR(K) ⊂ ydR(K) holds.
Next take an R ∈ R(K) with yd(R) 6∈ ydDR(K). Then, by Theorems 4.7.18 and
4.7.8, there exist a 2-copying homomorphism τ and a forest S ∈ R(K) such that
c2 yd(R), # = yd(Sτ ). On the other hand, since yd(R) 6∈ ydDR(K), by Theorems 4.5.4

189
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS


and 4.7.19, c2 yd(R), # 6∈ ydR(K). Therefore, the inclusion ydR(K) ⊂ ydDRf (R(K))
is valid.
Again take an R ∈ R(K) with yd(R) 6∈ ydDR(K). By Theorems 4.7.15 and 4.5.4 there
exist a DR-transformation τ and a forest S ∈ R(K) such that, c∗ yd(R), # = yd(Sτ  ).
Moreover, since yd(R) ∈
6 ydDR(K), by Theorems 4.7.16 and 4.7.8, c∗ yd(R), # 6∈
ydDRf R(K) . Thus we have got that
 
ydDRf R(K) ⊂ ydDR R(K) .

Finally, take an R ∈ R2 (K) with yd(R) 6∈ ydDRf R(K) . Then again by Theorems
2
 there exist an LNF-transformation τ and a forest S2 ∈ R (K) such that
4.7.12 and 4.5.4,
res yd(R), # = yd(Sτ ). Moreover, by Theorem 4.3.15, Sτ ∈ R (K). On the other 
hand, since yd(R)
 6∈ ydDRf R(K) , by Theorems 4.7.13 and 4.7.8, res yd(R), # 6∈
ydDR R(K) . Therefore, ydDR R(K) ⊂ ydR2 (K).
Summarizing our results, we have
 
ydR(K) ⊂ ydDRf R(K) ⊂ ydDR R(K) ⊂ ydR2 (K)

which completes the proof for n = 1.


The transition from n to n + 1 is illustrated by Fig. 4.4. ✷

c∗

c2
res

ydDR(Rn (K)) ydRn+1 (K) ydDRf (Rn+1 (K)) ydDR(Rn+1 (K)) ydRn+2 (K)

Figure 4.4.

According to Theorem 4.8.6, to show that the classes of (n,R)-transformational lan-


guages form a proper hierarchy it is enough to prove the properness of the inclusion
ydDRf (Rec) ⊂ ydR(Rec). For this we need

Lemma 4.8.7 For each k-copying DGSDT A = (Σ, X, A, Y, P, a0 ) there exists a lin-
ear DGSDT B = (Σ, X, B, Y, P ′ , b0 ) such that Par(T τB ) = Par(T τA ), for every forest
T ⊆ FΣ (X).

Proof. For each w ∈ (Y ∪ AΞ)∗ , let w denote the word obtained from w by erasing all
aξ’s (a ∈ A, ξ ∈ Ξ).
Let B = {(a1 , . . . , an ) | n ≤ k, ai ∈ A(i = 1, . . . , n)} and b0 = (a0 ). Moreover, P ′ is
defined in the following way:

190
4.8 The hierarchies of tree transformations, surface forests andtransformational languages

(i) Let a = (a1 , . . . , an ) ∈ B and x ∈ X be arbitrary. Assume that the produc-


tions ai x → vi (ai ∈ A, vi ∈ Y ∗ , i = 1, . . . , n) are in P . Then the production
ax → v1 . . . vn is in P ′ .

(ii) Take an arbitrary a = (a1 , . . . , an ) ∈ B and σ ∈ Σm (m ≥ 0). Suppose P contains,


for each i = 1, . . . , n, a production

ai σ → wij1 aij1 ξj wij2 . . . wijij aijij ξj wijij +1 = wi



wij1 , . . . , wijij +1 ∈ (Y ∪ A(Ξm − {ξj }))∗ , aij1 , . . . , aijij ∈ A, 1 ≤ j ≤ m .

Then the production

aσ → (a111 , . . . , a1111 , . . . , an11 , . . . , an1n1 )ξ1 . . .


. . . (a1m1 , . . . , a1m1m , . . . , anm1 , . . . , anmnm )ξm w 1 . . . w n

is in P ′ , provided that 1j + . . . + nj ≤ k (j = 1, . . . , m).

Obviously, B is a linear DGSDT. Moreover, the derivations in A and in B are related


as follows. Take a vector a ∈ An (n ≤ k) and a tree p ∈ FΣ (X). Consider the derivations
α : apn =⇒∗A w, where w = w1 . . . wn ∈ Y ∗ and αi : ai p =⇒∗A wi (i = 1, . . . , n). By the
state-sequence of α at a node d of p we mean (a1 , . . . , an ), where ai (1 ≤ i ≤ n) is the
state-sequence of αi at d. Furthermore, we say that α is k-copying if the length of the
state-sequence of α at any node of p is at most k. Assume that α is k-copying. Then for
some w′ ∈ Y ∗ , β : apn =⇒∗B w′ exists. One can easily show by induction on hg(p) that
the state-sequence of β at any node d of p is of length one (if it exists) and coincides, as
a sequence of states of A, with the state-sequence of α at d. Finally, w is a permutation
of w′ . Therefore, the equality Par(T τA ) = Par(T τB ) holds. ✷

From Lemma 4.8.7, by Theorems 1.6.17 and 4.5.4 and Corollary 4.6.6, we get

Corollary 4.8.8 Let T ⊆ FΣ (X) be a recognizable forest and A = (Σ, X, A, Y, P, a0 ) a


finite-copying DGSDT. Then Par(T τA ) is semilinear. ✷

We now can state and prove that the hierarchy of (n,R)-transformational languages is
infinite.

Theorem 4.8.9 For every natural number n, the inclusions


 
ydRn (Rec) ⊂ ydDRf Rn (Rec) ⊂ ydDR Rn (Rec) ⊂ ydRn+1 (Rec)

hold.

Proof. By Lemma 4.7.2 and Corollary 4.6.6, Rec is closed under regular insertion and
relabeling. Thus, by Theorems 4.8.6, 4.5.4, and 4.7.8, and Corollary 4.8.8, it is enough
to show that there exist a regular forest T ⊆ FΣ (X) and a GSDT A = (Σ, X, A, Y, P, a0 )
such that Par(T τA ) is not semilinear. For this let Σ = Σ1 = {σ}, A = {a0 }, X = {x},

191
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Y = {y} and P = {a0 σ → a0 ξ1 a0 ξ1 , a0 x → y}. Moreover, let T = {σ(x)}∗x . Then


n
T τA = {y 2 | n = 0, 1, . . .}. Thus, Par(T τA ) = {h2n i | n = 0, 1, . . .}, which is obviously
not semilinear. ✷

From Theorem 4.8.9 we directly get

Corollary 4.8.10 For every natural number n the inclusions


(i) ydRn (Rec) ⊂ ydRn+1 (Rec),
(ii) Rn (Rec) ⊂ Rn+1 (Rec),
(iii) Rn ⊂ Rn+1
hold. ✷

Finally, we give two more hierarchies of transformational languages, surface forests


and tree transformations.

Theorem 4.8.11 For every natural number n the inclusions

ydRn (Rec) ⊂ yd F n+1 (Rec) ⊂ ydRn+1 (Rec)

are valid.

Proof. By Theorems 4.3.3 and 4.3.12 and Corollary 4.6.6, the inclusions ydRn (Rec) ⊆
yd F n+1 (Rec) ⊆ ydRn+1 (Rec) hold. By the proofs  of Theorems 4.8.6 and 4.8.9,
n n
ydR (Rec) is a proper subclass of ydH R (Rec) . Moreover, by Theorems 4.3.3

and 4.3.12 and Corollary 4.6.6, the equality H Rn (Rec) = F n+1 (Rec) holds.
Thus, the inclusion ydRn (Rec) ⊂ ydF 
n+1 (Rec) is valid. Finally, by Theorem
4.8.9, ydH R (Rec) ⊆ ydDR R (Rec) ⊂ ydRn+1 (Rec). Therefore, the inclusion
n n

yd F n+1 (Rec) ⊂ ydRn+1 (Rec) is also valid. ✷

From Theorem 4.8.11, using Theorems 4.3.3 and 4.3.12 and Corollary 4.6.6, we get the
following results.

Corollary 4.8.12 For every natural number n the inclusions

Rn (Rec) ⊂ F n+1 (Rec) ⊂ Rn+1 (Rec)

hold. ✷

Corollary 4.8.13 For every natural number n the inclusions


(i) ydF n (Rec) ⊂ ydF n+1 (Rec),
(ii) F n (Rec) ⊂ F n+1 (Rec),
(iii) F n ⊂ F n+1
are valid. ✷

192
4.9 The equivalence of tree transducers

4.9 THE EQUIVALENCE OF TREE TRANSDUCERS


Since the equivalence problem for (nondeterministic) generalized sequential machines
is undecidable, there exists no algorithm to decide for arbitrary two tree transducers
whether or not they are equivalent. In this section we show that there is an algorithm
for deciding the equivalence of two tree transducers when at least one of them induces
a partial mapping. Moreover, we shall prove that it is decidable whether the tree trans-
formation induced by a given tree transducer is a partial mapping when restricted to a
given recognizable forest.
We start by introducing a concept.
ˆ
Definition 4.9.1 Let p ∈ FΣ (X). A tree p′ ∈ F̂Σ (X ∪ Ξn ) is called a supertree of p if
there are trees p1 , . . . , pn ∈ FΣ (X) such that p = p′ (p1 , . . . , pn ).
To prove the decidability results we shall give five reduction rules formulated in the
following five lemmas. In these lemmas A = (Σ, X, A, Ω, Y, P, A′ ) will be a fixed R-
transducer and B = (B, β, B ′ ) will be a fixed ΣX-recognizer with B = (B, Σ) and
T (B) = T . Furthermore, set Q = {p ∈ T | |pτA | ≥ 2}, i.e., Q consists of all trees from T
which are translated into at least two different output trees by A.
Lemma 4.9.2 Let p1 , p2 ∈ F̂Σ (X ∪ Ξ1 ), p3 ∈ FΣ (X), n1 , n′1 , n2 , n′2 ≥ 0,
n′
q1 ∈ F̂Ω (Y ∪ Ξn1 ), q1′ ∈ F̂Ω (Y ∪ Ξn′1 ), q2 ∈ F̂Ωn1 (Y ∪ Ξn2 ), q′2 ∈ F̂Ω 1 (Y ∪ Ξn′2 ),
′ ′
q3 ∈ FΩ (Y )n2 , q′3 ∈ FΩ (Y )n2 , a0 , a′0 ∈ A′ and ai ∈ Ani , a′i ∈ Ani (i = 1, 2). Moreover,
set Ai = {aij | j = 1, . . . , ni } and A′i = {a′ij | j = 1, . . . , n′i } (i = 1, 2). Assume that the
following conditions are satisfied:

(i) p1 p2 (p3 ) ∈ T ,
n′
(ii) a0 p1 ⇒∗ q1 (a1 ξ1n1 ), a′0 p1 ⇒∗ q1′ (a′1 ξ1 1 ),
n′ n′
(iii) a1 pn2 1 ⇒∗ q2 (a2 ξ1n2 ), a′1 p2 1 ⇒∗ q′2 (a′2 ξ1 2 ),
n′
(iv) a2 pn3 2 ⇒∗ q3 , a′2 p3 2 ⇒∗ q′3 ,

(v) p3 β̂ = p2 (p3 )β̂, A1 ⊆ A2 , A′1 ⊆ A′2 ,



(vi) for all r ∈ FΩ (Y )n1 and r′ ∈ FΩ (Y )n1 , q1 (r) 6= q1′ (r′ ).
Then p1 (p3 ) ∈ Q.

Proof. First let us note that the conditions of Lemma 4.9.2 imply p1 p2 (p3 ) ∈ Q.
Next take two mappings f : {1, . . . , n1 } → {1, . . . , n2 } and g : {1, . . . , n′1 } → {1, . . . , n′2 }
such that a1i = a2f (i) (i = 1, . . . , n1 ) and a′1i = a′2g(i) (i = 1, . . . , n′1 ). By (v), there are
n′
such mappings f and g. Thus, by (iv), we have a1 pn3 1 ⇒∗ r and a′1 p3 1 ⇒∗ r′ with r =
(q3f (1) , . . . , q3f (n1 ) ) and r′ = (q3′ g(1) , . . . , q3′ ′ ). This, by (ii) implies a0 p1 (p3 ) ⇒∗ q1 (r)
g(n )
1
and a′0 p1 (p3 ) ⇒∗ q1′ (r′ ). By (vi), q1 (r) 6= q1′ (r′ ). Moreover by (v), p1 (p3 ) ∈ T . Therefore,
p1 (p3 ) ∈ Q. ✷

193
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Lemma 4.9.3 Let p1 ∈ F̂Σ (X ∪ Ξ1 ), p2 ∈ FΣ (X), n, n′ > 0, q1 ∈ F̂Ω (Y ∪ Ξn ),


′ ′
q1′ ∈ F̂Ω (Y ∪ Ξn′ ), q2 ∈ FΩ (Y )n , q′2 ∈ FΩ (Y )n , a0 , a′0 ∈ A′ , a ∈ An and a′ ∈ An .
Furthermore, let K be the maximum of the heights of the right-hand sides of the produc-
tions from P . Assume that the following conditions are satisfied:

(i) p1 (p2 ) ∈ T ,

(ii) a0 p1 ⇒∗ q1 (aξ1n ), a′0 p1 ⇒∗ q1′ (a′ ξ1n ),

(iii) apn2 ⇒∗ q2 , a′ pn2 ⇒∗ q′2 ,

(iv) path1 (q1 ) is an initial segment


 of path1 (q1′ ), and
l path1 (q1′ ) − l path1 (q1 ) > |pA|2 |B|K, hg(p2 ) ≥ |pA|2 |B|.

Then there exists an r ∈ FΣ (X) with |r| < |p2 | such that p1 (r) ∈ Q.

Proof. Set R = {r ∈ FΣ (X) | p1 (r) ∈ T , |r| ≤ |p2 |, ar n ⇒∗ s, a′ r n ⇒∗ s′ for some

s ∈ FΩ (Y )n and s′ ∈ FΩ (Y )n }. Obviously, R is nonvoid. Denote by r an element from
R with minimal length. We prove that p1 (r) ∈ Q and hg(r) < |pA|2 |B|.
First assume that hg(r) ≥ |pA|2 |B|. Then there are

r1 , r2 ∈ F̂Σ (X ∪ Ξ1 ), r3 ∈ FΣ (X), m1 , m′1 , m2 , m′2 ≥ 0, s1 ∈ F̂Ωn (Y ∪ Ξm1 ),


′ m′
s′1 ∈ F̂Ωn (Y ∪ Ξm′1 ), s2 ∈ F̂Ωm1 (Y ∪ Ξm2 ), s′2 ∈ F̂Ω 1 (Y ∪ Ξm′2 ),
′ ′
s3 ∈ FΩ (Y )m2 , s′3 ∈ FΩ (Y )m2 , bi ∈ Ami , b′i ∈ Ami (i = 1, 2) such

that

(I) r = r1 r2 (r3 ) , r2 6= ξ1 ,
′ ′
(II) ar1n ⇒∗ s1 (b1 ξ1m1 ), a′ r1n ⇒∗ s′1 (b′1 ξ m1 ),
m′1 m′
(III) b1 r2m1 ⇒∗ s2 (b2 ξ1m2 ), b′1 r2 ⇒∗ s′2 (b′2 ξ1 2 ),
m′2
(IV) b2 r3m2 ⇒∗ s3 , b′2 r3 ⇒∗ s′3 ,

(V) r3 β̂ = r2 (r3 )β̂, B1 ⊆ B2 and B1′ ⊆ B2′ , where Bi = {bij | 1 ≤ j ≤ mi },


Bi′ = {b′ij | 1 ≤ i ≤ m′i } (i = 1, 2).

Take two mappings f : {1, . . . , m1 } → {1, . . . , m2 } and g : {1, . . . , m′1 } → {1, . . . , m′2 }
such that b1i = b2f (i) (1 ≤ i ≤ m1 ) and b′1i = b′2g (i) (1 ≤ i ≤ m′1 ). Obviously,

atn ⇒∗ s1 (s3f (1) , . . . , s3f (m1 ) ) and a′ tn ⇒∗ s′1 (s′3g(1) , . . . , s′3 ), where t = r1 (r3 ). More-
g(m′ )
1
over, r1 (r3 )β̂ = r β̂ also holds. Therefore, r1 (r3 ) ∈ R, which is a contradiction since
|r1 (r3 )| < |r|.
Thus, we got that hg(r) < |pA|2 |B|. Therefore, for arbitrary vectors s ∈ FΩ (Y )n and
′ ′
s′ ∈ FΩ (Y )n satisfying ar n ⇒∗ s and a′ r n ⇒∗ s′ , the inequalities hg(s1 ), hg(s′1 ) ≤
|pA|2 |B|K hold. This, by (iv), obviously implies the conclusion of Lemma 4.9.3. ✷

194
4.9 The equivalence of tree transducers

Lemma 4.9.4 Let p1 , p2 , p3 ∈ F̂Σ (X ∪ Ξ1 ), p4 ∈ FΣ (X), ni , n′i , mi ≥ 0 (i = 1, 2, 3),

q1 ∈ F̂Ω (Y ∪ Ξn1 +1 ), q1′ ∈ F̂Ω (Y ∪ Ξn′1 +1 ), r1 ∈ F̂Ω (Y ∪ Ξm1 ),


n′
q2 ∈ F̂Ωn1 (Y ∪ Ξn2 ), q′2 ∈ F̂Ω 1 (Y ∪ Ξn′2 ), r2 ∈ F̂Ωm1 (Y ∪ Ξm2 ),
n′
q3 ∈ F̂Ωn2 (Y ∪ Ξn3 ), q′3 ∈ F̂Ω 2 (Y ∪ Ξn′3 ), r3 ∈ F̂Ωm2 (Y ∪ Ξm3 ),

q4 ∈ FΩ (Y )n3 , q′4 ∈ FΩ (Y )n3 , r4 ∈ FΩ (Y )m3 ,

a0 , a′0 ∈ A′ , a ∈ A, ai ∈ Ani , a′i ∈ Ani , bi ∈ Ami (i = 1, 2, 3).

Moreover, take an r ∈ FΩ (Y ), and let r ′ = r1 r2 (r3 (r4 )) . Finally, set Ai = {aij | j =
1, . . . , ni }, A′i = {a′ij | j = 1, . . . , n′i } and Bi = {b′ij | j = 1, . . . , mi } (i = 1, 2, 3). Assume
that the following conditions are satisfied:

(i) p1 p2 (p3 (p4 )) ∈ T ,
n′ 
(ii) a0 p1 ⇒∗ q1 (aξ1 , a1 ξ1n1 ), a′0 p1 ⇒∗ q1′ r1 (b1 ξ1m1 ), a′1 ξ1 1 ,
n′ n′
(iii) a1 pn2 1 ⇒∗ q2 (a2 ξ1n2 ), a′1 p2 1 ⇒∗ q′2 (a′2 ξ1 2 ),
ap2 ⇒∗ aξ1 , b1 pm ∗
2 ⇒ r2 (b2 ξ1 ),
1 m2

n′ n′
(iv) a2 pn3 2 ⇒∗ q3 (a3 ξ1n3 ), a′2 p3 2 ⇒∗ q′3 (a′3 ξ1 3 ),
ap3 ⇒∗ aξ1 , b2 pm ∗
3 ⇒ r3 (b3 ξ1 ),
2 m3

n′
(v) a3 pn4 3 ⇒∗ q4 , a′3 p4 3 ⇒∗ q′4 , ap4 ⇒∗ r, b3 pm ∗
4 ⇒ r4 ,
3


(vi) p4 β̂ = p3 (p4 )β̂ = p2 p3 (p4 ) β̂, A1 ⊆ A2 ⊆ A3 ,
A′1 ⊆ A′2 ⊆ A′3 , B1 = B2 ⊆ B3 ,

(vii) r 6= r ′ , path1 (q1 ) = path1 (q1′ ).


 
Then at least one of the trees p1 p2 (p4 ) , p1 p3 (p4 ) and p1 (p4 ) is in Q.

Proof. First note that the  conditions of Lemma 4.9.4  imply p1 p2 (p3 (p4 )) ∈ Q. Indeed,
let t = q1 ξ1 , q2 (q3 (q4 )) and t′ = q1′ ξ1 , q′2 (q′3 (q′4 )) . Then
 
a0 p1 p2 (p3 (p4 )) ⇒∗ t(r), a′0 p1 p2 (p3 (p4 )) ⇒∗ t′ (r ′ )

and t(r) 6= t(r ′ ).


Take six mappings fi : {1, . . . , ni } → {1, . . . , ni+1 }, gi : {1, . . . , n′i } → {1, . . . , n′i+1 } and

hi : {1, . . . , mi } → {1, . . . , mi+1 } (i = 1, 2)

such that
aij = ai+1fi (j) (i = 1, 2, 1 ≤ j ≤ ni ) a′ij = a′i+1g (j) (i = 1, 2, 1 ≤ j ≤ n′i ),
i
bij = bi+1hi (j) (i = 1, 2, 1 ≤ j ≤ mi ).

195
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Furthermore, set f3 = f1 ◦ f2 , g3 = g1 ◦ g2 and h3 = h1 ◦ h2 . Moreover, introduce the


notations

s1 = (q3f1 (1) , . . . , q3f1 (n1 ) )(q4 ), s′1 = (q3′ g , . . . , q3′ g ′


)(q′4 ),
1 (1) 1 (n1 )

t1 = (r3h1 (1) , . . . , r3h1 (m1 ) )(r4 ),


s2 = q2 (q4f2 (1) , . . . , q4f2 (n2 ) ), s′2 = q′2 (q4′ g , . . . , q4′ g ′
),
2 (1) 2 (n2 )

t2 = r2 (r4h2 (1) , . . . , r4h2 (m2 ) ),


s3 = (q4f3 (1) , . . . , q4f3 (n1 ) ), s′3 = (q4′ g , . . . , q4′ g ′
),
3 (1) 3 (n1 )

t3 = (r4h3 (1) , . . . , r4h3 (m1 ) ).

Then the following derivations obviously hold:


  
a0 p1 p3 (p4 ) ⇒∗ q1 (r, s1 ), a′0 p1 p3 (p4 ) ⇒∗ q1′ r1 (t1 ), s′1 ,
  
a0 p1 p2 (p4 ) ⇒∗ q1 (r, s2 ), a′0 p1 p2 (p4 ) ⇒∗ q1′ r1 (t2 ), s′2 ,

a0 p1 (p4 ) ⇒∗ q1 (r, s3 ), a′0 p1 (p4 ) ⇒∗ q1′ r1 (t3 ), s′3 .
 
It is also obvious that p1 p3 (p4 ) , p1 p2 (p4 ) , p1 (p4 ) ∈ T .
Now assume that p1 p2 (p4 ) 6∈ Q. Then, by (vi) and (vii), m1 , m2 , m3 > 0 and there
exists an i (1 ≤ i ≤ m2 ) such that r3i (r4 ) 6= r4h2 (i) . We can choose h1 in such a way that
for some j (1 ≤ j ≤ m  1 ) h1 (j) = i holds. Now assume that, under the latter choice of
h1 , none of p1 p3 (p4 ) and p1 (p4 ) are in Q. Then we get r1 (t1 ) = r1 (t3 ) = r. But this
is impossible since t1j 6= t3j . ✷

Lemma 4.9.5 Let p1 , p2 , p3 ∈ F̂Σ (X ∪ Ξ1 ), p4 ∈ FΣ (X), ni , n′i , mi ≥ 0 (i = 1, 2, 3),

q1 ∈ F̂Ω (Y ∪ Ξn1 +1 ), q1′ ∈ F̂Ω (Y ∪ Ξn′1 +1 ), r1 ∈ F̂Ω (Y ∪ Ξm1 ),


n′
q2 ∈ F̂Ωn1 (Y ∪ Ξn2 ), q′2 ∈ F̂Ω 1 (Y ∪ Ξn′2 ), r2 ∈ F̂Ωm1 (Y ∪ Ξm2 ),
n′
q3 ∈ F̂Ωn2 (Y ∪ Ξn3 ), q′3 ∈ F̂Ω 2 (Y ∪ Ξn′3 ), r3 ∈ F̂Ωm2 (Y ∪ Ξm3 ),

q4 ∈ FΩ (Y )n3 , q′4 ∈ FΩ (Y )n3 , r4 ∈ FΩ (Y )m3 ,

a0 , a′0 ∈ A′ , ai ∈ Ani , a′i ∈ Ani , bi ∈ Ami (i = 1, 2, 3).

Moreover, take an r ′ ∈ FΩ (Y ), and let r = r1 r2 (r3 (r4 )) . Finally, set Ai = {aij | j =
1, . . . , ni }, A′i = {a′ij | j = 1, . . . , n′i } and B = {bij | j = 1, . . . , mi } (i = 1, 2, 3). Assume
that the following conditions are satisfied:

(i) p1 p2 (p3 (p4 )) ∈ T ,
 n′
(ii) a0 p1 ⇒∗ q1 r1 (b1 ξ1m1 ), a1 ξ1n1 , a′0 p1 ⇒∗ q1′ (r ′ , a′1 ξ1 1 ),
n′ n′
(iii) a1 pn2 1 ⇒∗ q2 (a2 ξ1n2 ), a′1 p2 1 ⇒∗ q′2 (a′2 ξ1 2 ), b1 pm ∗ m2
2 ⇒ r2 (b2 ξ1 ),
1

n′ n′
(iv) a2 pn3 2 ⇒∗ q3 (a3 ξ1n3 ), a′2 p3 2 ⇒∗ q′3 (a′3 ξ1 3 ), b2 pm ∗ m3
3 ⇒ r3 (b3 ξ1 ),
2

196
4.9 The equivalence of tree transducers

n′
(v) a3 pn4 3 ⇒∗ q4 , a′3 p4 3 ⇒∗ q′4 , b3 pm ∗
4 ⇒ r4 ,
3


(vi) p4 β̂ = p3 (p4 )β̂ = p2 p3 (p4 ) β̂,
A1 ⊆ A2 ⊆ A3 , A′1 ⊆ A′2 ⊆ A′3 , B1 = B2 ⊆ B3 ,

(vii) r 6= r ′ , path1 (q1 ) = path1 (q1′ ).


 
Then at least one of the trees p1 p2 (p4 ) , p1 p3 (p4 ) and p1 (p4 ) is in Q.

Proof. The proof of this lemma is similar to that of Lemma 4.9.4. ✷

Lemma 4.9.6 Let

p1 , p2 ∈ F̂Σ (X ∪ Ξ1 ), p3 ∈ FΣ (X), k, l, m, k′ , l′ , m′ ≥ 0,
q1 ∈ F̂Ω (Y ∪ Ξk+1 ), q1′ ∈ F̂Ω (Y ∪ Ξk′ +1 ), q2 ∈ F̂Ω (Y ∪ Ξl+1 ), q2′ ∈ F̂Ω (Y ∪ Ξl′ +1 ),

r ∈ F̂Ωk (Y ∪ Ξm ), r′ ∈ F̂Ωk (Y ∪ Ξm′ ), q3 ∈ F̂Ω (Y ∪ Ξ1 ), q3′ , r ∈ FΩ (Y ),
′ ′
s ∈ FΩ (Y )l , s′ ∈ FΩ (Y )l , t ∈ FΩ (Y )m , t′ ∈ FΩ (Y )m , a0 , a′0 ∈ A′ , a, a′ ∈ A,
′ ′ ′
a ∈ Ak , a′ ∈ Ak , b ∈ Al , b′ ∈ Al , c ∈ Am and c′ ∈ Am .

Moreover, set A1 = {ai | i = 1, . . . , k}, B1 = {bi | i = 1, . . . , l}, C1 = {ci | i = 1, . . . , m},


A′1 = {a′i | i = 1, . . . , k′ }, B1′ = {b′i | i = 1, . . . , l′ } and C1′ = {c′i | i = 1, . . . m′ }. Assume
that the following conditions are satisfied:

(i) p1 p2 (p3 ) ∈ T ,

(ii) a0 p1 ⇒∗ q1 (aξ1 , aξ1k ), a′0 p1 ⇒∗ q1′ (a′ ξ1 , a′ ξ1k ),

(iii) ap2 ⇒∗ q2 (aξ1 , bξ1l ), a′ p2 ⇒∗ q2′ (a′ ξ1 , b′ ξ1l ),
′ ′
apk2 ⇒∗ r(cξ1m ), a′ pk2 ⇒∗ r′ (c′ ξ1m ),

(iv) ap3 ⇒∗ q3 (r), a′ p3 ⇒∗ q3′ , bpl3 ⇒∗ s, b′ pl3 ⇒∗ s′ ,
∗ ′ m ∗ ′ ′
cpm
3 ⇒ t, c p3 ⇒ t ,

(v) A1 ⊆ B1 ∪ C1 , A′1 ⊆ B1′ ∪ C1′ , p3 β̂ = p2 (p3 )β̂,

(vi) path1 (q1′ ) = path1 (q1 )path1 (q3 ) and r 6= q3′ .

Then p1 (p3 ) ∈ Q.

Proof. Introduce the notation d = (b, c), d′ = (b′ , c′ ), u = (s, t) and u′ = (s′ , t′ ).
Moreover, take two mappings f : {1, . . . , k} → {1, . . . , l + m} and g : {1, . . . , k ′ } →
{1, . . . , l′ + m′ } satisfying the equalities ai = df (i) (1 ≤ i ≤ k) and a′i = ug(i)
(1 ≤ i ≤ k′ ). Obviously, there are derivations a0 p1 (p3 ) ⇒∗ q1 q3 (r), uf (1) , . . . , uf (k) and
a′0 p1 (p3 ) ⇒∗ q1′ q3′ , u′g(1) , . . . , u′g(k′ ) . Moreover, p1 (p3 ) ∈ T . Since
 
path1 q1 (q3 (ξ1 ), uf (1) , . . . , uf (k) ) = path1 q1′ (ξ1 , u′g(1) , . . . , u′g(k′ ) )

197
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

and q3′ 6= r, q1 q3 (r), uf (1) , . . . , uf (k) ) 6= q1′ (q3′ , u′g(1) , . . . , u′g(k′ ) ). Hence, p1 (p3 ) ∈ Q. ✷

Now we are ready to state a theorem from which the main decidability results of this
section easily follow.

Theorem 4.9.7 There exists an algorithm to decide whether Q is empty.

Proof. Let K denote the maximum of the heights of the right-hand sides of the pro-
ductions from P, kAk = 2|A| and let L be the number of all words over {1, . . . , rΣ } with
length at most kAk2 |B|K, where rΣ is the maximal m for which Σm 6= ∅. Moreover, let
k = kAk2 |A|2 |B|2L + 1, l = k + (2kAk3 |A||B|)(kAk2 |B|K + 1) and m = l + 2kAk3 |B|.
We shall show that Q is nonvoid iff it contains a tree with height less than m. The
case K = 0 being obvious, we assume that K 6= 0.
Let p be an element of Q with minimal length, and q, q ′ ∈ FΩ (Y ) trees such that
q 6= q ′ and (p, q), (p, q ′ ) ∈ τA . Assume that hg(p) ≥ m. Then there are a0 , a′0 ∈ A′ ,
p0 , . . . , pm ∈ F̂Σ (X ∪ Ξ1 ), pm+1 ∈ FΣ (X), ni , n′i ≥ 0 (i = 0, . . . , m), q0 ∈ F̂Ω (Y ∪ Ξn0 ),
n n′
q0′ ∈ F̂Ω (Y ∪ Ξn′0 ), qi ∈ F̂Ω i−1 (Y ∪ Ξni ), q′i ∈ F̂Ω i−1 (Y ∪ Ξn′i ) (i = 1, . . . , m), qm+1 ∈
′ ′
FΩ (Y )nm , q′m+1 ∈ FΩ (Y )nm , ai ∈ Ani , a′i ∈ Ani (i = 0, . . . , m) such that the following
conditions are satisfied:

(1) p = p0 p1 (. . . (pm+1 ) . . .) , pi 6= ξ1 (i = 1, . . . , m),
 
(2) q = q0 q1 (. . . (qm+1 ) . . .) , q ′ = q0′ q′1 (. . . (q′m+1 ) . . .) ,
n′
(3) a0 p0 ⇒∗ q0 (a0 ξ1n0 ), a′0 p0 ⇒∗ q0′ (a′0 ξ1 0 ),
n n′ n′
ai pni+1
i
⇒∗ qi+1 (ai+1 ξ1 i+1 ), a′i pi+1
i
⇒∗ q′i+1 (a′i+1 ξ1 i+1 )
n′
(i = 0, . . . , m − 1), am pnm+1
m
⇒∗ qm+1 , a′m pm+1
m
⇒∗ q′m+1 .

For i = 0, . . . , m, introduce the notations  p̌ i = p 0 p 1 (. . . (p i ) . . .) , q̌i =

′ ′ ′ ′
q0 q1 (. . . (qi ) . . .) and q̌i= q0 (q1 (. . . (qi ) . . .) . Moreover, let p̂i = pi+1 . . . (pm+1 ) . . . ,
q̂i = qi+1 . . . (qm+1 ) . . . and q̂i′ = q′i+1 . . . (q′m+1 ) . . . (i = 0, . . . , m). Finally, set
Ai = {aij | 1 ≤ j ≤ ni } and A′i = {a′ij | 1 ≤ j ≤ n′i } (i = 0, . . . , m).

If q̌l (r) 6= q̌l′ (r′ ) holds for all r ∈ FΩ (Y )nl and r′ ∈ FΩ (Y )nl , then the fact that
m − l + 1 > |pA|2 |B| makes Lemma 4.9.2 applicable and hence there are i and j with
l ≤ i < j ≤ m such that p̌i (p̂j ) ∈ Q. This is obviously a contradiction since |p̌i (p̂j )| < |p|.
Thus, we may assume that at least one of nl and n′l , say nl , is greater than 0. Moreover,
it can also be supposed that there are an il (1 ≤ il ≤ nl ), an r ′ ∈ F̂Ω (Y ∪ Ξ1 ) and an
s′ ∈ FΩ (Y ) such that q ′ = r ′ (s′ ), path1 (r ′ ) = pathil (q̌l ) and s′ 6= q̂lil . Then for each
j < l, nj > 0. Now let ij (0 ≤ j < l, 1 ≤ ij ≤ nj ) be those uniquely determined integers
for which pathij (q̌j ) are initial segments of pathil (q̌l ). Without loss of generality, we may
assume that i0 = . . . = il = 1.
Now suppose that there exists no w ∈ {pathi (q̌l′ ) | 1 ≤ i ≤ n′l } such that path1 (q̌l ) is an
initial segment of w or w is an initial segment of path1 (q̌l ). Then for each i (l ≤ i ≤ m),

198
4.9 The equivalence of tree transducers

set
Bi = {aij | path1 (q̌l ) is an initial segment of pathj (q̌i )}
and
Ci = {aij | path1 (q̌l ) is not an initial segment of pathj (q̌i )}.
Since the cardinality of {l, . . . , m} is 2kAk3 |B|+1, there are i1 , i2 , i3 (l ≤ i1 < i2 < i3 ≤
m) such that the following conditions are satisfied: p̂i1 β̂ = p̂i2 β̂ = p̂i3 β̂, Bi1 = Bi2 ⊆ Bi3 ,
Ci1 ⊆ Ci2 ⊆ Ci3 and A′i1 ⊆ A′i2 ⊆ A′i3 . From this, by Lemma 4.9.5 we get that at least
one of the trees p̌i2 (p̂i3 ), p̌i1 (p̂i2 ) and p̌i1 (p̂i3 ) is in Q, which is again a contradiction.
Therefore, for an il (1 ≤ il ≤ n′l ), pathil (q̌l′ ) is an initial segment of path1 (q̌l ) or
path1 (q̌l ) is an initial segment of pathil (q̌l′ ). Let ij (0 ≤ j < l, 1 ≤ ij ≤ n′j ) be
those uniquely determined integers for which pathij (q̌j′ ) are initial segments of pathij (q̌l′ ).
Without loss of generality we may assume that i0 = . . . = il = 1. We can also assume
that path1 (q̌l ) is an initial segment of path1 (q̌l′ ).
Now let us distinguish the following two cases:

a) path1 (q̌k′ ) is an initial segmentof path1 (q̌l ). If in addition for some i (0 ≤ i ≤ k),
abs l(path1 (q̌i )) − l(path1 (q̌i′ )) > kAk2 |B|K then, by Lemma 4.9.3, there exists
an r ∈ FΩ (Y ) such that q̌i (r) ∈ Q and |r| < |p̂i |. (Here abs stands for absolute
value.) This obviously is a contradiction.
 Therefore, for each i (0 ≤ i ≤ k),
abs l(path1 (q̌i )) − l(path1 (q̌i′ )) ≤ ||A||2 |B|K. Then, since the cardinality of
{1, . . . , k} is kAk2 |A|2 |B|2L + 1, for some integers i and j (1 ≤ i < j ≤ k),
we have:
(I) path1 (q̌i ) is an initial segment of path1 (q̌i′ ), path1 (q̌j ) is an initial segment of
path1 (q̌j′ ), path1 (q̌i′ )/path1 (q̌i ) = path1 (q̌j′ )/path1 (q̌j ), or
(II) path1 (q̌i′ ) is an initial segment of path1 (q̌i ), path1 (q̌j′ ) is an initial segment
of path1 (q̌j ), path1 (q̌i )/path1 (q̌i′ ) = path1 (q̌j )/path1 (q̌j′ ). (Here uv/u = v for
any two words u and v.) Moreover, p̂j β̂ = p̂i β̂, ai1 = aj1 , a′i1 = a′j1 , Bi ⊆ Bj
and Bi′ ⊆ Bj′ , where Bs = {ast | 2 ≤ t ≤ ns } and Bs′ = {a′st | 2 ≤ t ≤
n′s } (s = i, j). Then, by Lemma 4.9.6, p̌i (p̂j ) ∈ Q, which is a contradiction
since |p̌(p̂ij )| < |p|.

b) path1 (q̌l ) is an initial segment of path1 (q̌k′ ). We shall show that


 
l path1 (q̌l ) − l path1 (q̌k ) > kAk2 |B|K.
 
Then l path1 (q̌k′ ) − l path1 (q̌k ) > kAk2 |B|K will also hold, which, by Lemma
4.9.3, will be a contradiction.
 
Thus, assume that l path1 (q̌l ) − l path1 (q̌k ) ≤ kAk2 |B|K. Then, since the
cardinality of {k + 1, . . . , l} is (2kAk3 |A||B|)(kAk2 |B|K + 1), there are i1 and
i2 (k ≤ i1 < i2 ≤ l) such that i2 − i1 = 2kAk3 |A||B| and path1 (q̌i1 ) = . . . =
path1 (q̌i2 ), i.e., q(i1 +1)1 = . . . = qi21 = ξ1 . Now for each j (i1 ≤ j ≤ i2 ) set

Bj = {a′jt | 1 ≤ t ≤ n′j , path1 (q̌i′1 ) is an initial segment of path1 (q̌j′ )}

199
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

and

Cj = {a′jt | 1 ≤ t ≤ n′j , path1 (q̌i′1 ) is not an initial segment of path1 (q̌j′ )}.

Since the cardinality of {i1 , . . . , i2 } is 2kAk3 |A||B| + 1, there are integers j1 , j2 and
j3 (i1 ≤ j1 < j2 < j3 ≤ i2 ) such that p̂j1 β̂ = p̂j2 β̂ = p̂j3 β̂, aj11 = aj21 = aj31 ,
Aj1 ⊆ Aj2 ⊆ Aj3 , Bj1 = Bj2 ⊆ Bj3 and Cj1 ⊆ Cj2 ⊆ Cj3 , where Ajt = {ajts | 2 ≤
s ≤ njt } (t = 1, 2, 3). Therefore, by Lemma 4.9.4, at least one of the trees p̌j2 (p̂j3 ),
p̌j1 (p̂j2 ) and p̌j1 (p̂j3 ) is in Q which is again a contradiction. ✷
Now we are ready to prove

Theorem 4.9.8 For any two R-transducers A = (Σ, X, A, Ω, Y, P, A′ ) and B =


(Σ, X, B, Ω, Y, P ′ , B ′ ) and any recognizable ΣX-forest T it is decidable
(i) whether τA |T is a (partial) mapping,

(ii) whether τA |T ⊆ τB |T , provided that τB|T is a (partial) mapping,

(iii) whether A is equivalent to B, provided that τA or τB is a (partial) mapping,


and
(iv) whether A is equivalent to B, provided that at least one of them is deterministic.

Proof. By Theorem 4.9.7, (i) is true. Moreover, (iii) and (iv) follow from (ii) since the
domain of an R-transformation is regular and, by Theorem 2.10.3, it is decidable for two
regular forests whether one of them contains the other one. Therefore, it is enough to
prove (ii).
We may assume that A ∩ B = ∅. Let us construct an R-transducer C =
(Σ, X, C, Ω, Y, P ′′ , C ′ ) with C = A ∪ B, C ′ = A′ ∪ B ′ and P ′′ = P ∪ P ′ . Obviously,
τC |T = τA |T ∪ τB|T . Thus τA |T ⊆ τB|T holds iff dom(τA ) ∩ T ⊆ dom(τB ) ∩ T and τC |T
is a partial mapping. ✷

Before stating the analogous result for F-transducers we prove a lemma.

Lemma 4.9.9 For any F-transducer A = (Σ, X, A, ∆, Y, P, A′ ) and R ∈ Rec(Σ, X)


one can effectively give an R-transducer B = (Ω, X, B, ∆, Y, P ′ , B ′ ) and a forest
S ∈ Rec(Ω, X) such that τA |R is a partial mapping iff τB|S is a partial mapping.

Proof. Construct an RR -transducer A = (Σ, X, A, ∆, Y, P , A′ ) where P is given as


follows:

(i) If x → ar x ∈ X, a ∈ A, r ∈ F∆ (Y ) is in P , then ax → r is in P .

(ii) If σ(a1 , . . . , am ) → ar σ ∈ Σm , m ≥ 0,  a 1 , . . . , a m , a ∈ A, r ∈ F∆ (Y ∪ Ξ m ) is
in P , then aσ → r(a1 ξ1 , . . . , am ξm ), D is in P , where D(ξi ) = dom(τA(ai ) ) (i =
1, . . . , m). Since, by Theorem 4.1.10 (i), dom(τA(a) ) (a ∈ A) is regular, A is an
RR -transducer. Observe that τA(a) ⊆ τA(a) holds for every a ∈ A.

200
4.9 The equivalence of tree transducers

We shall show that for all {a, a′ } ⊆ A and p ∈ FΣ (X) the equivalence

|τA(a) (p) ∪ τA(a′ ) (p)| > 1 ⇐⇒ |τA(a) (p) ∪ τA(a′ ) (p)| > 1 (1)

holds. (Note that a and a′ are not necessarily distinct.)


Since τA(a) ⊆ τA(a) , the left side of (1) implies its right side.
The converse will be proved by induction on hg(p). If hg(p) = 0, then our  statement
obviously holds. Now let p = σ(p1 , . . . , pm ) σ ∈ Σm , m > 0, p ∈ FΣ (X) and r, r ′ ∈
F∆ (Y ) be such that ap ⇒∗A r, a′ (p) ⇒∗A r ′ and r 6= r ′ . Moreover, assume that the right
side of (1) implies its left side for every state and every ΣX-tree of height less than
hg(p).
Let us write the above derivations in the form

aσ ⇒A r(an1 1 ξ1n1 , . . . , anmm ξm


nm
), ani i pni i ⇒A ri (i = 1, . . . , m)

and
n′ n′ n′ n′ n′ n′
a′ σ ⇒A r′ (b1 1 ξ1 1 , . . . , bmm ξmm ), bi i pi i ⇒A r′i (i = 1, . . . , m),
where a, a′ , ai , bi ∈ A, i = 1, . . . , m, n1 + . . . + nm = n, n′1 + . . . + n′m = n′ ,
r ∈ F̂∆ (Y ∪ Ξn ), r ′ ∈ F̂∆ (Y ∪ Ξn′ ), r(r1 , . . . , rm ) = r and

r′ (r1 , . . . , rm ) = r ′ . Moreover, σ(a1 , . . . , am ), ar(ξ1n1 , . . . , ξm
nm ) ,

n′ n′ 
σ(b1 , . . . , bm ), a′ r ′ (ξ1 1 , . . . , ξmm ) ∈ P.

Now distinguish the following two cases:

(I) There exists an i (1 ≤ i ≤ m) with ni > 0 and |τA(ai ) (pi )| > 1 or there exists a
j (1 ≤ j ≤ m) with n′j > 0 and |τA(bj ) (pj )| > 1. Then, by the induction hypothesis,
|τA(ai ) (pi )| > 1 or |τA(bj ) (pj )| > 1. Therefore, by the definition of P , |τA(a) (p)| > 1
or |τA(a′ ) (p)| > 1 also holds.

(II) Assume that there are no i and j satisfying (I). Then, ri1 = . . . = rini = ri
(1 ≤ i ≤ m) if ni > 0. For all such i, by τA(ai ) ⊆ τA(ai ) and the choice of D, we
have pi ⇒∗A ai ri . Moreover, again by the choice of D, if ni = 0 then also there
exists an ri ∈ F∆ (Y ) such that pi ⇒∗A ai ri holds. Thus, we have the derivation
p ⇒∗A ar. Using similar arguments, one can show that p ⇒∗A a′ r ′ is also valid.
Therefore, |τA(a) (p) ∪ τA(a′ ) (p)| > 1.

Thus, we have proved that τA |R is a partial mapping iff τA |R is a partial mapping. By


Theorem 4.4.6 (i), there exist a deterministic F-relabeling τ : FΣ (X) → FΩ (X) and an R-
transducer B = (Ω, X, B, ∆, Y, P ′′ , B ′ ) such that τA = τ ◦τB . Moreover, by Lemma 4.6.5,
Rτ = S is in Rec(Ω, X) and S can be obtained effectively from R. Therefore, τA |R is a
partial mapping iff τB |S is a partial mapping. ✷

Now we state and prove

201
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

Theorem 4.9.10 For any two F-transducers A = (Σ, X, A, Ω, Y, P, A′ ) and B =


(Σ, X, B, Ω, Y, P ′ , B ′ ) and recognizable ΣX-forest T , it is decidable

(i) whether τA |T is a partial mapping,

(ii) whether τA |T ⊆ τB |T , provided that τB|T is a partial mapping,

(iii) whether A is equivalent to B, provided that τA or τB is a partial mapping, and

(iv) whether A is equivalent to B, provided that at least one of them is deterministic.

Proof. Obviously, (i) follows from Theorem 4.9.8 by Lemma 4.9.9. Moreover, (ii) im-
plies (iii) and (iv) since, by Theorem 4.1.10 (i), the domain of an F-transformation is
recognizable. Thus, it suffices to prove (ii).
Assume that A ∩ B = ∅, and construct the F-transducer

C = (Σ, X, C, Ω, Y, P ′′ , C ′ )

with C = A ∪ B, C ′ = A′ ∪ B ′ and P ′′ = P ∪ P ′ . Obviously, τC = τA ∪ τB . Therefore,


τA |T ⊆ τB|T iff dom(τA ) ∩ T ⊆ dom(τB ) ∩ T and τC |T is a partial mapping. ✷

4.10 EXERCISES
1. Define generalized sequential machines as tree transducers when strings are inter-
preted as unary trees in the usual way.

2. Let τ be a DR-transformation. Then dom(τ ) can be recognized by a DR-recognizer.

3. Show that the classes LDF and LDR, and similarly the classes LN DF and
LN DR, are incomparable.

4. Let us call a DR-transducer A = (Σ, X, A, Ω, Y, P, A′ ) simple, if for every aσ → q ∈


P , whenever a1 ξi and a2 ξi occur in q, then a1 = a2 . If A is a simple DR-transducer,
then τA can be induced by an F-transducer.

5. Prove that DR is not closed under composition.

6. The composition of a totally defined DR-transformation by an R-transformation


is an R-transformation.

7. Is R closed under composition with LR-transformations from the right?

8. Show that F is not closed under composition with LNF-transformations from the
right.

9. Prove Theorems 4.3.7 and 4.3.9.

10. Find two R-transformations τ1 and τ2 such that τ1 ◦ τ2 is the F-transformation


given in Example 4.1.3.

202
4.10 Exercises

11. Give two F-transformations whose composition is the R-transformation of Exam-


ple 4.1.6.
12. Show that F and RR are incomparable.
13. Prove that DRR is closed under DF-transformations.
14. An F-transformation (or an R-transformation) is a partial mapping iff it can be
induced by a DRR -transducer.
15. Find a DRR -transducer which is not equivalent to any DR-transducer.
16. The equivalence problem of two RR -transducers is decidable, provided that at least
one of them induces a partial mapping.
17. Find an algorithm to decide for an F-transducer whether it is equivalent to an
LF-transducer.
18. Let A = (Σ, X, A, Y, P, A′ ) be a GSDT and Ω a ranked alphabet. Let {n1 , . . . , nr }
be the set of lengths of right-hand sides of all rules from P (each element of AΞ is
counted as one symbol). Moreover, let r(Ω) = {m1 , . . . , ms }. Assume that there
exists a mapping f : {n1 , . . . , nr } → r(Ω) such that the equality
nk = mf (k) + l1 (m1 − 1) + . . . + ls (ms − 1)
holds for every k(= 1, . . . , r), where l1 , . . . , ls≥ 0. Then there is an R-transducer
B = (Σ, X, B, Ω, P ′ , B ′ ) with τA = { p, yd(q) | (p, q) ∈ τB }.
19. Find an R-transducer A such that τA preserves recognizability, but A is not equiv-
alent to any LF-transducer.
20. An R-transducer A = (Σ, X, A, Ω, Y, P, a0 ) is called k-metalinear if the following
conditions are satisfied:
(1) a0 does not appear in the right-hand sides in rules from P ,
(2) for each rule a0 σ → q (σ ∈ Σm ) in P every ξi (1 ≤ i ≤ m) can occur in q at
most k times, and
(3) for each rule aσ → q (a 6= a0 , σ ∈ Σm ) in P the number of occurences of each
ξi (1 ≤ i ≤ m) in q is 0 or 1.
Let A be a k-metalinear R-transducer. Does τA preserve recognizability?
21. For a ranked alphabet Σ let Σ̃ = Σ̃0 ∪ Σ̃1 be the ranked alphabet with Σ̃0 = Σ0
and Σ̃1 = {σ̃ | σ ∈ Σm , m > 0}. Define the mapping ph : FΣ (X) → pFΣ̃ (X)
by ph(d) = {d} (d ∈ Σ0 ∪ X) and

ph σ(p1 , . . . , pm ) = {σ̃(t) | t ∈ ph(p1 ) ∪ . . . ∪ ph(pm )}

Sσ ∈ Σ m , m > 0,
 p 1 , . . . , p m ∈ FΣ (X) . Show that if T ∈ Surf(R) then ph(T ) =
ph(t) | t ∈ T is recognizable.

203
4 TREE TRANSDUCERS AND TREE TRANSFORMATIONS

22. Is Surf(R) closed under intersection?

23. Give a recursive definition of the concepts of state-sequence and production-


sequence.

24. For every F-transducer there is an equivalent totally defined F-transducer with a
single final state.

25. For every DF-transducer (DR-transducer) one can effectively give an equivalent
DF-transducer (DR-transducer) with a minimal number of states.

4.11 NOTES AND REFERENCES


The concept of the R-transducer was introduced by Rounds [215] and Thatcher [238]
thus extending generalized sequential machines from strings to trees and to give a tree
automaton formalism for parts of mathematical linguistics (in particular, for the theory
of syntax directed compilation). The F-transducer is due to Thatcher [239]. As in
the case of tree recognizers, many of the authors dealing with tree transducers allow a
symbol from a ranked alphabet to have more than one rank, and most of them use no
separate frontier alphabets.
The results of Section 4.2 can be found in Engelfriet [75], and most results of Section
4.3 are also from this work. Theorems 4.3.3, 4.3.12, 4.3.13 were obtained by Baker [26].
Tree transducers with regular look-ahead are defined and investigated in Engelfriet
[78]. Generalized syntax directed translations were introduced by Aho and Ullman [2]
in the special case where the domain of the translation is the forest of all parse trees of a
given context-free grammar. (Parse trees are almost the same as our production trees.)
Applying a generalized syntax directed translation in the sense of Aho and Ullman is
equivalent to applying a DGSDT of Section 4.5 which, by Theorem 4.5.4, is equivalent
to applying a DR-transducer and then taking the yield of the resulting tree. The more
general concept of a GSDT was introduced in Baker [28]. In the same work she proved
that for each n, ydSurf(Rn ) and ydSurf(F n ) are properly contained in the family of
deterministic context-sensitive languages.
The results of Section 4.6 are from Engelfriet [75], Gécseg [102] and Rounds [215].
The first result about the Surf(Rn )-hierarchy can be found in Ogden and Rounds
[190], where they proved that Surf(R) is a proper subclass of Surf(R2 ) and conjec-
tured the properness of the hierarchy. It was Engelfriet [80, 83] who succeeded in
proving that the Rn -, Surf(Rn )-, and ydSurf(Rn )-hierarchies (and their F-transducer
counterparts) are proper. Section 4.7 and 4.8 are based on his work.
The decidability results of Section 4.9 are from Ésik [90]. Using a different technique
Zachar [254] also proved the decidability of the equivalence problem of DF-transducers.
As a conclusion we mention some other topics relevant to the subject matter of Chap-
ter 4.
A sequential program machine (sp-machine) introduced by Buda [46] is such a gen-
eralization of a gsm whose inputs are strings and whose outputs are n-tuples of n-ary

204
4.11 Notes and references

trees. Buda showed that the equivalence problem of sp-machines is solvable and that
this implies that the equivalence of certain program schemes is also decidable.
Engelfriet and Filè introduced a new type of tree transducer called macro tree trans-
ducer which is a combination of the R-transducer and the context-free tree grammar
(see Engelfriet [82]). They propose to use macro tree transducers to model attribute
grammars of D. E. Knuth (Math. Systems Theory 2 (1968), 127–145: Correction: ibid
5 (1971), 95–96). For tree transformations in terms of magmoids we refer the reader to
Arnold and Dauchet [13, 16], Dauchet [61, 62], and Lilin [159, 160].
Finally, we note that much of the category theoretic work mentioned in the Notes and
References of Chapter 2 deal with tree transductions.

205
Bibliography
We hope that most of the literature dealing with tree automata, tree grammars, forests,
tree transductions, or their applications (published by the end of 1982) is listed in this
bibliography. It also includes some more general works which devote at least a part to
our subject, as well as a few items on closely related topics. As to the latter category
the decision on inclusion or exclusion has sometimes been difficult. Of a paper published
more than once in almost identical form, just the more complete, or the more widely
available, version is mentioned. Preliminary reports and unpublished theses are not
included except for a few cases. Items published by the same author(s) in the same year
are distinguished for reference by a letter after the year. For some of the most often
recurring journals and proceedings we use the following abbreviations:

n. Ann. ACM STC = Proceedings of the nth Annual ACM Symposium on Theory of
Computing
n. Coll. Lille = Les Arbres en Algébre et en Programmation, nth Colloque du Lille,
Université de Lille I
IC = Information and Control
n. IEEE Symp. (n ≤ 15) = nth Annual Symposium on Switching and Automata Theory
n. IEEE Symp (n > 15) = nth Annual Symposium on Foundations of Computer Science
J. ACM = J. Assoc. Comput. Mach.
J. CSS = J. Comput. System Sci.
LN in CS = Lecture Notes in Computer Science (Springer-Verlag)
MST = Mathematical Systems Theory
S-C-C = Systems-Computers-Controls

[1] ADÁMEK, J. and TRNKOVÁ, V. (1981): Varietors and machines in a categry. – Al-
gebra Universalis 13 (1981), 89-132.

[2] AHO, A. V. and ULLMAN, J. D. (1971): Translations on a context-free grammar. –


IC 19 (1971), 439-475.

[3] ALAGIĆ, S. (1975a): Categorical theory of tree processing. – Category Theory Ap-
plied to Computation and Control (Proc. Symp., San Francisco, 1974), LN in CS
25 (1975), 65-72.

[4] ALAGIĆ, S. (1975b): Natural state transformations. – J. CSS 10 (1975), 266-307.

[5] ARBIB, M. A. and GIVE’ON, Y. (1968): Algebra automata I: Parallel programming


as a prolegomena to the categorical approach. – IC 12 (1968), 331-345.

207
Bibliography

[6] ARBIB, M. A. and MANES, E. G. (1974): Machines in a category: An expository


introduction. – SIAM Review 16 (1974), 163-192.

[7] ARBIB, M. A. and MANES, E. G. (1978): Tree transformations and the semantics of
loop-free programs. – Acta Cybernet. 4 (1978), 11-17.

[8] ARBIB, M. A. and MANES, E. G. (1979): Interwined recursion, tree transformations,


and linear systems. – IC 40 (1979), 144-180.

[9] ARNOLD, A. (1977a): Rational sets of trees. – 2. Coll. Lille (1977), 20-28.

[10] ARNOLD, A. (1977b): Systèmes d’equations dans le magmoide. Ensembles rationnels


et algébriques d’arbres. – Thèse de doctorat, Université de Lille I (1977).

[11] ARNOLD, A. (1980): Le théorème de transversale rationnelle dans les langages


d’arbres. – MST 13 (1980), 275-282.

[12] ARNOLD, A. and DAUCHET, M. (1976a): Theorie des magmoides. – 1. Coll. Lille
(1976), 15-30.

[13] ARNOLD, A. and DAUCHET, M. (1976b): Bimorphismes de magmoides. – 1. Coll.


Lille (1976), 31-43.

[14] ARNOLD, A. and DAUCHET, M. (1976c): Transductions de forêts reconnaissables


monadiques. Forêts corégulières – RAIRO Informat. Théor. 10 (1976), No. 3, 5-28.

[15] ARNOLD, A. and DAUCHET, M.(1976d): Un théorème de duplication pour les forêts
algébriques. – J. CSS 13 (1976), 223-244.

[16] ARNOLD, A. and DAUCHET, M. (1976e): Bi-transductions de forêts, – Automata,


Languages and Programming (Conf. Rec., Edinburgh, 1976), University Press, Ed-
inburgh (1976), 74-86.

[17] ARNOLD, A. and DAUCHET, M. (1977): Un théorème de Chomsky-Schützenberger


pour les forêts algébriques. – Calcolo 14 (1977), 161-184.

[18] ARNOLD, A. and DAUCHET, M. (1978a): Forêts algébriques et homomorphismes


inverses. – IC 37 (1978), 182-196.

[19] ARNOLD, A. and DAUCHET, M. (1978b): Sur l’inversion des morphismes d’arbres.
– Automata, Languages and Programming (Fifth Coll., Udine 1978), LN in CS 62
(1978), 26-35.

[20] ARNOLD, A. and DAUCHET, M. (1978c): Une relation d’equivalence decidable sur la
classe des forêts reconnaissables. – MST 12 (1978), 103-128.

[21] ARNOLD, A. and DAUCHET, M. (1978d, 1979): Theorie des magmoides


(I) – RAIRO Inform. Théor. 12 (1978), 235-257.

208
Bibliography

(II) – RAIRO Inform. Théor. 13 (1979), 135-154.


[22] ARNOLD, A. and DAUCHET, M. (1982): Morphismes et bimorphismes d’arbres. –
Theor. Comput. Sci. 20 (1982), 33-93.
[23] ARNOLD, A. and LEGUY, B. (1979a): Une propriété des forêts algébriques ”de
Greibach”. – 4. Coll. Lille (1979), 1-17.
[24] ARNOLD, A. and LEGUY, B. (1979b): Forêts de Greibach et homomorphismes in-
verses. – Fundam. Comput. Theory ’79 (Proc. Conf., Berlin/Wendisch-Rietz 1979),
Akademie-Verlag Berlin (1979), 31-37.
[25] ASVELD, P. R. J. and ENGELFRIET, J. (1979): Extended linear macro grammars,
iteration grammars, and register programs. – Acta Inform. 11 (1979), 259-285.
[26] BAKER, B. S. (1973): Tree transductions and families of tree languages. – 5. Ann.
ACM STC (1973), 200-206.
[27] BAKER, B. S. (1978a): Tree transducers and tree languages. – IC 37 (1978), 241-266.
[28] BAKER, B. S. (1978b): Generalized syntax directed translation, tree transducers,
and linear space. – SIAM J. Comput. 7 (1978), 876-891.
[29] BAKER, B. S. (1979): Composition of top-down and bottom-up tree transductions.
– IC 41 (1979), 186-213.
[30] BARRERO, A. and GONZALEZ, R. C. (1976): Minimization of deterministic tree gram-
mars and automata. – Proc. IEEE Conf. Decision and Control and the 15th Symp.
Adaptive Processes (Clearwater, Fla., 1976), Inst. Electr. Electron. Engrs., New
York (1976), 404-407.
[31] BARRERO, A., GONZALEZ, R. C. and THOMASON, M. G. (1981): Equivalence and
reduction of expansive tree grammars. – IEEE Trans. Pattern Anal. & Mach. Intell.
PAMI – 3 (1981), 204-206.
[32] BENSON, D. B. (1975): Semantic preserving translations. – MST 8 (1975), 105-126.
[33] BERGER, J. and PAIR, C. (1978): Inference for regular bilanguages. – J. CSS 16
(1978), 100-122.
[34] BERSTEL, J. and REUTENAUER, C. (1982): Recognizable power series on trees. –
Theor. Comput. Sci. 18 (1982), 115-148.
[35] BERTSCH, E. (1973): Some considerations about classes of mappings between
context-free derivation systems. – GI. 1. Fachtagung Automatentheorie Formale
Sprachen (Bonn, 1973), LN in CS 2 (1973), 278-283.
[36] BILSTEIN, J. and DAMM, W. (1981): Top-down tree-transducers for infinite trees I.
– CAAP’81 (Trees in algebra and programming, 6th Coll., Genoa, March 1981), LN
in CS 112 (1981), 117-134.

209
Bibliography

[37] BLOOM, S. L. and ELGOT, C. C. (1976): The existence and construction of free
iterative theories. – J. CSS 12 (1976), 305-318.

[38] BOBROW, L. S. and ARBIB , M. A. (1974): Discrete Mathematics, Applied Algebra


for Computer and Information Science. – W. S. Saunders Co., Philadelphia (1974).

[39] BRAINERD, W. S. (1968): The minimalization of tree automata. – IC 13 (1968),


484-491.

[40] BRAINERD, W. S. (1969a): Tree generating regular systems. – IC 14 (1969), 217-231.

[41] BRAINERD, W. S. (1969b): Semi-Thue systems and representations of trees. –


10. IEEE Symp. (1969), 240-244.

[42] BRAYER, J. M. and FU, K.-S. (1977): A note on the k-tail method of tree grammar
inference. – IEEE Trans. Systems Man Cybernetics SMC – 7 (1977), 293-300.

[43] BUDA, A. (1978a): The equivalence problem for sequential program machines. –
3. Coll. Lille (1978), 19-26.

[44] Buda, A.O. (1978b): Abstaktnye maxiny programm. – Akad. Nauk SSSR
Sib. otd., Vyqisl. centr, Preprint 108, Novosibirsk (1978).

[45] BUDA, A. (1978c): Languages of program machines (Russian). – C. R. Acad. Bulgare


Sci. 31 (1978), 1543-1544.

[46] BUDA, A. (1979): Generalized1.5 sequential machines. – Inform. Process. Lett. 8


(1979), No. 1, 38-40.

[47] BUTTELMANN, H. W. (1971): On generalized finite automata and unrestricted gen-


erative grammars. – 3. Ann. ACM STC (1971), 63-77.

[48] BUTTELMANN, H. W. (1975a): On the syntactic structures of unrestricted gram-


mars I: Generative grammars and phrase structure grammars. – IC 29 (1975),
29-80.

[49] BUTTELMANN, H. W. (1975b): On the syntactic structures of unrestricted grammars


II: Automata. – IC 29 (1975), 81-101.

[50] CASTERAN, P. (1978): Représentation rationelle d’arbres infinis. – 3. Coll. Lille


(1978), 27-39.

[51] CATALANO, A., GNESI, S. and MONTANARI, U. (1978): Shortest path problems and
tree grammars: An algebraic framework. – Graph-grammars and their application
to computer science and biology (International workshop, Bad Honnef, 1978), LN
in CS 73 (1978), 167-179.

[52] COSTICH, O. L. (1972): A Medvedev characterization of sets recognized by general-


ized finite automata. – MST 6 (1972), 263-267.

210
Bibliography

[53] COURCELLE, B. (1976): Arbres algébriques et langages déterministes. – 1. Coll. Lille


(1976), 60-64.

[54] COURCELLE, B. (1978): Frontiers of infinite trees, – 3. Coll. Lille (1978), 76-102.

[55] CRESPI REGHIZZI, S. and DELLA VIGNA, P. (1973): Approximation of phrase markers
by regular sets. – Automata, Languages and Programming (Proc. Coll., Rocquen-
court, 1972), North Holland, Amsterdam (1973), 367-376.

[56] ČULIK, K. II (1974): Structured OL-systems. – L Systems, LN in CS 15 (1974),


216-229.

[57] ČULIK, K. II and MAIBAUM, T. S. E. (1974): Parallel rewriting systems on terms. –


Automata, Languages and Programming (Proc. Symp., Saarbrücken, 1974), LN in
CS 14 (1974), 495-511.

[58] DAMM, W. (1977): Languages defined by higher program schemes. – Automata,


Languages and Programming (Proc. Coll., Turku, 1977), LN in CS 52 (1977), 164-
179.

[59] DAMM, W. (1979): An algebraic extension of the Chomsky-hierarchy. – 4. Coll. Lille


(1979), 66-78.

[60] DAMM, W. (1982): The IO- and OI-hierarchies. – Theor. Comput. Sci. 20 (1982),
95-207.

[61] DAUCHET, M. (1977a): Grammaires transformationelles et bimorphismes de mag-


moides. – 2. Coll. Lille (1977), 249-273.

[62] DAUCHET, M. (1977b): Transductions de forêts, bimorphismes de magmoides. –


Thèse de doctorat Universite de Lille I (1977).

[63] DAUCHET, M. and MONGY, J. (1979a): Image de noyaux reconnaissables par diverses
classes de transformations. – 4. Coll. Lille (1979), 79-101.

[64] DAUCHET, M. and MONGY, J. (1979b): Transformations de noyaux reconnaissables


capacité générative des bimorphismes de forêts, – Fundam. Computation Theory ’79
(Proc. Conf. Berlin/Wendisch-Rietz 1979), Akademie-Verlag, Berlin (1979), 92-97.

[65] DONER, J. E. (1965): Decidability of the weak second-order theory of two successors.
– Notices Amer. Math. Soc. 12 (1965), Abstract 65T-468, 819.

[66] DONER, J. E. (1970): Tree acceptors and some of their applications. – J. CSS 4
(1970), 406-451.

[67] DUBINSKY, A. (1975): Computation on arbitrary algebras. – Symp. on λ-calculus


and Computer Science Theory (Rome, 1975), LN in CS 37 (1975), 319-341.

211
Bibliography

[68] DUSKE, J. (1970): Funktionenautomaten. – Automaten und Formale Sprachen


(Tagung Math. Forschungsinst., Oberwolfach, 1969), Bibliographisches Institut,
Mannheim (1970), 23-26.

[69] EILENBERG, S. and WRIGHT, J. B. (1967): Automata in general algebras. – IC 11


(1967), 452-470.

[70] ELGOT, C. C. (1975): Monadic computation and iterative algebraic theories. – Logic
Colloquium ’73, Studies in Logic, Vol. 80 (Eds. M. E. Rose and J. C. Sheperdson),
North-Holland, Amsterdam (1975), 175-230.

[71] ELGOT, C. C., BLOOM, S. L. and TINDELL, R. (1978): On the algebraic structure of
rooted trees. – J. CSS 16 (1978), 362-399.

[72] ELLIS, C. A. (1971): Probabilistic tree automata. – IC 19 (1971), 401-416.

[73] ENGELFRIET, J. (1972): A note on infinite trees. – Information Processing Lett. 1


(1972), 229-232.

[74] ENGELFRIET, J. (1975a): Tree automata and tree grammars, – Lecture notes, DAIMI
FN-10, Inst. Math., Aarhus Univ., Aarhus (1975).

[75] ENGELFRIET, J. (1975b): Bottom-up and top-down tree transformations. A com-


parison. – MST 9 (1975), 198-231.

[76] ENGELFRIET, J. (1976a): Surface tree languages and parallel derivation trees. –
Theor. Comput, Sci. 2 (1976), 9-27.

[77] ENGELFRIET, J. (1976b): Some remarks on classes of macro languages. – 1. Coll.


Lille (1976), 71-79.

[78] ENGELFRIET, J. (1976/77): Top-down tree transducers with regular look-ahead. –


MST 10 (1976/77), 289-303.

[79] ENGELFRIET, J. (1977): Macro grammars, Lindenmayer systems and other copying
devices. – Automata, Languages and Programming (Proc. Coll., Turku, 1977), LN
in CS 52 (1977), 221-229.

[80] ENGELFRIET, J. (1978a): A hierarchy of tree transducers. – 3. Coll. Lille (1978),


103-106.

[81] ENGELFRIET, J. (1978b): On tree transducers for partial functions. – Inform. Pro-
cess. Lett. 7 (1978), 170-172.

[82] ENGELFRIET, J. (1980): Some open questions and recent results on tree transducers
and tree languages. – Formal language theory. Perspectives and open problems (ed.
R. V. Book), Academic Press, New York (1980), 241-286.

[83] ENGELFRIET, J. (1982): Three hierarchies of transducers. – MST 15 (1982), 95-125.

212
Bibliography

[84] ENGELFRIET, J., ROZENBERG, G. and SLUTZKI, G. (1980): Tree transducers, L sys-
tems, and two way machines. – J. CSS 20 (1980), 150-202.

[85] ENGELFRIET, J. and SCHMIDT, E. M. (1977, 1978): IO and OI.


I – J. CSS 15 (1977), 328-353.
II – J. CSS 16 (1978), 67-99.

[86] ENGELFRIET, J. and SKYUM, S. (1976): Copying theorems. – Information Processing


Lett. 4 (1976), 157-161.

[87] ENGELFRIET, J. and SKYUM, S. (1982): The copying power of one-state tree trans-
ducers. – J. CSS 25 (1982), 418-435.

[88] ÉSIK, Z. (1978): On decidability of injectivity of tree transducers. – 3. Coll. Lille


(1978), 107-133.

[89] ÉSIK, Z. (1979): On functional tree transducers. – Fundam. Computation Theory


’79 (Proc. Conf., Berlin/Wendisch-Rietz 1979), Akademie-Verlag, Berlin (1979),
121-127.

[90] ÉSIK, Z. (1980): Decidability results concerning tree transducers I. – Acta Cybernet.
5 (1980). 1-20.

[91] ÉSIK, Z. (1981): An axiomatization of regular forests in the language of algebraic


theories with iteration. – Fundamentals of computation theory (Proc, Conf., Szeged
1981), LN in CS 117 (1981), 130-136.

[92] ESTENFELD, K. (1982): A new characterization theorem of treetransductions – Elek-


tron. Informationsverarbeit. Kybernet. 18 (1982), 187-204.

[93] FERENCI, F. (1976) A new representation of context-free languages by tree automata


– Found. Control. Engrg. 1 (1976) 217-222.

[94] FERENCI, F. (1980): Groupoids of pseudoautomata. – Acta Cybernet. 4 (1980),


389-399.

[95] FISCHER, M. J. (1968): Grammars with macro-like productions – 9. IEEE Symp.


(1968), 131-142.

[96] FU, K.-S. (1980): Picture syntax. – Pictorial Information Systems (Eds, S. K. Chang
and K.-S. Fu), LN in CS 80 (1980), 104-127.

[97] FU, K.-S. (1982): Syntactic pattern recognition and applications. – Prentice-Hall,
Englewood Cliffs, N. J. (1982).

[98] FU, K.-S. and BHARKAVA, B. K. (1973): Tree systems for syntactic pattern recog-
nition. – IEEE Trans. Computers C-22 (1973), 1087-1099.

213
Bibliography

[99] FU, K.-S. and FAN, T.-I. (1982): Tree translation and its application to a time-
varying image analysis problem. – IEEE Trans. Systems, Man and Cybernetics,
SMC – 12 (1982), 856-867.

[100] FÜLÖP, Z. (1981): On attributed tree transducers. – Acta Cybernet. 5 (1981),


261-279.

[101] GÉCSEG, F. (1977): Universal algebras and tree automata. – Fundamentals of


Computation Theory (Proc, Symp., Poznań-Kórnik, 1977), LN in CS 56 (1977),
98-112.

[102] GÉCSEG, F. (1981): Tree transformations preserving recognizability. – Finite Al-


gebra and Multiple-valued Logic (Record Coll. Universal Algebra, Szeged 1980),
North-Holland, Amsterdam (1981), 251-273.

[103] GÉCSEG, F. and HORVÁTH, GY. (1976): On representation of trees and context-free
languages by tree automata. – Found. Control Engrg, 1 (1976), 161-168.

[104] GÉCSEG, F. and STEINBY, M. (1978a): Minimal ascending tree automata. – Acta
Cybernet. 4 (1978), 37-44.

[105] GÉCSEG, F. and STEINBY, M. (1978b, 1979): A faautomatak algebrai elmélete.


I – Mat. Lapok 26 (1978), 169-207.
II – Mat. Lapok 27 (1979), 283-336.

[106] GÉCSEG, F. and E.-TÓTH, P. (1977): Algebra and logic in theoretical computer sci-
ence. – Mathematical Foundations of Computer Science, 1977 (Tatranska Lomnica),
LN in CS 53 (1977), 78-92.

[107] GEORGEFF, M. P. (1981): Interdependent translation schemes. – J. CSS 22 (1981),


198-219.

[108] GINALI, S. (1979): Regular trees and the free iterative theory. – J. CSS 18 (1979),
228-242.

[109] GINSBURG, G. and MAYER, O. (1982): Tree acceptors and grammar forms. – Com-
puting 29 (1982), 1-9.

[110] GIVE’ON, Y. (1971): Algebraic theory of m-ary systems. – Theory of machines


and computations (Eds. Z. Kohavi and A. Paz), Academic Press, New York (1971),
275-286.

[111] GIVE’ON, Y. and ARBIB, M. A. (1968): Algebra automata II: the categorical frame-
work for dynamic analysis. – IC 12 (1968), 346-370.

[112] GNESI, S., MONTANARI, U. and MARTELLI, A. (1981): Dynamic programming as


graph searching: an algebraic approach. – J. ACM 28 (1981), 737-751.

214
Bibliography

[113] GOGUEN, J. A. (1975): Semantics of computation. – Category Theory Applied to


Computation and Control (Proc. Symp., San Francisco, 1974), LN in CS 25 (1975),
151-163.
[114] GOGUEN, J. A. and THATCHER, J. W. (1974): Initial algebra semantics. – 15. IEEE
Symp. (1974), 63-77.
[115] GOGUEN, J. A., THATCHER, J. W., WAGNER, E. G. and WRIGHT, J. B. (1977):
Initial algebra semantics and continuous algebras. – J. ACM 24 (1977), 68-95.
[116] GONZALEZ, R. C., EDWARDS, J. J. and THOMASON, M. G. (1976): An algorithm
for the inference of tree grammars. – Intern. J. Comput, Information Sci. 5 (1976),
145-164.
[117] GONZALEZ, R. C. and THOMASON, M. G. (1978): Syntactic pattern recognition. –
Addison Wesley, New York (1978).
[118] HART, J. M. (1974): Acceptors for the derivation languages of phrase-structure
grammars. IC 25 (1974), 75-92.
[119] HART, J. M. (1976): The derivation language of a phrase structure grammar. – J.
CSS 12 (1976), 64-79.
[120] HELTON, F. J. (1976): The semigroup of an algebra automaton. – J. CSS 12 (1976),
13-24.
[121] HÖPNER, M. (1971): Eine Charakterisierung der Szilardsprachen. – GI-4. Jahresta-
gung (Berlin, 1974), LN in CS 26 (1975), 113-121.
[122] HORVÁTH, GY. (1979): On machine maps in categories. – Fundamentals of Com-
putation Theory ’79 (Proc. conf., Berlin/Wendisch-Rietz 1979), Akademie-Verlag,
Berlin (1979), 182-186.
[123] HORVÁTH, GY. (1981): Functor state machines. – Acta Cybernet, 6 (1981), 147-
172.
[124] HÜBLER, A. (1975): Zur Dechiffrierung von Baum-Akzeptoren mittels Mehrfach-
experimenten. – Elektron. Informationsverarb. Kybernet. 11 (1975), 590-593.
[125] HUPBACH, U. L. (1978): Rekursive Funktionen in mehrsortigen Peano-Algebren. –
Elektron. Informationsverarb. Kybernet. 14 (1978), 491-506.
[126] INOUE, K. and NAKAMURA, A. (1976): Some topological properties of Σ-structure
automata. – S-C-C 7 (1976), No. 5, 19-27.
[127] ITO, T. and ANDO, S. (1974): A complete axiom system of super-regular expres-
sions. – Proc. IFIP Congress 74 (Stockholm, 1974), 661-665.
[128] ITO, H. and FUKUMURA, T. (1974): Dendrolanguage generating systems on sets of
control strings. – S-C-C 5 (1974), No. 4, 9-17.

215
Bibliography

[129] ITO, H., INAGAKI, Y. and FUKUMURA, T. (1973a): Characterization of derivation


trees of context sensitive tree generating systems. – S-C-C 4 (1973), No.2, 24-32.

[130] ITO, H., INAGAKI, Y. and FUKUMURA, T. (1973b): Scattered tree automata and
scattered context-sensitive tree-generating systems. – S-C-C 4 (1973), No.4, 22-28.

[131] ITO, H., INAGAKI, Y. and FUKUMURA, T. (1973c): Hierarchy of the families of
dendrolanguages. – S-C-C 4 (1973), No. 5, 48-56.

[132] ITO, H., INAGAKI, Y. and FUKUMURA, T. (1974): Dendrolanguage generating sys-
tems on control state sets. A hierarchy between context-free and context-sensitive
dendrolanguages, – S-C-C 5 (1974), No. 5, 1-8.

[133] JACOB, G. (1979): Elements de la théorie algébriques des arbres. – Fundamentals


of Computation Theory ’79 (Proc, Conf., Berlin/Wendisch-Rietz 1979), Akademie-
Verlag, Berlin (1979), 193-206.

[134] JOSHI, A. K. and LEVY, L. S. (1977): Constraints on structural descriptions: Local


transformations. – SIAM J. Comput. 6 (1977), 272-284.

[135] JOSHI, A. K., LEVY, L. S. and TAKAHASHI, M. (1973): A tree generating system.
– Automata, Languages and Programming (Proc. Symp., Rocquencourt, 1972),
North-Holland, Amsterdam (1973), 453-465.

[136] JOSHI, A. K., LEVY, L. S. and TAKAHASHI, M. (1975): Tree adjunct grammars. –
J. CSS 10 (1975), 136-163.

[137] JOSHI, A. K., LEVY, L. S. and YUEH, K. (1980): Local constraints in programming
languages. Part I: Syntax. – Theoret. Comput. Sci. 12 (1980), 265-280.

[138] KAMIMURA, T. and SLUTZKI, G. (1979): DAGs and Chomsky hierarchy (extended
abstract). – Automata, languages and programming, (6th Colloq., Graz 1979), LN
in CS 71 (1979), 331-337.

[139] KAMIMURA, T. and SLUTZKI, G. (1982): Transductions of dags and trees. – MST
15 (1982), 225-249.

[140] KARPIŃSKI, M. (1973a, b, c, 1974a): Free structure tree automata.


I – Equivalence. – Bull. Acad. Polon. Sci. Sér, Sci. Math. Astron. Phys. 21
(1973), 441-446.
II – Nondeterministic and deterministic regularity. – ibid 21 (1973), 447-450.
III – Normalized climbing automata. – ibid. 21 (1973), 567-572.
IV – Sequential representation. – ibid. 22 (1974), 87-91.

[141] KARPIŃSKI, M. (1974b): Probabilistic climbing and sinking languages. – Bull.


Acad. Sci. Sér, Sci. Math. Astron. Phys. 22 (1974), 1057-1061.

216
Bibliography

[142] KARPIŃSKI, M.(1975): Stretching by probabilistic tree automata and Santos gram-
mars. – Mathematical Foundations of Computer Science (Proc. Symp., Jadwisin
1974), LN in CS 28 (1975), 249-255.

[143] KARPIŃSKI, M. (1977): The equivalence problems for binary EOL-systems are
decidable. Fundamentals of Computation Theory (Proc, Symp., Poznań-Kórnik,
1977), LN in CS 56 (1977), 423-434.

[144] KAWAHARA, Y. (1980): Relational tree automata and context-free sets. – Bull.
Kyushu Inst. Technol., Math. Nat. Sci. 27 (1980), 17-25.

[145] KAWAHARA, Y. and YAMAGUCHI, M. (1980): Minimal realization theory for free
process machines in monoidal categories. – Mem. Fac. Sci. Kyushu Univ. Ser. A. 34
(1980), No. 1, 71-78.

[146] KOJIMA, M. and HONDA, N. (1972): Properties of context-sensitive tree automata


and characterizations of derivation trees of context-sensitive grammars. – S-C-C 3
(1972), No. 5, 23-30.

[147] KOJIMA, M. and HONDA, N. (1973): A characterization of sets of trees acceptable


by tree automata. – S-C-C 4 (1973), No. 1, 40-47.

[148] KOZEN, D. (1977): Complexity of finitely presented algebras. – 9. Ann. ACM STC
(Boulder, Co1. 1977), 164-177.

[149] LAWVERE, F. W. (1963): Functorial semantics of algebraic theories. – Proc. Nat.


Acad. Sci. USA 50 (1963), 869-872.

[150] LESCANNE, P. (1976): Equivalence entre la famille des ensembles réguliers et la


famille des ensembles algébriques. – RAIRO Inform. Théor. Sér. Rouge 10 (1976),
No. 8, 57-81.

[151] LESCANNE, P. (1977): Quelques applications des classes équationelles conformes. –


2. Coll. Lille (1977), 199-212.

[152] LEVINE, B. (1981): Derivatives of tree sets with applications to grammatical infer-
ence. – IEEE Trans. Pattern Anal. & Mach. Intell., PAMI-3 (1981), 285-293.

[153] LEVINE, B. (1982): The use of tree derivatives and a sample support parameter for
inferring tree systems. – IEEE Trans. Pattern Anal. Mach. Intell., PAMI-4 (1982),
25-34.

[154] LEVY, L. S. (1971): Tree adjunct, parenthesis, and distributed adjunct grammars.
– Theory of machines and computations (Eds. Z. Kohavi and A. Paz), Academic
Press, New York (1971), 127-142.

[155] LEVY, L. S. (1973): Structural aspects of local adjunct grammars. – IC 23 (1973),


260-287.

217
Bibliography

[156] LEVY, L. S. (1980): Discrete structures of computer science. – John Wiley & Sons,
New York (1980).

[157] LEVY, L. S. and JOSHI, A. K. (1973): Some results in tree automata. – MST 6
(1973), 334-342.

[158] LEVY, L. S. and JOSHI, A. K. (1978): Skeletal structural descriptions. – IC 39


(1978), 192-211.

[159] LILIN,E. (1978a): S-transducteurs de forêts. – 3. Coll. Lille (1978), 189-206.

[160] LILIN, E. (1978b): Une generalization des transducteurs d’etats finis d’arbres: les
S-transducteurs. – Thése de doctorat, Université de Lille I (1978).

[161] LILIN, E. (1981): Transducteurs finis d’arbres et tests d’egalite. – RAIRO Inform.
Theor. 15 (1981), 213-232.

[162] LIPPE, W.-M. (1982): Context-sensitive top-down creative dendrogrammars. –


Bull. EATCS, No. 9 (Oct. 1979), 41-45.

[163] LU, S. Y. (1979a): Stochastic tree grammar inference for texture synthesis and
discrimination. – Comput. Graphics and Image Process. 9 (1979), 234-245.

[164] LU, S. Y. (1979b): A tree-to-tree distance and its application to cluster analysis.
– IEEE Trans. Pattern. Anal. & Mach. Intell., PAMI-1 (1979), 219-224.

[165] LU, S.Y.and FU, K.-S. (1978): Error-correcting tree automata for syntactic pattern
recognition. IEEE Trans. Comput, C-27 (1978), 1040-1053.

[166] MAGIDOR, M. and MORAN, G. (1969): Finite automata over finite trees. – Technical
Report 30, Hebrew University, Jerusalem (1969).

[167] MAGIDOR, M. and MORAN, G. (1970): Probabilistic tree automata. – Israel J.


Math. 8 (1970), 340-348.

[168] MAHN, F. K. (1969): Primitiv-rekursive Funktionen auf Termmengen, – Arch.


Math. Logik Grundlagenforsch. 12 (1969), 54-65.

[169] MAIBAUM, T. S. E. (1972): The characterization of the derivation trees of context-


free sets of terms as regular sets. – 13. IEEE Symp. (1972), 224–230.

[170] MAIBAUM, T. S. E. (1974): A generalized approach to formal languages. – J. CSS


8 (1974), 409-439.

[171] MAIBAUM, T. S. E. (1978): Pumping lemmas for term languages. – J. CSS 17


(1978), 319-330.

[172] MARCHAND, P. (1976): Bigrammes et systemes transformationnels. – I. Coll. Lille


(1976), 175-195.

218
Bibliography

[173] MARCHAND, P. (1979): Construction des algèbres minimales des sous-ensembles


des algèbres libres. Applications aux parties reconnaissables. – 4. Coll. Lille (1979),
134-158.
[174] MARCHAND, P. (1980): Grammaires paranthésés et bilangages réguliers, – RAIRO
Inform. Theor. 14 (1980), 3-38.
[175] MARCHAND, P. (1981): Langages d’arbres. Langages dans les algèbres libres. –
Thesis, CRIN 81-T-030, Universitè de Nancy, Nancy (1981).
[176] MARÓTI, G. (1977): Rational representation of forests by tree automata. – Acta
Cybernet. 3 (1977), 309-320.
[177] MARTIN, D. E. and VERE, S. A. (1970): On syntax-directed transduction and tree
transducers. – 2. Ann. ACM STC (1970), 129-135.
[178] MAYER, O. (1975): On the analysis and synthesis problems for context-free expres-
sions. – Mathematical Foundations of Computer Science (Proc. Symp., Mariánské
Lázně 1975), LN in CS 32 (1975), 308-314.
[179] MEISSNER, H.-G. (1976): Über die Fortsetzbarkeit von sequentiellen Baumope-
ratoren mit endlichem Gewicht. – Elektron. Informationsverarbeit. Kybernet. 11
(1976), 578-579.
[180] MEISSNER, H.-G. (1977): Zu einigen Begriffen und Resultaten aus der Theorie der
Baumautomaten. – Rostock. Math. Kolloq. 3 (1977), 85-102.
[181] MERZENICH, W. (1979): A binary operation on trees and an initial algebra char-
acterization for finite tree types. – Acta Inform. 11 (1979), 149-168.
[182] MEZEI, J. and WRIGHT, J. B. (1967): Algebraic automata and context-free sets. –
IC 11 (1967), 3-29.
[183] Modina, L. S. (1975a): Derevnye grammatiki i zyki. – Kibernetika
(Kiev) (1975). No. 5, 86-93.
[184] MODINA, L. S. (1975b): On some formal grammars generating dependency trees.
– Mathematical Foundations of Computer Science 1975 (Proc. Symp. Mariánské
Lázně), LN in CS 32 (1975), 326-329.
[185] MOSTOWSKI, A. W. (1979): A note concerning the complexity of a decision problem
for positive formulas in SkS. – 4. Coll. Lille (1979), 173-180.
[186] MOSTOWSKI. A. W. (1982): Determinancy of sinking automata on infinite trees
and inequalities between Rabin’s pair indices. – Information Processing Lett. 15
(1982), 159-163.
[187] NG, P. and YEH, R. T. (1973): Tree transformations via finite recursive transition
machines. – Mathematical Foundations of Computer Science (Proc. Symp., High
Tatras 1973), 273-278.

219
Bibliography

[188] NG, P. A. and YEH, R. T. (1976): Sequential tree-walking automata. – Nanta


Math. IX (1976), 159-167.

[189] NIVAT, M. (1973): Langages algébriques sur le magma libre et sémantique des
schémas de programme. – Automata, Languages and Programming (Proc. Symp.,
Rocquencourt 1972), North-Holland, Amsterdam (1973), 367-376.

[190] OGDEN, W. F. and ROUNDS, W. C. (1972): Compositions of n tree transducers. –


4. Ann. ACM STC (1972), 198-206.

[191] OPP, M. (1975a): Eine Beschreibung contextfreier Sprachen durch endliche Men-
gensysteme. Automata Theory and Formal Languages (2nd GI Conf., Kaiserslautern
1975), LN in CS 33 (1975), 190-197.

[192] OPP, M. (1975b): Allgemeine Σ-Grammatiken. – GI-5. Jahrestagung (Dortmund


1975), LN in CS 34 (1975), 420-428.

[193] OPP, M. (1976): Characterizations of recognizable subsets in generic algebras. –


1. Coll. Lille (1976), 164-174.

[194] PAIR, C. (1976a): Inference for regular bilanguages. – Formal Languages and Pro-
gramming (Proc. Semin., Madrid 1975), North-Holland, Amsterdam (1976), 15-30.

[195] PAIR, C. (1976b): Les arbres en theorie des langages. – 1. Coll. Lille (1976),196-216.

[196] PAIR, C. and QUERE, A. (1968): Definition et étude des bilangages réguliers, – IC
13 (1968), 565-593.

[197] PERRAULT, C. R. (1976a): Intercalation lemmas for tree transducer languages. –


J. CSS 13 (1976), 246-277.

[198] PERRAULT, C. R. (1976b): Augmented transition networks and their relation to


tree transducers. – Information Sci. 11 (1976), 93-120.

[199] PETROV, S. V. (1978): Graph grammars and automata (survey). – Autom. Remote
Control 39 (1978), 1034-1050.

[200] PETTOROSSI, A. (1976): Combinators as tree transducers. – 2. Coll. Lille (1976),


213-223.

[201] PYSTER, A. (1978): Context-dependent tree automata. – IC 38 (1978), 81-102.

[202] PYSTER, A. and BUTTELMANN, H. W. (1978): Semantic-syntax-directed transla-


tion. – IC 36 (1978), 320-361.

[203] RABIN, M. O. (1967): Mathematical theory of automata. – Mathematical Aspects


of Computer Science (Proc. Symp. Appl. Math. XIX), Amer. Math. Soc., Providence
(1967), 153-175.

220
Bibliography

[204] RABIN, M. O. (1969): Decidability of second-order theories and automata on infi-


nite trees. Trans. Amer. Math. Soc. 141 (1969), 1-35.

[205] RABIN, M. O. (1970): Weakly definable relations and special automata. – Mathe-
matical Logic and Foundations of Set Theory (Proc. Coll., Jerusalem 1968), North-
Holland, Amsterdam (1970), 1-23.

[206] RAOULT J.-C. (1981): Finiteness results on rewritting systems. – RAIRO Inform.
Théor, 15 (1981), 373-391.

[207] REISIG, W. (1979): A note on the representation of finite automata. – Inform.


Process. Lett. 8 (1979), 239-240.

[208] RÉVÉSZ, Gy. (1977): Algebraic properties of derivation words. – 2. Coll. Lille
(1977), 224-234.

[209] RICCI, G. (1973): Cascades of tree-automata and computations in universal alge-


bras. – MST 7 (1973), 201-218.

[210] RIHA, A. (1981): A certain type of dependency tree transformations. – Mathemati-


cal logic in computer science (Proc. Coll., Salgótarján, Hungary, Sept. 10-15, 1978),
Elsevier North-Holland Publ. Co., New York (1981), 699-709.

[211] ROSEN, B. K. (1973): Tree-manipulating systems and Church-Rosser theorems. –


J. ACM 20 (1973), 160-187.

[212] ROSEN, B. K. (1974): Syntactic complexity. – IC 24 (1974), 305-335.

[213] ROUNDS, W. C. (1969): Context-free grammars on trees. – 1. Ann. ACM STC


(1969), 143-148.

[214] ROUNDS, W. C. (1970a): Tree-oriented proofs of some theorems on context-free


and indexed languages. – 2. Ann. ACM STC (1970), 109-116.

[215] ROUNDS, W. C. (1970b): Mappings and grammars on trees. – MST 4 (1970),


257-287.

[216] ROUNDS, W. C. (1973): Complexity of recognition in intermediate-level languages.


– 14. IEEE Symp. (1973), 145-158.

[217] SCHREIBER, P. P. (1976): Tree transducers and syntax-connected transductions. –


1. Coll. Lille (1976), 217-238.

[218] SCHÜTT, D. (1970): Baumautomaten. – Bericht 36, Gesellschaft für Math. u.


Datenverarbeitung, Bonn (1971).

[219] SCHÜTT, D. (1973): Zustandsfolgenabbildungen von verallgemeinerten endlichen


Automaten. – 1. Fachtagung über Automatentheorie und Formale Sprachen (Bonn
1973), LN in CS 2 (1973), 88-97.

221
Bibliography

[220] SHEPARD, C. D. (1969): Languages in general algebras. – 1. Ann. ACM STC (1969),
155-163.

[221] SHI, Q.-Y. and FU, K.-S. (1982): Efficient error-correcting parsing for (attributed
and stochastic) tree grammars. – Information Sciences 26 (1982), 159-188.

[222] SIEFKES, D. (1978): An axiom system for the weak monadic second-order theory
of two successors. – Israel J. Math. 30 (1978), 264-284.

[223] SOMMERHALDER, R. (1974): Monoids associated with algebras and automata. –


Unpublished Report, Delft (1974).

[224] STEINBY, M. (1977a): On algebras as tree automata. – Contributions to Universal


Algebra (Record Coll. Universal Algebra, Szeged 1975), North-Holland, Amsterdam
(1977), 441-455.

[225] STEINBY, M.(1977b): On the structure and realizations of tree automata. – 2. Coll.
Lille (1977), 235-248.

[226] STEINBY, M. (1979): Syntactic algebras and varieties of recognizable sets. – 4. Coll.
Lille (1979), 226-240.

[227] STEINBY, M. (1981): Some algebraic aspects of recognizability and rationality. –


Fundamentals of computation theory (Proc Conf., Szeged 1981), LN in CS 117
(1981), 360-372.

[228] STEYART, J.-M. (1977a): Sur les index rationelles des feuillages de forêts lineaires.
– C. R. Acad. Sci. Paris, Sér, A, t. 285 (1977), 473-476.

[229] STEYART, J.-M. (1977b): Evaluation des index rationnels de quelques familles de
langages. – Technical Report No. 261, IRIA, Rocquencourt, France (1977).

[230] STEYART, J.-M. (1978): Index rationnel des ETOL-Jangages. – 3. Coll. Lille (1978),
246-249.

[231] SZILARD, A. L. (1974): Ω-OL systems. – L-systems, LN in CS 15 (1974), 258-291.

[232] TAI, K.-CH. (1979): The tree-to-tree correction problem. – J. ACM 26 (1979),
422-433.

[233] TAKAHASHI, M. (1973): Primitive transformations of regular sets and recognizable


sets. – Automata, Languages and Programming (Proc. Coll., Roquencourt 1972),
North-Holland, Amsterdam (1973), 475-480.

[234] TAKAHASHI, M. (1975a): Generalizations of regular sets and their application to a


study of context-free languages. – IC 27 (1975), 1-36.

[235] TAKAHASHI, M. (1975b): A mathematical approach to the structure of language.


On the fundamental concept of a tree (Japanese). – Sugaku 27 (1975), 241-252.

222
Bibliography

[236] TAKAHASHI, M. (1977): Rational relations on binary trees. – Automata, Languages


and Programming (Proc, Coll. Turku 1977), LN in CS 52 (1977), 524-538.

[237] THATCHER, J. W. (1967): Characterizing derivation trees of context-free grammars


through a generalization of finite automata theory. – J. CSS 1 (1967), 317-322.

[238] THATCHER, J. W. (1970): Generalized2 sequential machines. – J. CSS 4 (1970),


339-367.

[239] THATCHER, J. W. (1973): Tree automata: an informal survey. – Currents in the


Theory of Computing (ed. A. V. AHO), Prentice-Hall, Englewood Cliffs, N. J. (1973),
143-172.

[240] THATCHER, J. W. and WRIGHT, J. B. (1965): Generalized finite automata. – No-


tices Amer. Math. Soc. 12 (1965), Abstract No. 65T- 649, 820.

[241] THATCHER, J. W. and WRIGHT, J. B. (1968): Generalized finite automata theory


with an application to a decision problem of second order logic. – MST 2 (1968),
57-81.

[242] TIURYN, J. (1977a, b): Fixed-points and algebras with infinitely long expressions.
I – Mathematical Foundations of Computer Science 1977 (Proc. Symp., Tatran-
ska Lomnica), LN in CS 53 (1977), 513-522.
II – Fundamentals of Computation Theory (Proc. Symp., Poznań-Kórnik 1977),
LN in CS 56 (1977), 332-339.

[243] TOKURA, N. and KASAMI, T. (1974): Automata with labelled tree inputs. – S-C-C 5
(1974), No. 3, 88-95.

[244] TRNKOVÁ, V. and ADÁMEK, J. (1979): Tree-group automata. – Fundamentals of


Computation Theory ’79 (Proc. Conf., Berlin/Wendisch-Rietz 1979), Akademie-
Verlag (1979), 462-468.

[245] TURNER, R. (1973): An infinite hierarchy of term languages - an approach to


mathematical complexity. – Automata, Languages and Programming (Proc. Symp.,
Rocquencourt 1972), North-Holland Amsterdam (1973), 593-608.

[246] TURNER, R. (1975): An algebraic theory of formal languages. – Mathematical


Foundations of Computer Science (Proc. Symp. Mariánské Lázně 1975), LN in CS
32 (1975), 426-431.

[247] UPTON, R. A. (1981): An extension of tree adjunct grammars. – IC 51 (1981),


248-274.

[248] VIRÁGH, J. (1980): Deterministic ascending tree automata I. – Acta Cybernet. 5


(1980), 33-42.

223
Bibliography

[249] WAGNER, E. G. (1971): An algebraic theory of recursive definitions and recursive


languages. – 3. Ann. ACM STC (1971), 12-23.

[250] WAGNER, E. G., WRIGHT, J. R, GOGUEN, J. A. and THATCHER, J. W. (1976):


Some fundamentals of order-algebraic semantics. – Mathematical Foundations of
Computer Science (Proc. Symp. Gdańsk 1976), LN in CS 45 (1976), 153-168.

[251] WILLIAMS, K. L. (1975): A multidimensional approach to syntactic pattern recog-


nition. – Pattern Recognition 7 (1975), 125-137.

[252] WRIGHT, J. B., THATCHER, J. W., WAGNER, E. G. and GOGUEN, J. A. (1976):


Rational algebraic theories and fixed-point solutions. – 17. IEEE Symp. (1976),
147-158.

[253] YEH, R. T. (1971): Some structural properties of generalized automata and alge-
bras. – MST 5 (1971), 306-318.

[254] ZACHAR, Z. (1979): The solvability of the equivalence problem for deterministic
frontier-to-root tree transducers. – Acta Cybernet. 4 (1979), 167-177.

224
Index

Algebra, 11 upper, 24
Boolean, 11 branch of tree, 49
clone, 115
finite, 11 chain, 24
finite ND, 55 Chomsky hierarchy, 35
finitely generated, 12 class
free, 20 congruence, 13
freely generated over a class, 20 equivalence, 7
ND, 55 of tree transformations closed under
NDR, 57 composition, 147
nondeterministic, 55 of tree transformations preserving
nondeterministic root-to-frontier, 57 regularity, 165
of finite type, 11 closure
power, 16 of forest, 104
quotient, 14 x-substitution, 124
ΣX-term, 20 comparable elements, 24
subset, 16 compatible partition, 13
substitution, 115 complete sublattice, 25
trivial, 11 complete variety, 129
universal, 10 composition of
alphabet, 27 mappings, 8
frontier, 48 operations, 10
ranked, 48 relations, 7
terminal, 36 tree transformations, 131
arity of congruence
operation, 9 of DR ΣX-recognizer, 107
operator, 11 of recognizer, 33
associated ΣX-recognizers, 58 of Σ-algebra, 13
of ΣX-recognizer, 80
bijection, 8 right, 31
binoid, 114 syntactic, 32
bound connected component of DR ΣX-
greatest lower, 24 recognizer, 108
least upper, 24 connected part of recognizer, 34
lower, 24 converse of relation, 6

225
Index

derivation in tree recognizers, 100


F-transducer, 134 equivalence of states in
grammar, 36 DR recognizer, 108
GSDT, 163 recognizer, 33
gsm, 44 ΣX-recognizer, 81
RR -transducer, 155 extension of mapping, 8
R-transducer, 137
DF-transducer, 142 family of languages, 27
direct derivation in final assignment of NDR ΣX-recognizer,
F-transducer, 133 57
GSDT, 163 final state of
RR -transducer, 155 F-transducer, 132
R-transducer, 136 gsm, 44
direct generation in grammar, 36 NDF ΣX-recognizer, 56
direct power of algebra, 15 recognizer, 28
direct product of ΣX-recognizer, 52
algebras, 15 fixed-point, 26
posets, 25 least, 26
domain forest
of relation, 7 closed, 104
of tree transformation, 131 derivation, 119
operator, 11 elementary, 93
tree, 113 equational, 90
DR-transducer, 143 generated by regular ΣX-grammar,
simple, 202 60
K-surface, 165
element local, 98
unit, 24 (n, F)-surface, 188
zero, 24 (n, RR )-surface, 188
embedding of (n, R)-surface, 188
algebra, 12 production, 121
ΣX-recognizer, 79 recognizable, 52
epimorphism recognized by NDF ΣX-recognizer,
natural, 80 56
of algebra, 12 recognized by NDR ΣX-recognizer,
of DR ΣX-recognizer, 107 58
of recognizer, 34 recognized by ΣX-recognizer, 52
of ΣX-recognizer, 79 regular, 77
equivalence of representable, 94
grammars, 36 represented by regular expression,
gsm’s, 44 75
Mealy machines, 43 fork of ΣX-tree, 97
R- and F-transducers, 142 F-relabeling, 142
regular ΣX-grammars, 60 frontier of tree, 49

226
Index

F-transducer, 132 tree, 49, 90


connected, 165 HF-transducer, 140
deterministic, 142 homomorphism
linear, 142 alphabetic tree, 74
nondeleting, 142 length-preserving, 33
totally defined, 142 linear tree, 70
F-transformation, 133 natural, 14
function, 8 of algebra, 12
output, 43 of DR ΣX-recognizer, 106
polynomial, 18 of recognizer, 34
unary algebraic, 22 of ΣX-recognizer, 79
tree, 70
generalized syntax directed homomor- HR-transducer, 140
phism, 163
generalized syntax directed translator, ideal, 25
162 dual, 26
generating set, 12 principal, 26
free, 20 principal dual, 26
grammar, 36, 59 image, 8
ambiguous, 39 epimorphic, 12, 79, 107
attribute, 205 inverse, 8
CF, 37 index of equivalence relation, 8
context-free, 37 induction
context-free tree, 129 term, 17
reduced CF, 40 tree, 49
right linear, 36 inference of forests, 116
tree adjunct, 115 infimum, 24
unambiguous, 39 infix notation, 9
Greibach k-form, 40 initial assignment of
groupoid, 128 NDF ΣX-recognizer, 56
GSD homomorphism, 163 ΣX-recognizer, 52
GSDH-translator, 163 initial state of
GSDT, 162 GSDT, 162
deterministic, 163 gsm, 44
finite copying, 178 Mealy machine, 43
k-copying, 178 NDR ΣX-recognizer, 57
linear, 163 R-transducer, 136
nondeleting, 163 recognizer, 28
totally defined, 163 initial symbol of
gsm, 44 grammar, 36
deterministic, 44 regular ΣX-grammar, 59
injection, 8
height of input alphabet of
production, 61 gsm, 44

227
Index

Mealy machine, 43 complete, 24


recognizer, 27 leaf of tree, 49
inverse of tree transformation, 131 leftmost derivation, 39
inversion of direct derivations in length of
F-transducer, 134 derivation in GSDT, 163
R-transducer, 138 derivation in F-transducer, 133
isomorphism of derivation in R-transducer, 137
algebras, 12 tree, 49
DR ΣX-recognizers, 107 word, 27
recognizers, 34 letter, 27
ΣX-recognizers, 79 LF-transducer, 142
iteration, 29 Lindenmayer system, 115
linear production of
join, 24 F-transducer, 142
k-copying derivation in R-transducer, 143
GSDT, 178 LR-transducer, 143
R-transducer, 178
machine
kernel of mapping, 8
generalized sequential, 44
K-transformation, 144
Mealy, 43
language, 27 sequential program, 204
CF, 37 magmoid, 114
context-free, 37 mapping, 8
e-free, 27 bijective, 8
η-recognized, 127 constant, 22
F-transformational, 189 identity, 8
generated by grammar, 36 injective, 8
inherently ambiguous CF, 39 isotone, 26
local, 33 natural, 8
(n, F)-transformational, 189 ω-continuous, 26
(n, RR )-transformational, 189 onto, 8
(n, R)-transformational, 189 Parikh, 42
of type, 36 partial, 9
quotient, 30 substitution, 30
recognizable, 28 surjective, 8
recognized by ΣX-recognizer, 120 undefined for an element, 9
recognized by recognizer, 28 meet, 24
regular, 29 mirror image, 30
right linear, 36 monoid
RR -transformational, 189 free, 27
R-transformational, 189 m-ary, 114
tree, 51 syntactic, 33
unambiguous CF, 39 monomorphism of
lattice, 24 algebra, 12

228
Index

ΣX-recognizer, 79 path in tree, 49, 165


morphism, 12 poset, 23
dual, 24
Nerode congruence of power of
forest, 86 language, 28
language, 32 relation, 7
next-state function of probabilistic tree automaton, 115
Mealy machine, 43 problem
recognizer, 27 emptiness, 99
NF-transducer, 142 equivalence, 100
nonterminal symbol of finiteness, 99
grammar, 36 inclusion, 100
regular ΣX-grammar, 59 nonterminal minimization, 42
normal form of CF grammar production minimization, 42
Chomsky, 40 product
Greibach, 40 forest, 65
normal form of regular tree grammar, 62
of languages, 28
normalized NDR ΣX-recognizer, 104
of mappings, 8
NR-transducer, 143
of relations, 7
occurrence of tree automata, 114
bound, 76 production of
free, 76 F-transducer, 132
of subtree, 50 grammar, 36
ω-sequence, 9 GSDT, 162
ω-variety, 115 gsm, 44
operation regular ΣX-grammar, 59
binary, 9 RR -transducer, 155
elementary, 94 R-transducer, 136
finitary, 10 production-sequence, 178
m-ary, 9 projection, 113
m-ary nondeterministic, 55 pseudovariety, 115
partial m-ary, 10
regular, 77 range of
unary, 9 relation, 7
operational symbol, 11 tree transformation, 131
operator, 11 rank of
ordering operation, 9
partial, 23 operator, 11
total, 24 rational completeness, 115
output alphabet of rational representation, 115
gsm, 44 reachability of state in
Mealy machine, 43 DR ΣX-recognizer, 108
ΣX-recognizer, 81
Parikh vector, 42 realization of

229
Index

operator, 11 root of tree, 49


tree automaton, 114 root-to-frontier tree transducer, 136
recognizer, 27 with regular look-ahead, 155
connected, 33 RR -transducer, 155
minimal, 33 deterministic, 155
nondeterministic, 30 linear, 155
quotient, 34 nondeleting, 155
Rabin-Scott, 27 RR -transformation, 156
reduced, 33 deterministic, 156
reduced form of ΣX-recognizer, 82 linear, 156
reflexive transitive closure, 7 nondeleting, 156
regular expression, 29 R-relabeling, 143
regular fixed-point equation, 90 R-transducer, 136
regular insertion, 174 k-metalinear, 203
regular operations, 77 deterministic, 143
regular ΣX-expression, 90 finite copying, 178
regular ΣX-grammar, 59 k-copying, 178
extended, 62 linear, 143
regular tree grammar, 59 nondeleting, 143
relation, 6 totally defined, 143
antisymmetric, 7 R-transformation, 136
congruence, 13
diagonal, 7 set
equivalence, 7 free generating, 20
invariant with respect to operation, generating, 12
13 Parikh, 42
reflexive, 7 power, 5
saturating a subset, 8 quotient, 8
symmetric, 7 Σ-algebra, see algebra, 11
total, 7 σ-catenation, 50
transitive, 7 σ-product, 69
reordering of direct derivations in Σ-term in X, 17
F-transducer, 134 Σ-tree, 48
R-transducer, 138 ΣX-forest, see also forest, 51
restriction of ΣX-recognizer
forest, 94 connected, 81
mapping, 8 connected DR, 108
operation, 10 deterministic root-to-frontier, 59
rewriting rule of DR, 59
F-transducer, 132 frontier-to-root, 52
GSDT, 162 minimal, 82
RR -transducer, 155 minimal DR, 110
R-transducer, 136 NDF, 56
ρ-class, 7 NDR, 57

230
Index

nondeterministic frontier-to-root, 56 supremum, 24


nondeterministic root-to-frontier, 57 surjection, 8
quotient, 80 syntactic pattern recognition, 115
quotient DR, 107
reduced, 81 term, 17
reduced DR, 108 TF-transducer, 142
ΣX-term, 17 theories, 114
ΣX-tree, 48 TR-transducer, 143
atomic, 111 transformation induced by
(Σ, X, k)-polynomial, 88 F-transducer, 132
regular, 90 RR -transducer, 156
sp-machine, 204 R-transducer, 136
state translation, 43
copying, 167 elementary, 23
deleting, 167 induced by GSDT, 163
nondeleting, 167 induced by gsm, 44
of F-transducer, 132 induced by Mealy machine, 43
of GSDT, 162 induced by tree transformation, 131
of gsm, 44 tree, 48
of Mealy machine, 43 derivation, 37, 119
of NDR ΣX-recognizer, 57 infinite, 115
of recognizer, 27 parse, 204
of ΣX-recognizer, 52 production, 121
state-sequence of tree transducer
GSDT, 178 frontier-to-root, 132
R-transducer, 178 macro, 205
structural equivalence of CF grammars, root-to-frontier, 136
128 tree transformation, 131
subalgebra, 12 preserving regularity, 165
generated by a set, 12
variable, 17
subderivation in
F-transducer, 133 word, 27
R-transducer, 137 accepted by recognizer, 28
subrecognizer, 34 empty, 27
subset η-accepted, 127
closed, 12 proper, 181
closed with respect to operation, 10
linear, 42 x-iteration, 68
recognizable, 112 x-path of ΣX-tree, 103
semilinear, 42 x-quotient, 68
subset construction, 31 x-substitution, 124
substitution, 113 X-language, see language, 27
subtree, 49 X-recognizer, see recognizer, 27
supertree, 193 X-tree, 48

231
Index

X-word, see word, 27

yield of
forest, 117
tree, 117

z-product, 65
0-state, 104

232
5 APPENDIX
SOME FURTHER TOPICS AND REFERENCES
by Magnus Steinby

The purpose of this Appendix is to supplement the original book with notes on some
further topics and a selection of more recent references. The choice of topics and ref-
erences is partly influenced by personal preferences, but I trust that the areas included
deserve to be mentioned, and that the general expositions, surveys and research papers
appearing in the bibliography are useful. Hence, I hope that these notes may serve as
an initial guide to the subjects discussed, and that they give an idea of the continuing
vitality of the theory and of its applications.
Before considering any specific areas, let me note some works of a general nature
published after Tree Automata was written. J. R. Büchi’s posthumous book Finite Au-
tomata, Their Algebras and Grammars [11] appeared in 1989 (edited by D. Siefkes).
The main part of it treats unary algebras, finite acceptors, regular languages and pro-
duction systems, but in a manner that suggests tree automata and tree languages as
natural generalizations. The last two chapters deal with terms, trees, algebras as tree
automata, tree grammars, and connections between context-free languages and push-
down automata. Especially this latter part of the book appears quite unfinished, but
the author’s grand design, a theory that would encompass algebras, automata, formal
languages and rewriting systems, is clearly discernible. The terminology and notation
is often nonstandard, sometimes even confusing, but a patient reader is rewarded by
original insights and interesting historical remarks.
The book Tree Automata and Languages [66] edited M. Nivat and A. Podelski, which
appeared in 1992, is a collection of papers that discuss a variety of topics involving trees.
The survey paper [46] by F. Gécseg and M. Steinby may be viewed as a condensed and
somewhat modernized version of Tree Automata, but it also takes up some further topics
and its bibliography includes many additional items.
In their book Syntax-Directed Semantics. Formal Models Based on Tree Transducers
[44], Z. Fülöp and H. Vogler consider formal models of syntax-directed semantics based
on tree transducers. They also develop a fair amount of the general theory of total
deterministic top-down, macro, attributed, and macro attributed tree transducers. In
particular, they compare with each other the classes of tree transformations defined
by the different types of tree transducers, and they present several composition and
decomposition results for these tree transformations.
The internet book Tree Automata Techniques and Applications [13], to be referred to
as TATA, is a joint enterprise of several authors. First launched in 1997, it has already

233
5 APPENDIX

been revised and extended a few times. The presentation is often rather informal, but
the ideas are richly illustrated by examples and many interesting facts are also given as
exercises. The first two chapters review some basic material about finite tree recognizers,
regular tree languages, and regular tree grammars, but also mention context-free tree
languages. Chapter 6 contains a brief account of tree transducers (without proofs). The
remaining five chapters deal with topics not covered by our book. The tutorial [55] by
C. Löding focuses on applications of tree automata and emphasizes algorithmic aspects.
Automata on infinite trees and the connections between tree automata and logic were
the most important topics excluded from Tree Automata. The two are strongly linked
with each other and have been studied intensively ever since tree automata were intro-
duced, and by now they form an extensive theory with important applications to logic
and computer science. Although mainly concerned with the word case, the survey pa-
pers [81] and [82] by W. Thomas offer very readable introductions to this area, and they
also include extensive bibliographies. Chapter 3 of TATA [13] is a further useful general
reference, and some of the papers in [66] deal with this topic. The book Automata,
Logics, and Infinite Games [50] edited by E. Grädel, Thomas and T. Wilke contains
twenty tutorial papers that form an excellent overview of the study of automata, logics
and games. About half of them concern trees and tree automata. Besides MSO logics,
they elucidate the uses and properties of various modal logics, fixed-point logics and
guarded logics, and demonstrate the usefulness of alternating tree automata.
The continual development of the theory of tree transformations is also largely driven
by applications, and tree transducers will be mentioned also in connection with some
the other themes to be discussed below. Here I shall note separately a few important
topics. The study of compositions of tree transformation classes initiated by B. S. Baker
(1973, 1979)1 and J. Engelfriet (1975) has been pursued further especially by Fülöp and
S. Vágvölgyi [40, 42, 43, 35]. In particular, they have considered semigroups of the com-
positions of some given tree transformation classes, and presented rewriting systems by
which the equality of two composition classes can be decided. They have also considered
some variants of Engelfriet’s (1977) important idea of regular look-ahead for top-down
tree transducers ([41], for example). Recently, Engelfriet, S. Maneth and H. Seidl [25]
have shown that in certain cases it can be decided whether a deterministic top-down
tree transducer with regular look-ahead is equivalent to a deterministic top-down tree
transducer, and that such a transducer without look-ahead can be constructed if the
answer is positive. Macro tree transducers were first defined by Engelfriet (1980) but, as
noted in [44] for example, the primitive recursive program schemes independently intro-
duced by B. Courcelle and P. Franchi-Zannettacci [14] amount to many-sorted versions
of them. Macro and other higher-level tree transducers have been studied in depth by
Engelfriet and Vogler [26, 27, 28, 29] (cf. also [21, 22]). For further information about
these matters, I recommend the bibliographic notes in [44]. The work [8] on equational
tree transformations by S. Bozapalidis, Fülöp and G. Rahonis is a natural extension of
a classical theme.
The decidability of the question whether the image of a given regular tree language
1
The references can be found in the original bibliography of Tree Automata

234
under a given tree homomorphism is regular, has been a relatively long-standing open
problem, but recently an affirmative solution was presented by G. Godoy and O. Giménez
[48]. Their approach uses tree automata with equality or disequality tests, and their work
contains also some results of independent interest concerning such automata. Moreover,
it has some applications to term rewriting and XML theory. Fülöp and P. Gyenizse [37]
have shown that injectivity is undecidable for tree homomorphisms while it is decidable
for linear deterministic top-down tree transformations. Furthermore, in [36] Fülöp proves
that several questions concerning the ranges of deterministic top-down tree transforma-
tions are undecidable. The decidability of the equivalence of deterministic top-down tree
transducers was proved by Ésik already in 1980. More recently, Engelfriet, Maneth and
Seidl [24] showed that the equivalence of total deterministic top-down tree transducers
can be decided in polynomial time by reducing the transducers to a certain canonical
form, and their method can be applied also to deterministic top-down tree transducers
with regular look-ahead. In [34], S. Friese, Seidl and Maneth present a corresponding
equivalence checking algorithm based on normal forms for bottom-up tree transducers.
In [23], Engelfriet and Maneth prove that the equivalence of deterministic MSO tree
transducers is decidable. These results, as well as many other decidability questions for
tree transducers are discussed in the recent survey paper [58] by Maneth. Finally, two
quite recent contributions should be mentioned. Firstly, Seidl, Maneth and G. Kemper
[78] prove the decidability of the equivalence of deterministic top-down tree-to-string
transducers. In [33], E. Filiot, Maneth, P.-A. Reynier and J.-M. Talbot introduce tree
transducers for which every output tree is augmented with information about the origin
of each of its nodes, and prove several decidability results concerning the equivalence or
injectivity of such transducers.
Since terms can be seen as syntactic representations of trees over ranked alphabets,
it is to be expected that there are some connections between tree automata and term
rewriting systems (TRSs). Indeed, various tree automata and tree grammars are often
defined as special term rewriting systems. On the other hand, tree automata can be
used for solving problems concerning TRSs and such applications have, in turn, inspired
new developments in the theory tree automata. In the mid-1980s it was noted that the
set Red(R) of terms reducible by a finite left-linear TRS R, as well as its complement,
the set Irr(R) of irreducible terms, are regular tree languages. Since this means that
many questions concerning reducibility and normal forms are decidable for such TRSs,
the observation was quickly followed by several studies of related matters. Thus it
was shown that a finite TRS R for which Red(R) is regular can be “linearized” and
that the regularity of Red(R) is decidable, the regular sets Red(R) were characterized
in terms of a new class of finite tree automata, and questions of ground reducibility
were considered. So-called monadic and semi-monadic TRSs were studied using tree
pushdown automata. For extending such applications to TRSs that are not left-linear,
new classes of tree automata are needed. The problem here is that automata that are
able to recognize also non-regular sets Red(R) or the sets of all ground instances of
a given non-linear term, tend to be too powerful to be manageable themselves. An
example of increased power combined with good decidability properties is provided by
the automata with comparisons between brothers introduced in the 1990s. The ground

235
5 APPENDIX

tree transducer is another important tree automaton sprung from the theory of term
rewriting. Much material concerning these matters can be found in TATA [13], and
introductions to this subject and many references are provided also by the surveys [47],
[67] and [79]. For some recent work on this theme, cf. [83], for example.
Weighted tree automata, tree series and weighted tree transformations have been stud-
ied quite extensively in recent years. Most aspects of this work (up to around 2009) are
reviewed in the handbook chapter [45] by Fülöp and Vogler, and a broad introduction
is provided also by the survey paper [31] by Z. Ésik and W. Kuich. Weighted logics for
weighted tree automata have been studied by M. Droste, Vogler and others, cf. [18, 39],
for example. Equational weighted tree transformations are considered by Bozapalidis,
Fülöp and Rahonis [9]. In [71] Rahonis introduces weighted Muller-automata on infinite
trees and a corresponding weighted MSO-logic. The dissertation [61] of C. Mathissen
contains, among other matters, also much interesting material belonging to this area as
well as a useful bibliography.
In an unranked tree a node labeled with a given symbol may have any number of
children. Languages of such trees were considered already in the 1960s in two notable
papers. J. W. Thatcher (1967) introduced finite unranked tree recognizers and showed
that the yields of the recognizable unranked tree languages are precisely the context-free
languages. C. Pair and A. Quere (1968) created an algebraic framework for the study
of regular unranked tree languages that also incorporated hedges, i.e. finite sequences
of unranked trees, and they proved many of the usual properties of regular sets for
recognizable unranked tree languages. Nevertheless, the topic received little attention
before it was discovered that it is natural to represent XML documents by unranked
trees and that unranked tree automata may be useful for handling questions concerning
them. The revival of the theory of unranked tree and hedge languages by M. Murata et al.
[63, 64, 10] initiated a lively activity in the area. TATA [13] devotes a chapter to unranked
tree languages and their applications. As a sample from the extensive literature, let us
mention just the papers [15, 59, 60, 65] and the survey [77] by T. Schwentick. Since this
work is mostly quite application-oriented, algorithmic and complexity issues are much
to the fore. X. Piao and K. Salomaa [69, 70] have considered state complexity questions
connected with conversions between different types of unranked automata as well as
lower bounds for the size of unranked tree automata. An overview of logics for unranked
trees is given by L. Libkin [54]. Weighted unranked tree automata are studied in [19]
and [17] by Droste, Vogler, and D. Heusel.
Natural language description and processing has become an important area of applica-
tion of the theory of tree automata and tree languages. Of course, parse trees of natural
languages have always been prime examples of ‘trees’ and some of the early works on
tree automata explicitly refer to linguistic motivations, but the current activity took
really off much later. In his book [62] F. Morawietz discusses formalizations of natural
language syntax that are based on monadic second-order (MSO) logic on trees and tree
language theory. A key fact here is the effective correspondence between weak MSO logic
and finite tree automata established already by Thatcher and J. B. Wright (1965, 1968)
and J. Doner (1965, 1970), but actually a whole array of tree language-related notions
are utilized or noted as potentially useful. These include tree walking automata [4, 6, 7]

236
macro tree transducers [26, 21], and tree-adjoining grammars (cf. [51], for a survey).
Recently, the theory of tree automata has attracted the attention of linguists especially
because of the promise shown by tree-based approaches to machine translation. Besides
classical notions and results appearing already in our book, work in this area draws
also upon some newer developments. In particular, it has both utilized and inspired
work on unranked and weighted tree languages as well as weighted tree transducers.
Furthermore, it has revived the interest in the generalized top-down tree transducers
studied much earlier by E. Lilin (1978). Also compositions and decompositions of vari-
ous tree transformations are used in machine translation systems. The papers [52, 53]
expose some of the relevant questions from a linguist’s point of view, while the papers
[20, 49, 56, 57] form a sample of theoretical work in the area.
Almost all papers on varieties of tree languages, and classes of special regular tree
languages in general, have appeared after 1984. Most of the work in this area published
before 2005 is at least mentioned in the survey [80], and all the references pointed to
(by author and year) below can be found there. Eilenberg-like variety theories for tree
languages were presented by Steinby (1979, 1992, 1998) and J. Almeida (1990, 1995).
Ésik (1999) has set forth a variety theory in which finitary algebraic theories take the
place of finite algebras, and later he together with P. Weil [32] formulated a similar theory
in terms of preclones. Syntactic monoids of tree languages were introduced by Thomas
(1982, 1984) and studied further by Salomaa (1983). A similar notion for binary trees
has been used by Nivat and Podelski (1989, 1992). The families of regular tree languages
considered in the literature include those of the finite and co-finite tree languages (Gécseg
and B. Imreh 1988), definite, reverse definite and generalized definite tree languages
(U. Heuter 1989, 1992), k-testable tree languages (Heuter 1989, T. Knuutila 1992), and
aperiodic tree languages (Thomas 1984). All of them are varieties of tree languages (cf.
Steinby 1992, 1998), and in some cases the corresponding varieties of finite algebras are
also known.
Although Thomas (1984) could characterize the aperiodic tree languages by their syn-
tactic monoids, it was obvious that such a characterization is not possible for all varieties
of tree languages. This was confirmed when S. Salehi [72] described the (generalized) va-
rieties definable by syntactic monoids or semigroups. His result shows, for example, that
the definite tree languages cannot be characterized by syntactic semigroups (as claimed
in an earlier paper). However, in [12] A. Cano Gomez and Steinby introduce generalized
syntactic semigroups (and monoids) in terms of which the definite tree languages can
be characterized. Wilke (1996) gave an effective characterization of the reverse definite
binary tree languages in terms of so-called tree algebras. Salehi and Steinby [74] stud-
ied the tree algebra formalism in some detail and presented a variety theorem for it.
Noticing that the well-known equivalence of aperiodicity, star-freeness, and first-order
definability of string languages fails for trees, Thomas (1984) introduced logics in which
set quantifications are limited to chains or to antichains of nodes. He proved then, for
example, that a regular tree language is star-free iff it is antichain-definable. This line of
research has been pursued further by Heuter (1989, 1991) and A. Potthoff (1994, 1995),
for example.
Some families of tree languages have been introduced by first defining a class of finite

237
5 APPENDIX

algebras. For example, the monotone tree languages studied by Gécseg and Imreh (2002)
were defined as the languages recognized by monotone algebras. Similarly, Ésik and
Sz. Iván [30] introduce a hierarchy of aperiodicity notions for finite algebras and consider
then, besides the properties of the obtained varieties of finite algebras, the corresponding
families of tree languages. There are a few different extensions of the variety theory
of tree languages: positive varieties of tree languages by T. Petković and Salehi [68],
varieties of many sorted sets (with tree languages as a special case) by Salehi and Steinby
[73], and varieties of recognizable tree series by Fülöp and Steinby [38].
A section of Tree Automata is devoted to deterministic root-to-frontier (DR) recogniz-
ers and DR tree languages, but the topic has been studied quite extensively also later. In
her thesis E. Jurvanen (1995) considers closure properties and the variety generated by
DR tree languages as well as ways of strengthening DR recognizers. The latter include,
in particular, the regular frontier check mechanism introduced by Jurvanen, Potthoff and
Thomas (1994). The thesis is also a good general introduction and a reference for work
done before 1995. In the synchronized deterministic top-down automata of Salomaa
[75, 76] a limited communication between the computations in different branches is al-
lowed. Gécseg and Steinby (2001) introduced syntactic monoids for DR tree languages,
and these were used by Gécseg and Imreh (2002, 2004) for characterizing monotone,
nilpotent and definite DR tree languages. In [59] W. Martens, F. Neven and Schwentick
discuss several aspects of DR-recognition. In particular, motivated by applications to
schema languages for XML, they study DR recognizers of unranked tree languages.
The book Grammatical Picture Generation. A Tree-Based Approach [16] by F. Drewes
is a comprehensive treatment of tree-based picture generation. The picture generating
systems considered consist, roughly speaking, of a device for producing a tree language
and a picture algebra that interprets trees as pictures. The devices used for producing
the tree languages include regular tree grammars, ET0L tree grammars, branching tree
grammars, and tree transducers. The needed tree language theory is given in several
inserts in the main text and in a separate appendix. Thus this fascinating book offers
also a general introduction to tree languages.
A great number of concepts and results from several branches of mathematics are
used in the theory of tree automata. However, as a conclusion of this appendix, I shall
mention some introductions to just two subjects most intimately connected with tree
automata: universal algebra and term rewriting. Besides the texts listed at the end of
Chapter I of Tree Automata, there are several other good books on universal algebra. As
general introductions, I recommend the classic [5] by S. Burris and H. P. Sankappanavar
and the more recent textbook by C. Bergman [3]. The book [84] by W. Wechler, written
expressly for computer scientists, is also very useful. The books [1] by J. Avenhaus and
[2] by F. Baader and T. Nipkow offer two good introductions to term rewriting systems.

238
Bibliography of the Appendix
[1] AVENHAUS, J. (1995): Reduktionssysteme. Springer-Verlag, Berlin 1995.
[2] BAADER, F. and NIPKOW, T. (1998): Term Rewriting and All That. Cambridge
University Press, Cambridge, UK 1998.
[3] BERGMAN, C. (2012): Universal Algebra. Fundamentals and Selected Topics. CRC
Press, A Chapman & Hall Book, Boka Raton, Fl 2012.
[4] BLOEM, J. and ENGELFRIET, J. (1997): Monadic second order logic and node re-
lations on graphs and trees. – Structures in Logic and Computer Science (Eds.
J. Mycielski, G. Rozenberg and A. Salomaa), Lecture Notes in Computer Science
1261, Springer-Verlag, Berlin 1997, 144-161.
[5] BURRIS, B. and SANKAPPANAVAR, H.P. (1981): A Course in Universal Algebra.
Springer-Verlag, New York 1981.
[6] BOJANCZYK, M. and COLCOMBET, T. (2006): Tree-walking automata cannot be
determinized. Theoretical Computer Science 350 (2006), 164-173.
[7] BOJANCZYK, M. and COLCOMBET, T. (2008): Tree-walking automata do not recog-
nize all regular languages. SIAM Journal of Computing 38 (2008), 658-701.
[8] BOZAPALIDIS, S., FÜLÖP, Z. and RAHONIS, G. (2011): Equational tree transforma-
tions. Theoretical Computer Science 412 (2011), 3676-3692.
[9] BOZAPALIDIS, S., FÜLÖP, Z. and RAHONIS, G. (2012): Equational weighted tree
transformations. Acta Informatica 49 (2012), 29-52.
[10] BRÜGGEMANN-KLEIN, A., MURATA, M. and WOOD, D. (2001): Regular tree and reg-
ular hedge languages over unranked alphabets: Version 1, April 3, 2001. Technical
Report HKUST-TCSC-2001-05, The Hongkong University of Technology 2001.
[11] BÜCHI, J. R (1989): Finite Automata, Their Algebras and Grammars. Towards a
Theory of Formal Expressions (Ed. D. Siefkes), Springer-Verlag, New York 1989.
[12] CANO GOMEZ, A. and STEINBY, M. (2011): Generalized contexts and n-ary syntactic
semigroups of tree languages. Asian-European Journal of Mathematics 4 (2011), 49-
79.
[13] COMON, H., DAUCHET, M., GILLERON, R., JACQUEMARD, F., LUGIEZ, D., LÖDING, C.,
TISON, S. and TOMMASI, M. (2008): Tree Automata Techniques and Applications.
Available at http://tata.gforge.inria.fr.

239
Bibliography of the Appendix

[14] COURCELLE, B. and FRANCHI-ZANNETTACCI, P. (1982): Attribute grammars and


recursive program schemes I and II. Theoretical Computer Science 17 (1982), 163-
191 and 235-257.

[15] CRISTAU, J., LÖDING, C. and THOMAS, W. (2005): Deterministic automata on un-
ranked trees. – Foundations of Computation Theory, FCT 2005 (Eds. M. Liśkiewicz
and R. Reinschuk), Lecture Notes in Computer Science 3623, Springer-Verlag,
Berlin 2005, 68-79.

[16] DREWES, F. (2006): Grammatical Picture Generation. A Tree-Based Approach,


Springer-Verlag, Berlin 2006.

[17] DROSTE, M. and HEUSEL, D. (2015): The supports of weighted unranked tree au-
tomata. Fundamenta Informaticae 136 (2015), 37-58.

[18] DROSTE, M. and VOGLER, H. (2006): Weighted tree automata and weighted logics.
Theoretical Computer Science 366 (2006), 228-247.

[19] DROSTE, M. and VOGLER, H. (2011): Weighted logics for unranked tree automata.
Theory of Computing Systems 48 (2011), 23-47.

[20] ENGELFRIET, J., LILIN, E. and MALETTI, A. (2009): Extended multi bottom-up tree
transducers – Composition and decomposition. Acta Informatica 46 (2009), 561-590.

[21] ENGELFRIET, J. and MANETH, S. (1999): Macro tree transducers, attribute gram-
mars, MSO definable tree translations. Information and Computation 154 (1999),
34-91.

[22] ENGELFRIET, J. and MANETH, S. (2003): Macro tree translations of linear size are
MSO definable. SIAM Journal of Computing 32 (2003), 950-1006.

[23] ENGELFRIET, J. and MANETH, S. (2006): The equivalence problem for deterministic
MSO tree transducers is decidable. Information Processing Letters 100 (2006), 206-
212.

[24] ENGELFRIET, J., MANETH, S. and SEIDL, H. (2009): Deciding equivalence of top-
down XML transformations in polynomial time. Journal of Computer and Systems
Science 75 (2009), 271-286.

[25] ENGELFRIET, J., MANETH, S. and SEIDL, H. (2014): How to remove the look-ahead
of top-down tree transducers. – Developments in Language Theory, DLT 2014 (Eds.
A.M Shur and M.V. Volkov), Lecture Notes in Computer Science 8633, Springer
International Publishing Switzerland 2014, 103-115.

[26] ENGELFRIET, J. and VOGLER, H. (1985): Macro tree transducers. Journal of Com-
puter and Systems Science 31 (1985), 71-146.

[27] ENGELFRIET, J. and VOGLER, H. (1986): Pushdown machines for the macro tree
transducer. Theoretical Computer Science 42 (1986), 251-368.

240
Bibliography of the Appendix

[28] ENGELFRIET, J. and VOGLER, H. (1988): High level tree transducers and iterated
pushdown tree transducers. Acta Informaticae 26 (1988), 131-192.

[29] ENGELFRIET, J. and VOGLER, H. (1991): Modular tree transducers. Theoretical


Computer Science 78 (1991), 267-304.

[30] ÉSIK, Z. and IVÁN, Sz. (2007): Aperiodicity in tree automata. – Algebraic Infor-
matics CAI 2007 (Eds. S. Bozapalidis and G. Rahonis), Lecture Notes in Computer
Science 4782, Springer-Verlag, Berlin 2007, 189-207.

[31] ÉSIK, Z. and KUICH, W. (2003): Formal tree series. Journal of Automata, Languages
and Combinatorics 8(2) (2003), 219-285.

[32] ÉSIK, Z. and WEIL, P. (2005): Algebraic recognizability of tree languages. Theoret-
ical Computer Science 340 (2005), 291-321.

[33] FILIOT, E., MANETH, S., REYNIER, P.-A. and TALBOT, J.-M. (2015): Decision prob-
lems of tree transducers. – Automata, Languages, and Programming (Proc. 42nd
Intern. Coll. ICALP 2015, Kyoto, Japan, July 2015), Lecture Notes in Computer
Science 9135, Springer-Verlag, Berlin 2015, 209-221.

[34] FRIESE, S., SEIDL, H. and MANETH, S. (2011): Earliest normal form and mini-
mization for bottom-up tree transducers. International Journal of Foundations of
Computer Science 22 (2011), 1607-1623.

[35] FÜLÖP, Z. (1991): A complete description for a monoid of deterministic bottom-up


tree transformation classes. Theoretical Computer Science 88 (1991), 253-268.

[36] FÜLÖP, Z. (1994): Undecidable properties of top-down tree transducers. Theoretical


Computer Science 134 (1994), 311-328.

[37] FÜLÖP, Z. and GYENIZSE, P. (1993): On injectivity of deterministic top-down tree


transducers. Information Processing Letters 48 (1993), 183-188.

[38] FÜLÖP, Z. and STEINBY, M. (2011): Varieties of recognizable tree series over fields.
Theoretical Computer Science 412 (2011), 736-752.

[39] FÜLÖP, Z., STÜBER, T. and VOGLER, H (2012): A Büchi-like theorem for weighted
tree automata over multioperator monoids. Theory of Computation Systems 50
(2012), 241-278.

[40] FÜLÖP, Z. and VÁGVÖLGYI, S. (1987): Results on compositions of deterministic


root-to-frontier tree transformations. Acta Cybernetica 8 (1987), 49-61.

[41] FÜLÖP, Z. and VÁGVÖLGYI, S. (1989): Variants of top-down tree transducers with
look-ahead. Mathematical Systems Theory 21 (1989), 125-145.

[42] FÜLÖP, Z. and VÁGVÖLGYI, S. (1990): A complete rewriting system for a monoid
of tree transformation classes. Information and Computation 86 (1990), 195-212.

241
Bibliography of the Appendix

[43] FÜLÖP, Z. and VÁGVÖLGYI, S. (1991): A complete classification of determinis-


tic root-to-frontier tree transformation classes. Theoretical Computer Science 81
(1991), 1-15.

[44] FÜLÖP, Z. and VOGLER, H. (1998): Syntax-Directed Semantics. Formal Models


Based on Tree Transducers, Springer-Verlag, Berlin 1998.

[45] FÜLÖP, Z. and VOGLER, H. (2009): Weighted tree automata and tree transducers.
– Handbook of Weighted Automata (Eds. M. Droste, W. Kuich and H. Vogler),
Springer-Verlag, Berlin 2009, 313-403.

[46] GÉCSEG, F. and STEINBY, M. (1997): Tree languages. – Handbook of Formal Lan-
guages, Vol. 3 (Eds. G. Rozenberg and A. Salomaa), Springer-Verlag, Berlin 1997,
1-68.

[47] GILLERON, R. and TISON, S. (1995): Regular tree languages and rewrite systems.
Fundamenta Informaticae 24 (1995), 157-175.

[48] GODOY, G. and GIMÉNEZ, O. (2013): The HOM problem is decidable. Journal of
the ACM 60(4) (2013), Article 23.

[49] GRAEHL, J., KNIGHT, K. and MAY, J. (2008): Training tree transducers. Computa-
tional Linguistics 34 (2008), 391-427.

[50] GRÄDEL, E, THOMAS, W. and WILKE, T. (Eds.) (2002): Automata, Logics, and
Infinite Games, Springer-Verlag, Berlin 2002.

[51] JOSHI, A. K. and SCHABES, Y. (1997): Tree-adjoining grammars. – Handbook of


Formal Languages, Vol. 3 (Eds. G. Rozenberg and A. Salomaa), Springer-Verlag,
Berlin 1997, 69-123.

[52] KNIGHT, K. (2007): Capturing practical natural language transformations, Machine


Translation 21 (2007), 212-133.

[53] KNIGHT, K. and GRAEHL, J. (2005): An overview of probabilistic tree transduc-


ers for natural language processing. – Computational Linguistics and Intelligent
Text Processing (Proc. 6th International Conference, CICLing 2005, Mexico City,
Mexico, February 2005), Lecture Notes in Computer Science 3406, Springer-Verlag,
Berlin 2005, 1-24.

[54] LIBKIN, L. (2006): Logics for unranked trees: an overview. Logical Methods in
Computer Science 2 (2006), 1-31.

[55] LÖDING, C. (2012): Basics on tree automata. – Modern Applications of Automata


Theory (Eds. D. D’Souza and P. Shankar), World Scientific, Singapore 2012, 79-109.

[56] MALETTI, A. (2011a): Survey. Weighted top-down tree transducers. Part I – Basics
and expressive power. Acta Cybernetica 20 (2011), 223-250.

242
Bibliography of the Appendix

[57] MALETTI, A. (2011b): Applications in machine translation of Survey: Weighted


top-down tree transducers. Fundamenta Informaticae 112 (2011), 239-261.

[58] MANETH, S. (2014): Equivalence problems for tree transducers: a brief survey. –
Automata and Formal Languages 2014, AFL 2014 (Eds. Z. Ésik and Z. Fülöp),
EPTCS 151, 2014, 74-93.

[59] MARTENS, W., NEVEN, F. and SCHWENTICK, T. (2008): Deterministic top-down


automata: past, present and future. – Logic and Automata. Texts in Logic and
Games, Vol. 2 (Eds. J. Flum, E. Grädel and T. Wilke), Amsterdam University
Press, Amsterdam 2008, 515-541.

[60] MARTENS, W. and NIEHREN, J. (2007): On the minimization of XML Schemas and
tree automata for unranked trees. Journal of Computer and System Sciences 73
(2007), 550-583.

[61] MATHISSEN, C. (2009): Weighted Automata and Weighted Logics over Tree-like
Structures. Dissertation, Faculty of Mathematics and Informatics, University of
Leipzig, Leipzig 2009.

[62] MORAWIETZ, F. (2003): Two-Step Approach to Natural Language Formalisms.


Studies in Generative Grammar 64, Mouton de Gruyter, Berlin 2003.

[63] MURATA, M. (1995): Forest-regular and tree-regular languages. Technical Report,


Fuji-Xerox, Japan 1995.

[64] MURATA, M. (2000): Hedge automata: A formal model for XML schemata. Fuji-
Xerox Information Systems, Japan 2000.

[65] NEVEN, F. (2002): Automata, logic, and XML. – Computer Science Logic (Proc.
16th Internat. Workshop, CSL 2002, Edinburgh, UK, 2002). Lecture Notes in Com-
puter Science 2471, Springer-Verlag, Berlin 2002, 2-26.

[66] NIVAT, M. and PODELSKI, A. (Eds.) (1992): Tree Automata and Languages, Studies
in Computer Science and Artificial Intelligence 10, North-Holland, Amsterdam 1992.

[67] OTTO, T. (1999): On the connections between rewriting and formal languages. –
Rewriting Techniques and Applications, RTA-99 (Proc. Conf., Trento, Italy, 1999),
Lecture Notes in Computer Science 1631, Springer-Verlag, Berlin 1999, 332-355.

[68] PETKOVIĆ, T. and SALEHI, S. (2005): Positive varieties of tree languages. Theoretical
Computer Science 347 (2005), 1-35.

[69] PIAO, X. and SALOMAA, K. (2011): Transformations between different models of


unranked bottom-up tree automata. Fundamenta Informaticae 109 (2011), 405-424.

[70] PIAO, X. and SALOMAA, K. (2012): Lower bounds for the size of deterministic
unranked tree automata. Theoretical Computer Science 454 (2012), 231-239.

243
Bibliography of the Appendix

[71] RAHONIS, G. (2007): Weighted Muller tree automata and weighted logics. Journal
of Automata, Languages and Combinatorics 12 (2007), 455-483.

[72] SALEHI, S. (2005): Varieties of tree languages definable by syntactic monoids. Acta
Cybernetica 17 (2005), 21-41.

[73] SALEHI, S. and STEINBY, M. (2007a): Varieties of many-sorted recognizable sets.


PU.M.A. 18 (2007), 319-343. Also as: TUCS Technical Report No 626, Turku 2004.

[74] SALEHI, S. and STEINBY, M. (2007b): Tree algebras and varieties of tree languages.
Theoretical Computer Science 377 (2007), 1-24.

[75] SALOMAA, K. (1994): Synchronized tree automata. Theoretical Computer Science


127 (1994), 25-51.

[76] SALOMAA, K. (1996): Decidability of equivalence for deterministic synchronized tree


automata. Theoretical Computer Science 167 (1996), 171-192.

[77] SCHWENTICK, T. (2007): Automata for XML – A survey. Journal of Computer and
System Sciences 73 (2007), 289-315.

[78] SEIDL, H., MANETH, S. and KEMPER, G. (2015): Equivalence of deterministic top-
down tree-to-string transducers is decidale. arXiv: 1503.09163v [cs.FL] 31Mar2015.

[79] STEINBY, M. (2003): Tree automata in the theory of term rewriting. – Words,
Languages and Combinatorics III (Proc. Intern. Conf., Kyoto, Japan, 2000) (Eds.
M. Ito and T. Imaoka), World Scientific, New Jersey 2003, 434-449.

[80] STEINBY, M. (2005): Algebraic classifications of regular tree languages. – Structural


Theory of Automata, Semigroups and Universal Algebra (Eds. V.B. Kudryavtsev
and I.G. Rosenberg), NATO Science Series, Mathematics, Physics and Chemistry,
vol. 207 (2005), 381-432.

[81] THOMAS, W. (1990): Automata on infinite objects. – Handbook of Theoretical


Computer Science, Vol. B (Ed. J. van Leeuwen), Elsevier, Amsterdam 1990, 133-
191.

[82] THOMAS, W. (1997): Languages, automata, and logic. – Handbook of Formal Lan-
guages, Vol. 3 (Eds. G. Rozenberg and A. Salomaa), Springer-Verlag, Berlin 1997,
389-455.

[83] VÁGVÖLGYI, S. (2013): Rewriting preserving recognizability of finite tree languages.


The Journal of Logic and Algebraic Programming 82 (2013), 71-94.

[84] WECHLER, W. (1992): Universal Algebra for Computer Scientists, Springer-Verlag,


Berlin 1992.

244

You might also like