Signatures, Algebra, Homomorphisms

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

(*

Interlude: abstracting from the particular language


- to a theory of all abstract languages, and
- (all) possible interpreters (evaluators)
*)
(*
Signatures and Signed/Ranked Algebras *)
(*
Let us present a general framework for discussing Abstract Syntax

For simplicity, we confine our interest to a homogeneous algebraic framework: all


expressions are of one sort. (The generalisation is called multi-sorted algebras.)
*)

(*
We first fix the symbols which any candidate (abstract) language will use:
Definition: Signature ∑ - a set of symbols,
and for each symbol, its arity ≥ 0
*)

(* Example: Boolean expressions


T , F arity 0
¬ arity 1
\/, /\ arity 2
*)

(* Example: Integer expressions


(a denumerable set of) numerals arity 0
+ , * arity 2
unary minus - arity 1
*)

(*
We now discuss the meaning of expressions in any language whose symbols have
been specified using a signature ∑.

Semantics

Definition (∑-algebra)
A — a carrier set, and
for each 0-ary symbol in ∑, associate some element of A
for each k-ary symbol in ∑, associate a total function in A k → A

Notes:
• Not all elements of A require having a 0-ary symbol in ∑ associated with them.
That is, the carrier set may contain values (perhaps infinitely many) for which
there is no syntax. Irrational real numbers are a good examples — there may be
no way of writing a symbol for each irrational number. In fact, we may not even
be able to write an expression for each irrational number.
• Different 0-ary symbols in ∑ do not necessarily have to be associated with
different elements — two different symbols can be mapped to the same value.
Similarly for k-ary symbols, two different symbols can be mapped to the same
function.
• Two ∑-algebras can have the same carrier set A, but differ on the interpretation of
the symbols in the signature ∑. These would be considered different ∑-algebras.
*)

(*
Note: Symbols do not inherently carry meaning. We can assign whatever meanings
we choose to …. but once we assign meaning by choosing a particular ∑-algebra, we
have fixed the meanings of the symbols.

In particular, we can give


Standard examples of ∑-algebras, where the symbols are given meanings that we all
expect and agree upon, e.g., binary symbol + means addition on naturals, the 0-ary
symbol numeral 0 means the natural number zero.

But we can also give


non-standard examples of ∑-algebras, where the meanings of symbols differ from
what we normally expect, e.g., 0-ary symbol numeral 0 can be given the meaning
“forty-two”.
*)

(*
Notation:
If = ⟨A, …⟩ is a ∑-algebra, and a ∈ A is associated with 0-ary symbol c in ∑, we
write c to denote the element a ∈ A, highlighting that this is the meaning
associated with symbol c in algebra .
Similarly, if f is a k-ary symbol in ∑, and in ∑-algebra = ⟨A, …⟩, f is associated
k
with a total function g : A → A, we write f to denote this function g.
*)
(* Abstract Syntax, abstractly and formally

Definition: (Carrier set TreeΣ)


Suppose ∑ is a given signature.
The set TreeΣ is inductively defined as follows:
Base cases: for each 0-ary symbol c in ∑, a labelled node •c is in TreeΣ
Induction cases: for each k > 0,
for each k-ary symbol f in ∑,
for any trees t1, …, tk (already) in TreeΣ,
the tree consisting of
a labelled node •f at the root
with the k trees t1, …, tk ordered as subtrees below the root node
is in TreeΣ.
𝒜
𝒜
𝒜
𝒜𝒜
Notes: The roots of t1, …, tk are the children, ordered 1…k, of the new root node •f
The ti need not be distinct.

The Abstract Syntax Tree Algebra: Syntax as Semantics


Definition: TreeΣ is the ∑-algebra with carrier set TreeΣ, where
- every 0-ary symbol c in ∑ is interpreted as the labelled node •c, which is in TreeΣ,
and
- every k-ary symbol f in ∑ is interpreted as the tree-forming function that takes k
trees t1, …, tk and creates the tree
•f
/ \
t1…tk
in TreeΣ by sticking t1, …, tk as the subtrees below a new root node labelled •f.
*)
(*
Structure-preserving functions between ∑-algebras
∑-homomorphisms are structure-preserving functions between ∑-algebras which
respect the symbol association (imposed with respect to the symbols in the
signature ∑) by the semantics of the source and target ∑-algebras.

Definition (∑-homomorphisms) Suppose ∑ is a given signature.


Suppose = ⟨A, …⟩ and ℬ = ⟨B, …⟩ are both ∑-algebras (for the same
signature ∑).
A total function h : A → B between the carrier sets of these algebras is called a ∑-
homomorphism if
- for all 0-ary symbols c in ∑: h(c ) = cℬ
- for all k-ary (k >0) symbols f in ∑,
for all a1, …ak ∈ A,
h( f (a1, …, ak ) ) = fℬ( h(a1), …, h(ak ) )
*)
(*
Initiality Theorem
For any ∑-algebra, ℬ = ⟨B, …⟩, there is a unique ∑-homomorphism
iℬ : TreeΣ → B
from the ∑-algebra TreeΣ to the ∑-algebra ℬ.
Proof — by construction, define iℬ to be a ∑-homomorphism.
Define iℬ : TreeΣ → B as follows:
• for each 0-ary symbol c in ∑: iℬ( •c ) = cℬ
• for each k-ary (k >0) symbol f in ∑,
iℬ ( •f ) = fℬ( iℬ(t1), …, iℬ(tk) )
/ \
t1…tk
𝒜
𝒜
𝒜
Uniqueness of iℬ
Suppose j : TreeΣ → B is also a ∑-homomorphism from TreeΣ to ℬ.
Proof by induction on (ht t) that for all t in TreeΣ, iℬ( t ) = j( t )
Base cases (ht t) = 0.
t must be of the form •c for some 0-ary symbol in ∑.
But for each 0-ary symbol c in ∑: iℬ( •c ) = cℬ // defn of iℬ
= j( •c ) // j is a ∑-homomorphism
// from TreeΣ to ℬ
Induction Hypothesis:
Assume that for all t’ in TreeΣ such that (ht t’) ≤ n,
iℬ( t’ ) = j( t’ )
Induction Step: Consider t in TreeΣ such that (ht t) = n+1
Therefore t is of the form
•f
/ \
t1…tk
with (ht ti ) ≤ n (1 ≤ i ≤ k)
Now, by definition of iℬ,
for each k-ary symbol (k >0) f in ∑,
iℬ( •f )
/ \
t1…tk

= fℬ( iℬ( t1 ),…, iℬ( tk ) ) // definition of iℬ

= fℬ( j( t1 ),…, j( tk ) ). // by IH on each of t1…tk

j( •f ) // j is a ∑-homomorphism from TreeΣ to ℬ


/ \
t1…tk
*)
(*
Variables

We now move from expressions to formulas, i.e., expressions that contain


variables. As the name implies, a variable is a 0-ary symbol can be given different
values, in contrast with constants in a signature ∑, whose values are fixed, once
we fix a ∑-algebra . Assume a (denumerable) set of variables, disjoint from
symbols in ∑, i.e., ∩ Σ = ∅ *)
(* Let us represent variables using the OCaml type string and extend the
encoding of the abstract syntax for expressions in the data type exp:
*)
type exp = N of int
| V of string
| Plus of exp * exp
𝒜𝒳𝒳
| Times of exp * exp;;

(* Note that we have a new family of base cases: variables, which are denoted using
the parameterised constructor V(_) which takes arguments of type string.
*)
(* The corresponding Concrete Syntax can also be extended
E —> T
E —> T + E
T —> F
T —> F * T
F —> n
F —> v // a family of variables
F —> ( E )
*)
(*
We now extend the syntactic functions on tree-structured expressions *)

(* size, as before, returns the number of nodes in the tree, with a variable counted
as a single node. *)
let rec size t = match t with
N _ -> 1
| V _ -> 1
| Plus(t1, t2) -> (size t1) + (size t2) + 1
| Times(t1, t2) -> (size t1) + (size t2) + 1
;;

(* ht, also as before, returns one less than the number of levels in the tree, with
variables being treated as leaf nodes. *)

let rec ht t = match t with


N _ -> 0
| V _ -> 0
| Plus(t1, t2) -> (max (ht t1) (ht t2)) + 1
| Times(t1, t2) -> (max (ht t1) (ht t2)) + 1
;;

(* As mentioned before, both size and ht are good measures for performing
induction on trees. *)

(* We now define a syntactic function on exp called vars which returns the list of
variables in the tree. (We will later see that we really want a function returning the
set of variables in a syntax tree. If we represent sets as lists, then the difference
goes away; of course, we must avoid duplicates and not care about the relative order
in which elements appear). *)
let rec vars t = match t with
N _ -> [ ]
| V x -> [x]
| Plus(t1, t2) -> (vars t1) @ (vars t2)
| Times(t1, t2) -> (vars t1) @ (vars t2)
;;

(* We now extend the standard semantics — the definitional interpreter of a


language of expressions which include variables. The main question to answer is
“what is the value of an expression that contains variables?” The way to approach
this question is to pose a counter-question, asking what value each variable takes.
So eval now takes an extra argument, which we call rho, that maps each variable
to a value. For the most part, except the new base cases, argument rho is carried
along as extra baggage in each call. But it cannot be left out. (Why not?) Note that
rho is a function from string to int ! *)

let rec eval t rho = match t with


N n -> n
| V x -> rho x
| Plus(t1, t2) -> (eval t1 rho) + (eval t2 rho)
| Times(t1, t2) -> (eval t1 rho) * (eval t2 rho)
;;

(* We now extend the compile function. First we introduce an opcode LOOKUP to


look up the value associated with a given variable.
*)
type opcode = LD of int
| LOOKUP of string //look up variable’s value
| PLUS
| TIMES
;;

(* The compiler as before, is a post-order traversal, i.e., a tree-recursive function,


with a straightforward new base case. *)
let rec compile t = match t with
N n -> [LD (n)]
| V x -> [ LOOKUP(x) ]
| Plus(t1, t2) -> (compile t1) @ (compile t2) @ [PLUS]
| Times(t1, t2) -> (compile t1) @ (compile t2) @ [TIMES]
;;

(* The stack machine is again written as a tail recursive function driven


by the first op-code and the state of the stack, and a new component env
(environment), which maps variables to their corresponding answers.
*)
let rec stkmc s code env = match code with
[ ] -> s
| (LD(n)) :: code' -> stkmc (n :: s) code' env
| (LOOKUP(x))::code' -> let n=env x
in stkmc (n::s) code' env
| PLUS :: code' -> (match s with
n1 :: n2 :: s' -> let n = n1 + n2
in stkmc (n :: s') code' env
| _ -> raise Stuck)
| TIMES :: code' -> (match s with
n1 :: n2 :: s' -> let n = n1 * n2
in stkmc (n :: s') code' env
| _ -> raise Stuck)
;;

(* Let us now abstract this treatment to arbitrary abstract syntax trees over
arbitrary signatures ∑. We do so by defining a set TreeΣ( ).

Definition: Suppose ∑ is a given signature.


Suppose is a (denumerable) set of variables.
The set TreeΣ( ) is inductively defined as follows:
Base cases:
- for each 0-ary symbol c in ∑, a labelled node •c is in TreeΣ( )
- for each variable x in , a labelled node •x is in TreeΣ( ).
Induction cases: for each k > 0,
for each k-ary symbol f in ∑,
for any trees t1, …, tk in TreeΣ( ),
the tree consisting of a labelled node •f at the root
with k trees t1, …, tk ordered as subtrees below the root node
is in TreeΣ( ).

The algebra of abstract syntax trees


TreeΣ( ) is the ∑-algebra with carrier set TreeΣ( ), where (as before)
- every 0-ary symbol c in ∑ is interpreted as the labelled node •c, in TreeΣ( ),
- every k-ary symbol (k >0) f in ∑ is interpreted as the tree-forming function
that takes k trees t1, …, tk from TreeΣ( ), and creates the tree
•f
/ \
t1…tk
by sticking t1, …, tk as the subtrees below a new root node labelled •f. *)
(* Observe is in 1-1 correspondence with a subset • of TreeΣ( ).
Note also that TreeΣ( ) contains • ∪ TreeΣ.

Definition (B-valuation). A function ρ : → B that maps variables to values in a


carrier set B is called a B-valuation.

Definition (Function extension)


A function h: TreeΣ( ) → B extends function ρ : → B if
for all x ∈ , h(•x ) = ρ(x)

Unique Homomorphic Extension Theorem


For any signature ∑, and
any ∑-algebra, ℬ = ⟨B, …⟩,
𝒳𝒳
𝒳
𝒳𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳𝒳
𝒳
𝒳𝒳
𝒳
𝒳
𝒳
𝒳
and any B-valuation ρ : → B,
there is a unique ∑-homomorphic extension of ρ : → B, written
ρ ̂ : TreeΣ( ) → B
from the (syntactic) ∑-algebra TreeΣ( ) to the ∑-algebra ℬ.

Proof (First) Existence of a ∑-homomorphic extension of ρ : → B.


By construction: define ρ ̂ : TreeΣ( ) → B to be a ∑-homomorphism as follows:
- For each 0-ary symbol c in ∑: ρ (̂ •c ) = cℬ
- For each x ∈ , ρ (̂ •x ) = ρ(x)
- For each k-ary symbol (k >0) f in ∑, and t1, …, tk in TreeΣ( )
ρ (̂ •f ) = fℬ( ρ (̂ t1 ),…, ρ (̂ tk ) )
/ \
t1…tk

Uniqueness of ρ ̂

Suppose j is an extension of ρ, and j : TreeΣ( ) → B is a ∑-homomorphism from


∑-algebra TreeΣ( ) to the ∑-algebra ℬ.

Proof by induction on (ht t) that for all t in TreeΣ( ), ρ (̂ t ) = j( t )


Base cases (ht t) = 0.
subcase t is of the form •c
for each 0-ary symbol c in ∑: ρ (̂ •c ) = cℬ // defn of ρ ̂
= j( •c ) // j is a ∑-homomorphism
// from TreeΣ( ) to ℬ
subcase t is of the form •x for some x ∈
ρ (̂ •x ) = ρ(x) // defn of ρ ̂
= j( •x ) // j extends ρ

Induction Hypothesis:
Assume that for all t’ in TreeΣ( ) such that (ht t) ≤ n,
ρ (̂ t’ ) = j( t’ )

Induction Step: Consider t in TreeΣ( ) s.t. (ht t) = n+1


t must be of the form
•f
/ \
t1…tk
with (ht ti) ≤ n (1 ≤ i≤ k)
Now, by definition,
for each k-ary symbol (k >0) f in ∑, and t1, …, tk in TreeΣ( ) s.t. (ht ti) ≤ n
ρ (̂ •f )
/ \
t1…tk
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
= fℬ( ρ (̂ t1 ),…, ρ (̂ tk ) ) // definition of ρ ̂
= fℬ( j( t1 ),…, j( tk ) ) // by IH on each of t1…tk
= j( •f ) // j : TreeΣ( ) → B is a ∑-homomorphism
/ \ // from ∑-algebra TreeΣ( ) to the ∑-algebra ℬ.
t1…tk
*)

(* Only those variables that appear in a tree (term, expression) matter in its
evaluation.
Lemma (Relevant Variables): Suppose ∑ is an arbitrary signature and is the
set of variables.
Let t in TreeΣ( ) be an arbitrary tree (term, expression).
Let V = vars( t ).
Consider any ∑-algebra ℬ = ⟨B, …⟩, and let ρ1, ρ2 ∈ → B be any two B-
valuations such that for all variables x ∈ V, ρ1(x) = ρ2(x).
[Note: ρ1(y) and ρ2(y) may be very different for y ∉ V. ]
Then ρ1 ̂(t) = ρ2 ̂(t).

Proof by induction on ht( t ).


Base cases (ht( t ) = 0 ).

Therefore ρ1 ̂(∙c) = cℬ = ρ2 ̂(∙c)


subcases: t is of the form •c for some 0-ary symbol c in ∑.

subcases t is of the form •x for some x in . So x ∈ V.


Therefore ρ1 ̂(∙x) = ρ1(x) = ρ2(x) = ρ2 ̂(∙x)
Induction Hypothesis: Assume for all t’ in TreeΣ( ) such that ht( t’ ) ≤ n,
for all ρ′1, ρ′2 ∈ → B such that
for all variables x ∈ V′ = vars( t’ ), ρ′1(x) = ρ′2(x),
that ρ′1 ̂(t′) = ρ′2 ̂(t′).
Induction Step: Let t in TreeΣ( ) be such that ht( t ) = n+1.
By analysis t is of the form •f for some symbol f of arity k in ∑
/ \
t1…tk
and trees t1, …, tk from TreeΣ( ), where max ht( ti) = n. (1≤i≤k)
ρ1 ̂( •f ) = fℬ( ρ1 (̂ t1 ),…, ρ1 ̂( tk ) )
/ \
t1…tk
= fℬ( ρ2 ̂( t1 ),…, ρ2 (̂ tk ) ) // By IH on t1, …, tk with ρ′1 = ρ1 and ρ′2 = ρ2
// Note that Vi = vars( ti ) ⊆ vars( t) = V
// so if for all x ∈ V, ρ1(x) = ρ2(x),
// then for all x ∈ Vi, ρ1(x) = ρ2(x) (1≤i≤k)
= ρ2 (̂ •f )
/ \
t1…tk
*)




𝒳
𝒳
𝒳
𝒳
𝒳
𝒳
𝒳






𝒳
𝒳
𝒳

(*
Proposition: For any t in TreeΣ( ), V = vars( t ) is finite.
|V|
|B|
Corollary: t in TreeΣ( ) can evaluate to at most | B | values.
*)
𝒳
𝒳

You might also like