0% found this document useful (0 votes)
21 views8 pages

COL226 9 Variables in the Calculator

The document discusses the extension of a toy programming language to include variables, allowing expressions to vary with different inputs. It outlines the modifications needed for type-checking, evaluation, and the definitional interpreter to accommodate these variables, including the introduction of typing assumptions and a new 'eval' function. Additionally, it addresses the operational semantics and the conditions for soundness and completeness in relation to the definitional interpreter.

Uploaded by

mayankgoel726
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views8 pages

COL226 9 Variables in the Calculator

The document discusses the extension of a toy programming language to include variables, allowing expressions to vary with different inputs. It outlines the modifications needed for type-checking, evaluation, and the definitional interpreter to accommodate these variables, including the introduction of typing assumptions and a new 'eval' function. Additionally, it addresses the operational semantics and the conditions for soundness and completeness in relation to the definitional interpreter.

Uploaded by

mayankgoel726
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

From Fixed Expressions to Formulas

Variables
So far, we have looked at type-checking, evaluating (in a value domain) and calculating (in
abstract syntax) expressions. However, “programming” makes sense when one performs
the same calculation on different inputs. In that sense, our toy language compiler or a
definitional interpreter are programs that can work on different expressions as input —
however, the toy language is not yet a “programming language”. The next step we will take
is to make some parts of expressions vary. Mathematicians do this all the time: for
example, they calculate a formula b 2 − 4ac for different values of a, b, c. The expressions
a, b, c are called input variables since they can (each) be given various values.

So our first step is to extend the toy language with a new case of expressions — namely,
variables. Assume that we have a (denumerable) set of identifiers (assume also that
these identifiers are different from other symbols in the language), with
x, y, z, x1, y1, … ∈ representing typical variables. (We call x, y, z, x1, y1, … “meta-
variables” — the actual variables in the language will be particular strings).

Following an “abstract grammatical notation” to characterise the inductively defined set of


expressions (i.e., abstract syntax), we can write
e ∈ Exp ::= n ∣ T ∣ F ∣ x ∣ e1 + e2 ∣ e1 * e2 ∣ e1 ∧ e2 ∣ e1 ∨ e2 ∣ ¬e1 ∣ e1 = e2 ∣ e1 > e2

We extend the OCaml encoding of the language


type exp = Num of int | Bl of myBool
| V of string (* variables *)
| Plus of exp * exp | Times of exp * exp
| And of exp * exp | Or of exp * exp | Not of exp
| Eq of exp * exp | Gt of exp * exp
;;
by redefining the type exp to include another case (namely, variables) represented using a
constructor V that takes a string (the name of a particular variable in the toy language) as
an argument.

The functions ht and size are amended as follows, mapping all variables to have height 0,
and size 1.

let rec ht e = match e with


Num n -> 0
| Bl b -> 0
| V x -> 0
| Plus (e1, e2) -> 1 + (max (ht e1) (ht e2))
| Times (e1, e2) -> 1 + (max (ht e1) (ht e2))
| And (e1, e2) -> 1 + (max (ht e1) (ht e2))
| Or (e1, e2) -> 1 + (max (ht e1) (ht e2))
| Not e1 -> 1 + (ht e1)
| Eq (e1, e2) -> 1 + (max (ht e1) (ht e2))
| Gt(e1, e2) -> 1 + (max (ht e1) (ht e2))
;;

and
𝒳
𝒳
let rec size e = match e with
Num n -> 1
| Bl b -> 1
| V x -> 1
| Plus (e1, e2) -> 1 + (size e1) + (size e2)
| Times (e1, e2) -> 1 + (size e1) + (size e2)
| And (e1, e2) -> 1 + (size e1) + (size e2)
| Or (e1, e2) -> 1 + (size e1) + (size e2)
| Not e1 -> 1 + (size e1)
| Eq (e1, e2) -> 1 + (size e1) + (size e2)
| Gt(e1, e2) -> 1 + (size e1) + (size e2)
;;

Types and typing rules


Assume we have a set Typ of types. Let τ, τ1, τ′ ∈ Typ represent typical types (again,
τ, τ1, τ′ are “meta-variables” ranging over types). So far, we have considered Int and Bool
as members of Typ. (This will be extended as we proceed later). We have seen earlier
typing rules that associate numeric expressions with type Int and boolean expressions with
type Bool. But what type should we give variables?

Type Assumptions. A typing assumption Γ ∈ →fin Typ is a finite-domain function


from variables to types, that is, it associates a type with any variable in its domain.
If Γ is a type assumption, x ∈ a variable, and τ ∈ Typ a type, we write Γ[ x : τ] to
denote the type assumption that associates the type τ to variable x and for other variables
in dom(Γ), associates them to types exactly as Γ would.
This notion generalises as follows:
If Γ, Γ1 are type assumptions, then Γ[Γ1] denotes the type assumption defined as
Γ[Γ1]( x) = Γ1( x) if x ∈ dom(Γ1);
Γ[Γ1]( x) = Γ( x) if x ∈ dom(Γ) − dom(Γ1);
and undefined if x ∉ dom(Γ) ∪ dom(Γ1).
(This is in fact, the standard notion of one finite domain function being augmented by
another.)

The “has type” relation is now modified to carry a type assumption, written to the left of
the “turnstile”, to handle the presence of variables within expressions, and by adding a rule
to deal with the base case of variables. Note that each statement is modified to read as
Γ⊢_:_

(NumT) All numerals n


Γ ⊢ n : Int
have type Int for any Γ

(BoolT) All boolean constants b have type Bool for any Γ


Γ ⊢ b : Bool

(VarT) A variable has the type it is assumed to have.


Γ ⊢ x : Γ( x)
𝒳
𝒳


Γ ⊢ e1 : Int Γ ⊢ e2 : Int
(PlusT)
Γ ⊢ e1 + e2 : Int
All addition expressions e1 + e2 have type Int, provided the subexpressions e1 and e2
both have type Int under the same type assumptions Γ.

Γ ⊢ e1 : Int Γ ⊢ e2 : Int
(TimesT)
Γ ⊢ e1 * e2 : Int
All multiplication expressions e1 * e2 have type Int, provided the subexpressions e1
and e2 both have type Int under the same type assumptions Γ.

Γ ⊢ e1 : Bool
(NotT)
Γ ⊢ ¬e1 : Bool
All negation expressions ¬e1 have type Bool, provided the subexpressions e1 have
type Bool under the same type assumptions Γ.

Γ ⊢ e1 : Bool Γ ⊢ e2 : Bool
(AndT)
Γ ⊢ e1 ∧ e2 : Bool
All conjunction expressions e1 ∧ e2 have type Bool, provided the subexpressions e1
and e2 both have type BoolT under the same type assumptions Γ.

Γ ⊢ e1 : Bool Γ ⊢ e2 : Bool
(OrT)
Γ ⊢ e1 ∨ e2 : Bool
All disjunction expressions E1 ∨ E2 have type Bool, provided the subexpressions e1
and e2 both have type Bool under the same type assumptions Γ.

Γ ⊢ e1 : τ Γ ⊢ e2 : τ
(EqT)
Γ ⊢ e1 = e2 : Bool
All equality expressions e1 = e2 have type Bool, provided the subexpressions e1 and
e2 both have type τ under the same type assumptions Γ.

Γ ⊢ e1 : Int Γ ⊢ e2 : Int
(GtT)
Γ ⊢ e1 > e2 : Bool
All greater-than expressions e1 > e2 have type Bool, provided the subexpressions e1
and e2 both have type Int under the same type assumptions Γ.

Modifying the definitional interpreter


How does the definitional interpreter change? Well, what is the value of a variable?
Whatever value we give to the variable by a “valuation”, i.e., a function ρ ∈ →
from variables to values in the set of values . So each equation for eval takes ρ as
an additional argument:
𝕍
𝒳
𝕍
eval[[ n]] ρ = n
eval[[ T ]] ρ = true and eval[[F ]] ρ = false

eval[[ x]] ρ = ρ( x)

eval[[e1 + e2]] ρ = (eval[[e1 ]] ρ) + (eval[[e2 ]] ρ)


eval[[e1 * e2]] ρ = (eval[[e1 ]] ρ) × (eval[[e2 ]] ρ)
(where +, × represent integer addition and multiplication).

eval[[e1 ∧ e2]] ρ = (eval[[e1 ]] ρ) && (eval[[e2 ]] ρ)


eval[[e1 ∨ e2]] ρ = (eval[[e1 ]] ρ) ∣ ∣ (eval[[e2 ]] ρ)
eval[[ ¬e1]]ρ = not (eval[[e1 ]] ρ)
(where && , ∣ ∣ , not represent boolean conjunction, disjunction and
negation).

eval[[e1 = e2]] ρ = (eval[[e1 ]] ρ) =? (eval[[e2 ]] ρ)


eval[[e1 > e2]] ρ = (eval[[e1 ]] ρ) >? (eval[[e2 ]] ρ)
(where =? , >? represent equality and greater-than comparisons on
integers).

The Modified Definitional Interpreter in OCaml

type values = N of int | B of bool;;

let rec eval e rho = match e with


Num n -> N n
| Bl b -> B (myBool2bool b)
| V x -> rho x
| Plus (e1, e2) -> let N n1 = (eval e1 rho)
and N n2 = (eval e2 rho)
in N (n1 + n2)
| Times (e1, e2) -> let N n1 = (eval e1 rho)
and N n2 = (eval e2 rho)
in N (n1 * n2)
| And (e1, e2) -> let B b1 = (eval e1 rho)
and B b2 = (eval e2 rho)
in B (b1 && b2)
| Or (e1, e2) -> let B b1 = (eval e1 rho)
and B b2 = (eval e2 rho)
in B (b1 || b2)
| Not e1 -> let B b1 = (eval e1 rho) in B (not b1)
| Eq (e1, e2) -> let a1 = (eval e1 rho)
and a2 = (eval e2 rho)
in B (a1 = a2)
| Gt(e1, e2) -> let N n1 = (eval e1 rho)
and N n2 = (eval e2 rho)
in B (n1 > n2)
;;
Note that the OCaml interpreter is able to effortlessly infer the type of the modified eval
function:
val eval : exp -> (string -> values) -> values = <fun>

Big Step (Natural, Kahn-style) Operational Semantics

The big-step (Kahn-style) operational semantics also needs to be modified. The calculates
relation now needs to return an answer when the input expression is a variable. What
answer? Whatever answer the variable is bound to. So we need a data structure that
associates variables to canonical answers. Let us call this a “table”, which is nothing but a
finite-domain function γ ∈ →fin Ans. Why “finite-domain”? Because a calculator has
to operate with finite data structures and not mathematical abstractions such as valuations
(which can be infinite).

Accordingly, we modify the calculates relation by introducing a table in each of the rules.
And we add a rule to deal with the case of variables. To highlight that the table does not
change during the calculation process, we place the table to the left of a turnstile.

(CalcNum) for any γ


γ ⊢n⟹n

(CalcBool) for any γ


γ ⊢b⟹b

(CalcVar) provided x ∈ dom(γ)


γ ⊢ x ⟹ γ ( x)
γ ⊢ e1 ⟹ n1 γ ⊢ e2 ⟹ n2
(CalcPlus) provided pLUS(n1, n 2, n)
γ ⊢ e1 + e2 ⟹ n

γ ⊢ e1 ⟹ n1 γ ⊢ e2 ⟹ n2
(CalcTimes) provided tIMES(n1, n 2, n)
γ ⊢ e1 * e2 ⟹ n

γ ⊢ e1 ⟹ b1
(CalcNot) provided myNot (b1, b)
γ ⊢ ¬e1 ⟹ b

γ ⊢ e1 ⟹ b1 γ ⊢ e2 ⟹ b2
(CalcAnd) provided myAnd(b1, b2, b)
γ ⊢ e1 ∧ e2 ⟹ b

γ ⊢ e1 ⟹ b1 γ ⊢ e2 ⟹ b2
CalcOr) provided myAnd(b1, b2, b)
γ ⊢ e1 ∨ e2 ⟹ b
𝒳
γ ⊢ e1 ⟹ a1 γ ⊢ e2 ⟹ a2
(CalcEq) provided eQ(a1, a 2, b)
γ ⊢ e1 = e2 ⟹ b

γ ⊢ e1 ⟹ n1 γ ⊢ e2 ⟹ n2
(CalcGt) provided gT (n1, n 2, b)
γ ⊢ e1 > e2 ⟹ b

Note that in (CalcVar), no answer is returned if x ∉ dom(γ). The calculator gets stuck!

What about Soundness and Completeness of the calculator with respect to the
Definitional Interpreter? How do those statements change?

We need a notion of a table and a valuation agreeing with each other. That is,
for every x ∈ dom(γ): eval[[γ ( x)]] ρ = ρ( x)

With this assumption on γ and ρ, it is fairly easy to state and prove Soundness.

Completeness is somewhat harder, and requires an additional assumption — namely that


x ∈ vars( e), x ∈ dom(γ), where vars( e) denotes the set of variables appearing in
expression E. This condition ensures that the calculator does not get stuck because a
variable cannot be looked up in the table.

Exercise: Define the function vars( e). Note the similarity in structure to ht, size, and
eval.

Type Preservation (version 2)

The type preservation theorem requires some additional conditions (apart from the
assumption that elementary operations are type-sound).

We say that a table γ is type-consistent with a type assumption Γ if for every x ∈ dom(γ):
x ∈ dom(Γ) and Γ ⊢ γ ( x) : Γ( x). That is, the answer associated with any variable in a
table is indeed of the same type associated with it by the type assumption.

Theorem (Type Preservation under γ ⊢ e ⟹ a)


For all expressions e ∈ Exp, a ∈ Ans,
for all type assumptions Γ,
for all tables γ type-consistent with Γ,
for all types τ ∈ Typ,
if Γ ⊢ e : τ and γ ⊢ e ⟹ a, then Γ ⊢ a : τ

Proof (By Induction on the structure/ht of e).


Base cases (ht ( e) = 0)
Subcases (e ≡ n) and (e ≡ b) are essentially unchanged.
There is a new base case: e ≡ x.
Assume Γ ⊢ e : τ. Therefore τ = Γ( x). (The case of x ∉ dom(Γ) cannot arise from the
assumption).
Now γ ⊢ x ⟹ γ ( x). Since γ is assumed type-consistent with Γ, Γ ⊢ γ ( x) : Γ( x).
So Γ ⊢ a : τ

The Induction Hypothesis is a suitably modified version of the earlier IH, and the cases
in the Induction Step (ht ( e) = 1 + k) are more or less the same as before, with the
appropriate changes.

Exercise: Complete this proof.

Exercise: Encode the type-checking relation Γ ⊢ e : τ in PROLOG as a predicate


hastype(G,E,T).

Compilation and execution on a Stack Machine


The stack machine now needs to be modified to incorporate an additional component,
namely the table. The configurations are now triples — a table, a stack of values and a
code list.

The opcodes need only a small extension however — an opcode to look up a variable in the
table. (In practice, we get rid of the variables and use some address/indexing mechanism).

type opcode = LDN of int | LDB of myBool | LOOKUP of string


| PLUS | TIMES | AND | OR | NOT | EQ | GT;;

The compile function therefore has minimal changes == the inclusion on a line for
compiling variables

let rec compile e = match e with


Num n -> [ LDN n ]
| Bl mb -> [LDB mb ] (* Constants *)
| V x -> [LOOKUP x] (* Variables *)
| Plus (e1, e2) -> (compile e1) @ (compile e2) @ [PLUS]
| Times (e1, e2) -> (compile e1) @ (compile e2) @ [TIMES]
| And (e1, e2) -> (compile e1) @ (compile e2) @ [AND]
| Or (e1, e2) -> (compile e1) @ (compile e2) @ [OR]
| Not e1 -> (compile e1) @ [NOT]
| Eq (e1, e2) -> (compile e1) @ (compile e2) @ [EQ]
| Gt(e1, e2) -> (compile e1) @ (compile e2) @ [GT]
;;

The stack machine, now endowed with a table in its configurations, needs to specify
how the LOOKUP(x) opcode is executed. Otherwise, it is substantially the same
(other than now containing a table component). Out of indolence, we have
represented a table as a function from strings to exp (and not very precisely to
answers).

exception Stuck of (string -> exp) * exp list * opcode list;;

let rec stkmc g s c = match s, c with


a::[], [ ] -> a (* no more opcodes, return top *)
| s', (LDN n)::c' -> stkmc g ((Num n)::s') c'
| s', (LDB b)::c' -> stkmc g ((Bl b)::s') c'
| s', (LOOKUP x)::c' -> stkmc g ((g x)::s') c'
| (Num n2)::(Num n1)::s', PLUS::c' ->
stkmc g (Num(n1+n2)::s') c'
| (Num n2)::(Num n1)::s', TIMES::c' ->
stkmc g (Num(n1*n2)::s') c'
| (Bl b2)::(Bl b1)::s', AND::c' ->
stkmc g (Bl(myAnd b1 b2)::s') c'
| (Bl b2)::(Bl b1)::s', OR::c' ->
stkmc g (Bl(myOr b1 b2)::s') c'
| (Bl b1)::s', NOT::c' -> stkmc g (Bl(myNot b1)::s') c'
| (Num n2)::(Num n1)::s', EQ::c' ->
stkmc g (Bl (bool2myBool(n1 = n2))::s') c'
| (Num n2)::(Num n1)::s', GT::c' ->
stkmc g (Bl (bool2myBool(n1 > n2))::s') c'
| _, _ -> raise (Stuck (g, s, c))
;;

The exception Stuck now takes 3 arguments, namely the table, stack and opcode
list. The only new line is
| s, (LOOKUP x)::c' -> stkmc g ((g x)::s) c'
where the value obtained from the table, namely (g x), is pushed onto the stack.

You might also like