CNG242 Lecture3 Variables and Storage

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 55

Variables and Storage

CNG 242 Programming Language Concepts


Lecture - 3
Outline

1. Storage 4. Commands
• Array Variables • Assignment
2. Semantics of Assignment • Procedure Call
3. Variable Lifetime • Block commands
• Global Lifetime • Conditional commands
• Local Lifetime • Iterative statements
• Heap Variable Lifetime 5. Memory Representation
• Dangling Reference and Garbage 6. Summary
• Persistent Variable Lifetime
Introduction

• Functional language variables: math like, defined or solved (A


mathematical variable stands for a fixed but unknown value;
there is no implication of change over time).

• Remains same afterwards.

• Imperative language variables: variable has a state and value.


• It can be assigned to different values in same phrase.
Introduction (2)

• Two basic operations a variable: inspect and update.


• In imperative (and object-oriented and concurrent) programming
languages, a variable is a container for a value, and may be
inspected and updated as often as desired.
• Variables are used to model real-world objects whose state
changes over time, such as today’s date, the current weather,
the population of the world, or a country’s economic
performance.
Variables
unallocated
• To understand such variables, cells
assume a simple abstract model
of storage: 7
allocated
– A store is a collection of true cells
storage cells. Each storage cell 3.14
has a unique address.
– Each storage cell is either
allocated or unallocated.
– Each allocated storage cell ? undefined
contains either a simple value or
undefined.
‘X’
Simple vs composite variables

• A simple value is one that can be stored in a single


storage cell (typically a primitive value or a pointer).
• A simple variable occupies a single allocated
storage cell.
• A composite variable occupies a group of allocated
storage cells.
Simple Variable Example

Cells are initially unallocated.

Then, allocated/undefined.
Ready to use but value
unknown.

Then, storable

After the including block


terminates, again unallocated
Composite variables

• A variable of a composite type has the same


structure as a value of that type. For instance:
– A record variable is a tuple of component variables.
– An array variable is a mapping from an index range to
a group of component variables.
• The component variables can be inspected and
updated either totally or selectively.
Array Variables

• Different approaches exist in implementation of array


variables:
1. Static arrays
2. Dynamic arrays
3. Flexible arrays
Static vs dynamic vs flexible arrays

• A static array is an array variable whose index range


is fixed by the program code.
• A dynamic array is an array variable whose index
range is fixed at the time when the array variable is
created.
• A flexible array is an array variable whose index
range is not fixed at all, but may change whenever a
new array value is assigned.
Static Array Example

Array size is fixed at compile time to a constant value


or expression.

C example:
Dynamic Array Example
•Array size is defined when variable is allocated. Remains
constant afterwards.
Example: GNU Compiler Collection (GCC) supports many language features
not found in the ANSI/ISO standards for C and C++

Example: C++ with templates


Flexible Arrays Example
•Array size is completely variable. Arrays may expand or shrink at run time.
Script languages like Perl, PHP, Python

•Perl example:

•C++ and object orient languages allow overload of [] operator to make flexible
arrays possible. STL (Standard Template Library) classes in C++ like vector,
map are like such flexible array implementations.
Copy semantics vs reference semantics

• What exactly happens when a composite value is


assigned to a variable of the same type?
• Copy semantics: All components of the composite
value are copied into the corresponding components
of the composite variable.
• Reference semantics: The composite variable is
made to contain a pointer (or reference) to the
composite value.
• C and Ada adopt copy semantics.
• Java adopts copy semantics for primitive values, but
reference semantics for objects.
Assignment by Copy vs Reference. Example

•Copy: All content is copied into the other


variables storage.
•Two copies with same values in
memory.

•Reference: Reference of variable is copied to other


variable.
•Two variables share the same storage and
values.
Example
•Assignment semantics is defined by the language design

•C structure follows copy semantics. Arrays cannot be assigned. Pointers are


used to implement reference semantics. C++ objects are similar.

•Java follows copy semantics for primitive types. All other types (objects) are
reference semantics.

•Copy semantics is slower

•Reference semantics cause problems from storage sharing (all operations effect
both variables). Deallocation of one makes the other invalid.

•Java provides copy semantic via a member function called copy(). Java
garbage collector avoids invalid values (in case of deallocation)
Example: Java reference semantics

• Declarations:
class Date {
int y, m, d;
public Date (int y, int m, int d) { … }
}
Date dateR = new Date(2004, 1, 1);
Date dateS = new Date(2004, 12, 25);
• Effect of reference semantics: dateR dateS
dateR dateS

dateS = dateR;
dateR.y = 2005; 2005
2004
2004 2004
11 12
11 25
Variable Lifetime

• Every variable is created (or allocated) at some


definite time, and destroyed (or deallocated) at some
later time when it is no longer needed.
• A variable’s lifetime is the interval between its
creation and destruction.
• A variable occupies storage cells only during its
lifetime. When the variable is destroyed, the storage
cells that it occupied may be deallocated (and
subsequently allocated for some other purpose).
Variable Lifetime (2)

Variable lifetime: The period between allocation of a


variable and deallocation of a variable.
4 kinds of variable lifetime.

1.Global lifetime (while program is running)


2.Local lifetime (while declaring block is active)
3.Heap lifetime (arbitrary)
4.Persistent lifetime (continues after program terminates)
Global Lifetime
•Life of global variables start at program startup and
finishes when program terminates.

•In C, all variables not defined inside of a function


(including main()) are global variables and have global
lifetime:

What are C static variables inside functions?


Life time of static variables remains equal to the life time of the program. Any
local or global variable can be made static depending upon what the logic
expects out of that variable.
Local Lifetime
•Lifetime of a local variable, (a variable defined in a function or statement
block) is the time between the declaring block is activated and the block
finishes.

•Formal parameters are local variables.

•Multiple instances of same local variable may alive at the same time in
recursive functions.
Variable Lifetime

Show lifetimes of each variable


Heap Variable Lifetime

•The heap is a region of computer's memory that is not


managed automatically for you.

•It is a more free-floating region of memory .

•In C to allocate memory on the heap, we use malloc() or


calloc(). Once you have allocated memory on the heap,
you are responsible for using free() to deallocate that
memory once you don't need it any more.
Heap Variable Lifetime (2)
•Heap variables: Allocation and deallocation is not automatic but
explicitly requested by programmer via function calls.
•C: malloc(), free(), C++: new, delete.
•Heap variables are accessed via pointers. Some languages use
references
double *p;
p=malloc(sizeof(double));
*p=3.4; ...
free(p);
•p and *p are different variables p has pointer type and usually a local
or global lifetime, *p is heap variable.

•heap variable lifetime can start or end at anytime.


Heap Variable Lifetime (3)

Show lifetimes of each variable


Reachability

• A heap variable remains reachable as long as it can


be accessed by following pointers from a global or
local variable.
• A heap variable’s lifetime extends from its creation
until:
– it is destroyed by a deallocator, or
– it becomes unreachable, or
– the program stops.
Dangling pointers

• A dangling pointer is a pointer to a variable that has


been destroyed.
• Dangling pointers arise from the following situations:
– where a pointer to a heap variable still exists after the
heap variable is destroyed by a deallocator
– where a pointer to a local variable still exists at exit
from the block in which the local variable was declared.
• A deallocator immediately destroys a heap variable;
all existing pointers to that heap variable then
become dangling pointers. Thus deallocators are
inherently unsafe.
Dangling Reference (Example)
• dangling reference: trying to access a variable whose lifetime is ended
and already deallocated.

• both p’s are deallocated or ended lifetime variable, thus dangling


reference
• sometimes operating system tolerates dangling references.
• Sometimes generates run-time erros like “protection fault”,
“segmentation fault” are generated.
Dangling pointers (2)

• C is highly unsafe:
– After a heap variable is destroyed, pointers to it might
still exist.
– At exit from a block, pointers to its local variables might
still exist (e.g., stored in global variables).
• Ada is safer:
– After a heap variable is destroyed, pointers to it might
still exist.
– But pointers to local variables may not be stored in
global variables.
• Java is very safe:
– It has no deallocator.
– Pointers to local variables cannot be obtained.
Example: C dangling pointers
struct Date {int y, m, d;};
allocates a new
Date* dateP; Date* dateQ; heap variable
dateP = (Date*)malloc(sizeof Date);
dateP->y = 2004; dateP->m = 1; dateP->d
= 1;
dateQ = dateP; makes dateQ point
to the same heap
free(dateQ); variable as dateP
deallocates that heap
printf("%d4", dateP->y); variable (dateP and
dateP->y = 2005; dateQ are now
dangling pointers)
fails fails
Garbage Variables
•Garbage variables: The variables with lifetime still continue but there is
no way to access.

•When the pointer value is lost or lifetime of the pointer is over, heap
variable is unaccessible. (*p in examples)
Garbage Collection
•A solution to dangling reference and garbage problem:
•PL does management of heap variable deallocation automatically.
•This is called garbage collection. (Java, Lisp, ML, Haskell, most
functional languages) no call like free() or delete exists.
•Count of all possible references is kept for each heap variable.
•When reference count gets to 0 garbage collector deallocates the heap
variable.
• Garbage collector usually works in a separate thread when CPU is
idle.

•Another but too restrictive solution: Reference cannot be assigned


to a longer lifetime variable. local variable references cannot be
assigned to global reference/pointer.
Persistent Variable Lifetime
•Variables with lifetime continues after program terminates: file,
database, web service object,...

•Stored in secondary storage or external process.

•Only a few experimental language has transparent persistence.

•Persistence achieved via IO instructions


•C files: fopen(), fseek(), fread(), fwrite()

•In object oriented languages; serialization: Converting object into a


binary image that can be written on disk or sent to network.

•This way objects snapshot can be taken, saved, restored and object
continue from where it remains.
Commands

• Expression: program segment with a value.


Statement: program segment without a value but with
purpose of altering the state.
• Input, output, variable assignment, iteration...
1. Assignment
2. Procedure call
3. Block commands
4. Conditional commands
5. Iterative commands
• Commands are characteristic of imperative and OO
(but not functional) PLs.
Assignment
•C: “Var = Expr;”, Pascal “Var := Expr;”.

•Evaluates RHS expression and sets the value of the variable at RHS

•x = x + 1 . LHS x is a variable reference (l-value), RHS is the value

•multiple assignment: x=y=z=0;

•parallel assignment: (Perl, PHP)


($a,$b) = ($b, $a);
($name, $surname, $no) =(“John",“Smith",00001);
Assignment: “reference aggregate” → “value aggregate”

•assignment with operator: x += 3; x *= 2;


Procedure calls

• A procedure call achieves its effect by applying a


procedure to some arguments.
• Typical form:
P(E1, , En);
Here P determines the procedure to be applied, and E1,
, En are evaluated to determine the arguments. Each
argument may be either a value or (sometimes) a
reference to a variable.
• The net effect of the procedure call is to update
variables. The procedure achieves this effect by
updating variables passed by reference, and/or by
updating global variables. (But updating its local
variables has no net effect.)
Procedure calls

• Procedure: user defined commands.


– Pascal: procedure,
– C:function returning void

• void functname (param1 , param2 , ..., paramn )

• Usage is similar to functions but call is in a statement


position(on a separate line of program)
Block Commands

•Composition of a block from multiple statements


•Sequential commands: { C1; C2; ...; Cn }
•A command is executed, after it finishes the next command is executed,...
•Commands enclosed in a block behaves like single command:“if” blocks,
loop bodies,...

•Collateral commands: { C1, C2, ... Cn } Commands can be executed in


any order. (not C ‘,’)!
•The order of execution is non-deterministic. Compiler or optimizer can
choose any order. If commands are independent, effectively deterministic:
‘y=3 , x=x+1 ;’ vs ‘x=3, x=x+1 ;’
•Can be executed in parallel.
Block Commands (2)
•Concurrent commands : concurrent paradigm languages:
•{ C1| C2| ...| Cn }
•All commands start concurrently in parallel. Block finishes when the last
active command finishes.
•Real parallelism in multi-core/multi-processor machines.
•Transparently handled by only a few languages. Thread libraries required in
languages like Java, C, C++.
Conditional Commands
•Commands to choose between alternative commands based
on a condition
•in C : if (cond ) C1 else C2
switch (value ) { case L1 : C 1 ; case L2: C 2 ; ...}
•if commands can be nested for multi-conditioned selection.
switch like commands chooses statements based on a value
If-commands

• Typical forms (Ada and C/Java, respectively):


if E then if (E) E must be of
C1 C1 type Boolean
else else
C2 C2
end if;
– if E yields true, C1 is executed; otherwise C2 is
executed.
Case-commands
• In C and Java:
switch (E) {
case v1:
C1

case vn:
Cn
default:
C0
}
– if the value of E equals some vi, then Ci, …, Cn, C0
are all executed; otherwise only C0 is executed.
Conditional Commands (2)
•non-deterministic conditionals: conditions are evaluated in
collaterally and commands are executed if condition holds.
•hypothetically:
if (cond1) C1 or if (cond2) C2 or if (cond3) C3 ;
switch (val ) {
case L1: C1 | case L2: C2 | case L3: C3

•Tests can run concurrently


Iterative Statements
•Repeating same command or command block multiple times possibly
with different data or state. Loop commands.

•Loop classification: minimum number of iteration: 0 or 1.

•An iterative command (or loop) repeatedly executes a subcommand,


which is called the loop body.

•Each execution of the loop body is called an iteration.


Iterative Statements (2)
•Repeating same command or command block multiple times possibly with
different data or state. Loop commands.

•Loop classification: minimum number of iteration: 0 or 1.

C: while (...) { ... } C: do {...} while


(...);
Iterative Statements (3)
•Repeating same command or command block multiple times possibly with
different data or state. Loop commands.

•Another classification: definite vs indefinite iteration

•Definite vs indefinite loops


•Indefinite iteration: Number of iterations of the loop is not known until loop
finishes.
•C loops are indefinite iteration loops.
•Definite iteration: Number of iterations is fixed when loop started.
•Pascal for loop is a definite iteration loop.
for i:= k to m do begin .... end; has (m - k + 1) iterations.
•Pascal forbids update of the loop index variable.
•List and set based iterations: PHP, Perl, Python, Shell
Memory Representation

•Global variables are kept in fixed region of data segment in


memory. They are directly accessible

•Heap variables are kept in dynamic region of data segment


in memory In a data structure. A memory manager required.

•Local variables are usually kept in run-time stack (Why?)


to keep track of the point to which each active subroutine
should return control when it finishes executing.
Implementation notes

• Each variable occupies storage space throughout its


lifetime. That storage space must be allocated at the
start of the variable’s lifetime (or before), and
deallocated at the end of the variable’s lifetime (or later).

• The amount of storage space occupied by each variable


depends on its type.
Storage for global and local variables (1)

• A global variable’s lifetime is the program’s entire


run-time. So the compiler can allocate a fixed storage
space to each global variable.

• A local variable’s lifetime is an activation of the block


in which the variable is declared. The lifetimes of
local variables are nested. So the compiler allocates
storage space to local variables on a stack.
Storage for global and local variables (2)
• At any given time, the stack contains several
activation frames.
• Each activation frame contains housekeeping
enough space for the local variables data
of a particular procedure.
local variables

 An activation frame is:


• pushed on to the stack when a procedure is called
• popped off the stack when the procedure returns.

 Storage can be allocated to local variables of recursive


procedures in exactly the same way.
Storage for heap variables

• A heap variable’s lifetime starts when the heap


variable is created and ends when it is destroyed or
becomes unreachable. There is no pattern in their
lifetimes.
• Heap variables occupy a storage region called the
heap. At any given time, the heap contains all
currently-live heap variables, interspersed with
unallocated storage space.
– When a new heap variable is to be created, some
unallocated storage space is allocated to it.
– When a heap variable is to be destroyed, its storage
space reverts to being unallocated.
Storage for heap variables (2)
• A heap manager (part of the run-time system) keeps track of
allocated and unallocated storage space.
• If the programming language has no explicit deallocator, the
heap manager must be able to find any unreachable heap
variables. (Otherwise heap storage will eventually be
exhausted.) This is called garbage collection.

• A garbage collector must visit all heap variables in order to find


the unreachable ones. This is time-consuming.
• But garbage collection eliminates some common errors:
– omitting to destroy unreachable heap variables
– destroying heap variables that are still reachable.
Representation of dynamic/flexible arrays

• The array indexing operation will behave unpredictably if the


index value is out-of-range. To avoid this, in general, we need a
run-time range check on the index value.
• A static array’s index range is known at compile-time. So the
compiler can easily generate object code to perform the
necessary range check.
• However, a dynamic/flexible array’s index range is known only
at run-time. So it must be stored as part of the array’s
representation:
– If the lower bound is fixed, only the length need be stored.
– Otherwise, both lower and upper bounds must be stored.
Representation of dynamic/flexible arrays (2)

• Example (Java):
float[] v1 = new float[4];
float[] v2 = new float[3];
v1 v2

tag float[] tag float[]


length 4 length 3
0 1.0 0 0.0
1 0.5 1 0.0
2 5.0 2 0.0
3 3.5
Representation of dynamic/flexible arrays (2)

• Exercise: Write a function that takes two lists where


one list contains letter grades and another list
contains the credits. You function should calculate
and then return the GPA (Grade Point Average). The
weight of the letter grades are as follow: A = 4, B = 3,
C = 2, D= 1 and F = 0.
• Sample Run:
• *Main> calculateGPA ['A', 'B', 'A', 'C'] [4, 2, 3, 3]
• 3.33

You might also like