unit-1(Programming Language)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

UNIT-1

Programming Language
A brief history of Programming Language: Programming Languages
What is Programming language?
Programming language is a notation for describing Algorithm and data.
What is a Program:
A sentence of a programming language.
Let start from year "1954"
Fortran (FORmula TRANslator)
• It is Created in 1954 by John Backus.
• It is First high-level language.
• It was developed by using the first compiler ever developed
• It is Machine Independent Language.
FORTRAN 2nd:
• In 1958 introduces subroutines, functions, loops and primitive for loop
IAL(International Algebraic Logic)
• It started as a Project later renamed ALGOL58
• The theoretical definition of the language is published
• No compiler is required in this
LISP(LISt Processing) :
• It is created in 1958 and released in 1960 by John Mccarthy of MIT.
• LISP was intended for writing artificial intelligence programs.
Features
• It uses Atoms and List as data structure
• Functional programming style- all component computation is performed by applying functions
to arguments variable declaration are rarely used.
• A Reliance on recursion- a strong Reliance on recursion has allowed LISP to be successful in
many areas including artificial intelligence.
• Garbage collection: LISP has built in garbage collection so programmers do not need to
explicitly free dynamically allocated memory.
COBOL (Common Business Oriented Language)
• It was rated in May 1959 by the ShortRange committee of the us department of DOD.
• The CODASYL(COnference on Data SYstems Languages) Worked from May 1959 to April
1960.
• ANSI standard included(cobol-68(1968), Cobol-74(1974),COBOL-85(1985), and COBOL-
2002(2002)
• Object Oriented version of COBOL is introduced in 1997 i.e COBOL-97

• Introduced the Record Data Structure.


ALGOL ( ALGOrithmic Language)
• It was released in "1960"and its major releases were in "1960" and "1968".
• It Was first "Block Structured Language."
• It was Considered to be the first second generation Computer Language
• It is Machine Independent language
• It introduced concepts like :
• 1) Block structure code(Marked by BEGIN and END)
• 2) Scope of variables(Scope of local variables inside blocks)
• 3) BNF (Backus Naur Form)
• 4) Notation for defining syntax
• 5) Dynamic Arrays
• 6) Reserved words
• 7) IF THEN ELSE, FOR, WHILE loops
the : Symbol for assignment
SWITCH with GOTO
User defined data type.
SNOBOL(StroNg Oriented symBOlic Language)
• It was created in 1964-
• Inted for "Strings"
• First Language to use Associative arrays , indexed by and type of key.
• Had feature of pattern matching, concatenation, and alternation.
• It allowed running code stored in strings
• Data types:- integer, real, array, pattern , and user defined types.
BASIC(Beginner's All-purpose Symbolic Instruction Code)
• Designed as a teaching language in 1963 by John george kemeny and Thomas Eugene Kurtz of
Dartmouth college.
• Intended to make it easy to learn programming.
PL/I (Programming language One)
• It was Created in 1964
• Intended to combine the features of fortran with Cobol, plus additional facilities for system
programming.
• Also borrows from ALOL 60.
• Originally called NPL (New programming Language).
• Introduces storage classes(automatic, Static, controlled and based), exception processing(on
condition)
• It uses Select when otherwise conditional structure and several variation of the DO Loop.
• It uses Numerous data types.
Pascal-(Named for French religious fanatic and mathematician Blaise Pascal)
• It was Created in 1970 .
• It was intended to replace BASIC for teaching language.
• It was quickly developed as a general-purpose language.
• It was Programs compiled to a platform-independent intermediate p-code.
• compiler for Pascal was written in Pascal.
C language:
• It was Developed from "1969 - 1972" by "Dennis Ritchie."
• It was used in system programming for UNIX.
ANSI C: The American national standards Institute(ANSI), formed a technical sub committee,
X3Jll, to create a standard for the C language and it runtime libraries. Ada:
• It was Released in "1983 (ADA 83)", with major released in "1995- (ADA (ADA 95)" and
"2005 1005 (ADA 2005)".
• It was created by us Department of Defence(DOD).
• It was intended for embedded systems and later intended for all military computing
purposes. Perl (Practical Extracting and Report Language)
• It was created by Larrywall ball in 1987.
• It was intended to replace the Unixshell, Sed, AWK.
Python:
• It was Created in 1991 by GuidoVan Rossum.
• A scripting language with dynamic type, intended to replace perl.
Characteristic of A Good Programming Language
There are various factors, why the programmers prefer one language over the another. And some
of very good characteristics of a good programming language are,
1) Clarity, Simplicity And Unity: A Programming language provides both a conceptual
framework for Algorithm planning and means of expressing them. It should provide a clear,
simple and unified set of concepts that can be used as primitives in developing algorithms.
It should have
• It has minimum number of different concepts
•- with Rules for their combina-tion being
•-simple and regular.
This attribute is called conceptual integrity.
2) Orthogonality: It is one of the most important feature of PL orthogonality is the property that
means " Changing A does not change B".
If I take Real world example of an orthogonal system Would be a radio, where changing the
station does not change the volume and vice versa.
When the features of a language are orthogonal, language is easier to learn and programs are
easier to write because only few exceptions and special cases to be remembered.
3) Support for Abstraction:- There is always found that a substantial gap remaining between
the abstract data structure and operations that characterize the solution to a problem and their
particular data structure and operations built into a language.
4) Programming Environment: An appropriate programming environment adds an extra utility
and make language to be implemented easily like
The availability of- Reliable- Efficient - Well documentation
Speeding up creation and testing by-special Editors- testing packages
Facility- Maintaining and Modifying- Multi Version of program software product.

5) Ease of program verification: - Reusability:


The reusability of program written in a language is always a central concern. A program is
checked by various testing technique like
Formal verification method Desk checking Input output test checking.
We verify the program by many more techniques. A language that makes program verification
difficult maybe far more troublesome to use. Simplicity of semantic and syntactic structure is a
primary aspect that tends to simplify program verification.
6) portability of programs: Programming language should be portable means it should be easy
to transfer a program from which they are developed to the other computer.
A program whose definition is independent of features of a Particular machine forms can only
support Portability. Example: Ada, FORTRAN, C, c++, Java.
Introduction to translators:
To Execute a computer program written in high level language must be translated into machine
understandable language that is machine language or machine code .
A Translator is basically computer program that performs the translation of a program written in
a given programming language into functionality equivalent program in a different computer
language, without losing the functional or logical structure of the original code.
There are three types of Programming Language Translators
• Assembler
• Compilers
• Interpreters
Source Code: It is the input to translator.
Executable code: Is the code that is output from Translator.
Assembler:Programming Language Translator
An Assembly Language which is basically MNEMONICS like GO ,HALT, JUMP, NOT code
which is translated to Machine Language by Programming Language Translator i.e Assembler.
So Assembler is a program that takes assembly and converts them into pattern of bits that the
computer processor can use to perform its basic operations. This pattern of bits is basically
Machine Language.

Examples of Assembler are


-NASM and -MASM
AnInterpreter is also a program that translates high level source code into Executable Code.
However the
Difference between a Compiler and an Interpreter
is that an interpreter translates one line at a time and then executes it. No object code is
produced, and so the program has to be interpreted each time it is to be run.
Advantages of an Interpreter
• 1) Good at locating errors in the program
• 2) Debugging is easier since the interpreter stops when it encounters an error.
• 3) if an error is deducted there is an no need to re-translate the whole program.
• 4) It uses less memory as only few lines has to be into the memory because no object code.
Disadvantages of an interpreter:
• 1) Rather Slow: Speed is biggest disadvantage
• 2) no object code is produced, so translation has to be done every time the program is running.
• 3) For the program to run, the interpreter must be present.
Interpreter has to analyse each line of code (byte code) into machine code before it can be
executed.
What is Elementary Data Types?
A data object is a region of storage that contains a value or group of values. Each value can be
accessed using its identifier or a more complex expression that refers to the object. In addition,
each object has a unique data type.
What is Data Object:
Definition: A Data object represents a container for data values, a place where data values may
be stored and later retrieved.
Definition: A runtime grouping of one or more pieces of data in a virtual computer.
Definition: A Location in memory with an assignment name in the actual computer.
Data objects can be:
• 1) at program execution- Programmer-defined -(example variables, constant, arrays, files etc)
• 2) not directly accessible to programmer- system defined- run time storage, stacks, file buffers,
free space lists.
Data values can be:
• Single number
• Pointer to other objects and characters.

Data object is usually represented as storage in Computer memory and a data value is
represented by a pattern of bits. So we can represent the relation between Data Object and
Data value.
A Data Object is elementary if it contains a data value that is always manipulated as a unit.
A Data Object is an Data Structure if it is an aggregate of the data object
Binding and Attributes of Data Object:
Binding is an association of data values and entity.
1) Type: This associates the data object with the set of data values that the object may take.
2) location: This associate the binding of a storage location in memory where the data
object is represented. Only storage management routines can only change add data object
in the virtual computer.
3) Value: This binding is usually the result of an assignment operation.
4) Name: The binding to one or more names by which the object maybe referenced during
program execution is usually set up by declaration and modified by subprogram calls the
returns.
5) Component: The binding of a data object to one or more data object of which it is a
component is often represented by a pointer value. And may be modified by a change in
pointer.
An elementary data object includes a single data value and a class of elementary data objects
with a set of operations for creating and manipulating them is represented as an elementary data
type. An example of elementary data types such as integer, real, character, Boolean, pointer, etc.
The basic components of elementary data types are as follows −
 Attributes − Attributes refers to characteristics or group of characteristics that distinguish
one data object from others. The main attributes of a data object are its name, associated
address, and data type. The following declaration in C.
int a;
It specifies that a data object named ‘a’ is of type integer. The attributes of a data object can be
stored in a collection of memory cells, called descriptor (or dope vector). A descriptor is the
group of attributes of a variable. If the attributes are all fixed, descriptors are needed only at
compile time. They are developed by the compiler, generally as a part of the symbol table, and
are used during compilation.
 Values − It refers to a set of all possible values that a data object can contain. The values
that a data object can assume are determined by the type of that data object. An
elementary data object contains a single value from the set of values at any point during
its lifetime. For example, the C declaration int a; specifies that the data object a can
assume a single integer value from a set of integer values. The value included in a data
object can change during the lifetime of the data object and is therefore represented
explicitly during the program execution.
 Operations − An operation refers to a mathematical function for the manipulation of data
objects. An operation includes −
o Domain − It refers to a set of all possible input arguments on which the operation
is defined.
o Range − It refers to a set of all possible results that an operation can produce as an
output.
o Action − The action of the operation represents the result created for any given set
of arguments.
o Algorithm − It defines how to evaluate the results for any given set of arguments.
It is used for determining the action of an operation.
o Signature − A signature of an operator defines the number, order, and data types
of the arguments in the domain of an operation and the order and data type of the
producing order.
Elementary Data Types
 Variables and Constants
 Variables: A variable is it Quadruple which is composed of a name,a set of attributes, a
reference and a value.
 A simple variable is an elementary data object with a name and binding of data object to
value may change during its lifetime. This data objects are basically defined and named
by programmer explicitly.
 Attribute of a variable:
 Lets take example of variable in "ALGOL Language"
 y:=9;
 We can say that it has four attributes
 • 1) the name of the box:y
 • 2) The name of description of a current contents.
 i.e 9 we can also say that square of 3.
 • 3) the box or storage location(s) which holds(s) the value.
 • 4) the content of the box or 9.
 The name of the box and its storage location are fixed, but the contents and it name may
vary over time.


 Let's take another example in C language:
 int N;- It declares a simple data object N of type integer.
 N=27; beta value 27 is assigned to variable N.
 • 1) declare the variable name N of type integer.
 • 2) lifetime of N is execution end.
 • 3) data object bound to N during end of execution time.
 • 4) value 27 is assigned and may be changed during life of N.
 • 5) hidden from the programmer are other binding made by virtual computer like
creating Activation Record, Storage for this activation record in Run-TimeStack etc.

Constant:
 A data object with a name that is bound to a value (or values) permanently
during its lifetime. The constant value can only be a number, string or identifier
which denotes constant.

 A constant definition in Pascal introduces an identifier as a synonym for

the constant value.

 Pascal uses the reserved word const to begin a constant declaration.

 const PI=3.1415;

 In ALGOL68 we can define constant by

 real root2=1.4142135;
 That was much acceptable that time.

 In Ada, provides a uniform notation for setting constants to initial values and for

initializing variables.

 X: Constant INTEGER:=17;

 In C language: const is used to initialise the constant value

 const int MAX=80;

 The constant MAX is a programmer defined constant because the programmer

explicitly defines the name for the value 30.

 In C, there is micro definition which is used for control the execution of program

and can be used for declaring constant.

 Example #define MAX30

 It is a compile-time operation that causes all references to MAX in

program to be changed to the constant 30.

 In this 30 has two names, the programmer defined MAX and literal name

30. Both of which may be used to refer to a data object containing the

value 30.
 # define MAX 30 is a command, which the translator used to

equate "MAX" with the value "30", where as the const attribute in C is a

translator directive starting that MAX will always contain the value 30.
 Data Types In Programming Languages

 Data type is a set of object and a set of operation on those object which

create, Build up, destroy, modify and pick apart instances of the objects.
 or

 A data object is a class of data objects together with a set of operations

for creating and manipulating them.

 A programming languages necessary deals more commonly with data types such

as the class of arrays, integers, or file and the operations provided for

manipulating, arrays, integers, or files.


 Example:

 In LISP major data type is the binary tree(called an S- expression) and basic

operations are CAR,CDR and CONS.


 Fortran 77- integer, real, logical, character, double precision, complex

 ALGOL- integer, real, boolean

 PASCAL- integer, real, boolean, char

 Ada- integer, float,Boolean, character

 The basic Elements of a Specification of a data types


 Attributes- distinguish data object of that types .

 Values- that data object of that type mein have.

 Operations- manipulations of data object of that type.

Example: Array data type

Attribute:-

• Numbers of dimensions

• The subscript range for each dimension and

• The data type of components.

Value:

• It would be sets of numbers that form valid values for array components.

Operations-

• It may include subscripting to select individual array components

• create arrays.

• change their shape

• performing arithmetic on pairs of arrays.


The basic elements of the implementation of a Data Types:

 1) Storage Representation: It is used to represent the data objects of the data

type in the storage of the computer during program execution.

 2) Algorithm or Procedures: The manner in which the operations defined for

the data type are represented in terms of particular algorithms or procedures

that manipulate the chosen storage representation of the data object.

DATA TYPE

A data type is an attribute associated with a piece of data that tells a computer system how
to interpret its value. Understanding data types ensures that data is collected in the preferred
format and the value of each property is as expected.

For example, knowing the data type for “Ross, Bob” will help a computer know:

 whether the data is referring to someone’s full name (“Bob Ross”)

 or a list of two names (“Bob” and “Ross”)

Understanding data types will help you ensure that:

 the data you collect is always in the right format (“Ross, Bob” vs. “Bob Ross”)

Data Type Definition Examples

Integer (int) Numeric data type for numbers without fractions -707, 0, 707

Floating Point
Numeric data type for numbers with fractions 707.07, 0.7, 707.00
(float)
Single letter, digit, punctuation mark, symbol, or blank
Character (char) a, 1, !
space

Sequence of characters, digits, or symbols—always hello, +1-999-666-


String (str or text)
treated as text 3333

Boolean (bool) True or false values 0 (false), 1 (true)

Enumerated type Small set of predefined unique values (elements or


rock (0), jazz (1)
(enum) enumerators) that can be text-based or numerical

List with a number of elements in a specific order— rock (0), jazz (1), blues
Array
typically of the same type (2), pop (3)

Date Date in the YYYY-MM-DD format (ISO 8601 syntax) 2021-09-28

Time in the hh:mm:ss format for the time of day, time


Time 12:00:59
since an event, or time interval between events

Date and time together in the YYYY-MM-DD


Datetime 2021-09-28 12:00:59
hh:mm:ss format

Number of seconds that have elapsed since midnight


Timestamp 1632855600
(00:00:00 UTC), 1st January 1970 (Unix time)

 the value is as expected (“Ross, Bob” vs. “R0$$, B0b”)

Common data types


Integer (int)

It is the most common numeric data type used to store numbers without a fractional component
(-707, 0, 707).

Floating Point (float)

It is also a numeric data type used to store numbers that may have a fractional component like
monetary values do (707.07, 0.7, 707.00).

Please note that number is often used as a data type that includes both int and float types.

Character (char)

It is used to store a single letter, digit, punctuation mark, symbol, or blank space.

String (str or text)

It is a sequence of characters and the most commonly used data type to store text. Additionally,
a string can also include digits and symbols, however, it is always treated as text.

A phone number is usually stored as a string (+1-999-666-3333) but can also be stored as an
integer (9996663333).

Boolean (bool)

It represents the values true and false. When working with the boolean data type, it is helpful to
keep in mind that sometimes a boolean value is also represented as 0 (for false) and 1 (for true).

Enumerated type (enum)

It contains a small set of predefined unique values (also known as elements or enumerators) that
can be compared and assigned to a variable of enumerated data type.

The values of an enumerated type can be text-based or numerical. In fact, the boolean data type
is a pre-defined enumeration of the values true and false.
For example, if rock and jazz are the enumerators, an enumerated type variable genre can be
assigned either of the two values, but not both.

With enumerated type, values can be stored and retrieved as numeric indices (0, 1, 2) or strings.

Array

Also known as a list, an array is a data type that stores a number of elements in a specific order,
typically all of the same type.

Since an array stores multiple elements or values, the structure of data stored by an array is
referred to as an array data structure.

Each element of an array can be retrieved using an integer index (0, 1, 2,…), and the total number
of elements in an array represents the length of an array.

For example, an array variable genre can store one or more of the elements rock, jazz, and blues.
The indices of the three values are 0 (rock), 1 (jazz), and 2 (blues), and the length of the array is
3 (since it contains three elements).

Continuing on the example of the music app, if you are asked to choose one or more of the three
genres and you happen to like all three (cheers to that), the variable genre will store all three
elements (rock, jazz, blues).

Date

Needs no explanation; typically stores a date in the YYYY-MM-DD format (ISO 8601 syntax).

Time

Stores a time in the hh:mm:ss format. Besides the time of the day, it can also be used to store the
time elapsed or the time interval between two events which could be more than 24 hours. For
example, the time elapsed since an event took place could be 72+ hours (72:00:59).

Datetime
Stores a value containing both date and time together in the YYYY-MM-DD hh:mm:ss format.

Timestamp

Typically represented in Unix time, a timestamp represents the number of seconds that have
elapsed since midnight (00:00:00 UTC), 1st January 1970.

Specification of Elementary Data Types

An elementary data object contains a single data values and class of such data objects over which
various operations are defined is termed as elementary data type.

Some elementary data types: Integer, real, character, Boolean, enumeration and pointer and
specification may differ significantly between two languages.

Attributes: Basic attributes of any data object, such as data type and name are usually invariant
during its lifetime.

Some attributes may be stored in a descriptor as a part of the data object daily program execution.
Others may be used only to determine the storage representation of the data object.

The value of an attribute of a data object is different from the value that the data object contains.

Values: The type of a data object determines the set of possible values that it may contain.

For Example: C defines the following four classes of integer types

int, short, long and char

because most hardware implements multiple Precision integer arithmetic( example 16 bit and 32
bit integers or 32 bit and 64 integers) We can use' short' for shortest value of the integer word
length.

long uses the longest value implemented by the hardware.

int uses the most efficient value that the hardware implements.
In C, Characters are stored as 8 bit integers in the type char, which is subtype of integer.

Operations:- The set of operations Defined by language is basically refers that how data object
of that data type may be manipulated.

If the operations are primitive operation, means specified as part of language.

Programmer defined operations, in the form of subprograms or method declarations as


part of class definitions.

Example

Integer* integer-> integer

a) integer addition is an operation that take to integer data objects as an arguments and produces
an integer data object as a result.

b) SQRT: real-> real

A Square-root operation, SQRT,on real number data object is specified.

(Port of operation)

An algorithm that specifies how to compute the results for any given set of arguments is a
common method for specifying the action of an operation.

In C, we have concept a function prototype which signature of an operation, the number,


order and data types of the arguments in the domain of an operation are given as well as
the order and the data type of the resulting range.

Binary operation: Two arguments with single result

Monolic operation: Single argument with single result.

Implementation of Elementary Data Types:"


Implementation of Elementary data type consists of

• Storage representation for data objects

• Values of that type

• Set of algorithms or procedures that define the operations of the type in terms of
manipulations of the storage representation.

Storage representation of Elementary data type:

1) Hardware Influence: Computer hardware influence the storage of elementary data type.

In this case computer hardware executes the program. If the hardware storage representation are
used, then the basic operations on data of that data type not implemented using hardware
provided operations.

2) Software influenced: If we do not use hardware storage representation, then the operation
must be software simulated and some operation will execute much less efficient.

Two methods to treat Attributes:

It has to be determined by the compiler and not stored in discriptors during execution or not
stored in runtime storage representation. It is usually a method in C language.

It is stored in a descriptor as part of the data object at runtime in LISP, Prolog language .

The storage representation is usually described in terms of

• Size of the block of memory required(the number of memory words bytes, ot bits needed)

• Layout of attributes and data values within the block.

Implementation of operations:
Each operation defined for data objects of a given type may be implemented in one of three main
ways:-

1) Directly as a hardware operation: If simple data types are stored using the hardware
representation, when the primitive operations are implemented using the arithmetic operations
built in to hardware.

2) As a Subprogram or procedure: A square root for an example, this operation is not provided
directly as a hardware operation. So it is software simulated implemented as a procedure or
function.

3) as an inline code sequence: It is software implementation of the code and its operation.
Instead of using a subprogram, operation in the subprogram are copied into the program at the
point where the subprogram would otherwise have been invoked.

For Example:

The absolute value of function on numbers

abs(x)= if x<0 then -x else x

is usually implemented as an inline code sequence.

a) fetch value of x from memory

b) if x >0, skip the next instruction

c) set x=-x

d) store new value of x in memory

Here each line is implemented by a single hardware operation.


Declarations

Declarations provide information about the name and type of data objects needed during program
execution.

Two types of declaration:

- implicit declaration

- explicit declaration

Implicit declaration or default declaration:

They are those declaration which is done by compiler when no explicit declaration or user defined
declaration is mentioned.

Example

$abc='astring';

$abc=7;

In 'perl' compiler implicitly understand that

$abc ='astring' is a string variable and

$abc=7; is an integer variable.

Explicit declaration of data object:

Float A,B;

It is an example of Float A,B, of c language. In explicit we or user explicitly defined the variable
type. In this example it specifies that it is of float type of variable which has name A & B.

A "Declaration" basically serves to indicate the desired lifetime of data objects.


Declarations of operations:

- compiler need the signature of a prototype of a subprogram Or function so it can determine the
type of argument is being used and what will be the result type.

* Before the calling of subprogram, Translator need to know all these information. *

Example in C language

Float sub(int z, float y)

It declares sub to have the signature

Sub: int xfloat-> float

Purpose of Declarations:

1) Choice of storage representation: AS Translator determine the best storage representation


of data types that why it needs to know primarily the information of data type and attribute of a
data object.

2) Storage Management: It make to us to use best storage management for data object by
providing its information and these information as tells the lifetime of a data object.

For Example:-

In C language we have many options for declaration for elementary data type.

1) Simple Declaration:Like float A,B;

It tells lifetime is only at the end of execution as lifetime of every data objects can be maximum
to end of execution time.

But simple declaration tells the single block of memory will be allocated.
2) Runtime Declaration: C language and many more language provide us the feature of
dynamic memory allocation by keywords "Malloc and Calloc."

So in this special block of memory is allocated in memory and their lifetime is also different.

3) Polymorphic operations: In most language, some special symbol like + to designate any one
of the several different operation which depends on type of data or argument is provided.

In this operation has some name like as we discussed + in this case operation symbol is said to
be overloaded because it does not designate one specific operation.

Ada: allows programmer to overload subprograms.

ML: Expands this concept with full polymorphism where function has one name but variety of
implementation depending on the types of arguments.

4) Type checking:- Declaration is basically for static type checking rather than dynamic.

Type Checking and Type Conversion

Type checking means checking that each operation should receive proper number of arguments
and of proper data type.

Like

A=B*j+d;

* and - are basically int and float data types based operations and if any variable in
this A=B*j+d;Is of other than int and float then compiler will generate type error.

Two ways of Type Checking:

1) Dynamic Type Checking:

• It is done at runtime.
• It uses concept of type tag which is stored in each data objects that indicates the data type of
the object.

Example:

An integer data object contains its'type' and 'values' attribute.

so Operation only be performed after type checking sequence in which type tag of each
argument is checked. If the types are not correct then error will be generated.

• Perl and Prolog follow basically dynamically type checking because data type of variables A+B
in this case may be changed during program execution.

• so that type checking must be done at runtime.

Advantages of Dynamic Type:

• It is much flexible in designing programs or we ca say that the flexibility in program design.

• In this no declarations are required.

• In this type may be changed during execution.

• In this programmerare free from most concern about data type.

Disadvantage of Dynamic Type:

• 1) difficult to debug: We need to check program execution paths for testing and in dynamic
type checking, program execution path for an operation is never checked.

• 2) extra storage: Dynamic type checking need extra storage to keep type information during
execution.

• 3) Seldom hardware support : As hardware seldom support the dynamic type checking so we
have to implement in software which reduces execution speed.
Type checking: Static Type Checking:

Static Type Checking is done at complete time.

Information needed at compile time is provided- by declaration- by language structures.

The information required includes:

1) for each operation: The number, order, and data type, of its arguments.

2) For each variables: Name and data type of data object.

Example-

A+B

in this type of A and B variables must not be changed.

3) for each constant: Name and data type and value

const int x=28;

const float x=2.087;

In this data type, the value and name is specified and in further if checked value assigned
should match its data type.

Advantages of Static Type Checking:

1) compiler saves information:- if that type of data is according to the operation then compiler
saves that information for checking later operations which further no need of compilation.

2) checked execution paths: As static type checking includes all operations that appear in any
program statement, all possible execution paths are checked, and further testing for type error is
not needed. So no type tag on data objects at run-time are not required, and no dynamic checking
is needed.
Disadvantages of Static Type Checking

: It affects many aspects of languages

1) declarations

2) data control structures

3) provision of compiling separately some subprograms.

Strong Typing:

If we change detect all types of errors statically in a program, we can say that language is'
strongly typed'.

It provides a level of security to our program.

Example

f:s-> R

In this function f mail signature s generate output R and R is not outside the range of R
data type.

IF every operation is type safe then automatically language is strongly typed.

Example of strongly typed languages are:

C,Java, C++, RubyRail, smalltalk, python.

Type infer:- In this, like in ML, the language implementation will infer any missing type
information from other declared type.

Example:

funarea(length:int, width:int):int= length *width;


This is the standard declaration which tells length and width of int data type and its return type
is int and function name area. But leaving any two of these declarations still leaves the function
will only one interpretation. Knowing that * can multiply together either two reals or two
integers. ML interprets the following as equivalent to the previous example.

Funarea(length,width)int= length*weight;

Funarea(length:int,width)= length*weight;

Funarea(length,width:int)= length*weight;

However:

Funarea(length,width)= length*weight;

Is invalid as it is now ambiguous as to that type of arguments. They could all be int or they
could be real.

Type Conversion And Coercion:

Explicit Type Conversion: Routines to change from data type to another.

Example:

Pascal: The function 'round'- converts a real type to integer

C: eg(int)x, for float x converts the value of x to type integer.

Coercion: Implicit type of conversion, performed by the system.

Pascal: + integer and real, integer is converted to real.

Java: Permits implicit coercion if operation is widening.

C Explicit cast must be given.

Two opposite approaches to type coercion:


• No Coercion, any type mismatch is considered an error, pascal, Ada.

• Coercion are the rule: Only if no conversion is possible, error is reported.

Advantages of Coercion:

It basically free the programmer from the low level concerns upto some level. as Adding two
different data types i.e Real & int

Disadvantage of Coercion :

As programmer concerned to some level is reduced which may be that it hides some serious
errors which will not be easy to point out.

Assignment of Data Types

Assignment: A basic operation for changing the binding of a value to the data object. Languages
like C, LISP and many more
Assignment also returns a value, which is the data object containing a copy of the value
assigned.
In Pascal: dfont>

Assignment(:=): integer1*integer2 -> void

Value of integer2 is copied in integer1 In C Language:

Assignment(=):integer1*integer2 ->integer3

With this action: Set the value contained in data object integer1 to be a copy of the value
contained in the data object integer2 and also create and return a new data object integer3,
containing a copy of a new value of integer2.

Two Concepts through which we can define assignment


L-Value: Location for an object
R-Value: Content at that location
Using L-Value and R-Value gives a more concise way to describe expression semantics.
Example in case of Integer:

A=B

In this copying the value of variable B to variable A. i.e assign to the value L-Value of A the
R-Value of B.

In case of pointer

A=B:

In this A & B are pointers variables. If B is a pointer then B's R-value is the L- value of
some other data object. This assignment then means,
"Make the r-value of A refers to the same data object as the r-value of B"
Thus, the assignment A=B means " Assign a copy of the pointer stored in variable B
variable A".

Copy value:(pascal)
A:=B

Two views of assignment:

Copy pointer: (SNOBOL)


A=B(ptr to value of variable B assigned to variable A)

Initialization of Data Types

Initialization is basically a step in which we just specify the name of variable and data object but
not yet assigned a value (i.e , an L-Value with no corresponding R-value).
• in this only block for storage is allocated.
• that block automatically may take some value in form of binary code.
• it is serious programming error to create an uninitialised error as it becomes difficult to
distinguish between original value or automatically filled value both of them are of bit patterns.
Numeric Data Types

Integers: The most primitive numeric data type is integer.

Specification of Integer Data Type:

C has four different integers specification

int, short, long and char

but maximal and minimal values depends upon the what bit architecture of hardware is basically
and in some languages these values represented as defined constants.

Like in Pascal: Maxint

Types of operations on Integers:

1) Arithmetic operation:
It is of basically of two types:
A) Binary operation: BinOp: integer * integer-> integer

Example addition (+), substraction (-) , multiplication(*), division(/), reminder(mod)

B) Unary Pperation: UnaryOp: integer-> integer


Negative(-), or identify(+), abs value

2) Relational Operations:
Signature is
Relop: Integer * integer-> Boolean
Where Relop maybe equal, not equal, less than, greater than,less-than-or-equal, greater-than-or-
equal Relational operation compare the value of a two arguments data value and return
Boolean (true or false value) data object as its result.

3) Assignment Operations: Signature


assignment: Integer* integer-> integer
and
assignment: Integer* integer -> integer

4) Bit operations:
In C, integers also plays the role of boolean values . Therefore additional bit operations are also
defined.
Signature: BinOp: Integer*integer-> integer
Operator (&) for and the bits together
Operator(|) for or the bits together
Operator(<<) for shift the bit among others.

Implementation of integer

Most often using the hardware-defined integer storage representation and a set of hardware
arithmetic and relational operations on integers.
Numeric Data Types : Sub Ranges of an Integer

Specification of Integer Data Types:

A sub-range of an integer data type is a subtype of the integer data type and consists of a sequence
of integer values within some restricted range.
Declaration in Pascal

A:1....10

Declaration in Ada

A: integer range 1...10

Implementation: Its implementation basically has two advantages

1) smaller storage requirement :As a smaller range of values, a sub-range value can usually be
stored in fewer bits than a general integer value.

2) Better type checking: More precise type checking to be performed on the value assigned to
that variables.
Example if variable month is: Month: 1....12 then the assignment
Month:0 is invalid and can be detected at compile time.
If we use assignment Month: Month + 1
At runtime compiler check for range limit that should not be exceeded.

Numeric Data Type: Floating Point Real Numbers


Some precision required for floating point numbers, in terms of the numbers of digits used in the
decimal representation, may be specified by the programmer, as a Ada. • similar arithmetic
operations, relational and assignment operations as with integers are usually provided for real.
• boolean operation has restrictions.
• equality between two real number is rarely achieved due to round off issues. Because program
that check for equality to exit a loop may never terminate.
• some inbuilt functions like

Sine and maximum value.

sine: real*real-> real

and

max:real*real->real

Implementation of Floating Point

• Storage representation based on hardware representation in which storage location is divided


into mantissa(that is significant digit of the number ) and an exponent.
• any number N can be expressed as N=m*2k form between 0 and 1 and for some integer k.
• A double-precision form of floating point number is also often available, in which an
additional memory word is used to store an extended mantissa. <

Scalar Data Type Enumeration

An Enumerated data type is a data type whose domain values are given in a list or ordered list
and who's only operations are equality and assignment.
or
An Enumeration is an ordered list of distinct values.
or
An Enumeration is a complete ordered listing of all items in a collection.
Pascal was first language which introduced enumeration. To make enumeration facility useful, a
programming language must provide a mechanism for declaring and defining the new data type
and for declaring variables whose value will come from the element of type.

It is assumed that this literals are distinct and does equality can be directly defined.

"Before an era of enumeration what we had ?"

For example: A variable student class might have only 4 possible values representing fresher,
sophomore, junior and senior. Similarly, a variable StudentSex might have only two values
representing Male and Female.
Before the contact of enumeration the language like
Fortran or Cobol such variables is declared as integer type and distinct values are assigned. like
fresher=1 , sophomore=2, and so on
and male=0, female =1
Then translator manipulate values as integers.
That creates big problem like
Sophomore =1 and female=1
As both have some values can we apply integer based operation on it. As a point of view of
programmer it should not be but according to translator it can apply as they are of integer types.
Then languages such as C, article Pascal and Ada includes an Enumeration data type that allows
the programmer to define and manipulate such variables directly.

Specification of Enumeration

The programmers defined both the literal name to be used for the values and their ordering using
a declaration such as in pascal.
Type months=(jan,feb, mar, apr, june, jul, aug, sep, oct,nov,dec);

In C
enum studentclass{ fresh, soph, junior, senior};
enum studentsex {male, female};
In Pascal, C example can be written as

type class =(fresh, soph, junior, senior};


Followed by declaration for variables such as
Studentclass: Class;
Studentsex class: Class;

Here type definition introduces the type name class, which may be used wherever the primitive
type name such as integer might be used time.
It also introduces the literals of fresh, soph, junior, senior which may be used wherever a
language- defined literal such as "27" might be used. Thus we can write.
if studentclass= junior then.....
Instead of the less understandable
if studentclass= 3 then ...........
Which would be required if integer variables were used. Static compiler can find error such as
if student class= Male then
As Male is part of student class. Operations which we can perform-
• Relational operations(equal, less-than, greater-than,etc)
• Assignment
• Successor and Predecessor

Implementation of Enumeration

• Each value in the enumeration sequence is represented at run-time by one of the integers
0,1,2,..... as only a small set of values is involved and the values are never negative.
• In this integer representation is often shortened to omit the sign bit and use only enough bits for
the range of values required, as with the sub-range values.
• Only and maximum 2 bits are required to represent the senior=3 in memory because
3=11(binary)/ 2 bits only
In C, the programmer main override default and set any values desired for enumeration
values for example.
Enum class{ fresh=74, soph=89, junio=7, senior=28}

With this storage representation for enumeration types. Relational operations such as =,>, and
< may be implemented.

Scalar Data Types: Booleans

The Boolean data type is a data type, having two values(usually denoted true or false), intended
to represent the truth values of logic and Boolean algebra.

Specification: In Pascal and Ada, the Boolean data type is considered simply a language -
defined enumeration, viz;

type Boolean=(false, true);

Which both defines the names true and false for the values of the types and define ordering
false<true

Common Operations in boolean are

and : Boolean*Boolean->Boolean(conjunction)

or : Boolean*Boolean->Boolean(inclusive disjunction )

not : Boolean ->Boolean(negative or complement)

Implementation of boolean data type:


Single bit of storage is provided, no descriptor designated the data type is needed. Because
single bit may not be separately addressable in memory which often takes a byte or word
to represent it if extended. Then the value true and false might be represented in two ways
within the storage unit:

• 1) a particular bit is used for the value(often the sign bit of the number representation), with '0=
false', '1=true', and the rest of the byte or word ignored, or

• 2) A zero value in the entire storage unit represent False, and any other non zero value
represents true
Scalar Data Types: Characters

Specification of Characters

A character data type provides data objects that have a single character as their value. Set of
values in character data type depends upon "Hardware and Operating System" like ASCII
character set. and ordering of the characters in this character set is called Collecting
Sequence And ordering given by the 'Relational Operations'.

Character set includes


• Spaces
• Digits
• Special character @,#,$,& etc.

Operations on character data include only

• Relational operations
• Assignment and
• To test character for- Letter, Digit, Special Character.
Implementation of Character Data Type

Character data values are almost directly supported by the underlying hardware and operating
system because their use in input-output.
In C Character is declared

ASCII value of character data type in C are

0 to 9 is 48 to 57/ ASCII value


A to z is 65 to 90
a to z is 97 to 122
And all remaining for special characters.

char a ; //declaration
a='A'; // initializing character data object with A
Where= A=65=1000001
Introduction to Syntax and Semantics

Syntax:

Like ordinary language English, programming languages have syntax. The Syntax of a
(programming) language is a set of rules that define what sequences of symbols are considered
to be valid expression (programs) in the language.
or
The Syntax of a programming language is what the program looks like.
Syntax provides significant information needed for understanding a program and provides much
needed information towards the translation of the source program into the object program.

A valid representation of syntax

X=Y+Z

Invalid representation maybe

XY+-

2+3*4 text will be interpreted this expression as having value 14 and not 20. That is,
expression is interpreted as if written (2+3)*4.

We can specify either interpretation, if we wish, by syntax and hence guide the translator into
generating the correct operations for evaluating this expression.

In a Statement

X=2.82 + 3.68

syntax cannot tell the type of x on which result is depended.


If x is Real then output will be 6.50 and
if x is integer then output will be 6.
To completely describe the syntactic structure of programming language we need
something else which can tell us the meaning of expression, statement and program units.

Semantics: Semantics is the meaning of an expression (program) in a programming


language.

In C to declare a 10 elements vector V of Integer has declaration

int v:{10};

In Pascal
v:array[0......9] of integer
Although both creates similar data objects at run time , their syntax is very different. To
Understand the meaning of declaration we need to know the semantics of both pascal and c for
such array declaration.

Another example

While (< boolean_exp>)< statement>

The semantics of this statement form is that when the current value of the Boolean exp. Is true,
the embedded statement is true.

The General Problem of Describing Syntax

A language, whether natural- like english, artificial like C, Java, is a set of strings of characters
from some alphabet. The strings of a language are called " Sentences or Statements". So Syntax
rules of a language specify which strings characters from the languages alphabet are in the
language.
Formal description of the syntax of programming languages, for simplicity sake , often do not
include descriptions of the lowest - 7 synthetic units. The small unit are called "LEXEMES".
Lexemes include numerical literals, operators and special words, among others. We can think of
program as strings of Lexemes rather than of characters.
Lexemes are partitioned into group - for example. The names of variables, methods, classes
and so forth in a programming language form a group called identifiers.
Each Lexemes group is represented by a name, or token. So, a token of a language is a category
of its Lexemes.

Consider the following Java statement:

Index= 2* count +17;

Lexemes Tokens

= equal-sign

index indentifier

2 int-literal

* mult-operator

count identifier

+ plus-operator

17 int-literals

; semicolon
Two distinct ways of defining a language

A) Language Recognizers:

• A recognition device reads input strings of the language and decide whether the input string
belong to the language.
Syntax Analyzers: Determine whether the given program or syntactically correct.

B) Language Generator:

• It generate sentences of a language


• people prefers certain forms of generators over recognizers because they can more easily read
and understand them.
By contrast, the syntax checking portion of a compiler.(a language recognizer) is not as useful
as language description for a programmer because it can be used only in trial- and- error mode.

To determine correct Syntax of a particular statement using a compiler, the programmer


can only submit circulated version and not whether the compiler accepts it.

Formal Methods of Describing Syntax

English:
Alphabet: a,b,c,d,e,f,,,,z,A,B,C,....Z. Punctuations:,.''!""_/

I eat chocolate- Words: Combination of alphabets


Sentence:

Eat I. Chocolate*

He eat chocolate- words are

correct He eat

chocolate.* Here eat


should be eats by Tenses

Sentence:
Programming language
1. Alphabets: a,b,c,d,-----z,A,B.....Z
2. Digits: 0,1,2,3....9
3. Arithmetic symbol: *,-,/,+
4. Special symbols: ->.....#,@,!
Programming language in more technical way
header file: Math.h
statements : Tokens
Expression: Lexemes
Identifiers: Literals
These all need a grammar which can describe their syntax.

Grammar: Syntax and structure of a language. It is used in compiler creation. Describe the
syntax of a programming language.

Natural Language: A language that has developed naturally through use.

Artificial Language: Language used to communicate with computers.

Syntax: The rules governing the arrangement of words and phrases to create well formed
sentences .
Meta language:
• Language used to model the other language.
• Determine whether a series of characters is valid.
• Generate girlfriend statement.
• Breakdown a statement into constituent parts so it can be converted into Machine Language.

Token: Each and every smallest individuals units in programming language are known as token.

c tokens are of six types

Type Example

Keywords int , while

Identifier Main, total

Constants 10, 20

Strings " total"," hello"

Special symbols (),{}

Operator +,*,-,/

Example

int main()
{

int x,y,total;

x=10,y=20;

total=x+y;

printf(" total=%d\n", total);

Where

main identifier

{,},(,) delimiter special control

int keyword

x,y,total identifier

Man,{,},),(,int,x,y,total tokens

Grammars and Derivations

• A Grammar is "Generative Device" for defining Languages.


• The sentences of the language are generated through a sequence of applications of the
rules, beginning with the special non terminal of the grammar called start symbol. • This
sequence of rule applications is called Derivation.
Parse Tree and Ambiguity

Parse Tree

The Hierarchical Syntactic Structure of Sentences of the Languages is called Parse Tree.

A=B*(A+C)
Ambiguity

A grammar that generates sentential form or which there are two or more distinct parse tree is
said to be ambiguous.
A=B+C*A it has a distinct parse Tree

You might also like