Dr.
Islam El-Maddah
Islam_elmaddah@yahoo.co.uk
Ain Shams University Faculty of Engineering
Spring 2022
Software Programs
Data Code
Arrays
Dynamic Array
Linked List +Sort
+Search Algorithms
Recursive functions
Stack
Queues
Tree (BST)
Binary Tree (tree
search)
Graph (traverse) + SN
Hashing
Data Structure and Algorithm Analysis in C++
3rd edition Mark Allen Weiss.
Midterm 15
Assignments and Quizzes 10
Project 15
Final Exam 110
Abstract Data Types
6
An abstract data type is a mathematical set
of data, along with operations defined on
that kind of data
Examples:
◦ int: it is the set of integers (up to a certain
magnitude), with operations +, -, /, *, %
◦ double: it’s the set of decimal numbers (up to a
certain magnitude), with operations +, -, /, *
7
The previous examples belong to what is
called built-in data types
That is, they are provided by the
programming language
But new abstract data types can be defined by
users, using arrays, classes (if object oriented
programming), etc.
8
For a phone book application
we are interested in fast search,
fair add, ok delete, etc
Phone book
Search by name Delete/remove
Add new item
by number item
Introduction to Data Structures
10
A data structure is a user-defined abstract
data type
Examples:
◦ Complex numbers: with operations +, -, /, *,
magnitude, angle, etc.
◦ Stack: with operations push, pop, peek, isempty
◦ Queue: enqueue, dequeue, isempty …
◦ Binary Search Tree: insert, delete, search.
◦ Heap: insert, min, delete-min.
11
Specification
◦ A set of data
◦ Specifications for a number of operations to be
performed on the data
Design
◦ A lay-out organization of the data
◦ Algorithms for the operations
Goals of Design: fast operations
12
Representation of the data using built-in data
types of the programming language (such as
int, double, char, strings, arrays, structs,
classes, pointers, etc.)
Language implementation (code) of the
algorithms for the operations
13
In OOP languages such as C++, both the
data representation and the operations are
aggregated together into what is called
objects
The data type of such objects are called
classes.
Classes are blue prints, objects are
instances.
14
Pseudo-code of finding a maximum value of
the array x[]:
double M=x[0];
for i=1 to n-1 do
if (x[i] > M)
M=x[i];
endif
endfor
return M;
15
T(n) = a+(n-1)(b+a) = O(n)
Where “a” is the time of one assignment, and
“b” is the time of one comparison
Both “a” and “b” are constants that depend on
the hardware
Observe that the big O spares us from
◦ Relatively unimportant arithmetic details
◦ Hardware dependency
16
The big-O notation is a simplification
mechanism of time/memory estimates.
It loses some precision, trading precision
for simplicity
Retains enough information to give an idea
of speed/cost of an algorithm, and to be
able to compare competing algorithms.
17
T(n)=O(1)// constant time, eg adding numbers
T(n)=O(log n) // logarithmic, like binary search
T(n)=O(n)// linear, like linear search
T(n)=O(n2) //quadratic, like selection sort
T(n)=O(n3) //cubic, like multiplying two
matrices
T(n)=O(nc), c 1 // polynomial
T(n)=O(logc n), c 1 // polylogarithmic
T(n)=O(nlog n) // like merge sort
18
a+ a + …(n items).+a = an = O(n)
a+2a+3a+…+na= a*n(n+1)/2 = O(n2)
12+22+32+…+n2= n(n+1)(2n+1)/6 = O(n3)
1+x+x2+x3+…+xn-1=(x n – 1)/(x-1) = O(xn-1)
19
a+2a+3a+…+na= a*n(n+1)/2 = O(n2)
Sum of a Mathematic series = (first + last)/2 *
number of items
E.g. 1+2+3+ 10 = 10*(1+10)/2 = 55
1+x+x2+x3+…+xn-1=(x n – 1)/(x-1) = O(xn-1)
Sum of a Geometric series = (last*step-first)/(step-
1)
E.g. 1+2+4+8+16 = (32-1)/(2-1) = 31
2+4+8+16 = (32-2)/(2-1)=30
20
How to define arrays
int array[50];
float array[50]={1.2,2.1,3.4,2,1.2e-3};
How to access arrays
Using index which takes an integer number
from 0 to size-1: like array[0] and array[49]
or array[i], where i changes from 0 to 49
int a[]={12,10,-3,120,5} // compiler will allocate
address 1010 to the array a
Arrays are stored in memory consecutive locations
Arrays are static, Address Content
the index boundary is limited with 1010 12
the maximum size, 1012 10
otherwise runtime error
1014 -3
will be issued in most compilers
1016 120
1018 5
Arrays are not good for dynamic space
requirements when the need for storing
is changed from one program run to another
or during the same run
Very common use of array is to store variable
length items like phonebook for example.
const int MAXSIZE = 100;
string pbook[MAXSIZE];
int size =0;
for(int i=0; i<size; i++)
cout<< pbook[i]<<endl;
string someentry;
cin>> someentry;
pbook[size] =someentry;
size++;
Inserting item at the middle
int pos;
string someentry;
cin>> pos >> someentry;
pbook[size] = pbook[pos];
Pbook[pos]=someentry;
size++;
Removing item at the middle
int pos;
cin>> pos;
pbook[pos] = pbook[size-1];
size--;
int maxindex=0; //assume it is at the top
for(int i=1; i<size;i++)
If(pbook[i]>pbook[maxindex])
maxindex=i;
cout<< “the maximum value is” <<pbook[maxindex];
Sorting can be done in many ways (each has
its own big O)
one way is to perform selection sort.
The array is visited number of times = size-1;
In each time, the maximum of the array is
determined and swapped with the last
element of the array.
for(int p=0; p<size-1;p++)
{int maxindex=0; //assume it is at the top
for(int i=1; i<size-p;i++)
If(pbook[i]>pbook[maxindex])
maxindex=i;
cout<< “the maximum value is” <<pbook[maxindex];
//swap pbook[size-1-p], pbook[maxindex]
int temp=pbook[size-1-p];
pbook[size-1-p]=pbook[maxindex];
pbook[maxindex]=temp;
This is called the
}
selection sort it is
O(n2) how?
Content Content Content
12 12 5
10 10 10
-3 -3 -3
120 5 12
P=0 P=1
5 Maxindex=3
120 Maxindex=0 120
Swap 3,4 Swap 0,3
Content Content Content
5 5 -3
10 -3 5
-3 10 10
12 12 12
P=2 P=3
120 Maxindex=1
120 Maxindex=0 120
Swap 1,2 Swap 0,1
A pointer refers to a position/address in the
memory where a value is found
Pointers are created in C/C++ for one of the data
types as follows int *pint;
To refer to a variable address use & before the
variable name
Like int x=9, *pint;
pint=&x
Pointers can allocate a new space using new
pint = new int
The values that pointer points at can be accessed
using * then the pointer name like cout<<*pint
Dynamic Arrays are allocated during run time using
pointers
float * darr;
darr = new float[size];
You can access the values same as arrays darr[i] where I
between 0 and size-1
*darr the value of the first float value
*(darr+1) the value of the second float darr[1]
darr++ advance the pointer to the second element in the
array
At the end, The pointer space will be freed using delete
delete [] darr; //most of c++ compilers put value NULL
or ptrnull
A C++ class for phone book
class pbook
{int size;
string *pdata;
public: pbook(int s) {pdata = new string[s]; size=0;}
void add(string s1) {pdata[size]=s1; size++;}
void insert(string s2, int pos);
void remove(int pos);
void list() ;
void sort();
~pbook() {delete [] pdata;}
}
main()
{
pbook a(10); // the constructor of object a will be
invoked
a.add(“my_friend”); Pointer variables may be
used as a cash to
pbook *ptr; reference recently used
items !
ptr =new pbook(5); // call the constructor
ptr->add(“my_friend”);
delete ptr; // call the destructor of *ptr
} // implicitly call the destructor of a
void get_sqrt( double x)
// Precondition: x >= 0.
// Postcondition: The output is a number
// y = the square root of x.
38
Use the library call
assert (condition)
You have to include #include <cassert>
It makes sure that condition is satisfied (=
true), in which case the execution of the
program continues.
If condition is false, the program execution
terminates, and an error message is printed
out, describing the cause of the
termination.
39
Determining an estimate of the time and
memory requirement of the algorithm.
Time estimation is called time complexity
analysis
Memory size estimation is called space
complexity analysis.
Because memory is cheap and abundant,
we rarely do space complexity analysis
Since time is “expensive” , analysis now
defaults to time complexity analysis
CS 103 40
Performance Analysis and
Big-O
41
Let n be a non-negative integer representing
the size of the input to an algorithm
Let f(n) and g(n) be two positive functions,
representing the number of basic calculations
(operations, instructions) that an algorithm
takes (or the number of memory words an
algorithm needs).
CS 103 42
f(n)=O(g(n)) iff there exist a positive constant
C and non-negative integer n0 such that
f(n) Cg(n) for all nn0.
g(n) is said to be an upper bound of f(n).
CS 103 43
f(n) = 5n+2 = O(n) // g(n) = n
◦ f(n) 6n, for n 3 (C=6, n0=3)
f(n)=n/2 –3 = O(n)
◦ f(n) 0.5 n for n 0 (C=0.5, n0=0)
n2-n = O(n2) // g(n) = n2
◦ n2-n n2 for n 0 (C=1, n0=0)
n(n+1)/2 = O(n2)
◦ n(n+1)/2 n2 for n 0 (C=1, n0=0)
CS 103 44
When computing the complexity,
◦ f(n) is the actual time formula
◦ g(n) is the simplified version of f
Since f(n) stands often for time, we use T(n)
instead of f(n)
In practice, the simplification of T(n) occurs
while it is being computed by the designer
CS 103 45
If T(n) is the sum of a constant number of
terms, drop all the terms except for the most
dominant (biggest) term;
Drop any multiplicative factor of that term
What remains is the simplified g(n).
amnm + am-1nm-1+...+ a1n+ a0=O(nm).
n2-n+log n = O(n2)
CS 103 46
T(n)=O(1) // constant time
T(n)=O(log n) // logarithmic
T(n)=O(n) // linear
T(n)=O(n2) //quadratic
T(n)=O(n3) //cubic
T(n)=O(nc), c 1 // polynomial
T(n)=O(logc n), c 1 // polylogarithmic
T(n)=O(nlog n)
CS 103 47
The big-O notation is a simplification
mechanism of time/memory estimates.
It loses precision, trading precision for
simplicity
Retains enough information to give a ballpark
idea of speed/cost of an algorithm, and to be
able to compare competing algorithms.
CS 103 48