Data Structures and Algorithms

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 70

Programming Algorithms and

Data Structures

Introduction
Stack and Queue / Slide 2

Overview

The Lecturer

Course structure
Course Aims / Objectives

Topics
Assessment
Course Policies
Stack and Queue / Slide 3

Lecturer’s Information
Dr. Phillip Kisembe

Mobile 0552215093

email drpkisembe@anuc.edu.gh
Stack and Queue / Slide 4

Aims and Summary


 This module builds on the concepts and principles
outlined in the Programming module at level 1.
The aim of this module is to give the student
additional insight into programming techniques, in
the context of data structures and algorithms. The
subject of complexity analysis for algorithms and
operations upon data structures will be introduced
and developed within a practical context.
Advanced language features such as the use of
event driven programming will also be introduced
in this module.
Stack and Queue / Slide 5

Course Educational Objectives


 The intended learning outcomes are that on
completion of this module the student should be able
to:
 1. Understand and select appropriate algorithms for
solving a range of problems.
 2. Design and implement algorithms and data
structures for novel problems.
 3. Reason about the complexity and efficiency of
algorithms.
 4. Demonstrate the use of object oriented
programming features and advanced language
features such as events and GUIs.
Stack and Queue / Slide 6

Teaching And Learning Methods


 Class contact time will comprise of a combination of lecture,
discussion and tutorial sessions.

 During lectures, students will be required to contribute by


answering questions and contributing to a topic on the floor for
discussion.

 The class will meet for three hours every week (see Time
table).
Stack and Queue / Slide 7

Indicative Content
Data Structures:
Arrays, Lists, Trees, Stacks, Queues, Heaps, Self balancing trees, Graphs and Hash
tables 
 
Algorithms:
Searching and sorting, Recursion, Divide and Conquer strategies, Greedy algorithms,
Prim’s algorithm, Dijkstra’s algorithm, Bellman-Ford algorithm
 
Complexity and efficiency:
Time and space complexity, Big-O notation, Analysis of algorithms
 
Software reuse and advanced language features:
Encapsulation, Data hiding, Polymorphism, Exception handling, Events, GUIs.
Stack and Queue / Slide 8

Course Requirements
Component Weighting Learning Outcomes

Assessment
2 3 4

45 min Phase Test (30%)


Coursework 60%   Y Y Y
45 min Phase Test (30%)

Examination 2-hour exam paper 40% Y Y Y


Stack and Queue / Slide 9

Literature And Reading Materials

 Essential Reading
 McGrath, M. (2011) C++ Programming in
Easy Steps. 4th ed. Southam: In Easy Steps

 Recommended Reading
 Cormen, Thomas H. (2009) Introduction to
Algorithms. 3rd ed. Cambridge, Mass: MIT
Press
Stack and Queue / Slide 10

Course Policies
 Class Participation:
 Preparation and engaged participation at all class
sessions are expected of all students.

 Deadlines are sacred and firm.


 Failure to keep deadlines will adversely affect your
grade.
 All written assignments should be typed.

 Attendance: regular attendance and promptness


are expected at each lecture.
 When absent, the student is responsible for getting
notes and assignments from other students.
Stack and Queue / Slide 11

What is a Computer Program?

Solution
Problem Computer Program

Input Process Output


(DS) (Algorithm) (DS)

Data Structures+Algorithms=Programs
Stack and Queue / Slide 12

What are Data Structures?


 A data structure is a specialized format for
organizing and storing data.

 Any data structure is designed to organize data to


suit a specific purpose so that it can be accessed
and worked with in appropriate ways.

 General data structure types include the array, the


stack, the record, the queue, the tree, and so on.
Stack and Queue / Slide 13

Data Structures
 Data may be organized in different ways

 Data structure is the logical or mathematical


model of a particular organization of data

 Must be rich enough to mirror the actual


relationships of the data in the real world.
What are Data Structures?
Stack and Queue / Slide 14

Data structures let the input and output be represented in a


way that can be handled efficiently and effectively.

Array

Linked List

Queue
Tree
Stack
Stack and Queue / Slide 15

Data Structures
 Data can be organized into:
Primitive types
Composite types
Abstract data types
Stack and Queue / Slide 16

Primitive Data Type


 Classic basic primitive types may include:

Character (character, char);

Integer (integer, int, short, long, byte) with a


variety of precisions;

Floating-point number (float, double, real);

Boolean having the values true and false.


Stack and Queue / Slide 17

Composite Data Types


 Composite data types are data types which
can be constructed in a program using its
programming language's primitive data types
and other composite types.

 The act of constructing a composite type is


known as composition.
Composite Data Types
Stack and Queue / Slide 18

 A struct is C's and C++'s notion of a


composite type.
 A data type that composes a fixed set of
labeled fields or members.
 It is so called because of the struct keyword
used in declaring them, which is short for
structure or, more precisely, user-defined
data structure. struct Account
{
int account_number;
char *first_name;
char *last_name;
float balance;
};
Stack and Queue / Slide 19

Abstract Data Type


 An Abstract Data type is defined as a mathematical model of the data
objects that make up a data type as well as the functions that operate on
these objects

 An abstract data type is defined indirectly, only by the operations


that may be performed on it and by mathematical constraints on
the effects (and possibly cost) of those operations.

 For example, an abstract stack data structure could be defined


by two operations:
 push, that inserts some data item into the structure,
 pop, that extracts an item from it;
 with the constraint that each pop always returns the most recently
pushed item that has not been popped yet.
Stack and Queue / Slide 20

The Need for Data Structures


 Goal: to organize data

 Criteria: to facilitate efficiency in the:


storage of data
retrieval of data
manipulation of data
Stack and Queue / Slide 21

Data Structure Operations


 Traversing
 Accessing each record exactly once so that certain items in the
record may be processed
 Searching
 Finding the location of the record with the given key value or finding
the location of all records which satisfy one or more conditions
 Insertion
 Adding a new record to the structure
 Deletion
 Removing a record from the structure

 Sorting
 Arrange the records in a logical order
 Merging
 Combining records from two or more files or data structures into one
Data Structures and Algorithms
Program Efficiency & Complexity Analysis
of Algorithms
Stack and Queue / Slide 23

What is an Algorithm?
 An algorithm is a definite procedure for solving a
problem in finite number of steps

 Algorithm is a well defined computational procedure


that takes some value(s) as input, and produces
some value(s) as output

 Algorithm is finite number of computational


statements that transform input into the output

 Algorithm Definition : A finite set of statements


that guarantees an optimal solution in finite interval
of time
Stack and Queue / Slide 24

Algorithm
 An algorithm is a set of instructions to be followed
to solve a problem.
 There can be more than one solution (more than one
algorithm) to solve a given problem.
 An algorithm can be implemented using different
programming languages on different platforms.
 An algorithm must be correct. It should correctly
solve the problem.
 e.g. For sorting, this means even if (1) the input is
already sorted, or (2) it contains repeated elements.
 Once we have a correct algorithm for a problem,
we have to determine the efficiency of that
algorithm.
CENG 213 Data Structures 24
Stack and Queue / Slide 25

Favorite Algorithms
 Takes less memory (Space Efficient)
 Smaller execution time
 Smaller programming time
 Time complexity (most important)
Stack and Queue / Slide 26

Efficient Algorithms

 Consumes lesser amount of resources while


solving a problem of size n
 Memory

 Time

 So do we just measure the processor time?


Stack and Queue / Slide 27

Measuring Efficiency
The efficiency of an algorithm is a measure of
the amount of resources consumed in solving a
problem of size n.

The resource we are most interested


in is time

We can use the same techniques to


analyze the consumption of other
resources, such as memory space.
Stack and Queue / Slide 28

Measuring Efficiency
 It would seem that the most obvious way to
measure the efficiency of an algorithm is to
run it and measure how much processor time
is needed

 Is it correct??
Stack and Queue / Slide 29

Real Time Execution Vs. Algorithm Complexity

 Same algorithms running on different


processors, don’t take the same time, why?

Processor speed
Processor type
Programming language
Quality of compiler
Size and nature of input
Operating system
Stack and Queue / Slide 30

Theoretical Analysis
 Uses a high-level description of the algorithm
instead of an implementation

 Characterizes running time as a function of the


input size, N.

 Taking into account all possible inputs Allows us


to evaluate the speed of an algorithm
independent of the hardware/software
environment
Stack and Queue / Slide 31

Running Time of an Algorithm


 Factors affecting running time:
 Nature of input
 Number of input
 Number of steps/primitive operations

 Running time is measured in terms of number of


steps/primitive operations.

 Generally time grows with size of input, so running time of


an algorithm is usually measured as function of input size.

 Independent from machine, OS

 Would not vary from processor to processor as algorithm


steps would remain the same
Stack and Queue / Slide 32

Analyzing an Algorithm
 Finding running time of an Algorithm

 Running time is measured by number of


steps/primitive operations performed

 Steps means elementary operation like


 ,+, *,<, =, A[i] etc

 We will measure number of steps taken in term of


size of input
Stack and Queue / Slide 33

Analysis of Algorithms

 Assume input size to be N/n


 Primitive steps: +,-,*,/,= etc.
 What about loops and control
structures?
Stack and Queue / Slide 34

Simple Example (1)

// Input: int A[N], array of N integers


// Output: Sum of all numbers in array A

int Sum(int A[], int N)


{
int s=0;
for (int i=0; i< N; i++)
s = s + A[i];
return s;
}

How should we analyse this?


Stack and Queue / Slide 35

Simple Example (2)


// Input: int A[N], array of N integers
// Output: Sum of all numbers in array A

int Sum(int A[], int N){


int s=0; 1
for (int i=0; i< N; i++)
2 3 4
s = s + A[i];
5 6 7 1,2,8: Once
return s; 3,4,5,6,7: Once per each iteration
}
8 of for loop, N iteration
Total: 5N + 3
The complexity function of the
algorithm is : f(N) = 5N +3
Stack and Queue / Slide 36

Simple Example (3) Growth of 5n+3


Estimated running time for different values of N:

N = 10 => 53 steps
N = 100 => 503 steps
N = 1,000 => 5003 steps
N = 1,000,000 => 5,000,003 steps

As N grows, the number of steps grow in


linear proportion to N for this function “Sum”
Stack and Queue / Slide 37

What Dominates in Previous Example?


 What about the +3 and 5 in 5N+3?
 As N gets large, the +3 becomes insignificant
 5 is inaccurate, as different operations require varying amounts of time
and also does not have any significant importance

 What is fundamental is that the time is linear in N.


Asymptotic Complexity: As N gets large, concentrate on the highest
order term:
 Drop lower order terms such as +3
 Drop the constant coefficient of the highest order term i.e. 5
Stack and Queue / Slide 38

Asymptotic Complexity
 The 5N+3 time bound is said to "grow
asymptotically" like N

 This gives us an approximation of the complexity


of the algorithm

 Ignores lots of (machine dependent) details,


concentrate on the bigger picture
Stack and Queue / Slide 39

 The simplest example is, when considering a


function f(n), there is a need to describe its
properties when n becomes very large.

 Thus, if f(n) = n2+3n, the term 3n becomes


insignificant compared to n2 when n is very
large. The function "f(n) is said to be
asymptotically equivalent to n2 as n → ∞", and
this is written symbolically as f(n) ~ n2
Stack and Queue / Slide 40

Relative rates of growth


 most algorithms' runtime can be expressed as a function of
the input size n: T(n)

 rate of growth: measure of how quickly the graph of a


function rises

 goal: distinguish between fast- and slow-growing functions

 we only care about very large input sizes


(for small sizes, almost any algorithm is fast enough)

 this helps us discover which algorithms will run more quickly or


slowly, for large input sizes
Stack and Queue / Slide 41

Time complexity
 In computer science, the time complexity of an algorithm
quantifies the amount of time taken by an algorithm to run
as a function of the size of the input to the problem.

 The time complexity of an algorithm is commonly


expressed using big O notation, which suppresses
multiplicative constants and lower order terms.

 When expressed this way, the time complexity is said to be


described asymptotically, i.e., as the input size goes to
infinity.
Stack and Queue / Slide 42

Time Complexity
 For example, if the time required by an algorithm on all inputs
of size n is at most 5n3 + 3n, the asymptotic time complexity is
O(n3).

 Time complexity is commonly estimated by counting the


number of elementary operations performed by the algorithm,
where an elementary operation takes a fixed amount of time to
perform.

 Thus the amount of time taken and the number of elementary


operations performed by the algorithm differ by at most a
constant factor.
Stack and Queue / Slide 43

Time Complexity
 Since an algorithm may take a different amount
of time even on inputs of the same size, the
most commonly used measure of time
complexity, the worst-case time complexity of
an algorithm, denoted as T(n), is the maximum
amount of time taken on any input of size n.

 Time complexities are classified by the nature


of the function T(n).
Stack and Queue / Slide 44

Big O Notation

 Big O notation is used in Computer Science to


describe the performance or complexity of an
algorithm.

 Big O specifically describes the worst-case


scenario, and can be used to describe the
execution time required or the space used
(e.g. in memory or on disk) by an algorithm.
Stack and Queue / Slide 45

Big O Notation
 describes the limiting behavior of a function
when the argument tends towards a particular
value or infinity, usually in terms of simpler
functions.

 Big O notation characterizes functions


according to their growth rates: different
functions with the same growth rate may be
represented using the same O notation.
Stack and Queue / Slide 46

Constant Time (O(1))

 O(1) describes an algorithm that will always execute in the same


time (or space) regardless of the size of the input data set.

bool IsFirstElementNull(String[] strings)


{
if(strings[0] == null)
{
return true;
}
return false;
}
Stack and Queue / Slide 47

Constant Time (O(1))


 An algorithm is said to be constant time (also
written as O(1) time) if the value of T(n) is
bounded by a value that does not depend on
the size of the input.

 For example, accessing any single element in


an array takes constant time as only one
operation has to be performed to locate it
Stack and Queue / Slide 48

linear time O(N)

 O(N) describes an algorithm whose performance will


grow linearly and in direct proportion to the size of the
input data set.

 The example below also demonstrates how Big O


favours the worst-case performance scenario; a
matching string could be found during any iteration of
the for loop and the function would return early, but Big
O notation will always assume the upper limit where the
algorithm will perform the maximum number of
iterations.
Stack and Queue / Slide 49

linear time O(N)


bool ContainsValue(String[] strings, String value)
{
for(int i = 0; i < strings.Length; i++)
{
if(strings[i] == value)
{
return true;
}
}
return false;
}
Stack and Queue / Slide 50

linear time O(N)


 An algorithm is said to take linear time, or O(n) time, if its time
complexity is O(n).

 Informally, this means that for large enough input sizes the
running time increases linearly with the size of the input.

 For example, a procedure that adds up all elements of a list


requires time proportional to the length of the list. This
description is slightly inaccurate, since the running time can
significantly deviate from a precise proportionality, especially for
small values of n.
Stack and Queue / Slide 51

Quadratic time O(N2)

 O(N2) represents an algorithm whose


performance is directly proportional to the
square of the size of the input data set.

 This is common with algorithms that involve


nested iterations over the data set.

 Deeper nested iterations will result in O(N3),


O(N4) etc.
Stack and Queue / Slide 52

Quadratic time O(N2)

bool ContainsDuplicates(String[] strings)


{
for(int i = 0; i < strings.Length; i++)
{
for(int j = 0; j < strings.Length; j++)
{
if(i == j)
// Don't compare with self
{
continue;
}
if(strings[i] == strings[j])
{
return true;
}
}
Stack and Queue / Slide 53

Exponential time O(2N)

 O(2N) denotes an algorithm whose growth will


double with each additional element in the
input data set.

 \The execution time of an O(2N) function will


quickly become very large.
Stack and Queue / Slide 54

Exponential time O(2N)


 Finding the (exact) solution to the traveling
salesman problem using dynamic
programming;

 determining if two logical statements are


equivalent using brute-force search
Stack and Queue / Slide 55

Logarithmic time
O(log N)

 An algorithm is said to take logarithmic time if T(n)


= O(log n). Due to the use of the binary numeral
system by computers, the logarithm is frequently
base 2 (that is, log2 n, sometimes written lg n).

 However, by the change of base equation for


logarithms, loga n and logb n differ only by a constant
multiplier, which in big-O notation is discarded; thus
O(log n) is the standard notation for logarithmic time
algorithms regardless of the base of the logarithm.
Stack and Queue / Slide 56

Logarithmic time
O(log N).

 Algorithms taking logarithmic time are


commonly found in operations on binary trees
or when using binary search.

 Finding an item in a sorted array with a binary


search or a balanced search tree as well as
all operations in a Binomial heap.
Stack and Queue / Slide 57

Big-Oh Rules
 If is f(n) a polynomial of degree d, then f(n) is
O(nd), i.e.,
1. Drop lower-order terms
2. Drop constant factors

 Use the smallest possible class of functions


Say “2n is O(n)” instead of “2n is O(n2)”

 Use the simplest expression of the class


Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”
Stack and Queue / Slide 58

The Big-Oh Notation


 Given functions f(n) and g(n), we say that f(n) is
O(g(n)) if there are positive constants c and n0
such that f(n) ≤ cg(n) for n ≥ n0

 Example: 2n + 10 is O(n)
 2n + 10 ≤ cn
 (c − 2) n ≥ 10
 n ≥ 10/(c − 2)
 Pick c = 3 and n0 = 1
Stack and Queue / Slide 59

The Big-Oh Notation


 Big-O notation is one of the ways in which
we talk about how complex an algorithm or
program is.

 It gives us a nice way of quantifying or


classifying how fast or slow a program is
as a function of the size of its input.
Stack and Queue / Slide 60

Big-Oh Example
 the function n2 is not O(n)
 n2 ≤ cn
n
≤c
 The above inequality cannot be satisfied
since c must be a constant
Stack and Queue / Slide 61

More Big-Oh Examples


 7n-2
 7n-2is O(n)
 need c > 0 and n0 ≥ 1 such that 7n-2 ≤ c*n for n ≥ n0
 this is true for c = 7 and n0 = 1

 3n3 + 20n2 + 5
 3n3 + 20 n2 + 5 is O(n3)
 need c > 0 and n0 ≥ 1 such that 3n3 + 20n2 + 5 ≤ c*n3
for n ≥ n0
 this is true for c = 4 and n0 = 21
Stack and Queue / Slide 62

Orders of common functions

Running time Examples of Examples of


Name
(T(n)) running times algorithms
Determining if a
constant time O(1) 10 number is even or
odd
Finding the
linear time O(n) n smallest item in an
unsorted array
Bubble sort;
quadratic time O(n2) n2
Insertion sort
logarithmic time O(log n) log n, log(n2) Binary search
Solving the
traveling salesman
exponential time 2O(n) 1.1n, 10n problem
using
dynamic program
ming
Hierarchy of Big-Oh
Stack and Queue / Slide 63

 Functions, ranked in increasing order of


growth:
1
 log log n
 log n
n
 n log n
 n2
 n2 log n
 n3
 ...
 2n
 n!
 nn
63
Stack and Queue / Slide 64

Growth-Rate Functions
O(1) Time requirement is constant, and it is independent of the problem’s size.
O(log2n) Time requirement for a logarithmic algorithm increases increases
slowly
as the problem size increases.
O(n) Time requirement for a linear algorithm increases directly with the size
of the problem.
O(n*log2n) Time requirement for a n*log2n algorithm increases more rapidly than
a linear algorithm.
O(n2) Time requirement for a quadratic algorithm increases rapidly with the
size of the problem.
O(n3) Time requirement for a cubic algorithm increases more rapidly with the
size of the problem than the time requirement for a quadratic algorithm.
O(2n) As the size of the problem increases, the time requirement for an
exponential algorithm increases too rapidly to be practical.

CENG 213 Data Structures 64


Stack and Queue / Slide 65

Growth-Rate Functions
 If an algorithm takes 1 second to run with the problem size 8,
what is the time requirement (approximately) for that algorithm
with the problem size 16?
 If its order is:
O(1)  T(n) = 1 second
O(log2n)  T(n) = (1*log216) / log28 = 4/3 seconds
O(n)  T(n) = (1*16) / 8 = 2 seconds
O(n*log2n)  T(n) = (1*16*log216) / 8*log28 = 8/3 seconds
O(n2)  T(n) = (1*162) / 82 = 4 seconds
O(n3)  T(n) = (1*163) / 83 = 8 seconds
O(2n)  T(n) = (1*216) / 28 = 28 seconds = 256 seconds

CENG 213 Data Structures 65


Stack and Queue / Slide 66

Best, worst and average case

 In computer science, best, worst and


average cases of a given algorithm express
what the resource usage is at least, at most
and on average, respectively.

 Usually the resource being considered is


running time, but it could also be memory or
other resources.
Stack and Queue / Slide 67

Best, worst and average case


 In real-time computing, the worst-case
execution time is often of particular concern
since it is important to know how much time
might be needed in the worst case to guarantee
that the algorithm will always finish on time.

 Average performance and worst-case


performance are the most used in algorithm
analysis.
Stack and Queue / Slide 68

Best, worst and average case


 Less widely found is best-case performance,
but it does have uses: for example, where the
best cases of individual tasks are known, they
can be used to improve the accuracy of an
overall worst-case analysis
Stack and Queue / Slide 69

Best, worst and average case


 Worst-case analysis has similar problems: it is typically impossible
to determine the exact worst-case scenario. Instead, a scenario is
considered such that it is at least as bad as the worst case.

 For example, when analysing an algorithm, it may be possible to


find the longest possible path through the algorithm (by considering
the maximum number of loops, for instance) even if it is not
possible to determine the exact input that would generate this path
(indeed, such an input may not exist).

 This gives a safe analysis (the worst case is never underestimated),


but one which is pessimistic, since there may be no input that would
require this path.
Stack and Queue / Slide 70

Any Questions?

You might also like