Data Structures- Recap
What is the "Data Structure" ?
Ways to represent data
Why data structure ?
To design and implement large-scale computer
system
Have proven correct algorithms
The art of programming
How to master in data structure ?
practice, discuss, and think
System Life Cycle
Summary
RADRCV
Requirements
What inputs, functions, and outputs
Analysis
Break the problem down into manageable pieces
Top-down approach
Bottom-up approach
System Life Cycle(Cont.)
Design
Create abstract data types and the algorithm
specifications, language independent
Refinement and Coding
Determining data structures and algorithms
Verification
Developing correctness proofs, testing the
program, and removing errors
Verification
Correctness proofs
Prove program mathematically
time-consuming and difficult to develop for large
system
Testing
Verify that every piece of code runs correctly
provide data including all possible scenarios
Error removal
Guarantee no new errors generated
Notes
Select a proven correct algorithm is important
Initial tests focus on verifying that a program runs correctly, then
reduce the running time
What is the module all about?
Design and Analysis of Algorithms
What is an Algorithm?
What is the module all about?
The theoretical study of design and analysis of
computer algorithms
Basic goals for an algorithm:
• always correct
• always terminates
• This class: performance
Performance often draws the line between
what is possible and what is impossible.
Analysis of Algorithms
Input Algorithm Output
An algorithm is a step-by-step procedure for
solving a problem in a finite amount of time.
Design and Analysis of Algorithms
• Analysis: predict the cost of an algorithm in terms of
resources and performance
• Design: design algorithms which minimize the cost
Algorithm Specification
Definition
An algorithm is a finite set of instructions that, if followed,
accomplishes a particular task. In addition, all algorithms
must satisfy the following criteria:
(1)Input.
(2)Output
(3)Definiteness
(4)Finiteness
(5)Effectiveness.
Meaning of the above criterions?????
Algorithm Specification
Definition
An algorithm is a finite set of instructions that, if followed,
accomplishes a particular task. In addition, all algorithms must satisfy
the following criteria:
(1)Input. There are zero or more quantities that are externally supplied.
(2)Output. At least one quantity is produced.
(3)Definiteness. Each instruction is clear and unambiguous.
(4)Finiteness. If we trace out the instructions of an algorithm, then for
all cases, the algorithm terminates after a finite number of steps.
(5)Effectiveness. Every instruction must be basic enough to be carried
out, in principle, by a person using only pencil and paper. It is not
enough that each operation be definite as in (3); it also must be
feasible.
Describing Algorithms
Natural language
English, Chinese
Instructions must be definite and effectiveness
Graphic representation
Flowchart
work well only if the algorithm is small and simple
Pseudo language
Readable
Instructions must be definite and effectiveness
Combining English and C++, Python
Translating a Problem into an Algorithm
Problem
Devise a program that sorts a set of n>= 1
integers.
How can we generate an algorithm?
Translating a Problem into an Algorithm
Problem
Devise a program that sorts a set of n>= 1 integers
Step I - Concept
From those integers that are currently unsorted, find the
smallest and place it next in the sorted list
Step II - Algorithm
for (i= 0; i< n; i++) {
Examine list[i] to list[n-1] and suppose that the
smallest integer is list[min];
Interchange list[i] and list[min];
}
Translating a Problem into an Algorithm(Cont.)
Step III - Coding
void sort(int *a, int n)
{
for (i= 0; i< n; i++)
{
int j= i;
for (int k= i+1; k< n; k++){
if (a[k ]< a[ j]) j= k;
int temp=a[i]; a[i]=a[ j]; a[ j]=temp;
}
}
Correctness Proof
Theorem
Function sort(a, n) correctly sorts a set of n>= 1
integers. The result remains in a[0], ..., a[n-1]
such that a[0]<= a[1]<=...<=a[n-1].
Proof:
For i= q, following the execution of line 6-11, we
have a[q]<= a[r], q< r< =n-1.
For i> q, observing, a[0], ..., a[q] are unchanged.
Hence, increasing i, for i= n-2, we have
a[0]<= a[1]<= ...<=a[n-1]
Analysis of algorithms
Issues:
correctness
time efficiency
space efficiency
optimality
Approaches:
theoretical analysis
empirical analysis
16
Running Time
Most algorithms transform best case
input objects into output average case
worst case
objects. 120
The running time of an 100
algorithm typically grows
Running Time
80
with the input size.
60
Average case time is often
difficult to determine. 40
We focus on the worst case 20
running time. 0
1000 2000 3000 4000
Easier to analyze Input Size
Crucial to applications such as
games, finance and robotics
Analysis of Algorithms 17
Experimental Studies
Write a program 9000
implementing the 8000
algorithm 7000
Run the program with 6000
Time (ms)
inputs of varying size and 5000
composition 4000
Use a method like 3000
System.currentTimeMillis() to 2000
get an accurate measure
1000
of the actual running time
0
Plot the results 0 50 100
Input Size
Analysis of Algorithms 18
Limitations of Experiments
It is necessary to implement the
algorithm, which may be difficult
Results may not be indicative of the
running time on other inputs not included
in the experiment.
In order to compare two algorithms, the
same hardware and software
environments must be used
Analysis of Algorithms 19
Theoretical Analysis
Uses a high-level description of the
algorithm instead of an implementation
Characterizes running time as a function
of the input size, n.
Takes into account all possible inputs
Allows us to evaluate the speed of an
algorithm independent of the
hardware/software environment
Analysis of Algorithms 20
Theoretical analysis of time
efficiency
Time efficiency is analyzed by determining
the number of repetitions of the basic
operation as a function of input size
Basic operation: the operation that
contributes most towards the running time
of the algorithm
input size
T(n) ≈ copC(n)
running time execution time Number of times
for basic operation basic operation is
executed 21
Input size and basic operation examples
Problem Input size measure Basic operation
Searching for key in a Number of list’s items, i.e.
Key comparison
list of n items n
Multiplication of two Matrix dimensions or total
??
matrices number of elements
Checking primality of a n’size = number of digits
??
given integer n (in binary representation)
Typical graph problem #vertices and/or edges ??
22
Input size and basic operation examples
Problem Input size measure Basic operation
Searching for key in a Number of list’s items, i.e.
Key comparison
list of n items n
Multiplication of two Matrix dimensions or total Multiplication of two
matrices number of elements numbers
Checking primality of a n’size = number of digits
Division
given integer n (in binary representation)
Visiting a vertex or
Typical graph problem #vertices and/or edges
traversing an edge
23
Empirical analysis of time efficiency
Select a specific (typical) sample of inputs
Use physical unit of time (e.g., milliseconds)
or
Count actual number of basic operation’s
executions
Analyze the empirical data
24
Best-case, average-case, worst-case
For some algorithms efficiency depends on form of input:
Worst case: Cworst(n) – maximum over inputs of size n
Best case: Cbest(n) – minimum over inputs of size n
Average case: Cavg(n) – “average” over inputs of size n
Number of times the basic operation will be executed on
typical input
Expected number of basic operations considered as a
random variable under some assumption about the
probability distribution of all possible inputs
25
Pseudocode
High-level description Example: find max
of an algorithm element of an array
More structured than Algorithm arrayMax(A, n)
English prose Input array A of n integers
Less detailed than a Output maximum element of A
program
Preferred notation for currentMax A[0]
describing algorithms for i 1 to n 1 do
Hides program design if A[i] currentMax then
issues currentMax A[i]
return currentMax
Analysis of Algorithms 26
Pseudocode Details
Control flow Method call
if … then … [else …] var.method (arg [, arg…])
while … do … Return value
repeat … until … return expression
for … do … Expressions
Indentation replaces braces Assignment
(like in Java)
Method declaration Equality testing
Algorithm method (arg [, arg…]) (like in Java)
Input … n2 Superscripts and other
Output … mathematical
formatting allowed
Analysis of Algorithms 27
The Random Access Memory
(RAM) Model
A CPU
An potentially unbounded
bank of memory cells, 2
1
each of which can hold an 0
arbitrary number or
character
Memory cells are numbered and accessing
any cell in memory takes unit time.
Analysis of Algorithms 28
Primitive Operations
Basic computations
Examples:
performed by an algorithm Evaluating an
Identifiable in pseudocode expression
Largely independent from the Assigning a value
to a variable
programming language Indexing into an
Assumed to take a constant array
amount of time in the RAM Calling a method
model Returning from a
method
Analysis of Algorithms 30
Growth Rate of Running Time
Changing the hardware/ software
environment
Affects T(n) by a constant factor, but
Does not alter the growth rate of T(n)
The linear growth rate of the running
time T(n) is an intrinsic property of
algorithm arrayMax
Analysis of Algorithms 31
Seven Important Functions
Seven functions that
often appear in 1E+29
algorithm analysis: 1E+27 Cubic
Constant 1 1E+25 Quadratic
1E+23
Logarithmic log n 1E+21 Linear
Linear n 1E+19
1E+17
N-Log-N n log n
T(n)
1E+15
Quadratic n2 1E+13
Cubic n3 1E+11
1E+9
Exponential 2n 1E+7
1E+5
In a log-log chart, the 1E+3
1E+1
slope of the line 1E-1
corresponds to the 1E-1 1E+2 1E+5 1E+8
growth rate of the n
function
Analysis of Algorithms 32
Constant Factors
1E+25
The growth rate is 1E+23
Quadratic
not affected by 1E+21
Quadratic
Linear
1E+19
constant factors or 1E+17 Linear
lower-order terms 1E+15
T(n)
1E+13
Examples 1E+11
102n 105 is a linear 1E+9
1E+7
function 1E+5
105n2 108n is a 1E+3
quadratic function 1E+1
1E-1
1E-1 1E+2 1E+5 1E+8
n
Analysis of Algorithms 33
Math you need to Review
Summations
Logarithms and Exponents
properties of logarithms:
logb(xy) = logbx + logby
logb (x/y) = logbx - logby
logbxa = alogbx
logba = logxa/logxb
properties of exponentials:
a(b+c) = aba c
Proof techniques abc = (ab)c
Basic probability ab /ac = a(b-c)
b = a logab
bc = a c*logab
Analysis of Algorithms 34