Data Structures and Algorithms BBIT 2.2 L1
Data Structures and Algorithms BBIT 2.2 L1
BBIT 2.2
Jomo kenyatta university of agriculture and technology
LECTURE 1
Introduction, efficiency and complexity
Definition
An algorithm for a particular task can be defined as “a finite sequence of instructions,
each of which has a clear meaning and can be performed with a finite amount of effort in
a finite length of time”. As such, an algorithm must be precise enough to be understood
by human beings.
However, in order to be executed by a computer, we will generally need a program
It is also worth bearing in mind the distinction between different programming
paradigms: Imperative Programming describes computation in terms of
instructions that change the program/data state, whereas Declarative Programming
specifies what the program should accomplish without describing how to do it.
Fundamental questions about algorithms Given an algorithm to solve a particular
problem, we are naturally led to ask:
1. What is it supposed to do?
2. Does it really do what it is supposed to do?
3. How efficiently does it do it?
The technical terms normally used for these three aspects are:
1. Specification.
2. Verification.
3. Performance analysis.
The specification should formalize the crucial details of the problem that the algorithm
is intended to solve. Sometimes that will be based on a particular representation of the
associated data, and sometimes it will be presented more abstractly. Typically, it will
have to specify how the inputs and outputs of the algorithm are related, though there is
no general requirement that the specification is complete or non-ambiguous.
For more complicated specifications and/or algorithms, the fact that an algorithm
satisfies its specification may not be obvious at all. In this case, we need to spend some
effort verifying whether the algorithm is indeed correct.
2
A dynamic array is an array that can be changed in size after it is created. A static array
is an array whose size is set when it is created and cannot be changed.
Dynamic arrays are often used because they can be changed as needed, which can be
helpful when the size of the data set is unknown or may change over time. Static arrays
are often used when the size of the data set is known in advance and will not change.
Both dynamic and static arrays have their advantages and disadvantages. Dynamic
arrays are more flexible, but they can be more difficult to manage. Static arrays are
easier to manage, but they are not as flexible.
It is important to choose the right type of array for the task at hand. If the size of the
data set is known in advance and will not change, then a static array may be the best
choice. If the size of the data set is unknown or may change over time, then a dynamic
array may be the best choice.
Invariants
An invariant, as the name suggests, is a condition that does not change during execution
of a given program or algorithm. It may be a simple inequality, such as “i < 20”, or
something more abstract, such as “the items in the array are sorted”.
Invariants are important for data structures and algorithms because they enable
correctness proofs and verification. In particular, a loop-invariant is a condition that is
true at the beginning and end of every iteration of the given loop.
We will also see how invariants (sometimes called inductive assertions) can be used
to formulate similar correctness proofs concerning properties of data structures that are
defined inductively.
Efficiency and Complexity
We talk about the time complexity of the algorithm as an indicator of how the execution
time depends on the size of the data structure.
Here we talk about the space complexity as how the memory requirement depends on
the size of the data structure.
Worst versus average complexity
Another thing that has to be decided when making efficiency considerations is whether
it is the average case performance of an algorithm/program that is important, or
whether it is more important to guarantee that even in the worst case the performance
obeys certain rules. For many applications, the average case is more important, because
saving time overall is usually more important than guaranteeing good behaviour in the
worst case. However, for time-critical problems, such as keeping track of aeroplanes in
certain sectors of air space, it may be totally unacceptable for the software to take too
long if the worst case arises.
Big-O notation for complexity class
Very often, we are not interested in the actual function C(n) that describes the time
complexity of an algorithm in terms of the problem size n, but just its complexity class.
This ignores any constant overheads and small constant factors, and just tells us about
the principal growth.
Big O notation (with a capital letter O, not a zero), also called Landau's symbol, is a
symbolism used in complexity theory, computer science, and mathematics to describe
the asymptotic behavior of functions. Basically, it tells you how fast a function grows or
declines. Landau's symbol comes from the name of the German number theoretician
Edmund Landau who invented the notation. The letter O is used because the rate of
growth of a function is also called its order. For example, when analyzing some
5
algorithm, one might find that the time (or the number of steps) it takes to complete a
problem of size n is given by T(n) = 4 n 2 - 2 n + 2. If we ignore constants (which makes
sense because those depend on the particular hardware the program is run on) and
slower growing terms, we could say "T(n) grows at the order of n 2 " and write (n) = O(n 2
).
Here is a list of classes of functions that are commonly encountered when analyzing
algorithms. The slower growing functions are listed first. c is some arbitrary constant.
1. O(1) constant
2. O(log(n)) logarithmic
3. O(nlog(n)) n logarithmic of n
4. O(n) linear
5. O (n 2 ) quadratic
6. O(n c ) polynomial
7. O(c n ) exponential