Week 6
Week 6
Week 6
Introduction to
Information Technology
Introduction to Algorithms
An algorithm is a well-defined computational procedure that takes a value, or a set of values, as input performs a sequence
of computational steps and produces some value, or set of values, as output Algorithms are used for solving computational
problems.
An algorithm is an effective method that can be executed within a specific amount of space and time and in a well-defined
formal language for calculating a function.
An example of an algorithm is the Euclidean algorithm which was used to determine the maximum common divisor of two
integers.
Figure 1 - Flow chart that shows the algorithm for calculating the Greatest Common Divisor (GCD)
There are various ways to classify algorithms, each with its own merits.
Classification by purpose
By implementation
By design paradigm
By complexity
Classification by purpose
Each algorithm has a goal, for example, the purpose of the Quick Sort algorithm is to sort data in ascending or descending
order. But the number of goals is infinite, and we have to group them by kind of purposes.
Classification by implementation
Recursive or iterative
A recursive algorithm is one that invokes (refers to) itself repeatedly until a certain condition (also known as termination
condition) matches, which is a method common to functional programming. Iterative algorithms use repetitive constructs
like loops and sometimes-additional data structures like stacks to solve the given problems. Some problems are naturally
suited for one implementation or the other.
For example, towers of Hanoi is well understood using recursive implementation. Every recursive version has an equivalent
(but possibly more or less complex) iterative version, and vice versa.
Logical
A logic component expresses the axioms which may be used in the computation and a control component determines the
way in which deduction is applied to the axioms.
This is the basis of the logic programming. In pure logic programming languages the control component is fixed and
algorithms are specified by supplying only the logic component.
The appeal of this approach is the elegant semantics: a change in the axioms has a well-defined change in the algorithm.
Algorithms are usually discussed with the assumption that computers execute one instruction of an algorithm at a time.
Those computers are sometimes called serial computers. An algorithm designed for such an environment is called a serial
algorithm, as opposed to parallel algorithms or distributed algorithms. Parallel algorithms take advantage of computer
architectures where several processors can work on a problem at the same time, whereas distributed algorithms utilize
multiple machines connected with a network. Parallel or distributed algorithms divide the problem into more symmetrical or
asymmetrical sub-problems and collect the results back together. The resource consumption in such algorithms is not only
processor cycles on each processor but also the communication overhead between the processors. Some sorting algorithms
can be parallelized efficiently, but their communication overhead is expensive. Iterative algorithms are generally
parallelizable. Some problems have no parallel algorithms, and are called inherently serial problems.
Deterministic or non-deterministic
Deterministic algorithms solve the problem with exact decision at every step of the algorithm whereas non-deterministic
algorithms solve problems via guessing although typical guesses are made more accurate through the use of heuristics.
A design paradigm is a domain in research or class of problems that requires a dedicated kind of algorithm:
Dynamic programming
The shortest path in a weighted graph can be found by using the shortest path to the goal from all adjacent vertices.
When the optimal solution to a problem can be constructed from optimal solutions to subproblems, using dynamic
programming avoids recomputing solutions that have already been computed.
- The main difference with the "divide and conquer" approach is, subproblems are independent in divide and conquer, where
as the overlap of subproblems occur in dynamic programming.
- Dynamic programming and memoization go together. The difference with straightforward recursion is in caching or
memoization of recursive calls. Where subproblems are independent, this is useless. By using memoization or maintaining a
table of subproblems already solved, dynamic programming reduces the exponential nature of many problems to polynomial
complexity.
Linear programming
The problem is expressed as a set of linear inequalities and then an attempt is made to maximize or minimize the inputs. This
can solve many problems such as the maximum flow for directed graphs, notably by using the simplex algorithm.
A complex variant of linear programming is called integer programming, where the solution space is restricted to all
integers.
Using graphs
Many problems, such as playing chess, can be modeled as problems on graphs. A graph exploration algorithms are used.
This category also includes the search algorithms and backtracking.
Probabilistic
Those that make some choices randomly.
Genetic
Attempt to find solutions to problems by mimicking biological evolutionary processes, with a cycle of random mutations
yielding successive generations of "solutions". Thus, they emulate reproduction and "survival of the fittest".
Heuristic
Whose general purpose is not to find an optimal solution, but an approximate solution where the time or resources to find a
perfect solution are not practical.
Classification by complexity
Some algorithms complete in linear time, and some complete in exponential amount of time, and some never complete.
The majority of algorithms of interest operate on data, particular ways of organizing data play a critical role in the design
and analysis of algorithms. A data structure can be defined as a particular scheme of organizing related data items.
There are a few data structures that have proved to be particularly important for computer algorithms.
Arrays
List
Stack
Queue
Arrays
The two most important elementary data structures are the array and the linked list. A(one dimensional) array is a sequence
of n items of the same data type that are stored contiguously in computer memory and made accessible by specifying a value
of the array’s index.
LINKED LIST
A linked list is a sequence of zero or more elements called nodes, each containing two kinds of information: some data and
one or more links called pointers to other nodes of the linked list. (A special pointer called “null” is used to indicate the
absence of a node’s successor.) In a singly linked list, each node except the last one contains a single pointer to the next
element.
The array and linked list are two principal choices in representing a more abstract data structure called a linear list or simply
a list. A list is a finite sequence of data items, i.e., a collection of data items arranged in a certain linear order. The basic
operations performed on this data structure are searching for, inserting, and deleting an element.
Two special types of lists, stacks and queues, are particularly important. A stack is a list in which insertions and deletions
can be done only at the end. This end is called the top because a stack is usually visualized not horizontally but vertically—
akin to a stack of plates whose “operations” it mimics very closely. As a result, when elements are added to (pushed onto) a
stack and deleted from (popped off) it, the structure operates in a “last-in–first-out” (LIFO) fashion— exactly like a stack of
plates if we can add or remove a plate only from the top.
Stacks have a multitude of applications; in particular, they are indispensable for implementing recursive algorithms.
A real-world stack allows operations at one end only. For example, we can place or remove a card or plate from the top of
the stack only. Likewise, Stack ADT allows all data operations at one end only. At any given time, we can only access the
top element of a stack.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here, the element which is placed (inserted or
added) last, is accessed first. In stack terminology, insertion operation is called PUSH operation and removal operation is
called POP operation.
A queue, on the other hand, is a list from which elements are deleted from one end of the structure, called the front (this
operation is called dequeue), and new elements are added to the other end, called the rear (this operation is called enqueue).
Consequently, a queue operates in a “first-in–first-out” (FIFO) fashion—akin to a queue of customers served by a single
teller in a bank. Queues also have many important applications, including several algorithms for graph problems. One
important variation of a queue is called a priority queue. A priority queue acts like a queue in that you dequeue an item by
removing it from the front. However, in a priority queue the logical order of items inside a queue is determined by their
priority. The highest priority items are at the front of the queue and the lowest priority items are at the back. Thus when you
enqueue an item on a priority queue, the new item may move all the way to the front. We will see that the priority queue is a
useful data structure for some of the graph algorithms.
A real-world example of queue can be a single-lane one-way road, where the vehicle enters first, exits first. More real-world
examples can be seen as queues at the ticket windows and bus-stops.
The classic way to implement a priority queue is using a data structure called a binary heap. A binary heap will allow us
both enqueue and dequeue items in O(logn).
The binary heap is interesting to study because when we diagram the heap it looks a lot like a tree, but when we implement it
we use only a single list as an internal representation. The binary heap has two common variations: the min heap, in which
the smallest key is always at the front, and the max heap, in which the largest key value is always at the front. In this section
we will implement the min heap.
Resources
1. Aho, A. V., Hopcroft, J. E., & Ullman, J. D. (2009). Data Structures and algorithms. Delhi: Dorling Kindersly.
2. Levitin, A. (2007). Introduction to the design & analysis of algorithms. Boston: Pearson Addison-Wesley.
3. Pai, G. A. (2008). Data structures and algorithms: concepts, techniques and applications. New Delhi: Tata
McGraw-Hill.
4. Aho, H. &. (1974). The design and analysis of computer algorithms. Taipei: Kai Fa Book Co.
5. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2014). Introduction to algorithms. Cambridge, MA: The
MIT Press.
6. Carrano, F. M., & Henry, T. (2017). Data abstraction & problem solving with C : walls and mirrors. Boston:
Pearson.
7. Asymptotic notation. (n.d.). Retrieved September 17, 2017, from
https://www.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/asymptotic-notation
8. Point, T. (2017, August 15). Data Structures and Algorithms Queue. Retrieved September 17, 2017, from
https://www.tutorialspoint.com/data_structures_algorithms/dsa_queue.htm
9. T. (Ed.). (2016). Data Structures and algorithms (Vol. 1, Tutorial Point).