Data Structure Manual
Data Structure Manual
Data Structure Manual
At the beginning of storing data structure, one should ask (1) what is data?
(2) where did data come from? (3) how are they computed? (4) how we can
manage data?
Data are raw figures, values, your set of values in facts pertaining to an
object, person, organization, place or things. Data are considerably more than just
a collection of numbers; each number must be interpreted as part of
representations of an object. Source of data; data (datum) are raw materials of
which information derived is obtained, data can be from many sources, they can
be compiled as a result of telephone call, received in the form of letter and turn
around documents, collected on the remote workstation and may be from a point
of sales transaction. Data can be obtained from parent ways e.g. census in the
street, relay data to a central computer that control the traffic etc. you must know
in advance what exactly you want to do with the data before you collect them.
(1) Bit
(2) Character
(3) Field
(4) Record
(5) File
(6) Data Base
BIT: It is derived from two words which are binary digit (0 & 1). It is the smallest
unit of information and the most private data item that a computer can
manipulate.
RECORD: Is the collection of related data items each of which is called field or
attributes e.g. student records, farm records, employee records,
1|P a g e
example of student record are: MATRIC NO. AGE NAME CLASS
DEPARTMENT.
FILE: This is the collection of records of the entities in a given entity set e.g. File of
ND 1 students of computer engineering department.
DATA BASE: is a repository where data /information are kept and set for easy
access, management and updating e.g. Sales transaction, customer
data, financial and product information.
File
Record Record
BIT: Is coined from two words which are binary digit (0 & 1) and it is smallest
unit of information and the most private data items which a computer can
manipulate.
BYTE: It is form when groups of bits are grouped together. It consists of eight (8)
consecutive bits data units and it is the most universal acceptable basic unit
of memory size measurement.
WORD: is a fixed size group of bytes/bits handled as a unit by the computer. (16
bits or two bytes).
Bit of a number are usually numbered from 0 and starts from the right to the
left. 16bit word.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
2|P a g e
Most computer have no instruction to manipulate a bit. However, some
programming languages provide bit-wise operations that may be applied on a byte
or word with an appropriate sequence of bits (called a mask) to isolate a bit for
manipulation.
Example: Student records have variable lengths, since different students take
different numbers of courses. Variable-length records have a minimum and a
maximum length. The above organization of data into fields, records and files may
not be complex enough to maintain and efficiently process certain collections of
data. For this reason, data are also organized into more complex types of
structures.
In traditional file processing, files are sorted, merged and process by a key data
element. When you write program based on traditional approach, data are
manipulated and retrieved on the recording menu either by sequential or random-
access capabilities. The problems contain in traditional files processing are
overcome by DBMS (Database Management System) software. DBMS rely
considerably among commercially available DBMS software packages.
DATA STRUCTURE
Data structure is the arrangement of data in an interconnected network
manner to make easy commencement to each data via algorithm. Usually,
algorithms defined on structure and retrieval instruction such as add, fetch and
processing operation such as search etc. Examples of data structures are array, set,
stack, queue, matrix, records, tree, graph, lists etc. This can be conventionally be
specified in abstract data language. Data structure is principal method for
collecting related data value into a single unit. An employee record may consist of
a name, address, phone number and salary, each of which may be different data
type, but together represent the employee’s information as a whole. Another
example is a group of items, all of which have the same data type and which need
to be kept together for the purpose of searching. A typical data structure provided
by programming language is an “array”. Array collects data into a sequence of
individually indexed items. e.g. Int A[10] where “A” is the name of the array of 10
integer values. Array is a successive location in the memory where document/data
are kept. Depending on your requirement and project, it is important to choose
the right data structure for your project. For example, array is a good example of
3|P a g e
data structure to be used if a user wants to store data on normal order in the
memory.
Memory location
1004 1005 1006 1007 1008 1009 1010
*** 2 1 5 3 4 ***
0 1 2 3 4
Index
Array data structure representation
1. Improve productivity: more job can be done within the same space of time
the slower processor would take to complete a job.
2. Reduce the time taken to solve complex problem, (i.e. some large
simulation and numbers crunching program).
CONCEPT OF ABSTRACTION
ABSTRACT DATA TYPE (ADT): is the specification of data type which specifies the
logical and mathematical model of the data type. In other words, ADT is defined as
a mathematical model of data objects that make up a data type as well as the
functions that operate on these objects. We can think of an ADT as a mathematical
model with a collection of operations defined on that model. Sets of integers
together with operations of union, intersection and set difference, form a simple
example of an ADT. So, a useful tool for specifying the logical properties of a data
type is abstract data type.
4|P a g e
An ADT is the specification of logical and mathematical properties of a data
type or data structure. A data type is a class of data objects together with a set of
operations for creating and manipulating them. Data object represents space in
the memory of actual processor, which can be recognized by variable name or
identifier name by the programmer. Data object is also called a container for data
values. A data object is elementary if it contains a data value that is always
manipulated as a unit and It is called a data structure if it is an aggregate of other
objects. The important steps in the definition of ADT involves mainly two parts:
c. what properties should the object process? When these questions are
correctly answered, only the relevant information is made available to user.
Details of the implementation are hidden from the user. Professional can easily be
given to ensure that the implementation of the design ADT is proper
representation of what is expected or required. Example of ADT that will be discuss
are stack, queue, list, tree, graph
ABSTRACT DATA TYPE SPECIFICATION
The specification of an ADT does not imply any implementation consideration.
The implementation of an ADT is a translation into statements of a programming
language, of the declaration that defines a variable to be of that particular ADT
including the procedure in that language for each operation of the ADT. The aim
is on how to specify the various attributes of our ADT. The abstract data type of
an abstract usually defines the syntax and semantic part of the Abstract data type.
The syntax part describes the manner in which constructs can be used, this cover
such details as the used of type, types argument results etc. The semantic section
on the other hand describes the meaning of behavior of the operation on the
abstract data type. Meaning that it can be defined by specifying the desire axioms.
5|P a g e
As part of the semantic section, there is a section with least exception in
restriction. The use of data types in a programming has the following advantages:
6|P a g e
Data structure helps us to understand the relationship of one element with the
other and organize it within the memory. The study of complex data structures
includes the following three steps:
1. Logical or mathematical description of the structure
2. Implementation of the structure on a computer
3. Quantitative analysis of the structure which includes determining the amount of
memory needed to store the structure and the time required to execute the
structure. Areas in which data structures are applied extensively are: -
⚫ DBMS ⚫ Simulation ⚫ Operating System
⚫ Statistical Analysis ⚫ Artificial ⚫ Numerical Analysis
Package Intelligence
⚫ Graphics
⚫ computer design ⚫ Network Analysis
7|P a g e
NOTE: Data Structure and data types are slightly different. Data structure is the
collection of data types arranged in a specific order. While data type is a class of
data objects together with a set of operations for creating and manipulating them.
Data structures are used in almost every program or software system.
Specific data structures are essential ingredients to many efficient algorithms, and
make the management of huge amount of data possible such as large databases
and internet indexing services. Some formal design methods and programming
language emphasize on data structures rather than algorithms, as the key
organizing factor in software design.
DATA TYPE VS. DATA STRUCTURE
1. Group items: are data item that are themselves divisible into sub-items e.g.
name: first name, other name and last name.
2. Elementary items: are data item that can not be further divided e.g. sex, age.
ENTITY: an entity is a statement that has certain attributes and properties which
may be assigned values, these values may be numeric or non-numeric or mixture
or the two (Alphanumeric). For example, the following are possible attributes and
their corresponding values of an entity. i.e. student of a school.
8|P a g e
Entity with similar attributes form an entity set e.g. all 100L students in
department of computer science. Each attribute of an entity set has a range of
value.
The term “information” is the term used to describe meaningful processed data or
used to refer to processed data from which meaningful decision can be made
which is the reason why data are considerably more than just a collection of
numbers of facts; each number or fact must be a part of a model or structure, for
example, the fact
Data Structure
Int
Linear Data Structure Non-linear Data Structure float
Char.
Link Tree
Graph
Linear List Linked List
Queue Circular
dequeue
Circular
10 | P a g e
LINEAR DATA STRUCTURE
1
Top
2 (insertion/
deletion)
.
.
n
Array Stack
A . B . C Linked list
11 | P a g e
b) Graph
PRIMITIVE DATA STRUCTURE: are the basic data structures and are directly
operated upon by the machine instructions, which is in primitive level. These have
different representations on different computers. They are integers, floating point
numbers, characters, string constants, pointers etc.
Data Structure
12 | P a g e
4. HOMOGENEOUS AND NON-HOMOGENEOUS DATA STRUCTURE:
In homogeneous data structures data elements are of same type like array.
In non-homogeneous data structure data elements may not be of same type like
structure in other programming language like “C”.
6. SEARCHING: It involves finding the location of the record with a given value
or finding the location of all records, which satisfy one or more conditions in the
data.
7. SORTING: is the process of arranging the records in some logical order i.e.
in ascending and descending order with numeric data or alphabetically with
character data.
13 | P a g e
ARRAY
DECLARATION OF ARRAY
Arrays must be declared before they are used just like every other variable. The
general form of declaration is:
The type specifies the type of elements that will be contained in the arrays such as
Int, float or char and the size indicates the maximum of elements that can be
stored inside the float array. So, when we declare an array, we will have to assign
a type as well as size. For example, when we want to store 10 integer values, we
can use the following declaration, int A [10].
14 | P a g e
starting address is stored in A. when we say A[0] we are referring to the first
integer value in A.
Float A [10];
Array size
Data type
Array name
ONE-DIMENSIONAL ARRAY:
If ‘n’ is 10, then the array elements A[0], A[1], …………..A[9] are stored in sequential
memory locations as follows:
15 | P a g e
For reading an array of ‘n’ elements.
The operations that are normally performed on any linear array (one-dimensional)
array are: Traversal, search, insertion, deletion, sorting and merging.
Where W is the number of bytes per storage location for one element of the array
LA. The time to calculate LOC(A[K]) is essentially the same for any value of K. Giving
any subscript K, one can locate and access the content of (LA[K]) without scanning
any other element of LA. The memory of the computer is composed of a sequence
of words, each word is referred to by its position in the sequence of words
(address). The address of the first word is normally zero.
Example 1:
Let the base address of the first element of the array is 400 and each element of
the array occupies 2 bytes in the memory, then address of 4th element of an array
A [10] will give as solution:
= 400 + 8
16 | P a g e
= 408
Length = UB – LB + 1
Where UB is the largest index, called the upper bound and LB is the smallest index,
known as the lower bound of the array.
Example 2
That is AUTO contains 19 elements and its index set consists of integers from 1970
through 1988.
Example 3
consider the array auto in the previous example, which records the number of
automobiles sold each year from 1970 to 1988. Which means Base(AUTO) = 200,
and W = 4 Words per memory cell for AUTO then LOC(AUTO[1970]) = 200,
LOC(AUTO[1971]) = 204, LOC(AUTO[1972]) = 208, LOC(AUTO[1973]) = 212.
NOTE: A collection 'A' of data element is said to be indexed if any element of A,
which we shall call AK can be located and processed in a time that is independent
of K (This is a very important property of linear array).
TRAVERSING LINEAR ARRAY
Let A be a collection of data element stored in the memory of the computer.
Suppose we want to print the content of each element of A or we want to count
the element A with a given property. This can be accomplished by traversing A.
That is by accessing and processing (frequently called visiting) each element of A
exactly once. The following algorithm traverses a linear array LA. The simplicity of
algorithm comes from the fact that LA is a linear structure. Other linear structures
such as linked lists, can also be easily traversed. On the other hand, the traversal
of non-linear structures such as trees and graph are considerably more
complicated.
Algorithm 1a (To traverse a linear array)
17 | P a g e
I. [Initialize counter) set K: = LB
2. Repeat step 3 and 4 while K <= UB
3. [Visit element] Apply process to LA[k]
4. [Increase counter] set k: = k+1
[End of Loop]
5. Exit.
NOTE: - LA is a linear array with lower bound LB and upper bound UB, this
algorithm traverses LA applying the operation process to each element of LA.
Algorithm 1b
1. Repeat for k = LB TO UB
Apply process to LA[k]
[End of Loop]
2. Exit
The operation process in the traversal algorithm may use certain variable which
must be initialized before process is applied to any of the element in the array.
Accordingly, the algorithm may need to be preceded by such an initialization step.
Example 4
Consider the array AUTO in example 2, which records the the number of
automobile sold each year from 1970 through 1988. Each of the following modules
which carry out the given operation involves traversing AUTO.
(i) Find the number NUM of the years margin during the execution which more
than 300 automobiles were sold.
1. [Initialization step] set NUM : = 0
Repeat for k = 1970 to 1988
If AUTO[k] > 300, then: set NUM: = NUM +1
[End of loop]
2. Return
Note: this algorithm requires an initialization step for variable NUM before
traversing the array AUTO
(ii) Print each year and the number of automobiles sold in that year.
18 | P a g e
1. Repeat for k := 1970 to 1988
Write k, AUTO[K]
End of loop]
2. Return
INSERTING AND DELETING
INSERTING: - insertion refers to operation of adding another element to the
collection. If the memory space located for a linear array is larger, inserting an
element at the end will be easy, but inserting such element in the middle would
pose some difficulties. This is because about half of the element of the array must
be moved downward to the new location to accommodate the new element and
maintain the order of the elements.
Example 5
The insertion operation is described below using COURSE which is an element
array having six title such as Agric, Botany, Computer, Geography, History and
Physics. The title is used alphabetically and the array title are kept alphabetically
at all times, if English is added to the array, Geography, History and Physics will
have to move one location downward. If Government is added, History and Physics
would have to move down.
Algorithm to (inserting into a linear array) INSERT (LA, N, K, ITEM)
1. [Initialize Counter set J:=N
2. repeat step 3 and 4 while J >= k
3. [Move Jth element downward] set LA [J+1]= LA[J]
4. [Decrease Counter] Set J=J-1
[End of step 2 loop]
5. [Insert element] set LA[K]
6. [Reset N] set N=N+1
7. Exit
DELETING
Deleting is the removal of a record from the structure. That is the operation of
removing one of the elements from the collection of data elements in the memory
of a computer. Similarly, it is easy to delete an element at the end of array but
tedious to do in the middle of the array as it would required moving each
19 | P a g e
subsequent element or location upward in order to fill up the array. Given similar
array as it is in traversing to remove History, physics will move upward.
COURSE
1 AGRIC
2 BOTANY
3 COMPANY
4 ENGLISH
5 GEOGRAPHY
6 GOVERNMENT
7 PHYSICS
20 | P a g e
CLASSIFICATION OF SORTING METHODS
Sorting methods used in computing can be classified into 2 areas
(1) internal
(2) external
Internal sorting involves the storage in main memory of all the data to be sorted.
However, when the amount of data is too large to be stored and sorted in the main
memory, it is sorted on an external (secondary) storage region such as disk and
successive part of data are sorted in the main memory, such techniques is known
as external sorting.
The type of sorting method that is used will depend upon at least one
of the following factors:
i. The amount of information to be sorted, clearly it will be time consuming to
use a relative inefficient sorting method on a large quantity of data.
ii. The computer configuration being used the size of the main memory and
number of tapes that would be used,
iii. The nature of the application and urgency of result
BUBBLE SORT
Bubble sorting is the simplest sorting algorithm that works by repeatedly
swapping the adjacent elements if they are in the wrong order. Alternatively, it
refers to the operation of rearranging the element I.e. Suppose the list of numbers
A[1], A[2]…..A[n] is in memory. The bubble sort algorithm works as follows:
Step 1:- compare A[1] and A[2] and arrange them in the desired order, so that A[1]
< A[2]. Then compare A[2] and A[3] and arrange them so that A[2] < A[3]. Then
compare A[3] and A[4] and arrange them so that A[3] < A[4]. Continue until we
compare A[N-1] with A[N] and arrange them so that A[N-1] < A[N].
Step 2:- Repeat step 1 with one less comparison; that is, we stop after we compare
and possibly rearrange A[N-2] and A[N-1] (step 2 involves N-2 comparisons and
when step 2 is completed, the second largest element will occupy A[N-1].
Step 3:- Repeat step 1 with two-fewer comparison; that is, we stop after we
compare and possibly rearrange A[N-3] and A[N-2] Step N-1. Compare A[1], with
A[2] and arrange them so that A[1] < A[2]. NOTE:- After N-1 steps, that list will be
sorted in increasing order.
Example 6
21 | P a g e
Suppose the following numbers are stored in an array "A" 32, 51, 27, 85, 66, 23,
13, 57. We apply bubble sort to the array A, each pass will be discussed separately.
Pass 1:
We have the following comparison:
a. Compare A1 and A2: since 32 < 51, the list is not altered.
b. Compare A2 and A3: since 51 > 27, interchange 51 and 27 as follows 32,
27,51,85,66,23,13,57
c. Compare A3 and A4: since 51 < 85, the list is not altered
d. Compare A4 and A5: since 85 > 66, interchange 85 and 66 as
follows 32, 27, 51, 66, 85, 23, 13, 57
e. Compare A5 and A6: since 85 > 23, interchange 23 as follows
32, 27,51,66,23,85, 13,57
f. Compare A6 and A7: since 85 > 13, interchange 85 and 13 as follows 32,
27,51,66,23, 13,85,57
g. Compare A7 and A8: since 85 > 57, interchange 85 and 57 to yield 32, 27, 51,
66, 23, 13, 57, 85
At the end of this first pass, the largest number, 85, has moved to the last
position. However, the rest of the numbers are not sorted even though some of
them have changed their position for the remainder of the passes, we show only
the interchanges.
Pass 2:
27,33,51,66,23, 13,57,85
27,33,51,23,66,13,57,85
27,33,51,23,13,66,57,85
27,33,51,23,13,57,66,85
At the end of pass 2, the second largest number 66 has moved to the next to the
last position.
Pass 3:
27,33,23,51,13,57,66,85
27,33,23,13,51,57,66,85
Pass 4
22 | P a g e
27,23,33,13,51,57,66,85
27,23,13,33,51,57,66,85
Pass 5:
23,27,13,33,51,57,66,85
23,13,27,33,51,57,66,85
Pass 6:
13,23,27,33,51,57,66,85
Pass 6 actually has 2 comparisons A1 with A2, and A2 and A3. The second
comparison does not involve an interchange. Pass 7 Finally, A1 is compared with
A2 since 13 < 23; no interchange takes place since the list has 8 elements, it is
sorted after the seventh pass.
Algorithm (a) Bubble sort
1. Repeat step 2 and 3 for k = 1 to N-1
2. Set PTR = 1 [initialize pass pointer PTR]
3. Repeat while PTR <= N-K: [Execute pass]
(a) If DATA[PTR] > DATA[PTR + 1], then interchange
DATA[PTR] and DATA[PTR +1]
[End If structure]
Set PTR := PTR +1
[End of inner loop]
[End of step 1 outer loop]
4. Exit.
GRAPH
A graph is a non-linear data structure that consists of a set of vertices (also
known as nodes) and edges. The vertices represent objects, and the edges
represent relationships between the objects. In a graph, the vertices and edges can
have different weights or values that represent the strength or importance of the
relationship between the vertices. A graph can be either directed or undirected. In
a directed graph, the edges have a direction, meaning that they have a start vertex
and an end vertex, while in an undirected graph, the edges have no direction and
23 | P a g e
are bidirectional. Graphs are used in various applications such as computer
networks, social networks, road networks, and many others. They provide a way
to model and analyse complex relationships between objects. Different types of
graphs exist, including weighted graphs, connected graphs, disconnected graphs,
and many others, each with its own properties and uses. In other words, a graph
is a pictorial representation of a set of objects where some pairs of objects are
connected by links. The interconnected objects are represented by points termed
as vertices, and the links that connect the vertices are called edges. More Formally,
a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of
edges, connecting the pairs of vertices. Take a look at the following graph:
BASIC OPERATIONS
Following are basic primary operations of a Graph −
• Add Vertex − Adds a vertex to the graph.
• Add Edge − Adds an edge between the two vertices of the graph.
• Display Vertex − Displays a vertex of the graph.
RELATIONS
A binary relation is determined by specifying all ordered pairs of objects in that
relation; it does not matter by what property the set of these ordered pairs is
described. We are led to the following definition: “A set R is a binary relation if all
elements of R are ordered pairs, i.e., if for any z ∈ R there exist x and y such that z
= (x, y). It is customary to write xRy instead of (x, y) ∈ R. We say that x is in relation
R with y if xRy holds. The set of all x which are in relation R with some y is called
the domain of R and denoted by “dom R.” So, dom R = {x | there exists y such that
xRy}. dom R is the set of all first coordinates of ordered pairs in R. The set of all y
such that, for some x, x is in relation R with y is called the range of R, denoted by
“ran R.” So, ran R = {y | there exists x such that xRy}.
SYMBOL
A symbol is something such as an object, picture, written word, sound, or
particular mark that represents something else by association, resemblance, or
convention. For example, a red octagon may stand for "STOP" On maps, crossed
sabres may indicate a battlefield. Numerals are symbols for numbers. The study or
interpretation of symbols is known as symbology, and the study of signs is known
as semiotics.
Logic Symbols
s/n Symbols Latex Command HTML Entity
1 ¬ \neg ¬
2 ∧ \wedge ∧
25 | P a g e
3 ∨ \vee ∨
4 ⊕ \oplus ⊕
5 \Rightarrow
6 \Leftrightarrow
7 ∃ \exists ∃
8 ∀ \forall ∀
EQUIVALENCE RELATION
Equivalence relation is a binary relation between two elements of a set
which groups them together as being "equivalent" in some way. Let a, b, and c be
arbitrary elements of some set X. Then "a ~ b" or "a ≡ b" denotes that a is
equivalent to b. An equivalence relation "~" is reflexive, symmetric, and transitive.
In other words, the following must hold for "~" to be an equivalence relation on X:
An equivalence relation partitions a set into several disjoint subsets, called
equivalence classes. All the elements in a given equivalence class are equivalent
among themselves, and no element is equivalent with any element from a
different class.
• Reflexivity: a ~ a
• Symmetry: if a ~ b then b ~ a
• Transitivity: if a ~ b and b ~ c then a ~ c.
• The equivalence class of a under "~", denoted [a], is the subset of X for which
every element b, a~b. X together with "~" is called a setoid. Example of equivalence
relations is a ubiquitous equivalence relation is the equality ("=") relation between
elements of any set.
Other examples include:
• "Has the same birthday as" on the set of all people, given naive set theory.
• "Is similar to" or "congruent to" on the set of all triangles.
• "Is congruent to modulo n" on the integers.
• "Has the same image under a function" on the elements of the domain of the
function.
• Logical equivalence of logical sentences.
• "Is isomorphic to" on models of a set of sentences.
• In some axiomatic set theories other than the canonical ZFC (e.g., New
Foundations and related theories):
➢ Similarity on the universe of well-orderings gives rise to equivalence classes
that are the ordinal numbers.
➢ Equinumerosity on the universe of:
▪ Finite sets give rise to equivalence classes which are the natural
numbers.
▪ Infinite sets give rise to equivalence classes which are the transfinite
cardinal numbers.
• Let a, b, c, d be natural numbers, and let (a, b) and (c, d) be ordered pairs of such
numbers. Then the equivalence classes under the relation (a, b) ~ (c, d) are the:
▪ Integers if a + d = b + c;
▪ Positive rational numbers if ad = bc.
26 | P a g e
• Let (rn) and (sn) be any two Cauchy sequences of rational numbers. The real
numbers are the equivalence classes of the relation (rn) ~ (sn), if the sequence (rn
− sn) has limit 0.
• Green's relations are five equivalence relations on the elements of a semigroup.
• "Is parallel to" on the set of subspaces of an affine space.
Examples of relations that are not equivalences:
• The relation "≥" between real numbers is reflexive and transitive, but not
symmetric. For example, 7 ≥ 5 does not imply that 5 ≥ 7. It is, however, a partial
order.
• The relation "has a common factor greater than 1 with" between natural
numbers greater than 1, is reflexive and symmetric, but not transitive. (The natural
numbers 2 and 6 have a common factor greater than 1, and 6 and 3 have a common
factor greater than 1, but 2 and 3 do not have a common factor greater than 1).
• The empty relation R on a non-empty set X (i.e., aRb is never true) is vacuously
symmetric and transitive, but not reflexive. (If X is also empty then R is reflexive).
• The relation "is approximately equal to" between real numbers, even if more
precisely defined, is not an equivalence relation, although reflexive and symmetric,
it is not transitive, since multiple small changes can accumulate to become a big
change. However, if the approximation is defined asymptotically, for example by
saying that two functions f and g are approximately equal near some point if the
limit of f-g is 0 at that point, then this defines an equivalence relation.
• The relation "is a sibling of" on the set of all human beings is not an equivalence
relation. Although siblinghood is symmetric (if A is a sibling of B, then B is a sibling
of A) it is neither reflexive (no one is a sibling of himself), nor transitive (since if A
is a sibling of B, then B is a sibling of A, but A is not a sibling of A). Instead of being
transitive, siblinghood is "almost transitive", meaning that if A ~ B, and B ~ C, and
A ≠ C, then A ~ C. However, the relation "A is a sibling of B or A is B" is an
equivalence relation. (This applies only to full siblings. A and B could have the same
mother, and B and C the same father, without A and C having a common parent).
• The concept of parallelism in ordered geometry is not symmetric and is
therefore, not an equivalence relation.
An equivalence relation on a set is never an equivalence relation on a
proper superset of that set. For example, R = {(1,1), (1,2), (1,3), (2,1), (2,2), (2,3),
(3,1), (3,2), (3,3)} is an equivalence relation on {1,2,3} but not on {1,2,3,4} or on the
natural number. The problem is that reflexivity fails because (4,4) is not a member.
• Three-point entry to the entity rectangle, if many instances of the entity may be
used for this relation
• Single-point entry, if relation can (or should) participate in only one instance of
the entity.
It is the specialized Entity Relationship diagram symbols, and the meanings of those
symbols.
Entities
28 | P a g e
Attributes
Relationships
Composite Relations
29 | P a g e
If the elements of a set A are related to those of a set B, and those of B are in turn
related to the elements of a set C, then one can expect a relation between A and
C. For example, if Tom is my father (parent-child relation) and Sarah is a sister of
Tom (sister relation), then Sarah is my aunt (aunt-nephew/niece relation).
Composite relations give that kind of relations.
Definition (composite relation): Let R1 be a binary relation from a set A to
a set B, R2 a binary relation from B to a set C. Then the composite relation from A
to C denoted by R1R2(also denoted by R1 R2 is defined as:
PROPERTIES OF GRAPH
a) Route: Routing is the act of finding a path to a destination and moving data
across this path from source to destination. It is also known as path in graph
30 | P a g e
theory, which is any route along the edges of a graph. A path may follow a
single edge directly between two vertices, or it may follow multiple edges
through multiple vertices. If there is a path linking any two vertices in a
graph, that graph is said to be connected.
b) Edges: The line segment which acts as an interface between two faces is
called an edge. Sometimes it is also described as the line segment joining
two vertices. A vertex is a corner. An edge is a line segment between faces.
A face is a single flat surface.
c) Sequences: Sequence graph, also called an alignment graph, breakpoint
graph, or adjacency graph, are bidirected graphs used in comparative
genomics. The structure consists of multiple graphs or genomes with a
series of edges and vertices represented as adjacencies between segments
in a genome respectively.
d) Directed: A directed graph (or digraph) is a set of vertices and a collection
of directed edges that each connects an ordered pair of vertices. We say
that a directed edge points from the first vertex in the pair and points to
the second vertex in the pair. We use the names 0 through V-1 for the
vertices in a V-vertex graph. In a directed graph, every vertices has a
minimum of one incoming edge & one outgoing edges signifying the strict
direction of each edge relative to its two connected vertices. Directed
graphs are a class of graphs that don’t presume symmetry or reciprocity in
the edges established between vertices. In a directed graph, if two vertices
connected by an edge, this doesn’t necessarily mean that an edge
connecting also exists.
SETS
Sets are fundamental objects that can be used to define all other concepts
in mathematics, they are not defined in terms of more fundamental concepts.
Rather, sets are introduced either informally, and are understood as something
self-evident or as is now standard in modern mathematics. Axiomatically, and their
properties are postulated by the appropriate formal axioms. The language of set
theory is based on a single fundamental relation, called membership. We say that
A is a member of B (in symbols A ∈ B), or that the set B contains A as its element.
The understanding is that a set is determined by its elements; in other words, two
sets are deemed equal if they have exactly the same elements. In practice, one
considers sets of numbers, sets of points, sets of functions and so on. In theory, it
is not necessary to distinguish between objects that are members and objects that
contain members. The only objects one needs for the theory are sets. See the
supplement:
The type SET denotes sets whose elements are integers in the range 0 to a small
number, typically 31 or 63. Given, for example, variables
VAR r, s, t: SET
possible assignments are
r: = {5}; s: = {x, y, z}; t: = {}
Here, the value assigned to r is the singleton set consisting of the single element 5;
to t is assigned the empty set, and to s the elements x, y, y+1, …., z-1, z.
SET OPERATIONS
A relation set is a set of ordered pairs if it is a binary relation, and it is a set of
ordered n tuples if it is an n-ary relation. Thus, all the set operations apply to
32 | P a g e
relations such as ᴜ, Ո, and complementing. For example, the union of the "less
than" and "equality" relations on the set of integers is the "less than or equal to"
relation on the set of integers. The intersection of the "less than" and "less than or
equal to" relations on the set of integers is the "less than" relation on the same
set. The complement of the "less than" relation on the set of integers is the
"greater than or equal to" relation on the same set. Therefore, the following are
the elementary operators are defined on variables of type SET:
*set intersection
+ set union
- set difference
/ symmetric set difference
IN set membership
Constructing the intersection or the union of two sets is often called set
multiplication or set addition, respectively; the priorities of the set operators are
defined accordingly, with the intersection operator having priority over the union
and difference operators, which in turn have priority over the membership
operator, which is classified as a relational operator. Following are examples of set
expressions and their fully parenthesized equivalents:
r * s + t = (r*s) + t
r - s * t = r - (s*t)
r - s + t = (r-s) + t
THIS IS A TEXT
r + s / t = r + (s/t)
x IN s + t = x IN (s+t)
33 | P a g e
The symbol for membership is ϵ. It can be read "is an element of" and looks quite
similar to the Greek letter epsilon (Ꞓ). The symbol for subset is . Some books will
allow and use it reversed—we will not. A superset is a set that includes other sets.
For example: If A B, then A is a subset of B and B is a superset of A.
A subset might have no members, in which case it is termed the null set or empty
set. The empty set is denoted either by {} or by Ø, a Norwegian letter. The null set
is a subset of every set. Note: a common mistake is to use {Ø} to denote the null
set. This is actually a set with one element and that element is the null set. Since
some people slash their zeroes, it is safest when using handwriting to always use
the notation {} to denote the empty or null set. A singleton is a set with only one
element. A subset might contain every member of the original set. In this case it is
termed an improper subset. A proper subset does not contain every member of
the original set. Sets may be finite, {1, 2, 3..., 10}, or infinite, {1, 2, 3…….}. The
cardinality of a set A, n(A), is how many elements are in the set.
The power set of a set is the complete set of subsets of the set. The safe sets
will be considered in this part; any set we consider should be well-defined. There
should be no ambiguity as to whether or not an element belongs to a set. That is
why we will avoid things like the village barber who shaves everyone in the village
that does not shave himself. This results in a contradiction as to whether or not he
shaves himself. Also consider Russell's Paradox: Form the set of sets that are not
members of themselves. It is both true and false that this set must contain itself.
These are examples of ill-defined sets. Sometimes, instead of listing elements in a
set, we use set builder notation: {x| x is a letter in the word "mathematics"}. The
symbol | can be read as "such that." Sometimes the symbol is reserved to mean
proper subset and the symbol is used to allow the inclusion of the improper
subset. Compare this with the use of < and ≤ to exclude or include an endpoint.
We will make no such distinction. A set may contain the same elements as another
set. Such sets are equal or identical sets— element order is unimportant.
A = B where
A = {m,o,r,e} and
B = {r,o,m,e},
In general A=B if A B and B A. Sets may be termed equivalent if they have the
same cardinality. If they are equivalent, a one-to-one correspondence can be
established between their elements. The universal set is chosen arbitrarily, but
must be large enough to include all elements of all sets under discussion.
Complementary set, A', is a set that contains all the elements of the universal set
that are not included in A. The symbol ' can be read "prime."
For example:
if U={0, 1, 2, 3, 4, 5, 6,...} and
34 | P a g e
A={0, 2, 4, 6, ...}, the
A'={1, 3, 5, ...}
Such paradoxes as those mentioned above, particularly involving infinities.
INTERSECTION AND UNION
Once we have created the concept of a set, we can manipulate sets in useful ways
termed set operations. Consider the following sets: animals, birds, and white
things. Some animals are white: polar bears, mountain goats, big horn sheep, for
example. Some birds are white: dove, stork, sea gulls. Some white things are not
birds or animal (but birds are animals!): snow, milk, wedding gowns (usually). The
intersection of sets are those elements which belong to all intersected sets.
Although we usually intersect only two sets, the definition above is general. The
symbol for intersection is . The union of sets are those elements which belong to
any set in the union. Again, although we usually form the union of only two sets,
the definition above is general. The symbol for union is .
For the example given above, we can see that:
{white things} {birds} = white birds
{white animals} {birds} = white animals and all birds
{white birds} {white animals} {animals}
Another name for intersection is conjunction. This comes from the fact that an
element must be a member of set A and set B to be a member of A B. Another
name for union is disjunction. This comes from the fact that an element must be a
member of set A or set B to be a member of A B. Conjunction and disjunction are
grammar terms and date back to when Latin was widely used.
OCCUPANCY
occupancy refers to the number of elements or items stored in a particular
structure, such as an array or a hash table. ”Load factor" is the measure of how
full a data structure is, calculated as the ratio of the number of elements stored in
the structure to the number of spaces or buckets available to store elements.
"Lean" generally means that the data structure is not very full, and has a low load
factor. This can be desirable in some cases, as it may lead to faster operations and
better performance. "Empty" means that the data structure has no elements and
load factor is 0."Loose" generally means that the data structure is relatively full,
and has a high load factor. This can lead to slower operations and worse
35 | P a g e
performance, as the structure may become too full and require rehashing or
resizing.
SEQUENTIAL LIST
sequential list is also known as a linear list. it is a type of data structure that stores
a sequence of elements in a linear order. Each element in the list has a unique
position, and elements can be accessed by their position in the list. In a sequential
list, elements are typically stored in contiguous memory locations and can be
accessed directly by their index. The most common operations on a sequential list
include adding an element to the list, removing an element from the list, and
searching for an element in the list. A common implementation of a sequential list
is an array, where each element of the list is stored at a specific index in the array.
It also can be implemented as linked list, where each element of the list is
represented by a node that contains the element's value and a reference to the
next node in the list. Sequential lists are useful in situations where elements need
to be stored in a specific order, and the order of elements is important.
36 | P a g e
access and manipulate the data stored in the structure, and can also make it more
difficult to ensure that the data is stored in a specific format. It is important to note
that depending on the data structure, different structures will have different
advantages and disadvantages with regard to fixed-length and variable-length
fields.
int id;
char name[10];
int age;
};
Here, the name field is of fixed length and will occupy 10 bytes of memory. On the
other hand, implementing variable-length fields in a data structure can be done
using dynamic arrays or linked lists. For example, in a linked list, each node
contains a variable-length field that stores a value and a pointer to the next node
in the list. The size of the field can be determined at runtime based on the actual
data stored in the field.
struct myStruct {
int id;
char* name;
int age;
};
37 | P a g e
Here, the name field is of variable length, the size of this field will be determined
at runtime based on the value assigned to the field. Another way to implement
variable fields is by using pointers; variable-length fields can be implemented by
storing a pointer to a block of memory allocated dynamically, that can be resized
as needed. It is important to note that different data structures handle fixed and
variable length fields differently, and the choice of which one to use will depend
on the specific requirements of the application and the operations that will be
performed on the data.
1. Append: This operation adds a new element to the end of the ordered list.
2. Search: This operation searches for a specific element in the ordered list. It
can be implemented using various algorithms such as linear search or
binary search.
3. Delete: This operation removes an element from the ordered list. It can be
implemented by searching for the element and then removing it from the
list.
4. Sort: This operation rearranges the elements in the ordered list according
to a specific order, such as ascending or descending. There are various
sorting algorithms that can be used, such as bubble sort, insertion sort, and
quick sort.
5. Merge: This operation combines two ordered lists into a single ordered list.
The resulting list will also be ordered.
6. Multiway merge: This operation combines multiple ordered lists into a
single ordered list. It is an extension of the merge operation.
7. Balance merge: This operation is used to balance the distribution of
elements in an ordered list. It is useful in scenarios where the list is
unbalanced and certain elements are grouped together, making searching
and accessing them slower.
38 | P a g e
LINKED LIST
A linked list is a data structure that consists of a sequence of elements,
where each element is called a "node". Each node contains two fields: a value field
that stores the element's value, and a next field that stores a reference to the next
node in the list (or pointer). The first node in the list is called the "head", and the
last node in the list is called the "tail". The tail's next field contains a null reference,
indicating the end of the list. Linked lists can be used to implement various data
structures such as stacks, queues, and hash tables. A singly linked list is a type of
linked list where each node points to its next node in the list, a doubly linked list is
a type of linked list where each node points to both next and previous nodes in the
list. Linked lists are dynamic data structures and can change their size during the
execution of a program, they are useful when you don't know how much data you
will have, or when you need to add/remove elements frequently.
The main difference between a linked list and a linear list is how the
elements are stored in memory. In a linear list, elements are stored in contiguous
memory locations, whereas in a linked list, elements are stored in non-contiguous
memory locations. This means that in a linear list, elements can be accessed by
their index position, whereas in a linked list, elements must be accessed by
following the next pointers.
Linked lists have the advantage of being dynamic data structures and can
change their size during the execution of a program, they are useful when you
don't know how much data you will have, or when you need to add/remove
elements frequently. On the other hand, linear list has the advantage of being
more cache-friendly and allow for constant-time access to elements at a specific
index position.
39 | P a g e
2. Doubly Linked List: In this type of linked list, each node contains a value
and references to both the next and previous nodes in the list. This type of
linked list is also known as a two-way linked list, as the nodes can be
traversed in both directions (from head to tail and from tail to head).
3. Circular Linked List: In this type of linked list, the last node in the list
contains a reference to the first node in the list, creating a circular loop. This
type of linked list can be either singly or doubly linked.
4. Multi-Linked List: In this type of linked list, each node contains multiple
references to other nodes in the list. This allows for multiple traversals of
the list and can be used to implement more complex data structures.
5. Skip List: Skip list is a data structure that allows for efficient search,
insertion and deletion of elements. It is a type of linked list where each node
has multiple pointers, each pointing to another node at a different level.
6. XOR Linked List: XOR linked list is a special type of linked list, where each
node contains an XOR of the addresses of its previous and next node.
POINTERS
Pointers are a fundamental concept in computer programming and are used in
many programming languages, including C and C++. They are variables that store
memory addresses of other variables. Here are some of the main uses of pointers:
1. Dynamic memory allocation: Pointers are commonly used to dynamically
allocate memory during the execution of a program. This allows for the
creation of data structures such as linked lists and trees, which can change
their size during runtime.
2. Implementing data structures: Pointers are used to implement various
data structures such as linked lists, trees, and graphs. They are used to link
together the different elements of the data structure, allowing for efficient
traversal and manipulation of the data.
3. Function parameters: Pointers can be used as function parameters to pass
variables by reference rather than by value. This can be useful for situations
where the called function needs to modify the original variable, rather than
a copy of it.
4. Indirection: This means that the pointers can be used to refer to other data
indirectly. This can be useful in situations where the location of data
changes frequently, such as in linked lists or dynamic arrays.
40 | P a g e
5. Arrays and Strings: Pointers are used to manipulate arrays and strings; they
are used to traverse and access elements at specific indexes.
6. Inter-process communication: Pointers can be used in inter-process
communication, where two or more processes share the same memory
space, it helps in sharing data between processes.
7. Device drivers: Pointers are used in device drivers to interact with
hardware, it helps to access memory addresses of specific hardware
registers to read or write data.
8. Smart pointers: smart pointers are a type of C++ feature that are used to
manage dynamically allocated memory and help to prevent memory leaks
and other issues related to manual memory management. They also
provide additional functionality such as thread-safety and automatic
deallocation of memory.
9. Callback functions: Pointers to functions can be used to create callback
functions, where a function pointer is passed as an argument to another
function, and the function pointer is called at a certain point in the
execution of the second function.
STORAGE MAPPING
41 | P a g e
to their physical storage locations. This allows for efficient access to data, as
the index can be used to quickly locate the desired data element.
4. Hashed Storage Mapping: In this technique, data elements are stored in non-
contiguous blocks of memory and hashed key is used to map the data elements
to their physical storage locations. This allows for efficient access to data, as
the hashed key can be used to quickly locate the desired data element.
The choice of storage mapping technique will depend on the specific requirements
of the application, such as the size of the data, the number of data elements, and
the frequency of data access.
TREE
A tree is a non-linear data structure that consists of nodes connected by
edges usually in form of hierarchy. The topmost node is called the root, and its
children are referred to as its descendants. The nodes without any children are
called leaves. The height of a tree is the number of edges from the root to the
furthest leaf. Each node in a tree has a parent (except the root) and zero or more
children.
1. Binary Tree: A binary tree is a tree data structure in which each node has at
most two children, which are referred to as the left child and the right child.
Binary trees can be used to implement various data structures such as binary
search trees and heap.
2. AVL Tree: An AVL tree is a self-balancing binary search tree, in which the
difference in heights of left and right subtrees of any node is at most one. AVL
tree is a balanced tree, which performs basic operations like insert, delete and
search in O (log n) time.
3. B-tree: A B-tree is a self-balancing tree data structure that is commonly used in
databases and file systems to store large amounts of data. B-trees have a higher
branching factor than other trees, which allows them to efficiently store and
retrieve large amounts of data.
42 | P a g e
4. B+ tree: A B+ tree is a variation of a B-tree, where all the data is stored in leaf
nodes, and all the leaf nodes are linked together to form a linked list. This
allows for efficient range queries and sequential access to the data.
5. Trie: A Trie (prefix tree) is a tree-like data structure that is used to store a
collection of strings. Each node in the Trie represents a character in the string,
and the path from the root to the node represents the prefix of the string. Tries
are commonly used for efficient string matching and prefix matching.
6. Heap: A Heap is a special tree-based data structure in which the root-node key
is compared with its children and arranged accordingly, it can be either min
heap or max heap. It is used to implement priority queues and also used in
algorithms such as Dijkstra's shortest path algorithm.
7. Red-Black Tree: A red-black tree is a self-balancing binary search tree, where
each node is coloured either red or black. It is similar to AVL tree, but it has less
strict balancing condition and is simpler to implement.
Properties of tree
A tree has several properties that define its structure and behaviour:
1. Hierarchical structure: A tree has a hierarchical structure where each node
has a parent node and zero or more child nodes.
2. Root node: The root node is the topmost node in the tree and has no
parent.
3. Leaf nodes: Leaf nodes are the nodes without any children.
4. Edge: An edge is the connection between a parent node and a child node.
5. Depth: The depth of a node is the number of edges from the root to that
node.
6. Height: The height of a tree is the number of edges from the root to the
furthest leaf.
7. Path: A path is a sequence of nodes connected by edges from the root to a
leaf.
8. Ancestors and descendants: Ancestors of a node are its parent and
grandparent, and so on. Descendants of a node are its children,
grandchildren, and so on.
9. Subtree: A subtree is a portion of a tree consisting of a node and its
descendants.
10. Degree: The degree of a node is the number of children it has.
These properties help to describe the structure of a tree and its behaviour when
used in various algorithms and data structures.
Binary tree
43 | P a g e
A binary tree is a type of tree in data structure where each node has at most
two children.
TYPES OF BINARY TREE
There are several types of binary trees, including:
1. Full binary tree: A full binary tree is a binary tree where every node has
either 0 or 2 children.
2. Complete binary tree: A complete binary tree is a binary tree in which all
the levels are completely filled except possibly the last level and the last
level has all keys as left as possible.
3. Perfect binary tree: A perfect binary tree is a tree in which all the internal
nodes have two children and all leaf nodes are at the same level.
4. Balanced binary tree: A balanced binary tree is a tree in which the height of
the two subtrees of any node never differ by more than 1. Examples of
balanced binary trees are AVL trees and Red-Black trees.
5. Skewed binary tree: A skewed binary tree is a binary tree in which one of
the subtrees is much taller than the other. Examples of skewed binary trees
are left-skewed trees and right-skewed trees.
6. Binary search tree: A binary search tree is a tree in which for each node, the
key of all the nodes in its left subtree is less than its own key, and the key
of all the nodes in its right subtree is greater than its own key.
These different types of binary trees have different properties and are used in
various algorithms and data structures for specific purposes.
45 | P a g e
It's important to note that Big ‘O’ notation provides an upper bound on the growth
rate, and does not take into account factors such as the actual constant factors or
lower order terms.
Analysing algorithm and determine their running time
The running time of an algorithm refers to the amount of computational
resources (such as time or memory) required to run the algorithm as a function of
the size of the input. The order of the running time of an algorithm is often
described using big ‘O’ notation, which provides an upper bound on the growth
rate of the running time as the size of the input increases. For example, a linear
search algorithm has a running time of O(n), where n is the number of elements in
the list, as it needs to examine each element in the list once to find the target
element. On the other hand, a binary search algorithm has a running time of O(log
n), as it reduces the search space by half with each iteration.
Linked list algorithms have different running times depending on the specific
operation being performed. For example, inserting an element at the beginning of
a linked list has a running time of O(1), as it only requires updating a few pointers.
However, inserting an element at a specific position in a linked list has a running
time of O(n), as it may require traversing the entire list to find the correct position.
In conclusion, the running time and the order of running time of algorithms
can be analysed using big O notation and depends on the specific operations being
performed and the size of the input.
SORTING
Sorting is the process of arranging a set of elements in a specific order,
either ascending or descending, based on their values. Some of the most
commonly used sorting algorithms include:
1. Bubble sort
2. Insertion sort
3. Selection sort
4. Quick sort
5. Merge sort
6. Heap sort
7. Radix sort
Sorting is an important operation in computer science and is used in various
applications, such as searching, data analysis, and database management. The
choice of sorting algorithm depends on the specific requirements of the problem,
46 | P a g e
such as the size of the data, the required speed of sorting, and the memory
constraints.
Sorting Techniques
There are several sorting algorithms, each with its own advantages and
disadvantages. Some of the most commonly used sorting techniques include:
1. Bubble sort: Bubble sort is a simple sorting algorithm that repeatedly steps
through the list, compares adjacent elements and swaps them if they are in the
wrong order. The process continues until the list is sorted.
2. Insertion sort: Insertion sort is a simple sorting algorithm that builds the final
sorted array one item at a time. It is much less efficient on large lists than more
advanced algorithms such as quicksort, heapsort, or merge sort.
3. Selection sort: Selection sort is a simple sorting algorithm that works by
repeatedly finding the minimum element from the unsorted part of the list and
swapping it with the first unsorted element.
4. Quick sort: Quick sort is a divide and conquer algorithm that works by selecting
a "pivot" element from the array and partitioning the other elements into two
sub-arrays, according to whether they are less than or greater than the pivot.
The sub-arrays are then sorted recursively.
5. Merge sort: Merge sort is a divide and conquer algorithm that works by dividing
the unsorted list into n sub-lists, each containing one element, and then
repeatedly merging sub-lists to produce new sorted sub-lists until there is only
one sub-list remaining.
6. Heap sort: Heap sort is a comparison-based sorting algorithm that works by
dividing the input into a sorted and an unsorted region, and iteratively
shrinking the unsorted region by extracting the largest element and moving
that to the sorted region.
7. Radix sort: Radix sort is a non-comparative integer sorting algorithm that sorts
data with integer keys by grouping the keys by the individual digits which share
the same significant position and value.
These are some of the most widely used sorting techniques. The choice of sorting
algorithm depends on the specific requirements of the problem, such as the size
of the data, the required speed of sorting, and the memory constraints.
47 | P a g e
i. Bubble sort
ii. Insertion sort
iii. Selection sort
iv. Quick sort
v. Merge sort
vi. Heap sort
In comparison-based sorting, the elements are compared to each other, and their
positions are swapped if they are in the wrong order. The algorithms use the
comparison operator (<, >, =) to determine the order of elements. Comparison-
based sorting algorithms can be applied to any data type for which a comparison
operator is defined. The performance of comparison-based sorting algorithms is
typically measured in terms of the number of comparisons required to sort the
data. Some algorithms, such as quick sort and merge sort, have a best-case time
complexity of O(n log n), while others, such as selection sort and bubble sort, have
a worst-case time complexity of O(n^2). The choice of sorting algorithm depends
on the specific requirements of the problem, such as the size of the data, the
required speed of sorting, and the memory constraints.
Bubble Sorting Algorithm
Bubble sort is a simple sorting algorithm that works by repeatedly stepping
through the list, comparing adjacent elements and swapping them if they are in
the wrong order. The process continues until the list is sorted.
Here is how bubble sort works:
i. Start at the beginning of the list.
ii. Compare the first two elements of the list.
iii. If the first element is greater than the second, swap their positions.
iv. Move to the next pair of elements and repeat step 3.
v. Repeat steps 3 and 4 for all elements in the list.
If no swaps were made during the pass through the list, the list is sorted. If not,
repeat the entire process again. Bubble sort has a time complexity of O(n^2) in the
worst-case scenario, which means it is not very efficient for large lists. However, it
is easy to understand and implement, making it a good starting point for beginners
learning about sorting algorithms. Bubble sort can be improved by adding a flag
that keeps track of whether any swaps were made during a pass through the list.
If no swaps were made, the list is already sorted, and the algorithm can be stopped.
This optimization reduces the number of passes required for the algorithm, making
it slightly faster.
Selection Sorting Algorithm
48 | P a g e
Selection sort is a simple sorting algorithm that works by repeatedly finding the
minimum element from the unsorted part of the list and swapping it with the first
unsorted element.
Here is how selection sort works:
i. Start with the first element in the list and consider it the current minimum.
ii. Compare the current minimum with each remaining element in the list.
iii. If a smaller element is found, set it as the new minimum.
iv. After checking all the remaining elements, swap the current minimum with
the first unsorted element.
v. Repeat steps 1 to 4 for all remaining elements in the list.
Selection sort has a time complexity of O(n^2) in the worst-case scenario, which
means it is not very efficient for large lists. However, it is simple to understand and
implement, making it a good starting point for beginners learning about sorting
algorithms. The key to the selection sort algorithm is the concept of dividing the
input data into two parts, the sorted and the unsorted, and iteratively sorting the
unsorted part. The algorithm works by finding the minimum element from the
unsorted part of the list and swapping it with the first unsorted element. The
algorithm continues until all elements are in their correct positions.
Insertion Sorting
Insertion sort is a sorting algorithm that works by taking an input list and
sorting it into an output list. The algorithm works by taking each element of the
input list and inserting it into its proper position in the output list. This is done by
comparing the element to the elements already in the output list. Elements that
are smaller than the current element are shifted to the right to make room for the
current element. This process is repeated until the entire input list is sorted.
Insertion sort is an in-place sorting algorithm - it does not require any extra
memory to store the sorted list. It is a simple and efficient sorting algorithm which
has a time complexity of O(n2).
Algorithm for Insertion Sorting
1. Begin by setting the first element of the array as the sorted partition.
2. For each element in the unsorted partition, starting from the second element,
compare it to the elements in the sorted partition.
3. If the element is smaller than the element in the sorted partition, move all
elements greater than it one index to the right.
4. Insert the element into its correct position in the sorted partition.
5. Repeat step 2-4 until all elements in the unsorted partition have been sorted.
49 | P a g e
Differences between Linear and Binary Search Algorithm Techniques
Linear Search: Linear search is a very simple search algorithm. It sequentially
checks each element of the list until a match is found or the whole list has been
searched. Linear search is also known as sequential search. While Binary search is
a much more efficient searching algorithm than linear search. It works by
repeatedly dividing the search interval in half. The idea is to narrow down the
search by eliminating the half of the elements that cannot be the solution. Binary
search works only on a sorted list of elements. The list should be in either an
ascending or descending order.
Apply sorting algorithm to sort an array of objects.
[8, 5, 2, 9, 7, 1, 3]
Answer:
[1, 2, 3, 5, 7, 8, 9].
SORTING:
i. Bubble Sort:
1. #include <iostream>
2. using namespace std;
3.
4. void bubbleSort(int arr[], int n) {
5. for (int i = 0; i < n-1; i++) {
6. for (int j = 0; j < n-i-1; j++) {
7. if (arr[j] > arr[j+1]) {
8. int temp = arr[j];
9. arr[j] = arr[j+1];
10. arr[j+1] = temp;
11. }
12. }
13. }
14. }
15.
16. int main() {
17. int arr[] = {64, 34, 25, 12, 22, 11, 90};
18. int n = sizeof(arr)/sizeof(arr[0]);
19.
50 | P a g e
20. bubbleSort(arr, n);
21.
22. cout << "Sorted array: \n";
23. for (int i = 0; i < n; i++)
24. cout << arr[i] << " ";
25. return 0;
26. }
#include <iostream>
using namespace std;
int main() {
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr)/sizeof(arr[0]);
selectionSort(arr, n);
51 | P a g e
Searching:
Linear Search:
1. #include <iostream>
2. using namespace std;
3.
4. int linearSearch(int arr[], int n, int x) {
5. for (int i = 0; i < n; i++) {
6. if (arr[i] == x)
7. return i;
8. }
9. return -1;
10. }
11.
12. int main() {
13. int arr[] = {64, 34, 25, 12, 22, 11, 90};
14. int n = sizeof(arr)/sizeof(arr[0]);
15. int x = 25;
16.
17. int result = linearSearch(arr, n, x);
18. if (result == -1)
19. cout << "Element not found\n";
20. else
21. cout << "Element found at index: " << result << endl;
22. return 0;
23. }
52 | P a g e