DS Unit-1
DS Unit-1
T-SEM-II
Chap 1. Introduction
❖
DATA AND INFORMATION
There is a subtle difference between data and information. Data are the facts or details from
which information is derived. Individual pieces of data are rarely useful alone. For data to
become information, data needs to be put into context.
Data: Data is raw, unorganized facts that need to be processed. Data can be something
simple and seemingly random and useless until it is organized.
CHARACTERISTICS:
Atomic , traceable, accurate, clear and concise
The Data structure can also be defined as a mathematical or logical model, which relates to
a particular organization of different data elements:
Basically, the term data structure and algorithm both are somehow related to each other. Set
of a well-defined algorithm and data structure makes a computer program efficient.
❖
Classification of data structure
The classification of data structure mainly consists of:
The primitive data structures are known as basic data structures. These data structures are
directly operated upon by the machine instructions. Normally, primitive data structures have
different representation on different computers.
• Integer
• Float
• Character
• Pointer
➢
Non-Primitive data structure:
The non-primitive data structures are highly developed complex data structures. Basically,
these are developed from the primitive data structure. The non-primitive data structure is
responsible for organizing the group of homogeneous and heterogeneous data elements.
• Arrays
• Lists
• Stack
• queue
❖
ABSTRACT DATATYPES
Abstract Data Type: ADT may be defined as a set of data values and associated operations that
are precisely specified independent of any particular implementation.
Thus an Abstract Data Type is an organized collection of information and a set of operations
used to manage that information.
The set of operations defines the interface of the ADT.
As long as the ADT fulfills the conditions of the interface, it doesn’t really matter how the ADT is
implemented.
Since, in ADT, the data values and operations are defined with mathematical precision, rather
than as an implementation in a computer language, we may reason about effects of the
operations, relations to other abstract data types whether a program implements the data type
etc.
One of the simplest abstract data type is the stack data type for which functions might
be provided to create an empty stack, to push values onto a stack and to pop values
from a stack.
❖
DATA STRUCTURES VS FILE ORGANIZATION:
(REFER DEFINITION OF DATA STRUCTURE FROM ABOVE)
➢
FILE ORGANIZATION
File is a collection of records related to each other. The file size is limited by the size of
memory and storage medium.
File organization ensures that records are available for processing.
There are three types of organizing the file:-
1. Sequential access file organization
2. Direct access file organization
3. Indexed sequential access file organization
Storing and sorting in contiguous block within files on tape or disk is called as sequential
access file organization.
Direct access file is also known as random access or relative file organization.
• Direct access file helps in online transaction processing system (OLTP) like online
railway reservation system.
• In direct access file, sorting of the records are not required.
• It accesses the desired records immediately.
• It updates several files quickly.
• It has better control over record allocation.
Indexed sequential access file combines both sequential file and direct access file organization.
• In indexed sequential access file, sequential file and random file access is possible.
• It accesses the records very fast if the index table is properly organized.
• The records can be inserted in the middle of the file.
• It provides quick access for sequential and direct processing.
• It reduces the degree of the sequential search.
• Indexed sequential access file requires unique keys and periodic reorganization.
• Indexed sequential access file takes longer time to search the index for the data
access or retrieval.
• It requires more storage space.
• It is expensive because it requires special software.
• It is less efficient in the use of storage space as compared to other file organizations.
❖
OPERATIONS ON DATA STRUCTURES:
Traversing: Accessing each records exactly once so that certain items in the record may
be processed.
Searching: Finding the location of a particular record with a given key value, or finding
the location of all records which satisfy one or more conditions. Inserting: Adding a new
record to the structure.
Deleting: Removing the record from the structure.
Sorting: Managing the data or record in some logical order(Ascending or descending order).
Merging: Combining the record in two different sorted files into a single sorted file.
❖
ALGORITHM
From the data structure point of view, following are some important categories of algorithms −
Characteristics of an Algorithm:
Not all procedures can be called an algorithm. An algorithm should have the
following characteristics −
• Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or
phases), and their inputs/outputs should be clear and must lead to only one meaning.
• Input − An algorithm should have 0 or more well-defined inputs.
• Output − An algorithm should have 1 or more well-defined outputs, and should
match the desired output.
• Finiteness − Algorithms must terminate after a finite number of steps.
• Feasibility − Should be feasible with the available resources.
• Independent − An algorithm should have step-by-step directions, which should
be independent of any programming code.
Example:-
Step 3 − c = a + b
Step 4 − display c
Step 5 − STOP
❖
Importance of Algorithm Analysis
Space complexity is the amount of memory used by the algorithm (including the input values to
the algorithm) to execute and produce the result.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I) is
the variable part of the algorithm, which depends on instance characteristic I. Following is a
simple example that tries to explain the concept −
Time complexity of an algorithm signifies the total time required by the program to run till
its completion.
For example, addition of two n-bit integers takes n steps. Consequently, the total computational time is T(n) = c ∗ n, where c is the time taken
for the addition of two bits. Here, we observe that T(n) grows linearly as the input size increases.
#include<iostream.h>
#include<conio.h>
int main ()
{
int number[5]; sum=0;
A one dimensional array is structured collection of array elements that can be access
individually.
2 4 1 6 2 28 9 8
Box drawn in memory with a series of subdivision called elements, to store data. The box above could
represent an array of 8 integers.
#include<iostream.h> #include<conio.h>
int main()
{
int I;
Float marks[6];
Cout<<”enter the marks of your 6 subject:\n”;
For(i=0;i<6;i++)
{
Cin>>marks[i];
}
Float sum=0; For(i=0;i<6;i++)
{
Sum+=marks[i];
}
Float Ave=sum/6;
Cout<<”average mark is:”<<Ave<<'\n’;
getch(); return 0;
}
❖
Memory Representation of one dimensional array
1-Darray are linear array in which elements are stored in the successive memory locations.
The element at the very first position in the array is called it’s base address.
• An array is stored such that the position of each element can be computed from its index tuple by
a mathematical formula.
• For example, an array of 10 32-bit integer variables, with indices 0 through 9, may be stored as 10 words at
memory addresses 2000, 2004, 2008, ... 2036, so that the element with index i has the address 2000 + 4 × i.
Elements 34 78 98 45 56 76 54
Index arr[0]=100 arr[1]=? arr[2]=? arr[3]=? arr[4]=? arr[5]=? arr[6]=?
• Here we defined an array of seven elements of integer type, whose index element base address is
100.i.e,the element arr[0] is stored at base 100.
• For calculating starting address of the next element i.e. of a[1],we can use the following formula.
• For example, the value of c will be 4 bytes, since an integer occupies 4 bytes of memory. We can calculate
the starting address of second element of the array as:
arr[1]=100+4*1= 104
The starting address of second element of array is 104.
❖
Traversing of array
• Traversing basically means the accessing the and every element of the array at least once.
• After insertion and deletion operation you would usually want to check whether it has been successfully or not,
To check this we can use traversing, and can make sure that whether it has been element
is successfully inserted or deleted.
• Let A be a collection of data elements stored in the memory of the computer.
• Suppose we want to print the content of each element of A or suppose we want to count the number of elements
of A with given property.
➢
Algorithm: (traversing a linear array)
• Here A is linear array with lower bound LB and upper bound UB.
• Array has an upper and lower bound and array class provides us functions to get the minimum and
maximum index value of an array.
• Each dimension in an array has an upper and lower bound, which determine the range of values that
can be used as subscripts for that dimension.The lower bound is a specification expression.
Step 1: [initialize counter] Set I:=LB.
Step 2: repeat step 3 and 4
while I <= UB:
Step 3: Apply process to A.
Step4: Set I:I+1
[End of step 2 loop]
Step 5:Exit.
• Traversing a linear structure means moving through it sequentially, node by node.
• Processing the data element of a node be complex , but the general pattern is as follows:
▪
Begin at the first node.
▪
Repeat until there are no more nodes [last node of the of the next part will contain the null].
▪
Process the current node.
▪
Move to the next node.
❖
Insertion of array
• To insert an element in an array in c++ programming, you have to ask user to enter the array size and
array elements.
• And ask to the user to enter the element to insert the element at desired position in the array. After inserting
the element at the desired position in the array, display the new array.
#include<iostream.h>
#include<conio.h>
int main()
{
int a[100],pos,c,n,value;
cout<<”enter number of elements in array”<<endl;
cin>>n;
cout<<”enter elements”<<endl; for(c=0;c<n;c++)
{
Cin>>a[c];
}
Cout<<”enter location where you want to insert the element”;
Cin>>pos;
Cout<<”enter value”<<endl;
Cin>>value;
for(c=n-1;c>=pos-1;c--)
{
a[c+1]=a[c]; a[pos-1]=value;
}
Cout<<”result in array is”;
For(c=0;c<=n;c++)
{
Cout<<a[c]<<endl;
}
getch();
return 0;
}
❖
Deletion of arrays
• To delete element from an array in c++ programming, you have to first ask to the user to enter the array size
then ask to enter the array elements.
• Now ask to enter the element which is to be deleted.
• Search that number if found then place the next element after the founded element to the back until the last as
shown here in program.
#include<iostream.h>
#include<conio.h>
int main()
{
int j,a[5],no,pos;
cout<<”enter data in array”;
for(j=0;j<5;j++){
Cin>>a[j];
}
Cout<<”data shown in array”;
For(j=0;j<5;j++)
{
Cout<<a[j];
}
Cout<<”enter position to delete=”;
Cin>>pos; if(pos>5)
{
Cout<<”number is not in array”;
}
else
{
--pos;
for(j=pos;j<=4;j++)
{
a[j]=a[j+1];
}
Cout<<”after data deletion”;
For(j=0;j<4;j++)
{
Cout<<a[j];
}
getch();
return 0;
}
❖
Searching of arrays
• To search any element present inside the array in c++ programming using linear search technique,
• You have to ask to the user to enter the array size and elements to store the elements in the array.
• Now, ask to the user to enter the element that she/he want to check or search whether the entered
number/element is present in the array or not.
• To check/search for the element, just compare with the number to each element present in the array.
#include<iostream.h> #include<conio.h>
void main()
{
clrscr();
int a[10],i,num,n,c=0,pos; cout<<"Enter the
array size="; cin>>n;
cout<<"Enter array element="; for(i=0;i<n;i++)
{
cin>>a[i];
}
cout<<"Enter the number to be search="; cin>>num;
for(i=0;i<n;i++)
{
if(a[i]==num)
{ c=1;
pos=i+1; break;
}
if(c==0)
{
cout<<"Number not found... ";
}
else
{
cout<<num<<"Position"<<pos;
}
getch();
}
❖
Sorting of array
• Sorting is the process of putting data in order either numerically or alphabetically.
• It is often necessary to arrange the elements in an array in numerical order from highest to lowest values.
• If the array contain string values, alphabetical order may be needed.
• In sorting there Are two types of sorting Ascending and Descending.
➢
Ascending:
#include<iostream.h>
#include<conio.h> void main()
{
int i,a[10],temp,j ;
clrscr();
cout<<"Enter any 10 num in array: \n"; for(i=0;i<=10;i++)
{
cin>>a[i];
}
cout<<"\nData before sorting: "; for(j=0;j<10;j++)
{
cout<<a[j]; cout<<"\n";
}
for(i=0;i<=10;i++)
{
for(j=0;j<=10-i;j++)
{
if(a[j]>a[j+1])
{
temp=a[j]; a[j]=a[j+1];
a[j+1]=temp;
}
}
}
cout<<"\nData in Ascending Order:\n"; for(j=0;j<10;j++)
{
cout<<a[j];
cout<<"\n";
}
getch();
}
➢
Descending:
#include<iostream.h>
#include<conio.h> void main ()
{
clrscr();
int num[10];
int i, j, desc;
cout<<"Enter 10 Numbers :\n";
for (i = 0; i < 10; ++i)
{
cin>>num[i];
for (i = 0; i < 10; ++i)
{
for (j = i + 1; j < 10; ++j)
{
if (num[i] < num[j])
{
desc = num[i];
num[i] = num[j];
num[j] = desc;
}
}
}
cout<<"\n";
cout<<"Numbers in Descending Order : \n";
for (i = 0; i < 10; ++i)
{
cout<<num[i];
cout<<"\n";
}
getch();
}
❖
Merging of array
To merge two arrays in C++ programming,
• you have to ask to the user to enter the array 1 size and elements then array 2 size and elements to merge both
the array and store the merged result in the third array say merge[ ].
• To merge two array, start adding the element of first array to the third array after this start appending the
elements of second array to the third array.
#include<iostream.h>
#include<conio.h>
void main()
{
clrscr();
int arr1[50], arr2[50], size1, size2, size, i, j, k, merge[100];
cout<<"Enter Array 1 Size : ";
cin>>size1;
cout<<"Enter Array 1 Elements : ";
for(i=0; i<size1; i++)
{
cin>>arr1[i];
}
cout<<"Enter Array 2 Size : "; cin>>size2;
cout<<"Enter Array 2 Elements : ";
for(i=0; i<size2; i++)
{
cin>>arr2[i];
}
for(i=0; i<size1; i++)
{
merge[i]=arr1[i];
}
size=size1+size2;
for(i=0, k=size1; k<size && i<size2; i++, k++)
{
merge[k]=arr2[i];
}
cout<<"Now the new array after merging is :\n";
for(i=0; i<size; i++)
{
cout<<merge[i]<<" ";
}
getch();
}
❖
Multidimensional array
• C++ allows multidimensional arrays.
• It is a collection of data elements of same data type arranged in rows and columns.
• General form of a multidimensional array declaration:
• Type name[size1][size2]…..[sizeN];
• For example, int arr[5][10][4];
❖
Two-dimensional arrays
• The simplest form of the multidimensional array is the two dimensional array.
• A two dimensional array is, in essence ,a list of one dimensional arrays.
• To declare two dimensional integer array of size x,y, you would write something as follows:
• Declaration of two dimensional array
Type arrayname [number of rows][number of columns]; For
example, int sales [3][3];
X[0][0] X[0][1] X[0][2]
X[1][0] X[1][1] X[1][2]
X[2][0] X[2][1] X[2][2]
#include<iostream.h>
#include<conio.h>
void main()
{
clrscr();
int arr[3][3],i,j,arrt[3][3];
cout<<"Enter 3*3 Array Elements:\n";
for(i=0;i<3;i++)
{
cout<<"\n";
for(j=0;j<3;j++)
{
cin>>arr[i][j];
}
}
cout<<"\n 2-D Matrix Entered is:\n";
for(i=0;i<3;i++)
{
for(j=0;j<3;j++)
{
arrt[i][j]=arr[j][i];
}
}
getch();
}
❖
Accessing Elements of Two-Dimensional Arrays:
• Elements in Two-Dimensional arrays are accessed using the row indexes and column
indexes. Example:
int x[2][1];
• The above example represents the element present in second row and first column.
❖
Memory Representation of two dimensional arrays
• When working with multi dimensional arrays, one important programmers have to make fairly early on in
the project is what memory layout to use for sorting the data,
• And how to access such data in the most efficient manner.
1. Column-Major Order
2. Row-Major Order
1. Column-Major Order:
In this method the elements are stored column wise, i.e. m elements of first column are stored in first
m locations, m elements of second column are stored in next m locations and so on.
E.g. a 3 x 4 array will stored as below:
In this method the elements are stored row wise, i.e. n elements of first row are stored in first n locations,
n elements of second row are stored in next n locations and so on.
❖
General multidimensional arrays
Vector based multidimensional arrays
Vectors are a STL(standard template library) container that allow you to store pretty much anything in them.
• When used correctly they can be very powerful containers.
• They provide an added benefit that they will remove memory automatically they use when they go
out of scope.
• This means that objects stored within a vector do not need to be de-escalate (but pointer to object do).
Pointer based multidimensional arrays
• Pointer based multidimensional arrays provide you with a more raw access to the objects.
• The benefits can be added speed and you can apply custom optimization to them.
❖
Sparse arrays and sparse matrix
• The sparse matrix represents a special type of two dimensional array consisting of a large number
of elements out of which a very high proportion is occupied by null elements.
• This, the classical declaration of two dimensional array in the c++ programming language leads to
unnecessary use of memory by storing null values and of the processing capacity.
The declaration of sparse matrix type data structure is based on the following elements.
• The memory space needed to store the sparse matrix which is much larger than in the case
than number of not null values is not much smaller than the number of null values;
• More difficult implementation of the operation at matrix level due to the indirect access
way through the data structure defined for storing the sparse matrix. Storage structure of
sparse matrixes.
• Three arras for storing the values of the line index, column index and not null values; in
linear shape, the first two arrays are replaced with the linear index;
• The article type structure with the following fields: line index, column index and the not
null value; the articles are stored in an array or a linear list;
• Array on bits for marking the position of the not null element in the linear matrix and an array
to store not null values.
• A matrix is a two dimensional data object made of m rows and n columns , therefore
having total m x n values.
Why to use sparse matrix instead of simple matrix:
Storage: there are lesser non-zero elements than zeros and thus lesser memory can be used to store
only those elements.
Computing time: computing time can be solved by logically designing a data structure traversing only
non-zero elements.
For example, 0 0 3 4
5002
0054
❖
Memory representation of special kind of matrices
Representation is a method used by a computer language to store matrices of more than one
dimension in memory.
Fortran and Cuse different schemes for their native arrays
Fortran uses "Column Major", in which all the elements for a given column are stored contiguously in
memory.
C uses "Row Major", which stores all the elements for a given row contiguously in memory.
LAPACK defines various matrix representations in memory. There is also Sparse matrix representation
and Morton-order matrix representation.
According to the documentation, in LAPACK the unitary matrix representation is optimized.
Some languages such as Javastore matrices using vectors. These are particularly useful for storing irregular
matrices. Matrices are of primary importance in linear algebra.
❖
Advantages and limitations of arrays
➢
Advantages of arrays:
• It allows random accessing of elements i.e. any element of the array can be randomly accessed
using indexes.
➢
Disadvantages of arrays:
• To delete one element in the array, we need to traverse throughout the array.