Overview of Data Manipulation
Overview of Data Manipulation
Overview of Data Manipulation
There are two main file organization mechanisms used in indexing methods for
data storage:
Sequential File Organization (Ordered Index File):
Indices are based on a sorted order of values.
Two types of sequential file organization: Dense Index (one index record for each
data record) and Sparse Index (index records for only some data items).
Searching involves finding the index record with the largest search key value less than
or equal to the desired value and then proceeding sequentially.
Number of accesses required for searching: log₂(n) + 1, where n is the number of
blocks acquired by the index file.
Hash File Organization:
Indices are based on values distributed uniformly across buckets using a hash
function.
Three methods of indexing: Clustered Indexing (grouping related records together),
Primary Indexing (using primary keys for indexing), and Non-Clustered or Secondary
Indexing (providing pointers to data locations).
Non-Clustered indexing provides references to data locations but doesn't physically
organize the data in index order.
Multilevel Indexing is used to manage large indices by breaking them into smaller
blocks, reducing memory overhead.
Clustered Indexing:
Used when multiple records related to the same thing are stored together.
Typically applied to an ordered data file based on non-key fields or columns.
Groups records with similar characteristics together and creates indexes for these
groups.
Primary Indexing:
A type of Clustered Indexing using the primary key of a database table for indexing.
Induces sequential file organization for efficient searching as primary keys are unique
and sorted.
Non-Clustered or Secondary Indexing:
Provides pointers or references to data locations but doesn't physically organize data
in the index order.
Similar to a book's table of contents, giving references to where data is stored.
Dense ordering is common as sparse ordering is not feasible due to the lack of
physical data organization.
Multilevel Indexing:
Used when the index size becomes too large for main memory.
Breaks the main block into smaller blocks, which are stored efficiently in memory.
Outer blocks are divided into inner blocks, which point to data blocks, reducing
memory overhead.
In summary, these file organization mechanisms and indexing methods help manage
and access data efficiently, with each having its own advantages and use cases
depending on the specific requirements of the database system.