DATABASE MANAGEMENT SYSTEM

INDEXING AND TYPES OF INDEXING

 

Indexing in a Database Management System (DBMS) is a data structure technique used to efficiently retrieve records from the database files based on some attributes on which the indexing has been done. 

The first column i.e. the Search key contains a copy of the primary key or candidate keys or non-keys of the table.

The second column i.e., the pointer contains a pointer or the address of the block where that particular Search key value can be found.

TYPES OF INDEXING 


 

 

  1. Primary Indexing

 

Primary indexing in Database Management Systems (DBMS) is a type of indexing where the index is built on the primary key of a database table. This type of index helps in quickly accessing records based on the primary key, which is unique for each record in the table. 

Data Elements must be Ordered and are Key values .

The primary index can be classified into two types: Dense index and Sparse index.

Dense Index

A dense index is an indexing method where there is an index entry for every search key value in the data file. This means that every record in the table has a corresponding index entry.

Characteristics of Dense Index

  1. Every Key: Contains an index entry for every search key value in the data file.
  2. Direct Access: Provides direct access to the data record for every key.
  3. Storage: Requires more storage space because it maintains an index entry for each record.
  4. Speed: Generally provides faster search operations since every record has a corresponding index entry.

Example

Consider a table Employee:

A dense index for the EmployeeID column would look like this:

Advantages of Dense Index

  • Fast Searches: Quick access to records because there is an index entry for each record.
  • Direct Access: Directly points to the data record, making retrieval operations efficient.

Disadvantages of Dense Index

  • Storage Overhead: Requires more storage space for maintaining the index entries.
  • Maintenance Overhead: More maintenance is required for updates, inserts, and deletes, as the index needs to be updated accordingly.

Sparse Index

A sparse index is an indexing method where index entries are not created for every search key value. Instead, index entries are created only for some of the search key values. Each index entry points to a block or a page of records.

Characteristics of Sparse Index

  1. Not Every Key: Contains index entries for only some of the search key values.
  2. Block Level: Each index entry points to a block or page of records rather than individual records.
  3. Storage: Requires less storage space compared to a dense index because it has fewer index entries.

 

Example

Using the same Employee table:

A sparse index for the EmployeeID column might look like this:

Here, Block 1 might contain records for EmployeeID 1 and 2, and Block 2 might contain records for EmployeeID 3 and 4.

Advantages of Sparse Index

  • Storage Efficiency: Requires less storage space because it has fewer index entries.
  • Maintenance Efficiency: Easier to maintain compared to dense indexing, especially for large datasets.

Disadvantages of Sparse Index

  • Slower Access: May require additional block reads to locate the exact record, making searches slightly slower compared to dense indexing.
  • Indirect Access: Provides indirect access to records, which may increase search time.

When to Use

  • Dense Index: Suitable for smaller tables or tables where search speed is critical, and storage overhead is not a primary concern.
  • Sparse Index: Ideal for large tables where storage space is a consideration and search operations can tolerate a slight performance overhead.

2. Clustering Indexing

In a clustered index, data elements are ordered and  non-key values. This results in similar elements being grouped together into clusters. Each cluster, containing a group of similar values, is assigned a single key index. Clustered indexing utilizes only sparse indexing. The data in the search table is ordered and unique. The arrow from B1 to B2 is called block hanker .


 

3. Secondary Indexing 

In databases, searches can be performed on both primary key and non-primary key values. Secondary indexing is used when the data is unordered, and the values can be either key elements or non-key elements. This approach involves using two indexes to facilitate quicker and easier searches.

For key values, a primary index is used, which is a sparse index. For non-key values, a secondary index is created. In this case, the non-key values are first ordered in a secondary table. Then, a separate index table with a dense index is prepared to manage these non-key values.