Indexes in SQL Server with Examples
In this article, I am going to discuss Indexes in SQL Server and we will also discuss how do the indexes make your search operations faster with examples. Please read our previous article, where we discussed Joins in SQL Server with examples. As part of this article, we are going to discuss the following pointers in detail.
- How data search when there is no index?
- Understanding the Balanced Tree (B-Tree) in SQL Server.
- How will be the database engine retrieves the data from a table?
- Multiple Examples to understand Indexes in SQL Server.
- What is an index?
- When SQL Server uses Indexes?
- Types of indexes in SQL Server.
The goal of the index is to make the search operation faster. Then the next question that should come to your mind is how does index make your search faster? Indexes make the search faster by creating something called as B-Tree (Balanced Tree) structure internally.
So, in this article, first, we will understand the theory Balanced Tree (B-Tree) Structure and then we will see the practical implementation of how indexes make the search faster.
How data search when there is no index?
Let us understand how the SQL Server Database Engine searches the data when there is no index on the table. When there is no index the data search sequentially. Please have a look at the following image.
Suppose, you want to search the value 50, then search engine (i.e. SQL Server Search Engine) will scan the record sequentially one by one from the beginning i.e. from 1 and until it reaches the value 50.
If you want to increase the search performance, then somehow you have to minimize the number of scans. That what exactly the B-Tree (Balanced Tree) does.
Understanding the Balanced Tree (B-Tree):
Whenever you index some data or index some column on SQL Server then what happens is, it internally creates a B-Tree structure. In B-Tree structure, the data is divided into three sections i.e. Root Node, Non-Leaf Nodes, and Leaf Nodes.
In order to understand this better please have a look at the following image which shows how the data is divided and stored. As you can see, in the Root Node it has 30 and 50. In the Non-Leaf node, it has 10, 30, 40, and 50. And in the leaf node, we have the actual data. So, the leaf is actually pointing to data.
Suppose, you want to search 50 here, then what will happen internally is, the search engine will start the search from the root node. It will check whether 50 is less than or equal to 30. As 50 is not less than or equal to 30, so the non-leaf nodes and leaf nodes which come under the root node 30 is completely bypassed.
Then it will go to the next node i.e. 50 and check whether 50 is less than or equal to 50. And the condition satisfies here. Then it goes to the non-leaf nodes (40, 50) which are under the root node 50. It will check whether 50 is less than or equal to 40 and condition fail, so, it will bypass all the leaf nodes which come under the non-leaf node 40. Then it will check the other non-leaf node i.e. 50 and here the condition satisfies as 50 equals 50 and it goes to scan the leaf node sequentially. That is it approximately scans 10 records.
So, as you can see, due to the Root Node, Non-Leaf Nodes and Leaf Nodes arrangement, the complete records from 1 to 39 are bypassed.
How will be the database engine retrieves the data from a table?
Whenever the database engine wants to retrieve the data from a database table it will adopt two different mechanisms for searching the data
- Table scan
- Index Scan/Seek
Table Scan in SQL Server:
In Table Scan, the SQL Server Search Engine will search for the required information sequentially one by one from the start to the last record of the table. If the table having more number of rows, then it will take more time for searching the required data, so it is a time-consuming process.
Index Scan/Seek in SQL Server:
In Index Scan, the SQL Server Search Engine uses B-Tree structure to search the required data which can reduce the disk I/O operation saving the time, as a result, we will get better performance.
Example to understand how SQL Server Index improves search performance.
We are going to use the following Employee table to understand the concept Index.
Please use the following SQL Script to create and populate the Employee table with the required test data.
CREATE TABLE Employee ( Id INT, Name VARCHAR(50), Salary INT, Gender VARCHAR(10), City VARCHAR(50), Dept VARCHAR(50) ) GO INSERT INTO Employee VALUES (3,'Pranaya', 4500, 'Male', 'New York', 'IT') INSERT INTO Employee VALUES (1,'Anurag', 2500, 'Male', 'London', 'IT') INSERT INTO Employee VALUES (4,'Priyanka', 5500, 'Female', 'Tokiyo', 'HR') INSERT INTO Employee VALUES (5,'Sambit', 3000, 'Male', 'Toronto', 'IT') INSERT INTO Employee VALUES (7,'Preety', 6500, 'Female', 'Mumbai', 'HR') INSERT INTO Employee VALUES (6,'Tarun', 4000, 'Male', 'Delhi', 'IT') INSERT INTO Employee VALUES (2,'Hina', 500, 'Female', 'Sydney', 'HR') INSERT INTO Employee VALUES (8,'John', 6500, 'Male', 'Mumbai', 'HR') INSERT INTO Employee VALUES (10,'Pam', 4000, 'Female', 'Delhi', 'IT') INSERT INTO Employee VALUES (9,'Sara', 500, 'Female', 'London', 'IT')
Currently, we don’t have an index in any of the columns of the Employee table. Let us write a query to get the employee info whose id is 8.
SELECT * FROM Employee Where Id = 8;
When you execute the above query it will use a table scan to get the data. In order to make sure it uses a table scan, please click on the Display Estimated Execution Plan button as shown in the below image.
Once you click on the above Display Estimated Execution Plan Icon it will show the following image which clearly says that it performs a Table scan. Further, if you mouse over of Table Scan option, then you will see that the value of the Estimated Number of Rows as 10 as our table currently holds 10 records. You will not find any performance issue currently as the number of records is less. But if your table contains a huge amount of data let say 10000000 records, then it will definitely take much more time to get the data.
Creating Index on Id Column:
Let us create an index on the Id column by executing the following query. Later we will discuss the syntax and the different types of indexes and their need and use. But for now, we are just focusing on the need for Indexes and how we can improve the search performance.
CREATE CLUSTERED INDEX IX_Employee_ID ON Employee(Id ASC);
Once you execute the above statement the index gets created and you can see the index in the indexes folder which is present inside the Employee table as shown below.
The above index stores the ID of each employee in the ascending order. Once you create the index, now let us execute the same following query to get the employee info whose id is 8.
SELECT * FROM Employee Where Id = 8;
Once you execute the above query, now again click on the Display Estimated Execution Plan option which will show the following image which clearly shows that it perform an Index Scan or Index Seek. Further, if you mouse over of Index Seek option, then you will see that the value of Estimated Number of Rows as 1 which means it only scan 1 row which will improve the search operation.
Now, I hope you understood the basic need of Indexes. With this keep in mind let us proceed and discuss some of the theoretical concepts which are good if you are preparing for an interview.
What is an index?
- It is a database object which is used to speed up the querying process by providing query access to rows in the data table.
- When we create an index on any column SQL Server internally maintains a separate table called index table so that when any user trying to retrieve the data from the existing table depends on index table SQL Server directly go to the table and retrieve required data very quickly.
- In a table, we can use a maximum of 250 indexes. The index type refers to the way the index is stored internally by SQL Server.
When SQL Server uses Indexes?
The SQL Server uses indexes of a table provided the select or update or delete statement contained the “WHERE” clause and moreover the where condition column must be an indexed column. If the select statement contains an “ORDER BY” clause then also the indexes can be used.
Note: When SQL Server is searching for information under the database first it verifies the best execution plan for retrieving the data and uses that plan which can be either a full-page scan and index scan.
The syntax for creating Index in SQL Server:
CREATE [UNIQUE] [CLUSTERED/ NON-CLUSTERED] INDEX <INDEX NAME> ON <TABLE NAME> (<COLUMN LIST>)
To see the index: sp_helpindex Employee
To drop an index: Drop index Employee.IX_Employee_Id
Types of indexes
Indexes are divided into two types such as
- Clustered index
- Non- clustered index
In the next article, I am going to discuss the Clustered Index in SQL Server with some examples. Here, in this article, I try to explain the need for Indexes in SQL Server step by step with one example. I hope you enjoy this article.