Table Scan vs. Index Scan
We know that an index seek would be preferred over an index scan, but what should we do if we see a table scan in our execution plan? This demonstration shows the difference between a table scan and an index scan.
A table scan is performed on a table which does not have an index upon it (a heap) – it looks at the rows in the table. An index scan is performed on an indexed table – the index itself.
I’ve seen it written that the two are the same, and I can understand why that conclusion is made. In fact running side-by-side there is little difference between a table scan and an index scan. In terms of performance, they both run the same. The performance balance between the two is changed when you are wanting to do any meaningful operations on the data, such as join another table.
I’d like to demonstrate first the similarity of the two without furthering the query:
SET NOCOUNT ON; USE [master] GO IF NOT EXISTS (SELECT * FROM sys.databases WHERE [name] = 'Index_Test') CREATE DATABASE Index_Test GO USE [Index_Test] GO IF EXISTS(select * FROM sys.tables WHERE [name] = 'without_index') DROP TABLE without_index CREATE TABLE without_index ( ID int NOT NULL, value varchar(50) NOT NULL ) IF EXISTS(select * FROM sys.tables WHERE [name] = 'with_index') DROP TABLE with_index CREATE TABLE with_index ( ID int NOT NULL PRIMARY KEY CLUSTERED, value varchar(50) NOT NULL ) INSERT INTO with_index SELECT 1,'testvalue' UNION SELECT 2,'anothertestvalue' UNION SELECT 3, 'my last test value' INSERT INTO without_index SELECT 1,'testvalue' UNION SELECT 2,'anothertestvalue' UNION SELECT 3, 'my last test value'
Then if Include Actual Execution Plan (CTRL+M), and execute the following:
SELECT * FROM dbo.without_index SELECT * FROM dbo.with_index
You’ll notice that the first statement on the table without the index will cause a ‘Table Scan’, while the second statement causes an ‘Index Scan’.
Notice as well that the I/O usage and CPU cost for these two operations are different, however the sub tree cost for the two operations is identical.
Clustered Index Scan:
If we were to create a non-clustered index on the table ‘without_index’:
CREATE NONCLUSTERED INDEX [IX_TEST] ON [dbo].[without_index] ( [ID] ASC, [value] ASC )
And we perform the two selects statements we’ll get two index scans:
You’ll notice that the I/O and CPU cost is the same.
This is only for 3 rows of ordered data inserted. If you take that to millions of rows the results are still the same. The real benefit to having an index on the table is made aparent when you want to perform any operations on this data such as joining another table. If we magnify the scale of the inserts up to 1m, when a join onto another table is performed the table scan performs a sort and uses a paralellism to hash join the data. Whereas our indexed table is straing joined as the data is already sorted. This is when the performance cost is 96% of the batch for the without_index table and just 4% for with_index table over 1m rows.
So we know that index seeks are preferred over index scans. This demonstration shows that index scans are preferred over table scans. In fact, a good rule of thumb is to always include at least one index on a table this way we avoid table scans altogether.
This final script will drop the database:
--DROP DATABASE USE [master] GO ALTER DATABASE [Index_Test] SET SINGLE_USER WITH ROLLBACK IMMEDIATE GO DROP DATABASE [Index_Test] GO