Analysing Indexes Part 4 - Size does matter
( can't post this in one go so it's going to be a few parts )
v I’ll say now that this is a whole lot of figures and values, I’ve attempted to summarize what I can at the end of the article; for those who want to cut to the chase.
o ( The naming used for these examples is not representative of any conventions or standards )
v As part of my analysis of indexes I was particularly interested in the size of indexes, the database I am working with has many tables of several hundred million rows, these tables are often wide with clustered primary keys of up to 6 columns and twenty or more secondary indexes.
v I’m not interested in the theory of indexing here or making any ground breaking revelations, I’m just observing that an awareness of how the index is formed may help us understand why the optimiser may sometimes seem to ignore what we may think is a perfectly good index
v My initial interest in this subject was aroused when I noticed I had an index of 8gb for a single datetime column on a 10gb table which had a considerable number of columns.
1. For my tests I created two tables and populated them with two million rows each.
Why two tables? Well I wanted to see what difference a nullable column might have.
Whilst I would have liked to create much larger tables the reality is that I needed to run repeated calls to sys.dm_db_index_physical_stats and I know from experience that these calls can take a long time to run on a four hundred million row table.
Results for the initial table creation – the nullable table returns the exact same size results as my non nullable table., using sp_spaceused and sys.dm_db_index_physical_stats
|
name |
rows |
reserved |
data |
index_size |
unused |
|
|
Testtable2 |
2000000 |
1333576 KB |
1333392 KB |
8 KB |
176 KB |
|
|
with nulls |
|
|
|
|
|
|
|
index_id |
index_type_desc |
index_depth |
index_level |
page_count |
record_count |
|
|
0 |
HEAP |
1 |
0 |
166674 |
2000000 |
577 |
|
|
|
|
|
|
|
|
|
name |
rows |
reserved |
data |
index_size |
unused |
|
|
TestTable |
2000000 |
1333576 KB |
1333392 KB |
8 KB |
176 KB |
|
|
no nulls |
|
|
|
|
|
|
|
index_id |
index_type_desc |
index_depth |
index_level |
page_count |
record_count |
|
|
0 |
HEAP |
1 |
0 |
166674 |
2000000 |
577 |
For all my results I will scale up to four hundred million rows, I’ll refer to this as my “target table”
2. Leaving the tables as HEAPs I created a non clustered secondary index on a datetime column, what was noticeable was that the index on the nullable column showed an average record size 3 bytes larger, this might not sound much but on my target table of four hundred million rows that equates to 1.1gb.
Testtable2 ( no nulls )
|
Index
id |
index_typec |
Index
depth |
Index
level |
Page
count |
Record
count |
avg_record
size_in
bytes |
|
0 |
HEAP |
1 |
0 |
166674 |
2000000 |
577 |
|
2 |
NONCLUSTERED INDEX |
3 |
0 |
4695 |
2000000 |
17 |
|
2 |
NONCLUSTERED INDEX |
3 |
1 |
15 |
4695 |
23 |
|
2 |
NONCLUSTERED INDEX |
3 |
2 |
1 |
15 |
23 |
Testtable ( nulls )
|
Index
id |
index_typec |
Index
depth |
Index
level |
Page
count |
Record
count |
avg_record
size_in
bytes |
|
0 |
HEAP |
1 |
0 |
166674 |
2000000 |
577 |
|
2 |
NONCLUSTERED INDEX |
3 |
0 |
5435 |
2000000 |
20 |
|
2 |
NONCLUSTERED INDEX |
3 |
1 |
19 |
5435 |
|