Keeping Your Data in Order
Data, data, data. It seems to be all around us now. Everywhere you look there is data stored electronically. Storing this data is easy, but retrieving the data sometimes can be slow and painful. An efficient way to SELECT this data is the use of indexes: clustered and non-clustered.
By default, SQL Server will insert a row of data into the table as it is presented with no logical order. As the number of rows increase it will be become increasingly harder for the SQL engine to find a specific row or a group of rows.
The easiest way to keep track of all this data is to keep it in order. A very common method of keeping things in order is to sort things alphabetically, just like a phone book. Ordering the data would allow the SQL Engine to SEEK the indexes of the table to find a specific entry in the phone book, rather than SCAN the entire table. It also helps SQL Server know how many of a single item there are once it finds the first entry. For example, how many entries are there in the phone book “Smith” (very common) vs “Smythe” (rare).
One of the solutions for this is what we call an “index”. A simple little short cut that tells the SQL Engine what names are on what page. A phone book index, usually found at the top, will tell a user what last names are found on each page. The SQL Engine index works the same way.
In SQL Server, there are two main types of table indexes, clustered and non-clustered. Clustered indexes “sort and store the data rows in the table or view based on their key values” (https://msdn.microsoft.com/en-us/library/ms190457.aspx). This is the crucial index and the one that can usually improve performance without any additional work. Tables that do not have a clustered (ordered) index are called “heaps”. Finding a single row among several 100,000 rows is like find a “needle in a haystack, a heap of hay! Plus, once you find the needle, you have to keep going through the entire stack, because you don’t know how many needles you’ll find!
Clustered indexes must be chosen carefully because you can only have one per table; data can only be stored one way. There are 4 simple rules in determining what makes a good clustered index and we at Sparkhound have a fun easy way to remember them: NUNS. A good table design will enforce this principle when deciding on which columns to include in your clustered index. Your clustered index should be Non-changing, Unique, Narrow, and Sequential. (Good Clustered Index Design)
The safest bet is to always go with a column named ID using int as data type and IDENTITY(1,1) as the parameter. A good clustered index is like a phone book index; it lets the SQL Engine know exactly what page to find your data. Clustered indexes are so important; the following script will help identify tables in a database that are a HEAP table and don’t have clustered indexes.
-- List all heap tables USE <name of database>; GO SELECT SCH.name + '.' + TBL.name AS TableName FROM sys.tables AS TBL INNER JOIN sys.schemas AS SCH ON TBL.schema_id = SCH.schema_id INNER JOIN sys.indexes AS IDX ON TBL.object_id = IDX.object_id AND IDX.type = 0 -- = Heap ORDER BY TableName
So, let’s get rid of those heaps by creating Clustered Indexes on those tables!
Once we have our data ordered by using a clustered index, sometimes we need to SELECT data by another field. This is where non-clustered indexes are helpful. A table can have many non-clustered indexes to help you find your data; but be warned the more non-clustered indexes a table has the more your database performance could suffer inserting or updating rows.
Using our phone book example scenario, the yellow pages usually divides the entries by classification. The example on the below shows the first couple of entries for “Dentists”. This is a perfect example of a non-clustered index. In a table, each yellow page entry would have a column defined as “JOB”. A non-clustered index on “JOB” would allow you to quickly and easily find all the dentists in your table.
Check out these links to learn more about clustered and non-clustered indexes:
8 Weeks of Indexes -- a tutorial on the basics of indexing
Stairway to SQL Server Indexes -- courtesy of SQL Server Central
If you are experience performance problems, slow applications, reports that never finish, or you would like to just discuss this topic further; feel free to contact anyone on our Database Team of SQL Server DBAs and we will gladly help you resolve some of your SQL pain.
Information and material in our blog posts are provided "as is" with no warranties either expressed or implied. Each post is an individual expression of our Sparkies. Should you identify any such content that is harmful, malicious, sensitive or unnecessary, please contact email@example.com