SQL Server

Top 20 tips to optimize your T-SQL code (Part-1)

Top 20 tips to optimize your T-SQL code
(Part-1)

Author: Basit A. Farooq

Troubleshooting the database performance bottlenecks is not an easy task, since it can be influenced by a lot of different factors. However, common symptoms of SQL Server database performance bottlenecks are related to bad Transact-SQL (T-SQL) code. This is because, the core database application logic is done through T-SQL statements whether they are issued directly to retrieve the data from SQL Server database or submitted behind the scenes through an API. Therefore, the best option for us to tune the performance of a SQL Server database application is by optimizing the underlying T-SQL code. In this first of the two part article series, I will share 10 of the 20 tips which you must follow to optimize T-SQL code.

Tip#1: Use SET NOCOUNT

To improve performance, try to add SET NOCOUNT ON statement at the top of your batches, stored procedures and triggers. That’s because, when specified, does not return the number of rows affected message by a SQL statement. It also reduces network traffic, because our client will not receive any message indicating the number of rows affected by a SQL statement.

Tip#2: Use TOP keyword

Try to use TOP keyword in the select statements, if we need to return only the first n rows. This can improve performance of our queries, as the smaller result set will be returned. It can also reduce the traffic between the server and the clients.

Tip#3: Avoid using (NOLOCK) query hint

The NOLOCK query hint is one of the most common practices in T-SQL, however one of the worst. Most developer’s perception is that the risk of using NOLOCK hint is the possibility of getting inconsistent data, since it only reads rows and not have to wait till others have committed their Select’s, Update’s, and Delete’s. In short, the effect of using NOLOCK hint is the possibility of getting uncommitted reads (aka dirty reads), which is true. However, there is more to it than just reading the uncommitted rows. As we know, the transactions do more than just selecting, updating and deleting rows, for example, the transaction often requires an index that is updated or run out of space on the data page. This may require the allocation of new pages and relocation of existing rows on the existing page to this new page (page split). Because of this, you may be missing a number of rows and / or have rows twice, which usually is not allowed if you are running your queries without NOLOCK query hint.

Tip#4: Avoid the use of DISTINCT and UNION clauses

Where possible, try to avoid the use of DISTINCT or UNION clauses in your T-SQL queries. That’s because they can adversely affect the query performance due to sorting operation they require to identify and remove duplicates from the result set of the query.

Tip#5: Use actual column_list for your SELECT and INSERT statements

Although the column_list parameters are optional for the SELECT and INSERT statements, it is always a good idea to provide a full-column list for these statements. This is because when full-column list is not specified, SQL Server resolves full-column list whenever the SELECT and INSERT statements execute. Moreover, the SELECT or INSERT statement may generate an error if the underlying table schema changes. So to reduce network traffic and improve the overall performance of the SELECT query, use the actual columns names in the SELECT statement.

Tip#6: Prefer EXISTS keyword over IN keyword

When checking for existence of records, simply prefer EXISTS keyword over IN keyword. This is because, IN keyword operates on lists and return the complete result set from subqueries before doing further processing, while subqueries using the EXISTS keyword return either TRUE or FALSE and is faster because once the match is found, it will quit looking as the condition has proven true.

Tip#7: Avoid dynamic SQL

Avoid using dynamic SQL; try to find alternatives that do not constitute of dynamic SQL. If you use dynamic SQL, use sp_executesql instead of EXECUTE (EXEC) because sp_executesql is more efficient and versatile than EXECUTE, as it supports parameter substitution and generates execution plans that are more likely to be reused by SQL Server.

Tip#8: Use table variables for smaller data sets

Try to use table variables instead of temporary tables as table variables require less locking resources as well as less logging resources than the temporary tables, so table variables should be used whenever possible.

Tip#9: Have appropriate indexes on the tables for your queries

Table should have minimum of one clustered index and appropriate number of non clustered indexes, which should be created on columns of table based on query which is running following the priority order as WHERE clause, then JOIN clause, then ORDER BY clause and finally the SELECT clause.

Also, try to create indexes on columns that have integer values rather than character values. Because the integer values have less size than the size of characters values, so we can reduce the number of index pages which are being used to store the index keys. This finally reduces the number of reads required to read the index and enhances the overall index performance. If you need to join several tables very frequently, then consider creating index on the joined columns which can significantly improve performance of the queries against the joined tables.

Tip#10: Remove duplicate indexes

Microsoft SQL Server allows you to create multiple indexes on the same column of a table. This increases the likelihood of having duplicate indexes in the database because SQL Server does not prevent you from creating duplicate indexes, infect we can create up to 999 duplicate indexes on each table inside database. Duplicate indexes are indexes of same type, created on the same set of key columns in the same order, and same set of non-key columns (also known as included columns) with same or different order, but have different names. We should not create duplicate indexes on the table, and remove them if you find them. Having duplicate indexes on tables columns can significantly hurt the performance of your database because SQL Server has to maintain each duplicate index separately (such as updating these duplicate indexes during DML operations and calculating and updating statistics for these duplicate indexes). Moreover, SQL Server query optimizer will consider each of them when it optimizes queries, which can cause serious performance impact. In this article, we’ll understand what a duplicate index is and how we can find and remove the duplicate indexes from database.

No one deliberately creates duplicate indexes on the table. Sometimes, you can create duplicate indexes unknowingly. For example, creating primary key and unique key on the same column creates duplicate indexes because by default SQL Server creates clustered index and unique key created non-clustered index.

Continued to Part-2…