SQL Server Query Optimization Techniques PDF
SQL Server Query Optimization Techniques PDF
SQL Server Query Optimization Techniques PDF
GROUP BY COLUMN / ROLLUP (column) / CUBE (column) (5) GROUP BY (6) CUBE | ROLLUP
For determining the number of logical reads, you can turn the STATISTICS IO option ON.
SET STATISTICS IO ON
SELECT *
FROM tablename;
Table 'tablename'. Scan count 1, logical reads 33, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical
FROM table_name;
Instead of:
SELECT *
FROM table_name;
WITH COUNT_ROWS
SELECT rows
FROM sys.sysindexes
WHERE id = OBJECT_ID('table_name')
SELECT rows
FROM sys.partitions
Instead of:
SELECT COUNT(*)
FROM table_name;
FROM table_name
GROUP BY col_1;
Instead of:
SELECT Col_1, count(Col_1)
FROM table_name
GROUP BY Col_1
HAVING Col_1 != 'testvalue1'
SELECT col_1
FROM table_name1
FROM table_name2)
Instead of:
-- For SQL Server
SELECT col_1
FROM table_name1
FROM table_name2)
FROM table_name2)
Try to use operators like EXISTS, IN and JOINS appropriately in your query
Exists vs. In
The EXISTS function searches for the presence of a single row that meets the stated criteria, as opposed to the IN
statement that looks for all occurrences.
(a) Usually IN has the slowest performance.
(b) IN is efficient only when most of the filter criteria for selection are placed in the subquery of a SQL statement.
(c) EXISTS is efficient when most of the filter criteria for selection is in the main query of a SQL statement.
Example: Write the query as
SELECT *
FROM table1 t1
FROM table2 t2
Instead of:
SELECT *
FROM table1 t1
FROM table1 t1
Instead of:
SELECT *
FROM table1 t1
FROM table2 t2
FROM system_user
FROM system_user
MINUS
SELECT ac_user
FROM account)
Instead of:
SELECT *
FROM system_user
FROM account)
Use EXISTS instead of DISTINCT when using table joins that involves tables having one-to-many relationships
Example: Write the query as
SELECT d.col_id, d.col2
FROM table1 d
FROM table2 e
Instead of:
SELECT DISTINCT d.col_id, d.col2
FROM table1
UNION ALL
FROM table2;
Instead of:
SELECT id, col1, col2
FROM table1
UNION
FROM table2;
FROM TABLE
Instead of:
SELECT id, col1, col2
FROM TABLE
FROM TABLE
Instead of:
SELECT id, col1, col2
FROM TABLE
FROM TABLE
Instead of:
SELECT Col1, Col2
FROM TABLE
WHERE Col3 >= MAX(Col3)
Use NON-Column expression on one side of the SQL query as it will be processed before any other clause
Example: Write the query as
SELECT id, Col1, Col2
FROM TABLE
Instead of:
SELECT id, Col1, Col2
FROM TABLE
MAX(p.created) AS latest_post
FROM authors a
GROUP BY a.id
Instead of:
SELECT a.id,
(SELECT MAX(created)
FROM posts
FROM authors a
FROM a, b
UNION
SELECT *
FROM a, b
Instead of:
SELECT *
FROM a, b
SELECT *
FROM TABLE
-- Postfix wildcard
SELECT *
FROM TABLE
-- Prefix wildcard
SELECT *
FROM TABLE
Alias Usage
A short alias is parsed more quickly than a long table name or alias.
Example: Write the query as
SELECT e.first_name, e.last_name, c.country
Instead of:
SELECT first_name, last_name, country
WHERE country_id = id
FROM emp e2
Instead of:
SELECT OUTER.*
SOME MORE TIPS FOR OPTIMIZATION OF QUERIES/ TABLES/ STORED PROCEDURES IN SQL
SERVER
a. Table should have minimum of one clustered index and appropriate number of non-clustered indexes, which
should be created on columns of table based on query which is running following the priority order as WHERE
clause, then JOIN clause, then ORDER BY clause and finally the SELECT clause.
b. Avoid using Triggers if possible; incorporate the logic of trigger in a stored procedure.
c. Table should have a primary key.
d. Try to use constraints for selection instead of using triggers. Constraints are efficient than triggers enhance
performance. So, you should use constraints instead of triggers, whenever possible.
e. Try to use table variables instead of temporary tables as table variables require less locking resources as well as
less logging resources than the temporary tables, so table variables should be used whenever possible.
f. Avoid the use of views or replace views with original tables.
g. Try to avoid the use of DISTINCT clause, wherever possible. As the DISTINCT clause will result in performance
degradation, we should use this clause only when it is necessary or unavoidable.
h. Try to add SET NOCOUNT ON statement into your stored procedures as it stops the message indicating the
number of rows affected by a SQL statement. It also reduces network traffic, because our client will not receive
any message indicating the number of rows affected by a SQL statement.
i. Try to use TOP keyword or the SET ROWCOUNT statement in the select statements, if we need to return only the
first n rows. This can improve performance of our queries, as the smaller result set will be returned. It can also
reduce the traffic between the server and the clients.
j. Try to use user-defined functions to keep the encapsulated code for reuse in future. The user-defined functions
(UDFs) contain one or more SQL statements that can be used to encapsulate code for reuse. Using UDFs can also
reduce the network traffic.
k. If possible move the logic of UDF to SP as well.
l. If you need to delete all rows of a table, try to use TRUNCATE TABLE command instead of DELETE command.
Using the TRUNCATE TABLE is a faster way to delete all rows of a tables, because it removes all rows from a
table without logging each row delete.
m. Remove any unnecessary joins from tables.
n. If there is cursor used in query, see if there is any other way to avoid the usage of this (either by SELECT … INTO
or INSERT … INTO, etc.). Try to avoid using cursors whenever possible. As SQL Server cursors can result in
some performance degradation as compared to select statements. Try to use correlated subqueries or derived tables
for row-by-row operations on tables.
o. When writing a subquery (a SELECT statement within the WHERE or HAVING clause of another SQL
statement):
1. Try to use a correlated (refers to at least one value from the outer query) subquery when the return is
relatively small and/or other criteria are efficient i.e. if the tables within the subquery have efficient indexes.
2. Try to use a non-correlated (does not refer to the outer query) subquery when dealing with large tables from
which you expect a large return (many rows) and/or if the tables within the subquery do not have efficient
indexes.
3. Ensure that multiple subqueries are in the most efficient order.
4. Remember that rewriting a subquery as a join can sometimes increase efficiency.
p. Use char/varchar columns data type, instead of nchar/nvarchar data type if we do not need to store Unicode data.
The char/varchar data value uses only one byte to store one character; whereas the nchar/nvarchar value uses two
bytes to store one character, so the char/varchar columns use two times less space to store data as compared to
nchar/nvarchar data columns.
q. Try to use stored procedures instead of heavy queries as they can reduce network traffic, because our client will
send to server only stored procedure name (along with some parameters) instead of heavy and lengthy queries text.
Stored procedures can be used to enhance security. For example, we can give different users, different set of
permissions to execute the stored procedure to work with the restricted set of the columns and data.
r. We should try to return an integer value from a RETURN statement instead of returning an integer value as a part
of a record set. As the RETURN statement exits from a stored procedure unconditionally, so the statements
following the RETURN statement are not executed. The RETURN statement is generally used for error checking,
but we can also use this statement to return an integer value for any other reason. Using RETURN statement can
improve performance as SQL Server will not create a record set.
s. Try to drop indexes that are not being used. Because each index takes up disk space and slow the DML operations,
we should drop indexes that are not used. We can use Index Wizard to identify indexes that are not being used in
our SQL queries.
t. We should try to create indexes on columns that have integer values rather than character values. Because the
integer values have less size than the size of characters values, so we can reduce the number of index pages which
are being used to store the index keys. This finally reduces the number of reads required to read the index and
enhances the overall index performance.
u. If we need to join several tables very frequently, then we should consider creating index on the joined columns
which can significantly improve performance of the queries against the joined tables.
v. Try to avoid any operations on the fields, where ever possible. Some operations will prevent the use of index on a
field even if it exists—for example, ltrim(rtrim(FieldColumnName)) as such operations will degrade the
performance. For example, instead of using the condition cast(DateColumn as varchar(20)) = @dateStr, we should
try to convert @dateStr to an expression of datetime type and then compare it to DateColumn value.