Using local variables in T-SQL queries

A query plan is a set of steps generated by the database engine to retrieve data. Query plans are produced by the query optimizer from SQL statements.

SQL Server automatically caches query plans and try to reuse them whenever possible. For many applications (such as OLTP transactional applications), plan reuse is a very good thing since it avoids unneeded compilations that may take much time to complete each time a query is executed. SQL Server caches query plans (execution plans based on parameter assumptions) but not execution contexts (execution plans based on the actual parameters values). If you execute a query or stored procedure several times per second, you want to reuse the query plan as much as possible. However, when querying large tables, using the optimal plan is preferable since the queries may take several minutes to complete. In these cases, it is obviously better to save minutes (sometime hours) with an optimal plan at the cost of that extra 1-second of plan compilation.

In many cases you may end up with a sub-optimal query plan because the queries are compiled before the actual parameter values are known. Such is the case when local variables are used. Let’s create a sales table with 1 million sales on July 1st and 5 sales on July 2th with a index on the sale date (tested on SQL Server 2008 R2):

SET NOCOUNT ON

-- Drop Sales Table
IF EXISTS(select 1 from INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME='SALES_TABLE')
DROP TABLE SALES_TABLE

-- Create Sales Table
CREATE TABLE SALES_TABLE(
[SALES_ID] [int] NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED
,SALE_DATE [datetime] NOT NULL
,SALE_AMOUNT [numeric](28,10) NOT NULL
)

-- Insert Sales Data
DECLARE @I INT = 0
BEGIN TRAN
WHILE @I < 1000000
BEGIN
INSERT INTO SALES_TABLE(SALE_DATE,SALE_AMOUNT) SELECT '20110701', RAND() * 100.0
SET @I=@I+1
END

SET @I=0
WHILE @I < 5
BEGIN
INSERT INTO SALES_TABLE(SALE_DATE,SALE_AMOUNT) SELECT '20110702', RAND() * 100.0
SET @I=@I+1
END
COMMIT TRAN

-- Create index on Sale Date
CREATE NONCLUSTERED INDEX [IX_SALE_DATE] ON [SALES_TABLE]
(
[SALE_DATE] ASC
)

Let’s summarized the sales for July 2th:

-- Query with constant value
set statistics profile on
SELECT SUM([SALE_AMOUNT]) FROM SALES_TABLE WHERE [SALE_DATE]='20110702'
set statistics profile off

You will notice in the results that the engine does an Index Seek on the IX_SALE_DATE index. This is the optimal plan since there is only 5 sales on July 2th. Now let’s declare the local variable @mydate and set it to July 2th:

-- Parametrized query with local variable
declare @mydate datetime = '20110702'
set statistics profile on
SELECT SUM([SALE_AMOUNT]) FROM SALES_TABLE WHERE [SALE_DATE]=@mydate
set statistics profile off

SQL Server does not use the optimal plan (it does a clustered index scan instead). Why is that? This is because the engine simply ignores the local variable value and compiles a plan based on general statistics assumptions. The compiled query plan is “good enough” for just about any value of @mydate (note that sometimes the query optimizer is way off, you must make sure that you have enough statistics and that they are up-to-date). When you are using local variables in your queries and you can afford to lose one or two seconds, you should force query recompilation using this syntax:

Use OPTION (RECOMPILE)

-- Parametrized query with local variable and OPTION (RECOMPILE)
declare @mydate datetime = '20110702'
set statistics profile on
SELECT SUM([SALE_AMOUNT]) FROM SALES_TABLE WHERE [SALE_DATE]=@mydate OPTION (RECOMPILE)
set statistics profile off

Note that the same principles applies to stored procedure compilations (you can use WITH RECOMPILE argument with the EXEC statement) when you want the procedure to recompile with the provided parameters, thus avoiding parameter sniffing).