Cardinality Estimation Bug with Lookups in SQL Server 2008 onward

Posted by Paul White on SQL Blog See other posts from SQL Blog or by Paul White
Published on Mon, 15 Oct 2012 06:25:05 GMT Indexed on 2012/10/15 9:45 UTC
Read the original article Hit count: 409

Filed under:

Cost-based optimization stands or falls on the quality of cardinality estimates (expected row counts). If the optimizer has incorrect information to start with, it is quite unlikely to produce good quality execution plans except by chance. There are many ways we can provide good starting information to the optimizer, and even more ways for cardinality estimation to go wrong. Good database people know this, and work hard to write optimizer-friendly queries with a schema and metadata (e.g. statistics) that reduce the chances of poor cardinality estimation producing a sub-optimal plan. Today, I am going to look at a case where poor cardinality estimation is Microsoft’s fault, and not yours.

SQL Server 2005

     SELECT

        th.ProductID,

        th.TransactionID,

        th.TransactionDate

    FROM Production.TransactionHistory AS th 

    WHERE 

        th.ProductID = 1 

        AND th.TransactionDate BETWEEN '20030901' AND '20031231';

The query plan on SQL Server 2005 is as follows (if you are using a more recent version of AdventureWorks, you will need to change the year on the date range from 2003 to 2007):

There is an Index Seek on ProductID = 1, followed by a Key Lookup to find the Transaction Date for each row, and finally a Filter to restrict the results to only those rows where Transaction Date falls in the range specified. The cardinality estimate of 45 rows at the Index Seek is exactly correct. The table is not very large, there are up-to-date statistics associated with the index, so this is as expected.

The estimate for the Key Lookup is also exactly right. Each lookup into the Clustered Index to find the Transaction Date is guaranteed to return exactly one row. The plan shows that the Key Lookup is expected to be executed 45 times. The estimate for the Inner Join output is also correct – 45 rows from the seek joining to one row each time, gives 45 rows as output.

The Filter estimate is also very good: the optimizer estimates 16.9951 rows will match the specified range of transaction dates. Eleven rows are produced by this query, but that small difference is quite normal and certainly nothing to worry about here. All good so far.

SQL Server 2008 onward

The same query executed against an identical copy of AdventureWorks on SQL Server 2008 produces a different execution plan:

The optimizer has pushed the Filter conditions seen in the 2005 plan down to the Key Lookup. This is a good optimization – it makes sense to filter rows out as early as possible. Unfortunately, it has made a bit of a mess of the cardinality estimates.

The post-Filter estimate of 16.9951 rows seen in the 2005 plan has moved with the predicate on Transaction Date. Instead of estimating one row, the plan now suggests that 16.9951 rows will be produced by each clustered index lookup – clearly not right! This misinformation also confuses SQL Sentry Plan Explorer:

Plan Explorer shows 765 rows expected from the Key Lookup (it multiplies a rounded estimate of 17 rows by 45 expected executions to give 765 rows total).

Workarounds

One workaround is to provide a covering non-clustered index (avoiding the lookup avoids the problem of course):

    CREATE INDEX nc1 

    ON Production.TransactionHistory (ProductID) 

    INCLUDE (TransactionDate);

With the Transaction Date filter applied as a residual predicate in the same operator as the seek, the estimate is again as expected:

We could also force the use of the ultimate covering index (the clustered one):

    SELECT

        th.ProductID,

        th.TransactionID,

        th.TransactionDate

    FROM Production.TransactionHistory AS th WITH (INDEX(1))

    WHERE 

        th.ProductID = 1 

        AND th.TransactionDate BETWEEN '20030901' AND '20031231';

Summary

Providing a covering non-clustered index for all possible queries is not always practical, and scanning the clustered index will rarely be optimal. Nevertheless, these are the best workarounds we have today.

In the meantime, watch out for poor cardinality estimates when a predicate is applied as part of a lookup.

The worst thing is that the estimate after the lookup join in the 2008+ plans is wrong. It’s not hopelessly wrong in this particular case (45 versus 16.9951 is not the end of the world) but it easily can be much worse, and there’s not much you can do about it. Any decisions made by the optimizer after such a lookup could be based on very wrong information – which can only be bad news.

If you think this situation should be improved, please vote for this Connect item.

twitter: @SQL_Kiwi
email: [email protected]

Developer IT