SQL Server CTE referred in self joins slow

Posted by Kharlos Dominguez on Stack Overflow See other posts from Stack Overflow or by Kharlos Dominguez
Published on 2010-06-16T15:39:54Z Indexed on 2010/06/16 15:42 UTC
Read the original article Hit count: 247

Hello,

I have written a table-valued UDF that starts by a CTE to return a subset of the rows from a large table. There are several joins in the CTE. A couple of inner and one left join to other tables, which don't contain a lot of rows. The CTE has a where clause that returns the rows within a date range, in order to return only the rows needed.

I'm then referencing this CTE in 4 self left joins, in order to build subtotals using different criterias.

The query is quite complex but here is a simplified pseudo-version of it

WITH DataCTE as
(
     SELECT [columns] FROM table
                      INNER JOIN table2
                      ON [...]

                      INNER JOIN table3
                      ON [...]

                      LEFT JOIN table3
                      ON [...]
)
SELECT [aggregates_columns of each subset] FROM DataCTE Main
LEFT JOIN DataCTE BananasSubset
               ON [...] 
             AND Product = 'Bananas'
             AND Quality = 100
LEFT JOIN DataCTE DamagedBananasSubset
               ON [...]
             AND Product = 'Bananas'
             AND Quality < 20
LEFT JOIN DataCTE MangosSubset
               ON [...]
GROUP BY [

I have the feeling that SQL Server gets confused and calls the CTE for each self join, which seems confirmed by looking at the execution plan, although I confess not being an expert at reading those.

I would have assumed SQL Server to be smart enough to only perform the data retrieval from the CTE only once, rather than do it several times.

I have tried the same approach but rather than using a CTE to get the subset of the data, I used the same select query as in the CTE, but made it output to a temp table instead.

The version referring the CTE version takes 40 seconds. The version referring the temp table takes between 1 and 2 seconds.

Why isn't SQL Server smart enough to keep the CTE results in memory?

I like CTEs, especially in this case as my UDF is a table-valued one, so it allowed me to keep everything in a single statement.

To use a temp table, I would need to write a multi-statement table valued UDF, which I find a slightly less elegant solution.

Did some of you had this kind of performance issues with CTE, and if so, how did you get them sorted?

Thanks,

Kharlos

© Stack Overflow or respective owner

Related posts about sql-server

Related posts about Performance