Find The Bug

Posted by Alois Kraus on Geeks with Blogs See other posts from Geeks with Blogs or by Alois Kraus
Published on Sun, 06 Mar 2011 21:07:05 GMT Indexed on 2011/03/07 0:11 UTC
Read the original article Hit count: 450

Filed under:

What does this code print and why?

            HashSet<int> set = new HashSet<int>();
            int[] data = new int[] { 1, 2, 1, 2 };
            var unique = from i in data
                         where set.Add(i)
                         select i;
  // Compiles to: var unique = Enumerable.Where(data, (i) => set.Add(i));
            foreach (var i in unique)
            {
                Console.WriteLine("First: {0}", i);
            }
 
            foreach (var i in unique)
            {
                Console.WriteLine("Second: {0}", i);
            }

 

The output is:

First: 1
First: 2

Why is there no output of the second loop? The reason is that LINQ does not cache the results of the collection but it does recalculate the contents for every new enumeration again. Since I have used state (the Hashset does decide which entries are part of the output) I do arrive with an empty sequence since Add of the Hashset will return false for all values I have already passed in leaving nothing to return a second time.

The solution is quite simple: Use the Distinct extension method or cache the results by calling .ToList() or ToArray() for the result of the LINQ query.

Lession Learned: Do never forget to think about state in Where clauses!

© Geeks with Blogs or respective owner