Distinctly LINQ – Getting a Distinct List of Objects

Posted by David Totzke on Geeks with Blogs See other posts from Geeks with Blogs or by David Totzke
Published on Mon, 06 Dec 2010 15:13:50 GMT Indexed on 2010/12/06 16:57 UTC
Read the original article Hit count: 864

Filed under:
Let’s say that you have a list of objects that contains duplicate items and you want to extract a subset of distinct items.  This is pretty straight forward in the trivial case where the duplicate objects are considered the same such as in the following example:
    List<int> ages = new List<int> { 21, 46, 46, 55, 17, 21, 55, 55 };
    IEnumerable<int> distinctAges = ages.Distinct();
    Console.WriteLine("Distinct ages:");
    foreach (int age in distinctAges)
    {
        Console.WriteLine(age);
    }
    /*
    This code produces the following output:
    Distinct ages:
    21
    46
    55
    17

    */

What if you are working with reference types instead?  Imagine a list of search results where items in the results, while unique in and of themselves, also point to a parent.  We’d like to be able to select a bunch of items in the list but then see only a distinct list of parents.  Distinct isn’t going to help us much on its own as all of the items are distinct already.  Perhaps we can create a class with just the information we are interested in like the Id and Name of the parents. 
    public class SelectedItem 
    {
        public int ItemID { get; set; }
        public string DisplayName { get; set; }

    }

We can then use LINQ to populate a list containing objects with just the information we are interested in and then get rid of the duplicates.

    IEnumerable<SelectedItem> list =
        (from item in ResultView.SelectedRows.OfType<Contract.ReceiptSelectResults>()
            select new SelectedItem { ItemID = item.ParentId, DisplayName = item.ParentName })
            .Distinct();

Most of you will have guessed that this didn’t work.  Even though some of our objects are now duplicates, because we are working with reference types, it doesn’t matter that their properties are the same, they’re still considered unique.  What we need is a way to define equality for the Distinct() extension method.

IEqualityComparer<T>

Looking at the Distinct method we see that there is an overload that accepts an IEqualityComparer<T>.  We can simply create a class that implements this interface and that allows us to define equality for our SelectedItem class.

    public class SelectedItemComparer : IEqualityComparer<SelectedItem>
    {

        public new bool Equals(SelectedItem abc, SelectedItem def)
        {
            return abc.ItemID == def.ItemID && abc.DisplayName == def.DisplayName;
        }

        public int GetHashCode(SelectedItem obj)
        {
            string code = obj.DisplayName + obj.ItemID.ToString();

            return code.GetHashCode();

        }

    }

In the Equals method we simply do whatever comparisons are necessary to determine equality and then return true or false.  Take note of the implementation of the GetHashCode method.  GetHashCode must return the same value for two different objects if our Equals method says they are equal.  Get this wrong and your comparer won’t work .  Even though the Equals method returns true, mismatched hash codes will cause the comparison to fail.  For our example, we simply build a string from the properties of the object and then call GetHashCode() on that.

Now all we have to do is pass an instance of our IEqualitlyComarer<T> to Distinct and all will be well:

IEnumerable<SelectedItem> list =
    (from item in ResultView.SelectedRows.OfType<Contract.ReceiptSelectResults>()
        select new SelectedItem { ItemID = item.dahfkp, DisplayName = item.document_code })
                        .Distinct(new SelectedItemComparer());

 

Enjoy.

Dave
Just because I can…

Technorati Tags: ,

© Geeks with Blogs or respective owner