Remove duplicate records/objects uniquely identified by multiple attributes

Posted by keruilin on Stack Overflow See other posts from Stack Overflow or by keruilin
Published on 2010-05-07T20:10:02Z Indexed on 2010/05/07 20:18 UTC
Read the original article Hit count: 256

I have a model called HeroStatus with the following attributes:

  • id
  • user_id
  • recordable_type
  • hero_type (can be NULL!)
  • recordable_id
  • created_at

There are over 100 hero_statuses, and a user can have many hero_statuses, but can't have the same hero_status more than once.

A user's hero_status is uniquely identified by the combination of recordable_type + hero_type + recordable_id. What I'm trying to say essentially is that there can't be a duplicate hero_status for a specific user.

Unfortunately, I didn't have a validation in place to assure this, so I got some duplicate hero_statuses for users after I made some code changes. For example:

user_id = 18
recordable_type = 'Evil'
hero_type = 'Halitosis'
recordable_id = 1
created_at = '2010-05-03 18:30:30'

user_id = 18
recordable_type = 'Evil'
hero_type = 'Halitosis'
recordable_id = 1
created_at = '2009-03-03 15:30:00'

user_id = 18
recordable_type = 'Good'
hero_type = 'Hugs'
recordable_id = 1
created_at = '2009-02-03 12:30:00'

user_id = 18
recordable_type = 'Good'
hero_type = NULL
recordable_id = 2
created_at = '2009-012-03 08:30:00'

(Last two are not a dups obviously. First two are.) So what I want to do is get rid of the duplicate hero_status. Which one? The one with the most-recent date.

I have three questions:

  1. How do I remove the duplicates using a SQL-only approach?

  2. How do I remove the duplicates using a pure Ruby solution? Something similar to this: http://stackoverflow.com/questions/2790004/removing-duplicate-objects.

  3. How do I put a validation in place to prevent duplicate entries in the future?

© Stack Overflow or respective owner

Related posts about ruby-on-rails

Related posts about ruby