Verify uniqueness of new content

Posted by rogerkk on Programmers See other posts from Programmers or by rogerkk
Published on 2012-11-15T09:37:50Z Indexed on 2012/11/15 11:22 UTC
Read the original article Hit count: 232

Filed under:
|
|

I'm working on a review site, where there is a minor issue with almost duplicate reviews across items. Just a few words are changed. It would be very nice to be able to uncover these duplicates before they are approved by a moderator, and I'm hoping someone could chime in on the best strategy to get there.

The site is running Ruby on Rails on a Postgres database and using Thinking Sphinx for search (all on Heroku), and so far the best option I see is to be pulling all the reviews out of the db and using a module like amatch to compare the strings. Not very efficient, so in this case I guess I'll have to limit the number/age of reviews to scan for dupes.

Anyone got a better idea?

© Programmers or respective owner

Related posts about ruby-on-rails

Related posts about search