Algorithm to classify a list of products?

Posted by Martin on Stack Overflow See other posts from Stack Overflow or by Martin
Published on 2009-03-29T20:44:05Z Indexed on 2010/05/22 20:41 UTC
Read the original article Hit count: 259

Filed under:
|

I have a list representing products which are more or less the same. For instance, in the list below, they are all Seagate hard drives.

  1. Seagate Hard Drive 500Go
  2. Seagate Hard Drive 120Go for laptop
  3. Seagate Barracuda 7200.12 ST3500418AS 500GB 7200 RPM SATA 3.0Gb/s Hard Drive
  4. New and shinny 500Go hard drive from Seagate
  5. Seagate Barracuda 7200.12
  6. Seagate FreeAgent Desk 500GB External Hard Drive Silver 7200RPM USB2.0 Retail

For a human being, the hard drives 3 and 5 are the same. We could go a little bit further and suppose that the products 1, 3, 4 and 5 are the same and put in other categories the product 2 and 6.

We have a huge list of products that I would like to classify. Does anybody have an idea of what would be the best algorithm to do such thing. Any suggestions?

I though of a Bayesian classifier but I am not sure if it is the best choice. Any help would be appreciated!

Thanks.

© Stack Overflow or respective owner

Related posts about algorithm

Related posts about nlp