Package to compare LSA, TFIDF, Cosine metrics and Language Models

Posted by gouwsmeister on Stack Overflow See other posts from Stack Overflow or by gouwsmeister
Published on 2009-10-12T21:12:48Z Indexed on 2010/05/02 17:58 UTC
Read the original article Hit count: 273

Hi,

I'm looking for a package (any language, really) that I can use on a corpus of 50 documents to perform interdocument similarity testing in various metrics, like tfidf, okapi, language models, lsa, etc.

I want as a result a document similarity matrix, i.e. doc1 is x% similar to doc2, etc... This is for research purposes, not for production. I specifically want the doc similarity matrix as I want to correlate this with human ratings.

Thank you in advance!

© Stack Overflow or respective owner

Related posts about document

Related posts about similarity