spam and dirty words comment post filtering in python (django)

Posted by sintaloo on Stack Overflow See other posts from Stack Overflow or by sintaloo
Published on 2010-05-20T08:07:34Z Indexed on 2010/05/20 8:10 UTC
Read the original article Hit count: 300

Hi All,

My basic question is how to filter spam and dirty words in a comment post system under python (django).

I have a collection of phrases (approximately 3000 phrases) to be filtered.

Question (1), are there any existing open source python (or django) package/module/plugin which can handle this job? I knew there was one called Akismet. But from what I understood, it will not solve my problem. Akismet is just a web service and filter the words dictionary defined by Akismet. But I have my own collection of words. Please correct me if I am wrong.

Question (2), If there is no such open source package I can use, how to create my own one? The only thing I can think of it's to use regular expression and join all the word phrases with 'or' in a regular expression. but I have 3000 phrases, I think it won't work in term of performance and filter every comment post. any suggestions where should I start from?

Thank you very much for your help and time.

© Stack Overflow or respective owner

Related posts about django

Related posts about python