How to estimate the quality of a web page?

Posted by roddik on Stack Overflow See other posts from Stack Overflow or by roddik
Published on 2010-05-01T07:01:11Z Indexed on 2010/05/01 7:07 UTC
Read the original article Hit count: 228

Filed under:
|
|

Hello, I'm doing a university project, that must gather and combine data on a user provided topic. The problem I've encountered is that Google search results for many terms are polluted with low quality autogenerated pages and if I use them, I can end up with wrong facts. How is it possible to estimate the quality/trustworthiness of a page?

You may think "nah, Google engineers are working on the problem for 10 years and he's asking for a solution", but if you think about it, SE must provide up-to-date content and if it marks a good page as a bad one, users will be dissatisfied. I don't have such limitations, so if the algorithm accidentally marks as bad some good pages, that wouldn't be a problem.

© Stack Overflow or respective owner

Related posts about nlp

Related posts about data