How to store and collect data for mining such information as most viewed for last 24 hours, last 7 d

Posted by Kirzilla on Stack Overflow See other posts from Stack Overflow or by Kirzilla
Published on 2010-06-07T15:00:00Z Indexed on 2010/06/07 15:02 UTC
Read the original article Hit count: 151

Hello,

Let's imagine that we have high traffic project (a tube site) which should provide sorting using this options (NOT IN REAL TIME). Number of videos is about 200K and all information about videos is stored in MySQL. Number of daily video views is about 1.5KK. As instruments we have Hard Disk Drive (text files), MySQL, Redis.

Views
 top viewed
 top viewed last 24 hours
 top viewed last 7 days
 top viewed last 30 days
 top rated last 365 days

How should I store such information?

The first idea is to log all visits to text files (single file per hour, for example visits_20080101_00.log). At the beginning of each hour calculate views per video for previous hour and insert this information into MySQL. Then recalculate totals (for last 24 hours) and update statistics in tables. At the beginning of every day we have to do the same but recalculate for last 7 days, last 30 days, last 365 days. This method seems to be very poor for me because we have to store information about last 365 days for each video to make correct calculations.

Is there any other good methods? Probably, we have to choose another instruments for this?

Thank you.

© Stack Overflow or respective owner

Related posts about algorithm

Related posts about data-structures