A very interesting MYSQL problem (related to indexing, million records, algorithm.)

Posted by terence410 on Stack Overflow See other posts from Stack Overflow or by terence410
Published on 2010-05-03T17:34:07Z Indexed on 2010/05/03 17:38 UTC
Read the original article Hit count: 202

Filed under:

mysql

|

index

This problem is pretty hard to describe and therefore difficult to search the answer. I hope some expert could share you opinions on that.

I have a table with around 1 million of records. The table structure is similar to something like this:

items{ uid (primary key, bigint, 15) updated (indexed, int, 11) enabled (indexed, tinyint, 1) }

The scenario is like this. I have to select all of the records everyday and do some processing. It takes around 3 second to handle each item.

I have written a PHP script to fetch 200 items each time using the following.

select * from items where updated > unix_timestamp(now()) - 86400 and enabled = 1 limit 200;

I will then update the "updated" field of the selected items to make sure that it wont' be selected again within one day. The selected query is something like that.

update items set updated = unix_timestamp(now()) where uid in (1,2,3,4,...);

Then, the PHP will continue to run and process the data which doesn't require any MYSQL connection anymore.

Since I have million records and each record take 3 seconds to process, it's definitely impossible to do it sequentially. Therefore, I will execute the PHP in every 10 seconds.

However, as time goes by and the table growth, the select getting much slower. Sometimes, it take more than 100 seconds to run!

Do you guys have any suggestion how may I solve this problem?

Developer IT