How to schedule hundreds of thousands of tasks?
        Posted  
        
            by wehriam
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by wehriam
        
        
        
        Published on 2010-03-16T21:28:27Z
        Indexed on 
            2010/03/16
            21:31 UTC
        
        
        Read the original article
        Hit count: 188
        
python
We have hundreds of thousands of tasks that need to be run at a variety of arbitrary intervals, some every hour, some every day, and so on. The tasks are resource intensive and need to be distributed across many machines.
Right now tasks are stored in a database with an "execute at this time" timestamp. To find tasks that need to be executed, we query the database for jobs that are due to be executed, then update the timestamps when the task is complete. Naturally this leads to a substantial write load on the database.
As far as I can tell, we are looking for something to release tasks into a queue at a set interval. (Workers could then request tasks from that queue.)
What is the best way to schedule recurring tasks at scale?
For what it's worth we're largely using Python, although we have no problems using components (RabbitMQ?) written in other languages.
© Stack Overflow or respective owner