Program structure in long running data processing python script
        Posted  
        
            by fmark
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by fmark
        
        
        
        Published on 2010-05-27T13:11:20Z
        Indexed on 
            2010/05/27
            15:01 UTC
        
        
        Read the original article
        Hit count: 307
        
For my current job I am writing some long-running (think hours to days) scripts that do CPU intensive data-processing. The program flow is very simple - it proceeds into the main loop, completes the main loop, saves output and terminates: The basic structure of my programs tends to be like so:
<import statements>
<constant declarations>
<misc function declarations>
def main():
   for blah in blahs():
      <lots of local variables>
      <lots of tightly coupled computation>
      for something in somethings():
          <lots more local variables>
          <lots more computation>
   <etc., etc.>
   <save results>
if __name__ == "__main__":
    main()
This gets unmanageable quickly, so I want to refactor it into something more manageable. I want to make this more maintainable, without sacrificing execution speed.
Each chuck of code relies on a large number of variables however, so refactoring parts of the computation out to functions would make parameters list grow out of hand very quickly. Should I put this sort of code into a python class, and change the local variables into class variables? It doesn't make a great deal of sense tp me conceptually to turn the program into a class, as the class would never be reused, and only one instance would ever be created per instance.
What is the best practice structure for this kind of program? I am using python but the question is relatively language-agnostic, assuming a modern object-oriented language features.
© Stack Overflow or respective owner