Python: Script works, but seems to deadlock after some time

Posted by sberry2A on Stack Overflow See other posts from Stack Overflow or by sberry2A
Published on 2010-03-20T16:41:18Z Indexed on 2010/03/20 17:11 UTC
Read the original article Hit count: 282

I have the following script, which is working for the most part Link to PasteBin The script's job is to start a number of threads which in turn each start a subprocess with Popen. The output from each subprocess is as follows:

1
2
3
.
.
.
n
Done

Bascially the subprocess is transferring 10M records from tables in one database to different tables in another db with a lot of data massaging/manipulation in between because of the different schemas. If the subprocess fails at any time in it's execution (bad records, duplicate primary keys, etc), or it completes successfully, it will output "Done\n". If there are no more records to select against for transfer then it will output "NO DATA\n"

My intent was to create my script "tableTransfer.py" which would spawn a number of these processes, read their output, and in turn output information such as number of updates completed, time remaining, time elapsed, and number of transfers per second.

I started running the process last night and checked in this morning to see it had deadlocked. There were not subprocceses running, there are still records to be updated, and the script had not exited. It was simply sitting there, no longer outputting the current information because no subprocces were running to update the total number complete which is what controls updates to the output. This is running on OS X.

I am looking for three things:

  1. I would like to get rid of the possibility of this deadlock occurring so I don't need to check in on it as frequently. Is there some issue with locking?
  2. Am I doing this in a bad way (gThreading variable to control looping of spawning additional thread... etc.) I would appreciate some suggestions for improving my overall methodology.
  3. How should I handle ctrl-c exit? Right now I need to kill the process, but assume I should be able to use the signal module or other to catch the signal and kill the threads, is that right?

I am not sure whether I should be pasting my entire script here, since I usually just paste snippets. Let me know if I should paste it here as well.

© Stack Overflow or respective owner

Related posts about python

Related posts about multithreading