Python: Catching / blocking SIGINT during system call
- by danben
I've written a web crawler that I'd like to be able to stop via the keyboard.  I don't want the program to die when I interrupt it; it needs to flush its data to disk first.  I also don't want to catch KeyboardInterruptedException, because the persistent data could be in an inconsistent state.
My current solution is to define a signal handler that catches SIGINT and sets a flag; each iteration of the main loop checks this flag before processing the next url.
However, I've found that if the system happens to be executing socket.recv() when I send the interrupt, I get this:
^C
Interrupted; stopping...  // indicates my interrupt handler ran
Traceback (most recent call last):
  File "crawler_test.py", line 154, in <module>
    main()
  ...
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/socket.py", line 397, in readline
    data = recv(1)
socket.error: [Errno 4] Interrupted system call
and the process exits completely.  Why does this happen?  Is there a way I can prevent the interrupt from affecting the system call?