Large scale file replication with an option to "unsubscribe" from a replicated file on a given machine

Posted by Alexander Gladysh on Server Fault See other posts from Server Fault or by Alexander Gladysh
Published on 2013-06-29T21:38:30Z Indexed on 2013/06/29 22:22 UTC
Read the original article Hit count: 193

Filed under:
|
|

I have a 100+ GB files per day incoming on one machine. (File size is arbitrary and can be adjusted as needed.)

I have several other machines that do some work on these files.

I need to reliably deliver each incoming file to the worker machines. A worker machine should be able to free its HDD from a file once it is done working with it.

It is preferable that a file would be uploaded to the worker only once and then processed in place, and then deleted, without copying somewhere else — to minimize already high HDD load. (Worker itself requires quite a bit of bandwidth.)

Please advise a solution that is not based on Java. None of existing replication solutions that I've seen can do the "free HDD from the file once processed" stuff — but maybe I'm missing something...

A preferable solution should work with files (from the POV of our business logic code), not require the business logic to connect to some queue or other. (Internally the solution may use whatever technology it needs to — except Java.)

© Server Fault or respective owner

Related posts about ubuntu

Related posts about replication