Which is the fastest way to move 1Petabyte from one storage to a new one?

Posted by marc.riera on Server Fault See other posts from Server Fault or by marc.riera
Published on 2012-04-03T21:27:08Z Indexed on 2012/04/03 23:31 UTC
Read the original article Hit count: 301

Filed under:
|
|
|

First of all, thanks for reading, and sorry for asking something related to my job. I understand that this is something that I should solve by myself but as you will see its something a bit difficult.

A small description:

Now

Storage => 1PB using DDN S2A9900 storage for the OSTs, 4 OSS , 10 GigE network. (lustre 1.6)

100 compute nodes with 2x Infiniband

1 infiniband switch with 36 ports

After

Storage => Previous storage + another 1PB using DDN S2A 990 or LSI E5400 (still to decide) (lustre 2.0)

8 OSS , 10GigE network

100 compute nodes with 2x Infiniband

Previous experience: transfered 120 TB in less than 3 days using following command:

 tar -C /old --record-size 2048 -b 2048 -cf - dir | tar -C /new
--record-size 2048 -b 2048 -xvf - 2>&1 | tee /tmp/dir.log

So , big problem here, using big mathematical equations I conclude that we are going to need 1 month to transfer the data from one side to the new one. During this time the researchers will need to step back, and I'm personally not happy with this.

I'm telling you that we have infiniband connections because I think that may be there is a chance to use it to transfer the data using 18 compute nodes (18 * 2 IB = 36 ports) to transfer the data from one storage to the other. I'm trying to figure out if the IB switch will handle all the traffic but in case it just burn up will go faster than using 10GigE.

Also, having lustre 1.6 and 2.0 agents on same server works quite well, with this there is no need to go by 1.8 to upgrade the metadata servers with two steps.

Any ideas?

Many thanks

Note 1: Zoredache, we can divide it in two blocks (A)600Tb and (B)400Tb. The idea is to move (A) to new storage which is lustre2.0 formated, then format where (A) was with lustre2.0 and move (B) to this lustre2.0 block and extend with the space where (B) was.

This way we will end with (A) and (B) on separate filesystems, with 1PB each.

© Server Fault or respective owner

Related posts about linux

Related posts about storage