DRBD stacked resources: recovering from failure

Posted by Marcus Downing on Server Fault See other posts from Server Fault or by Marcus Downing
Published on 2011-01-10T16:32:11Z Indexed on 2011/01/10 16:55 UTC
Read the original article Hit count: 219

Filed under:

We're running a stacked four-node DRBD setup like this:

A  -->  B
|       |
v       v
C       D

This means three DRBD resources running across these four servers. Servers A and B are Xen hosts running VMs, while servers C and D are for backups. A is in the same datacentre as C.

  1. From server A to server C, in the first datacentre, using protocol B
  2. From server B to server D, in the second datacentre, using protocol B
  3. From server A to server B, different datacentres, stacked resource using protocol A

First question: booting a stacked resource

We haven't got any vital data running on this setup yet - we're still making sure it works first. This means simulating power cuts, network outages etc and seeing what steps we need to recover.

When we pull the power out of server A, both resources go down; it attempts to bring them back up at next boot. However, it only succeeds at bringing up the lower-level resource, A->C. The stacked resource A->B doesn't even try to connect, presumably because it can't find the device until it's a connected primary on the lower level.

So if anything goes wrong we need to manually log in and bring that resource up, then start the virtual machine on top of it.

Second question: setting the primary of a stacked resource

Our lower-level resources are configured so that the right one is considered primary:

resource test-AC {
    on A { ... }

    on C { ... }

    startup {
        become-primary-on   A;
    }
}

But I don't see any way to do the same with a stacked resource, as the following isn't a valid config:

resource test-AB {
    stacked-on-top-of test-AC { ... }

    stacked-on-top-of test-BD { ... }

    startup {
        become-primary-on   test-AC;
    }
}

This too means that recovering from a failure requires manual intervention. Is there no way to set the automatic primary for a stacked resource?

© Server Fault or respective owner

Related posts about drbd