Search Results

Search found 78 results on 4 pages for 'drbd'.

Page 1/4 | 1 2 3 4  | Next Page >

  • Kernel panic when bringing up DRBD resource

    - by sc.
    I'm trying to set up two machines synchonizing with DRBD. The storage is setup as follows: PV - LVM - DRBD - CLVM - GFS2. DRBD is set up in dual primary mode. The first server is set up and running fine in primary mode. The drives on the first server have data on them. I've set up the second server and I'm trying to bring up the DRBD resources. I created all the base LVM's to match the first server. After initializing the resources with `` drbdadm create-md storage I'm bringing up the resources by issuing drbdadm up storage After issuing that command, I get a kernel panic and the server reboots in 30 seconds. Here's a screen capture. My configuration is as follows: OS: CentOS 6 uname -a Linux host.structuralcomponents.net 2.6.32-279.5.2.el6.x86_64 #1 SMP Fri Aug 24 01:07:11 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux rpm -qa | grep drbd kmod-drbd84-8.4.1-2.el6.elrepo.x86_64 drbd84-utils-8.4.1-2.el6.elrepo.x86_64 cat /etc/drbd.d/global_common.conf global { usage-count yes; # minor-count dialog-refresh disable-ip-verification } common { handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o /proc/sysrq-trigger ; halt -f"; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb become-primary-on both; wfc-timeout 30; degr-wfc-timeout 10; outdated-wfc-timeout 10; } options { # cpu-mask on-no-data-accessible } disk { # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes # disk-drain md-flushes resync-rate resync-after al-extents # c-plan-ahead c-delay-target c-fill-target c-max-rate # c-min-rate disk-timeout } net { # protocol timeout max-epoch-size max-buffers unplug-watermark # connect-int ping-int sndbuf-size rcvbuf-size ko-count # allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri # after-sb-1pri after-sb-2pri always-asbp rr-conflict # ping-timeout data-integrity-alg tcp-cork on-congestion # congestion-fill congestion-extents csums-alg verify-alg # use-rle protocol C; allow-two-primaries yes; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } } cat /etc/drbd.d/storage.res resource storage { device /dev/drbd0; meta-disk internal; on host.structuralcomponents.net { address 10.10.1.120:7788; disk /dev/vg_storage/lv_storage; } on host2.structuralcomponents.net { address 10.10.1.121:7788; disk /dev/vg_storage/lv_storage; } /var/log/messages is not logging anything about the crash. I've been trying to find a cause of this but I've come up with nothing. Can anyone help me out? Thanks.

    Read the article

  • Xen DomU on DRBD device: barrier errors

    - by Halfgaar
    I'm testing setting up a Xen DomU with a DRBD storage for easy failover. Most of the time, immediatly after booting the DomU, I get an IO error: [ 3.153370] EXT3-fs (xvda2): using internal journal [ 3.277115] ip_tables: (C) 2000-2006 Netfilter Core Team [ 3.336014] nf_conntrack version 0.5.0 (3899 buckets, 15596 max) [ 3.515604] init: failsafe main process (397) killed by TERM signal [ 3.801589] blkfront: barrier: write xvda2 op failed [ 3.801597] blkfront: xvda2: barrier or flush: disabled [ 3.801611] end_request: I/O error, dev xvda2, sector 52171168 [ 3.801630] end_request: I/O error, dev xvda2, sector 52171168 [ 3.801642] Buffer I/O error on device xvda2, logical block 6521396 [ 3.801652] lost page write due to I/O error on xvda2 [ 3.801755] Aborting journal on device xvda2. [ 3.804415] EXT3-fs (xvda2): error: ext3_journal_start_sb: Detected aborted journal [ 3.804434] EXT3-fs (xvda2): error: remounting filesystem read-only [ 3.814754] journal commit I/O error [ 6.973831] init: udev-fallback-graphics main process (538) terminated with status 1 [ 6.992267] init: plymouth-splash main process (546) terminated with status 1 The manpage of drbdsetup says that LVM (which I use) doesn't support barriers (better known as tagged command queuing or native command queing), so I configured the drbd device not to use barriers. This can be seen in /proc/drbd (by "wo:f, meaning flush, the next method drbd chooses after barrier): 3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- ns:2160152 nr:520204 dw:2680344 dr:2678107 al:3549 bm:9183 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 And on the other host: 3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r---- ns:0 nr:2160152 dw:2160152 dr:0 al:0 bm:8052 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 I also enabled the option disable_sendpage, as per the drbd docs: cat /sys/module/drbd/parameters/disable_sendpage Y I also tried adding barriers=0 to fstab as mount option. Still it sometimes says: [ 58.603896] blkfront: barrier: write xvda2 op failed [ 58.603903] blkfront: xvda2: barrier or flush: disabled I don't even know if ext3 has a nobarrier option. And, because only one of my storage systems is battery backed, it would not be smart anyway. Why does it still compain about barriers when I disabled that? Both host are: Debian: 6.0.4 uname -a: Linux 2.6.32-5-xen-amd64 drbd: 8.3.7 Xen: 4.0.1 Guest: Ubuntu 12.04 LTS uname -a: Linux 3.2.0-24-generic pvops drbd resource: resource drbdvm { meta-disk internal; device /dev/drbd3; startup { # The timeout value when the last known state of the other side was available. 0 means infinite. wfc-timeout 0; # Timeout value when the last known state was disconnected. 0 means infinite. degr-wfc-timeout 180; } syncer { # This is recommended only for low-bandwidth lines, to only send those # blocks which really have changed. #csums-alg md5; # Set to about half your net speed rate 60M; # It seems that this option moved to the 'net' section in drbd 8.4. (later release than Debian has currently) verify-alg md5; } net { # The manpage says this is recommended only in pre-production (because of its performance), to determine # if your LAN card has a TCP checksum offloading bug. #data-integrity-alg md5; } disk { # Detach causes the device to work over-the-network-only after the # underlying disk fails. Detach is not default for historical reasons, but is # recommended by the docs. # However, the Debian defaults in drbd.conf suggest the machine will reboot in that event... on-io-error detach; # LVM doesn't support barriers, so disabling it. It will revert to flush. Check wo: in /proc/drbd. If you don't disable it, you get IO errors. no-disk-barrier; } on host1 { # universe is a VG disk /dev/universe/drbdvm-disk; address 10.0.0.1:7792; } on host2 { # universe is a VG disk /dev/universe/drbdvm-disk; address 10.0.0.2:7792; } } DomU cfg: bootloader = '/usr/lib/xen-default/bin/pygrub' vcpus = '2' memory = '512' # # Disk device(s). # root = '/dev/xvda2 ro' disk = [ 'phy:/dev/drbd3,xvda2,w', 'phy:/dev/universe/drbdvm-swap,xvda1,w', ] # # Hostname # name = 'drbdvm' # # Networking # # fake IP for posting vif = [ 'ip=1.2.3.4,mac=00:16:3E:22:A8:A7' ] # # Behaviour # on_poweroff = 'destroy' on_reboot = 'restart' on_crash = 'restart' In my test setup: the primary host's storage is 9650SE SATA-II RAID PCIe with battery. The secondary is software RAID1. Isn't DRBD+Xen widely used? With these problems, it's not going to work.

    Read the article

  • DRBD Not syncing between my nodes

    - by Mike Curry
    Some version info: Operating system is Ubuntu 11.10, on EC2, kernel is 3.0.0-16-virtual and the application info is: Version: 8.3.11 (api:88) GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by [email protected], 2011-07-05 19:51:07 Getting some strange errors in dmesg (seen below) as well, there is no replication happening. I have made my first node primary and its showing: drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro ds p mounted fstype 0:r0 StandAlone Primary/Unknown UpToDate/DUnknown r----s ext3 my secondary node is showing: drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro ds p mounted fstype 0:r0 StandAlone Secondary/Unknown Inconsistent/DUnknown r----s Showing /proc/drbd on the master shows: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----s ns:0 nr:0 dw:4 dr:1073 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:262135964 Showing /proc/drbd on the slave shows that there is nothing being transfered... version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 0: cs:StandAlone ro:Secondary/Unknown ds:Inconsistent/DUnknown r----s ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:262135964 Here is my config... resource r0 { protocol C; startup { wfc-timeout 15; degr-wfc-timeout 60; } net { cram-hmac-alg sha1; shared-secret "test123; } on drbd01 { device /dev/drbd0; disk /dev/xvdm; address 23.XX.XX.XX:7788; # blocked out ip meta-disk internal; } on drbd02 { device /dev/drbd0; disk /dev/xvdm; address 184.XX.XX.XX:7788; #blocked out ip meta-disk internal; } } I have run the following on the master: sudo drbdadm -- --overwrite-data-of-peer primary all There is no firewall between the systems. Here is the dmesg with some errors: [2285172.969955] drbd: initialized. Version: 8.3.11 (api:88/proto:86-96) [2285172.969960] drbd: srcversion: DA5A13F16DE6553FC7CE9B2 [2285172.969962] drbd: registered as block device major 147 [2285172.969965] drbd: minor_table @ 0xffff88000276ea00 [2285173.000952] block drbd0: Starting worker thread (from drbdsetup [1300]) [2285173.003971] block drbd0: disk( Diskless -> Attaching ) [2285173.006150] block drbd0: No usable activity log found. [2285173.006154] block drbd0: Method to ensure write ordering: flush [2285173.006158] block drbd0: max BIO size = 4096 [2285173.006165] block drbd0: drbd_bm_resize called with capacity == 524271928 [2285173.008512] block drbd0: resync bitmap: bits=65533991 words=1023969 pages=2000 [2285173.008518] block drbd0: size = 250 GB (262135964 KB) [2285173.079566] block drbd0: bitmap READ of 2000 pages took 17 jiffies [2285173.081189] block drbd0: recounting of set bits took additional 1 jiffies [2285173.081194] block drbd0: 250 GB (65533991 bits) marked out-of-sync by on disk bit-map. [2285173.081203] block drbd0: Suspended AL updates [2285173.081210] block drbd0: disk( Attaching -> UpToDate ) [2285173.081214] block drbd0: attached to UUIDs 1C1291D39584C1D1:0000000000000004:0000000000000000:0000000000000000 [2285173.095016] block drbd0: conn( StandAlone -> Unconnected ) [2285173.095046] block drbd0: Starting receiver thread (from drbd0_worker [1301]) [2285173.099297] block drbd0: receiver (re)started [2285173.099304] block drbd0: conn( Unconnected -> WFConnection ) [2285173.099330] block drbd0: bind before connect failed, err = -99 [2285173.099346] block drbd0: conn( WFConnection -> Disconnecting ) [2285173.295788] block drbd0: Discarding network configuration. [2285173.295815] block drbd0: Connection closed [2285173.295826] block drbd0: conn( Disconnecting -> StandAlone ) [2285173.295840] block drbd0: receiver terminated [2285173.295844] block drbd0: Terminating drbd0_receiver Edit: Reading some other similar issues, it was suggested to do a 'drbdadm dump all', so I figured it couldn't hurt. [email protected]:~$ drbdadm dump all /etc/drbd.conf:19: in resource r0, on drbd01: IP 23.XX.XX.XX not found on this host. and on slave: [email protected]:~# drbdadm dump all /etc/drbd.conf:25: in resource r0, on drbd02: IP 184.XX.XX.XX not found on this host. Strange it doesn't find its own ip, however, this is an Amazon EC2 system using an elastic IP... here are my ipconfigs for both... master: [email protected]:~$ ifconfig eth0 Link encap:Ethernet HWaddr 22:00:0a:1c:27:11 inet addr:10.28.39.17 Bcast:10.28.39.63 Mask:255.255.255.192 inet6 addr: fe80::2000:aff:fe1c:2711/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1569 errors:0 dropped:0 overruns:0 frame:0 TX packets:1169 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:124409 (124.4 KB) TX bytes:213601 (213.6 KB) Interrupt:26 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) slave: [email protected]:~# ifconfig eth0 Link encap:Ethernet HWaddr 12:31:3f:00:14:9d inet addr:10.160.27.107 Bcast:10.160.27.255 Mask:255.255.254.0 inet6 addr: fe80::1031:3fff:fe00:149d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:915 errors:0 dropped:0 overruns:0 frame:0 TX packets:774 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:75381 (75.3 KB) TX bytes:109673 (109.6 KB) Interrupt:26 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

    Read the article

  • mysql with DRBD on rackspace

    - by Richard Castle
    I am trying to set up a failover secondary MySQL server that is a mirror of my primary MySQL server using DRBD. The problem is that I am on a rackspace cloud server and I need a second partition on both the primary and secondary servers that I will replicate with DRBD. Rackspace does not allow me to create a second partition. I am left with the default single partition. How can I mirror using DRBD?

    Read the article

  • drbd block device as storage for kvm virtual machine

    - by facha
    Hi, everyone I've setup a drbd replication between two machines and used a drbd block device as storage for a kvm machine. Everything is running well. However I'm in doubt if this setup is ok to use. From what I've read so far on the internet, people tend to use drbd-ospf-qcow2_file as storage for their virtual machines.

    Read the article

  • Proxmox drbd configuration split brain [on hold]

    - by AudioDan
    I am planning a proxmox HA configuration with two Dell R710 machines (dual 6 core processors in each) with enterprise level drive raid arrays. I would be using DRBD with a quorum disk on a third machine. I would dedicate two 1GB nics on each server to the DRBD communications. We would have approximately 12 to 14 Virtual Machines running on this pair of servers. The proxmox manual recommends creating two DRBD resources - one for the Virtual Machines that normally run on ServerA and one for the Virtual Machines that normally run on ServerB. This is because of the Primary/Primary state in which this configuration runs. If both servers have VMs talking to the same DRBD resource and a split brain situation occurs, there is potential for data corruption that must be resolved. While I understand it would take more effort to create new virtual machines, can anybody foresee any potential problems with running a separate DRBD resource for each VM instead? Does anyone have experience running a setup that way and has it worked well? It seems to me that would allow more flexibility in moving machines back and forth.

    Read the article

  • Pygrub with DRBD on Xen 3.2

    - by Joril
    Hi all, we have a two-node cluster using DRBD 8.2 on CentOS 5.2 64bit. The cluster runs a few VMs on top of Xen 3.2.1, here's the configuration for an Ubuntu Jaunty VM: name = 'dev' bootloader = '/usr/bin/pygrub' memory = '512' vif = [ 'ip=192.168.1.217,mac=00:16:3E:CD:60:80' ] disk = [ 'phy:/dev/drbd24,xvda1,w', 'phy:/dev/drbd25,xvda2,w' ] As you can see, the disks are specified like "phy:", and as such pygrub doesn't know a thing about the underlying drbd device... So my problem is that even though the VM boots just fine, it doesn't handle the state of the drbd device. As a result, when for some reason the device gets to a secondary/secondary state, the VM won't boot, and I have to manually specify which node is primary. I read that starting with Xen 3.3 pygrub understands the "drbd:" specification, and I think that it would fix my problem, but I can't upgrade Xen at the moment... Is there a workaround? For example, could I use the 3.3 version of pygrub? Thanks!

    Read the article

  • DRBD not syncing between my nodes when IP is reset

    - by ramdaz
    I am trying to setup DRBD by following the article at http://www.howtoforge.com/setting-up-network-raid1-with-drbd-on-ubuntu-11.10-p2 I am using Ubuntu 10.04 DRBD - 8.3.11 In the first run I had everything working perfectly and when shifting the systems to a production environment I decided to restart the Meta Data creation part and start from scratch. The IPs had changed entirely in the production environment. Issuing drdbadm create-md r0 in both the servers runs successfully. But when I do "drbdadm -- --overwrite-data-of-peer primary all" on the primary it fails to start the re sync. My config file is as given below resource r0 { protocol C; syncer { rate 50M; } startup { wfc-timeout 15; degr-wfc-timeout 60; } net { cram-hmac-alg sha1; shared-secret "aklsadkjlhdbskjndsf8738734jkfkjfkjf"; } on primaryds { device /dev/drbd0; disk /dev/md2; address 172.16.7.1:7788; meta-disk internal; } on secondaryds { device /dev/drbd0; disk /dev/md2; address 172.16.7.3:7788; meta-disk internal; } } Status on primary root at primaryds:~# cat /proc/drbd version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by root at primaryds, 2012-05-12 15:08:01 0: cs:WFBitMapS ro:Primary/Secondary ds:UpToDate/Inconsistent C r---- ns:0 nr:0 dw:0 dr:200 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:5690352828 Status on secondary root at secondaryds:/etc/drbd.d# cat /proc/drbd version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by root at secondaryds, 2012-05-12 15:25:25 0: cs:WFBitMapT ro:Secondary/Primary ds:Inconsistent/UpToDate C r---- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:5690352828 Log of Primary May 30 13:42:23 primaryds kernel: [ 1584.057076] block drbd0: role( Secondary -> Primary ) disk( Inconsistent -> UpToDate ) May 30 13:42:23 primaryds kernel: [ 1584.086264] block drbd0: Forced to consider local data as UpToDate! May 30 13:42:23 primaryds kernel: [ 1584.086303] block drbd0: Creating new current UUID May 30 13:42:26 primaryds kernel: [ 1586.405551] block drbd0: drbd_sync_handshake: May 30 13:42:26 primaryds kernel: [ 1586.405564] block drbd0: self E8A075F378173D4B:0000000000000004:0000000000000000:0000000000000000 bits:1422588207 flags:0 May 30 13:42:26 primaryds kernel: [ 1586.405574] block drbd0: peer 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:1422588207 flags:0 May 30 13:42:26 primaryds kernel: [ 1586.405582] block drbd0: uuid_compare()=2 by rule 30 May 30 13:42:26 primaryds kernel: [ 1586.405587] block drbd0: Becoming sync source due to disk states. May 30 13:42:26 primaryds kernel: [ 1586.405592] block drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake. May 30 13:42:27 primaryds kernel: [ 1588.171638] block drbd0: 5427 GB (1422588207 bits) marked out-of-sync by on disk bit-map. May 30 13:42:27 primaryds kernel: [ 1588.172769] block drbd0: conn( Connected -> WFBitMapS ) Log in Secondary May 30 13:42:24 secondaryds kernel: [ 1563.304894] block drbd0: peer( Secondary - Primary ) pdsk( Inconsistent - UpToDate ) May 30 13:42:24 secondaryds kernel: [ 1563.339674] block drbd0: drbd_sync_handshake: May 30 13:42:24 secondaryds kernel: [ 1563.339685] block drbd0: self 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:1422588207 flags:0 May 30 13:42:24 secondaryds kernel: [ 1563.339695] block drbd0: peer E8A075F378173D4B:0000000000000004:0000000000000000:0000000000000000 bits:1422588207 flags:0 May 30 13:42:24 secondaryds kernel: [ 1563.339703] block drbd0: uuid_compare()=-2 by rule 20 May 30 13:42:24 secondaryds kernel: [ 1563.339709] block drbd0: Becoming sync target due to disk states. May 30 13:42:24 secondaryds kernel: [ 1563.339714] block drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake. May 30 13:42:26 secondaryds kernel: [ 1565.652342] block drbd0: 5427 GB (1422588207 bits) marked out-of-sync by on disk bit-map. May 30 13:42:26 secondaryds kernel: [ 1565.652965] block drbd0: conn( Connected - WFBitMapT ) The serves are not responding once it reaches this stage. Tried redoing it couple of time but noting happens. Why could the resync not be taking place? I would like some advice? Directions?

    Read the article

  • DRBD Replication failure

    - by user62513
    A couple of weeks ago I setup a 2 nodes CRM system with one of the resources managed being MySQL over DRBD. Today for maintenance reasons I restarted both nodes but now they can't connect to each other anymore. DRBD fell out of sync and I followed this guide to get it back connected but it's only able to run successfully on one node. But this strange thing happens: If I crm node standby both nodes and I try: crm node online node0 before crm node online node1, all the CRM resources start successfully but the DRBD partitions are still running in StandAlone state. crm node online node1 beofre crm node online node0, the DRBD resource fails to start, thus causing mysql not to start. If I standby both resources and call crm node online node0 then it times out and prints this error: Running crm node online node0 produces this output after timing out Error setting standby=off (section=nodes, set=<null>): Remote node did not respond Error performing operation: Remote node did not respond Is there anything I'm doing wrong here? An alternative will be just do MySQL replication but I'm not sure how to promote a slave to master when the master database is not available.

    Read the article

  • DRBD as a block device for XEN VM (Centos 5.3)

    - by SaberTooth
    Hi all, I have setup a drbd resource between 2 server nodes - everything works correctly when doing sync tests between the two. (I want to create a HA cluster using drbd,xen and heartbeat) However, when I try and create a XEN VM with Centos as guest operating system, I get through to the partitioning screen on the install but when I select a partitioning type the next screen gives me the following error : "An error has occurred - no valid devices were found on which to create new file systems. Please check your hardware for the cause of this problem." This is the first time attempting create a setup like this and searching Google does not help much... my config files for DRBD and XEN.... DRBD (just the section that is pertinent) on xennode0 { device /dev/drbd0; disk /dev/sda5; address X.X.X.X:7788; flexible-meta-disk internal; } on xennode1 { device /dev/drbd0; disk /dev/sda5; address X.X.X.X:7788; meta-disk internal; } XEN kernel = "/boot/xeninstall/vmlinuz" ramdisk = "/boot/xeninstall/initrd.img" extra = "text" name = "VM" maxmem = 3000 memory = 3000 vcpus = 4 on_poweroff = "destroy" on_reboot = "restart" on_crash = "restart" vfb = [ ] disk = [ "phy:/dev/drbd0,sda1,w", "tap:aio:/srv/xen/xenswap.img,sda2,w" ] vif = [ "mac=00:16:3e:11:67:ae,bridge=xenbr0" ] root = "/dev/sda1 ro" Thanks in advance!

    Read the article

  • NFS failover WITHOUT DRBD?

    - by user439407
    So I am trying to set up a redundant NFS share in a cloud environment(all links internal, half gig links), and I am looking into using heartbeat for failover, but all the guides seem to be about combining DRBD and heartbeat to create a robust environment. If need be I can do that, but since my content is almost completely static, I would like to avoid the extra overhead and complexity of DRBD if possible, but still be able to fail over if one of the NFS servers fails. Is it possible to use heartbeat with NFS to achieve high-availability without using DRBD to copy the blocks? I am not married to NFSv4, so if NFSv3 over UDP is necessary, that won't be a problem(only a very small number of clients will be connecting to the share) Any comments are appreciated.

    Read the article

  • DRBD on a disk with existing file system that takes all the place

    - by Karolis T.
    I'm currently trying to simulate the environment via XEN. I have installed two debian systems with such FS layout: cltest1:/etc# df -h Filesystem Size Used Avail Use% Mounted on /dev/xvda2 6.0G 417M 5.2G 8% / tmpfs 257M 0 257M 0% /lib/init/rw udev 10M 16K 10M 1% /dev tmpfs 257M 4.0K 257M 1% /dev/shm Host cltest2 is identical. Here's my drbd.conf global { minor-count 1; } resource mysql { protocol C; syncer { rate 10M; # 10 Megabytes } on cltest1 { device /dev/drbd0; disk /dev/xvda2; address 192.168.1.186:7789; meta-disk internal; } on cltest2 { device /dev/drbd0; disk /dev/xvda2; address 192.168.1.187:7789; meta-disk internal; } } I have not created filesystem on drbd0 Starting DRBD via init.d script errors out with: Starting DRBD resources: [ d(mysql) /dev/drbd0: Failure: (114) Lower device is already claimed. This usually means it is mounted. [mysql] cmd /sbin/drbdsetup /dev/drbd0 disk /dev/xvda2 /dev/xvda2 internal --set-defaults --create-device failed - continuing! Running: drbdadm create-md mysql gives: cltest1:/etc# drbdadm create-md mysql md_offset 6442446848 al_offset 6442414080 bm_offset 6442217472 Found ext3 filesystem which uses 6291456 kB current configuration leaves usable 6291228 kB Device size would be truncated, which would corrupt data and result in 'access beyond end of device' errors. You need to either * use external meta data (recommended) * shrink that filesystem first * zero out the device (destroy the filesystem) Operation refused. Command 'drbdmeta /dev/drbd0 v08 /dev/xvda2 internal create-md' terminated with exit code 40 drbdadm aborting As I understand, all of my problems are because I don't have unallocated disk space on xvda2. What are my options besides shrinking FS and connecting a separate physical disk? Can't the meta-data be stored on a file in the local filesystem?

    Read the article

  • DRBD setup problem

    - by cuthieu
    I'm so new to DRBD, please help me fixing the problem below. Enclosed my drbd.conf. Many thanks [[email protected] ~]# drbdadm create-md all open(/dev/hdb3) failed: No such file or directory Command 'drbdmeta /dev/drbd0 v08 /dev/hdb3 internal create-md' terminated with exit code 20 drbdsetup exited with code 20 [[email protected] ~]# vi /etc/drbd.conf global { usage-count no; } resource repdata { protocol C; startup { wfc-timeout 0; degr-wfc-timeout 120; } disk { on-io-error detach; } # or panic # net { cram-hmac-alg "hdd1"; shared-secret "testing"; } syncer { rate 10M; } on skonkwerks1 { device /dev/drbd0; disk /dev/hdb1; address 172.29.156.1:7788; meta-disk internal; } on skonkwerks2 { device /dev/drbd0; disk /dev/hdb1; address 172.29.156.2:7788; meta-disk internal; } }

    Read the article

  • IP issue with Heartbeat & DRBD

    - by adam0345
    I'm in the process of setting up 3-node stacked DRBD, and i'm experiencing a rather bizarre issue. Two nodes are located at the data center, and the 3rd node is located locally. The Primary and Secondary nodes are working as expected, however the 3rd node won't connect to the primary. If I ping the IP provided by heartbeat on the 3rd node it will return 100% packet loss, if I reset networking interfaces, ping will then return a few successful packets, but then stop returning any packets. I can't work out any reason why this would be behaving like this. All nodes are running Debian Squeeze, and the latest version of DRBD.

    Read the article

  • Ubuntu Server mdadm drbd ocfs2 kvm hangs under heavy file reading

    - by Stefano Annese
    I have deployed four ubuntu 10.04 server. They are coupled two by two in a cluster scenario. on both sides we have software raid1 disks, drbd8 and OCFS2 and on top of it some kvm machines run with qcow2 disks. I followed this: Link corosync is just used for DRBD and OCFS, the kvm machines are run "manually" When it works is fine: good performances, good I/O, but at a given time one of the two cluster started hanging. Then we tried with just one server turned on and it hangs the same. It seems to happen when an heavy READ in one of the virtual machines occurs, that is during rsyn backup. When the fact occurs the virtual machines are not reachable any more and the real server responds with good delay to the ping but no screen and no ssh is available. All we can do is force shutdown (hold the button) and restart and when it turns on again the raid on which relay drbd is resyncing. All the time it hangs we see such fact. After a couple of week of pain on one side this morning also the other cluster hung, but it has different moteherboard, ram, kvm instances. What is similar is reading for rsync scenario and Western Digital RAID Edistion disks on both side. Can anybody give me some input to solve such issue?

    Read the article

  • DRBD stacked resources: recovering from failure

    - by Marcus Downing
    We're running a stacked four-node DRBD setup like this: A --> B | | v v C D This means three DRBD resources running across these four servers. Servers A and B are Xen hosts running VMs, while servers C and D are for backups. A is in the same datacentre as C. From server A to server C, in the first datacentre, using protocol B From server B to server D, in the second datacentre, using protocol B From server A to server B, different datacentres, stacked resource using protocol A First question: booting a stacked resource We haven't got any vital data running on this setup yet - we're still making sure it works first. This means simulating power cuts, network outages etc and seeing what steps we need to recover. When we pull the power out of server A, both resources go down; it attempts to bring them back up at next boot. However, it only succeeds at bringing up the lower-level resource, A-C. The stacked resource A-B doesn't even try to connect, presumably because it can't find the device until it's a connected primary on the lower level. So if anything goes wrong we need to manually log in and bring that resource up, then start the virtual machine on top of it. Second question: setting the primary of a stacked resource Our lower-level resources are configured so that the right one is considered primary: resource test-AC { on A { ... } on C { ... } startup { become-primary-on A; } } But I don't see any way to do the same with a stacked resource, as the following isn't a valid config: resource test-AB { stacked-on-top-of test-AC { ... } stacked-on-top-of test-BD { ... } startup { become-primary-on test-AC; } } This too means that recovering from a failure requires manual intervention. Is there no way to set the automatic primary for a stacked resource?

    Read the article

  • Why do I see a large performance hit with DRBD?

    - by BHS
    I see a much larger performance hit with DRBD than their user manual says I should get. I'm using DRBD 8.3.7 (Fedora 13 RPMs). I've setup a DRBD test and measured throughput of disk and network without DRBD: dd if=/dev/zero of=/data.tmp bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 4.62985 s, 116 MB/s / is a logical volume on the disk I'm testing with, mounted without DRBD iperf: [ 4] 0.0-10.0 sec 1.10 GBytes 941 Mbits/sec According to Throughput overhead expectations, the bottleneck would be whichever is slower, the network or the disk and DRBD should have an overhead of 3%. In my case network and I/O seem to be pretty evenly matched. It sounds like I should be able to get around 100 MB/s. So, with the raw drbd device, I get dd if=/dev/zero of=/dev/drbd2 bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 6.61362 s, 81.2 MB/s which is slower than I would expect. Then, once I format the device with ext4, I get dd if=/dev/zero of=/mnt/data.tmp bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 9.60918 s, 55.9 MB/s This doesn't seem right. There must be some other factor playing into this that I'm not aware of. global_common.conf global { usage-count yes; } common { protocol C; } syncer { al-extents 1801; rate 33M; } data_mirror.res resource data_mirror { device /dev/drbd1; disk /dev/sdb1; meta-disk internal; on cluster1 { address 192.168.33.10:7789; } on cluster2 { address 192.168.33.12:7789; } } For the hardware I have two identical machines: 6 GB RAM Quad core AMD Phenom 3.2Ghz Motherboard SATA controller 7200 RPM 64MB cache 1TB WD drive The network is 1Gb connected via a switch. I know that a direct connection is recommended, but could it make this much of a difference? Edited I just tried monitoring the bandwidth used to try to see what's happening. I used ibmonitor and measured average bandwidth while I ran the dd test 10 times. I got: avg ~450Mbits writing to ext4 avg ~800Mbits writing to raw device It looks like with ext4, drbd is using about half the bandwidth it uses with the raw device so there's a bottleneck that is not the network.

    Read the article

  • DRBD with MySQL

    - by tdimmig
    Question about using DRBD to provide HA for MySQL. I need to be sure that my backup MySQL instance is always going to be in a functional state when the failover occurs. What happens, for example, if the primary dies part way through committing a transaction? Are we going to end up with data copied to the secondary that mysql can't handle? Or, what if the network goes away while the two are syncing, and not all of the data makes it across. It seems like it's possible to get into a state where incomplete data on the secondary makes it impossible for mysql to start up and read the database. Am I missing something?

    Read the article

  • Drbd Primary/Primary + iSCSI: accessing to different files avoids split brain?

    - by Eddie C.
    I have a question / curiosity about split-brain on a Drbd Primary/Primary configuration. Supposing two nodes (hosts), host1 and host2 configured with Drbd Primary/Primary and two different shares (NFS, CIFS o iSCSI) of a replicated area (saying /drbd) /drbd/file1.data /drbd/file2.data If a pool of client would access only by host1 share reading and wrinting only file1.data and another pool only by host2 share to file2.data, this scenario should avoid split brain situation in case of one node failure or it's just a conjecture? The final purpose is load balance between the two nodes in normal condition and collapsing to one node only in case of failure. Thank you! Eddie

    Read the article

  • DRBD and MySQL - Virtualbox Setup

    DRBD is a Linux project that provides a real-time distributed filesystem. Sean Hull demonstrates how to use Sun's virtualbox software to create a pair of VMs, then configure those VMs with DRBD, and finally install and test MySQL running on volumes sitting on DRBD.

    Read the article

  • DRBD and MySQL - Virtualbox Setup

    DRBD is a Linux project that provides a real-time distributed filesystem. Sean Hull demonstrates how to use Sun's virtualbox software to create a pair of VMs, then configure those VMs with DRBD, and finally install and test MySQL running on volumes sitting on DRBD.

    Read the article

  • Possible to use DRBD on two ESXi virtualized servers?

    - by chen
    I have two servers (attached disks have been set up as hardware RAID1 for disk device level failure resilience). Here is the setup in my mind: 1) Install ESXi on each of the physical server, M1, M2; 2) Start one VM on each of the ESXi virtualized physical server V1, V2; 3) Install the DRDB drivers within V1 and V2. Essentially, this is a "virtualizing machine running DRBD in the VM's instead of bare metal hardware" idea. My question is whether the above setup can achieve the same "networked RAID1" goal that DRDB can achieve in the bare-metal physical machines (http://www.drbd.org/). Thanks. [EDIT] I found this (http://serverfault.com/questions/49305/drbd-experimentation-and-virtualization) is a similar question, but the answer does not seem to be firmative enough for me to follow.

    Read the article

  • DRBD and MySQL - Heartbeat Setup

    Heartbeat automates all the moving parts and can work as well with the MySQL master-master active/passive solution as well as it can with the MySQL & DRBD solution. It manages the virtual IP address used by the database, directs DRBD to become primary, or relinquish primary duties, mounts the /dev/drbd0 device, and starts/stops MySQL as needed.

    Read the article

  • DRBD and MySQL - Heartbeat Setup

    Heartbeat automates all the moving parts and can work as well with the MySQL master-master active/passive solution as well as it can with the MySQL & DRBD solution. It manages the virtual IP address used by the database, directs DRBD to become primary, or relinquish primary duties, mounts the /dev/drbd0 device, and starts/stops MySQL as needed.

    Read the article

  • When to use MySQL replication or DRBD for HA on Xen VM?

    - by user62513
    I'm setting up a database which needs to be needs to provide High Availabilty. My primary concern is high performance and robustness (I don't want something that will fail fast and badly). The database is accessed by the application at an average of 300 qps. It's will run on Xen VMs and it has some InnoDB tables as well as MyISAM tables. The VMs are connected via ethernet 100Mbit/s ethernet cables. Which of the two - MySQL replication or DRBD - would you recommend in such a situation? Or should I use DRBD to make the master database Highly Available and use MySQL replication on the slaves? I'm a developer so these things are all not so easy for me to make a sound judgement.

    Read the article

1 2 3 4  | Next Page >