Search Results

Search found 13 results on 1 pages for 'deadtime'.

Page 1/1 | 1 

  • How do you setLayoutParams() for an ImageView?

    - by DeadTime
    I want to set the LayoutParams for an ImageView but cant seem to find out the proper way to do it. I can only find documentation in the API for the various ViewGroups, but not an ImageView. Yet the ImageView seems to have this functionality. This code doesn't work... myImageView.setLayoutParams(new ImageView.LayoutParams(30,30)); How do I do it?

    Read the article

  • How can I use a prefix for the fields in my strongly typed View?

    - by deadtime
    Hi, I have an EditProfile action in my PersonController, that finds a Person object by ID and does return View(person). The EditProfile view has a bunch of fields that should be filled in with the information from the Person object. One of the fields is named "Person.FirstName", and is not filled in automatically when the View is rendered. However, if I rename the textbox to "FirstName", it gets filled in. My question is: How can I specify to the view that it should use the "Person" prefix when filling in the information from the Person object? The reason I want to do this is because of consistency - in some of my views I allow the user to edit more than one object at the same time, and use prefixes in order to bind these correctly.

    Read the article

  • IP failover with 2 nodes on different subnet: cannot ping virtual IP from second node?

    - by quanta
    I'm going to setup redundant failover Redmine: another instance was installed on the second server without problem MySQL (running on the same machine with Redmine) was configured as master-master replication Because they are in different subnet (192.168.3.x and 192.168.6.x), it seems that VIPArip is the only choice. /etc/ha.d/ha.cf on node1 logfacility none debug 1 debugfile /var/log/ha-debug logfile /var/log/ha-log autojoin none warntime 3 deadtime 6 initdead 60 udpport 694 ucast eth1 node2.ip keepalive 1 node node1 node node2 crm respawn /etc/ha.d/ha.cf on node2: logfacility none debug 1 debugfile /var/log/ha-debug logfile /var/log/ha-log autojoin none warntime 3 deadtime 6 initdead 60 udpport 694 ucast eth0 node1.ip keepalive 1 node node1 node node2 crm respawn crm configure show: node $id="6c27077e-d718-4c82-b307-7dccaa027a72" node1 node $id="740d0726-e91d-40ed-9dc0-2368214a1f56" node2 primitive VIPArip ocf:heartbeat:VIPArip \ params ip="192.168.6.8" nic="lo:0" \ op start interval="0" timeout="20s" \ op monitor interval="5s" timeout="20s" depth="0" \ op stop interval="0" timeout="20s" \ meta is-managed="true" property $id="cib-bootstrap-options" \ stonith-enabled="false" \ dc-version="1.0.12-unknown" \ cluster-infrastructure="Heartbeat" \ last-lrm-refresh="1338870303" crm_mon -1: ============ Last updated: Tue Jun 5 18:36:42 2012 Stack: Heartbeat Current DC: node2 (740d0726-e91d-40ed-9dc0-2368214a1f56) - partition with quorum Version: 1.0.12-unknown 2 Nodes configured, unknown expected votes 1 Resources configured. ============ Online: [ node1 node2 ] VIPArip (ocf::heartbeat:VIPArip): Started node1 ip addr show lo: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet 192.168.6.8/32 scope global lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever I can ping 192.168.6.8 from node1 (192.168.3.x): # ping -c 4 192.168.6.8 PING 192.168.6.8 (192.168.6.8) 56(84) bytes of data. 64 bytes from 192.168.6.8: icmp_seq=1 ttl=64 time=0.062 ms 64 bytes from 192.168.6.8: icmp_seq=2 ttl=64 time=0.046 ms 64 bytes from 192.168.6.8: icmp_seq=3 ttl=64 time=0.059 ms 64 bytes from 192.168.6.8: icmp_seq=4 ttl=64 time=0.071 ms --- 192.168.6.8 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3000ms rtt min/avg/max/mdev = 0.046/0.059/0.071/0.011 ms but cannot ping virtual IP from node2 (192.168.6.x) and outside. Did I miss something? PS: you probably want to set IP2UTIL=/sbin/ip in the /usr/lib/ocf/resource.d/heartbeat/VIPArip resource agent script if you get something like this: Jun 5 11:08:10 node1 lrmd: [19832]: info: RA output: (VIPArip:stop:stderr) 2012/06/05_11:08:10 ERROR: Invalid OCF_RESK EY_ip [192.168.6.8] http://www.clusterlabs.org/wiki/Debugging_Resource_Failures Reply to @DukeLion: Which router receives RIP updates? When I start the VIPArip resource, ripd was run with below configuration file (on node1): /var/run/resource-agents/VIPArip-ripd.conf: hostname ripd password zebra debug rip events debug rip packet debug rip zebra log file /var/log/quagga/quagga.log router rip !nic_tag no passive-interface lo:0 network lo:0 distribute-list private out lo:0 distribute-list private in lo:0 !metric_tag redistribute connected metric 3 !ip_tag access-list private permit 192.168.6.8/32 access-list private deny any

    Read the article

  • heartbeat: Bad nodename in /etc/ha.d//haresources [node1]

    - by Richard
    I'm trying to start heartbeat on Ubuntu 10.04 with service heartbeat start, but getting the following errors: heartbeat[24829]: 2011/11/22_19:31:07 ERROR: Bad nodename in /etc/ha.d//haresources [node1] heartbeat[24829]: 2011/11/22_19:31:07 ERROR: Configuration error, heartbeat not started. On on server uname -n produces loadb1, on the second server uname -n produces loadb2. The two servers can ping each other okay with those names. This is /etc/ha.d/ha.cnf on both servers: debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 10 udpport 694 bcast eth1 ucast eth0 my.external.ip ucast eth0 my.external.ip ucast eth1 10.0.0.5 ucast eth1 10.0.0.6 #udp eth0 node loadb1 node loadb2 auto_failback off And this is /etc/ha.d/haresources on both servers: node1 IPaddr::46.20.121.113 httpd smb dhcpd Authkeys is also set up. What am I doing wrong? The part where I'm least clear is the ucast/bcast lines.

    Read the article

  • How to setup heartbeat for IP fail over on SSH failure

    - by Tony
    I wonder if anyone can help me, I am trying to setup heartbeat on a redhat 5 to failover an IP address when ssh stops responding on a server. So basically you ssh to a VIP and then get put through which ever server has the floating ip. 192.168.0.100 | | /------------------------\ | /------------------------\ | Server 01 | | | Server 02 | | eth0 - 192.168.0.1 |-----/ | eth0 - 192.168.0.2 | | eth0:0 - 192.168.0.100 | | eth0:0 - down | \------------------------/ \------------------------/ if ssh stops responding i want eth0:0 to be brought up on the second machine to allow ssh connections to carry on being served. I have tried to follow some documents I have found online so here is my current configuration: ha.cf bcast eth0 keepalive 2 warntime 10 deadtime 30 initdead 120 udpport 694 auto_failback off node vm-bal01 node vm-bal02 debugfile /var/log/ha-debug logfile /var/log/ha-log authkeys auth 1 1 sha1 sshhhsecret1234 haresources server01 192.168.0.100/24/eth0:0/192.168.0.255 Hope someone can help as this is driving me nuts...

    Read the article

  • configure Heartbeat on Centos Linux - error message

    - by Elad Dotan
    I installed Heartbeat on my Centos Linux and it seems to partially work..but I'm trying to monitor a service with no success. only when I reboot the main server the backup server takes over. in the logs I get : heartbeat[30476]: 2012/03/20_18:51:57 WARN: string2msg_ll: node [node1] failed authentication heartbeat[30476]: 2012/03/20_18:51:58 WARN: string2msg_ll: node [node02] failed authentication the authkeys is identical (copied from one to another). this is my ha.cf: logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 30 initdead 120 bcast eth0 udpport 694 auto_failback on node server01.com node server02.com haresources : server01.com 38.108.117.3 aim chat any idea how to fix the problem so if a service stops the other server take over Thanks! E.

    Read the article

  • Cisco PIX firewall blocking inbound Exchange email

    - by sumsaricum
    [Cisco PIX, SBS2003] I can telnet server port 25 from inside but not outside, hence all inbound email is blocked. (as an aside, inbox on iPhones do not list/update emails, but calendar works a charm) I'm inexperienced in Cisco PIX and looking for some assistance before mails start bouncing :/ interface ethernet0 auto interface ethernet1 100full nameif ethernet0 outside security0 nameif ethernet1 inside security100 hostname pixfirewall domain-name ciscopix.com fixup protocol dns maximum-length 512 fixup protocol ftp 21 fixup protocol h323 h225 1720 fixup protocol h323 ras 1718-1719 fixup protocol http 80 fixup protocol rsh 514 fixup protocol rtsp 554 fixup protocol sip 5060 fixup protocol sip udp 5060 fixup protocol skinny 2000 no fixup protocol smtp 25 fixup protocol sqlnet 1521 fixup protocol tftp 69 names name 192.168.1.10 SERVER access-list inside_outbound_nat0_acl permit ip 192.168.1.0 255.255.255.0 192.168.1.96 255.255.255.240 access-list outside_cryptomap_dyn_20 permit ip any 192.168.1.96 255.255.255.240 access-list outside_acl permit tcp any host 213.xxx.xxx.xxx eq 3389 access-list outside_acl permit tcp any interface outside eq ftp access-list outside_acl permit tcp any host 213.xxx.xxx.xxx eq https access-list outside_acl permit tcp any host 213.xxx.xxx.xxx eq www access-list outside_acl permit tcp any interface outside eq 993 access-list outside_acl permit tcp any interface outside eq imap4 access-list outside_acl permit tcp any interface outside eq 465 access-list outside_acl permit tcp any host 213.xxx.xxx.xxx eq smtp access-list outside_cryptomap_dyn_40 permit ip any 192.168.1.96 255.255.255.240 access-list COMPANYVPN_splitTunnelAcl permit ip 192.168.1.0 255.255.255.0 any access-list COMPANY_splitTunnelAcl permit ip 192.168.1.0 255.255.255.0 any access-list outside_cryptomap_dyn_60 permit ip any 192.168.1.96 255.255.255.240 access-list COMPANY_VPN_splitTunnelAcl permit ip 192.168.1.0 255.255.255.0 any access-list outside_cryptomap_dyn_80 permit ip any 192.168.1.96 255.255.255.240 pager lines 24 icmp permit host 217.157.xxx.xxx outside mtu outside 1500 mtu inside 1500 ip address outside 213.xxx.xxx.xxx 255.255.255.128 ip address inside 192.168.1.1 255.255.255.0 ip audit info action alarm ip audit attack action alarm ip local pool VPN 192.168.1.100-192.168.1.110 pdm location 0.0.0.0 255.255.255.128 outside pdm location 0.0.0.0 255.255.255.0 inside pdm location 217.yyy.yyy.yyy 255.255.255.255 outside pdm location SERVER 255.255.255.255 inside pdm logging informational 100 pdm history enable arp timeout 14400 global (outside) 1 interface nat (inside) 0 access-list inside_outbound_nat0_acl nat (inside) 1 0.0.0.0 0.0.0.0 0 0 static (inside,outside) tcp 213.xxx.xxx.xxx 3389 SERVER 3389 netmask 255.255.255.255 0 0 static (inside,outside) tcp 213.xxx.xxx.xxx smtp SERVER smtp netmask 255.255.255.255 0 0 static (inside,outside) tcp 213.xxx.xxx.xxx https SERVER https netmask 255.255.255.255 0 0 static (inside,outside) tcp 213.xxx.xxx.xxx www SERVER www netmask 255.255.255.255 0 0 static (inside,outside) tcp interface imap4 SERVER imap4 netmask 255.255.255.255 0 0 static (inside,outside) tcp interface 993 SERVER 993 netmask 255.255.255.255 0 0 static (inside,outside) tcp interface 465 SERVER 465 netmask 255.255.255.255 0 0 static (inside,outside) tcp interface ftp SERVER ftp netmask 255.255.255.255 0 0 access-group outside_acl in interface outside route outside 0.0.0.0 0.0.0.0 213.zzz.zzz.zzz timeout xlate 0:05:00 timeout conn 1:00:00 half-closed 0:10:00 udp 0:02:00 rpc 0:10:00 h225 1:00:00 timeout h323 0:05:00 mgcp 0:05:00 sip 0:30:00 sip_media 0:02:00 timeout sip-disconnect 0:02:00 sip-invite 0:03:00 timeout uauth 0:05:00 absolute aaa-server TACACS+ protocol tacacs+ aaa-server TACACS+ max-failed-attempts 3 aaa-server TACACS+ deadtime 10 aaa-server RADIUS protocol radius aaa-server RADIUS max-failed-attempts 3 aaa-server RADIUS deadtime 10 aaa-server RADIUS (inside) host SERVER *** timeout 10 aaa-server LOCAL protocol local http server enable http 217.yyy.yyy.yyy 255.255.255.255 outside http 192.168.1.0 255.255.255.0 inside no snmp-server location no snmp-server contact snmp-server community public no snmp-server enable traps floodguard enable sysopt connection permit-ipsec crypto ipsec transform-set ESP-3DES-MD5 esp-3des esp-md5-hmac crypto dynamic-map outside_dyn_map 20 match address outside_cryptomap_dyn_20 crypto dynamic-map outside_dyn_map 20 set transform-set ESP-3DES-MD5 crypto dynamic-map outside_dyn_map 40 match address outside_cryptomap_dyn_40 crypto dynamic-map outside_dyn_map 40 set transform-set ESP-3DES-MD5 crypto dynamic-map outside_dyn_map 60 match address outside_cryptomap_dyn_60 crypto dynamic-map outside_dyn_map 60 set transform-set ESP-3DES-MD5 crypto dynamic-map outside_dyn_map 80 match address outside_cryptomap_dyn_80 crypto dynamic-map outside_dyn_map 80 set transform-set ESP-3DES-MD5 crypto map outside_map 65535 ipsec-isakmp dynamic outside_dyn_map crypto map outside_map client authentication RADIUS LOCAL crypto map outside_map interface outside isakmp enable outside isakmp policy 20 authentication pre-share isakmp policy 20 encryption 3des isakmp policy 20 hash md5 isakmp policy 20 group 2 isakmp policy 20 lifetime 86400 telnet 217.yyy.yyy.yyy 255.255.255.255 outside telnet 0.0.0.0 0.0.0.0 inside telnet timeout 5 ssh 217.yyy.yyy.yyy 255.255.255.255 outside ssh 0.0.0.0 255.255.255.0 inside ssh timeout 5 management-access inside console timeout 0 dhcpd address 192.168.1.20-192.168.1.40 inside dhcpd dns SERVER 195.184.xxx.xxx dhcpd wins SERVER dhcpd lease 3600 dhcpd ping_timeout 750 dhcpd auto_config outside dhcpd enable inside : end I have Kiwi SysLog running but could use some pointers in that regard to narrow down the torrent of log messages, if that helps?!

    Read the article

  • How does Heartbeat determine when to switch to the secondary? Can you force it to switch?

    - by John
    I've been trying to understand exactly how Heartbeat works - I understand how when one server dies, it switches to the backup. But, for me, it also switches when the primary has a large increase in workload. But, it doesn't always switch at the same value. There doesn't seem to much information on the web about how it works. The best I've found is this article. How does Heartbeat determine when to switch to the secondary, and how does it determine when it switch back to the primary? Is this an editable setting, and can I force it to switch between one and the other? Sometimes when Heartbeat will switch to the secondary, it takes a few days or I've even seen two weeks before it switches back to the primary. This is well after the primary traffic has gone down. I'm currently using BlueOnyx, and my Heartbeat settings are: Auto Failback: on Keepalive: 1 seconds Warntime: 10 seconds Deadtime: 20 seconds Initdead: 30 seconds

    Read the article

  • Heartbeat/DRBD failover didn't work as expected. How do I make the failover more robust?

    - by Quinn Murphy
    I had a scenario where a DRBD-heartbeat set up had a failed node but did not failover. What happened was the primary node had locked up, but didn't go down directly (it was inaccessible via ssh or with the nfs mount, but it could be pinged). The desired behavior would have been to detect this and failover to the secondary node, but it appears that since the primary didn't go full down (there is a dedicated network connection from server to server), heartbeat's detection mechanism didn't pick up on that and therefore didn't failover. Has anyone seen this? Is there something that I need to configure to have more robust cluster failover? DRBD seems to otherwise work fine (had to resync when I rebooted the old primary), but without good failover, it's use is limited. heartbeat 3.0.4 drbd84 RHEL 6.1 We are not using Pacemaker nfs03 is the primary server in this setup, and nfs01 is the secondary. ha.cf # Hearbeat Logging logfacility daemon udpport 694 ucast eth0 192.168.10.47 ucast eth0 192.168.10.42 # Cluster members node nfs01.openair.com node nfs03.openair.com # Hearbeat communication timing. # Sets the triggers and pulse time for swapping over. keepalive 1 warntime 10 deadtime 30 initdead 120 #fail back automatically auto_failback on and here is the haresources file: nfs03.openair.com IPaddr::192.168.10.50/255.255.255.0/eth0 drbddisk::data Filesystem::/dev/drbd0::/data::ext4 nfs nfslock

    Read the article

  • Linux HA cluster w/Xen, Heartbeat, Pacemaker. domU does not failover to secondary node

    - by Kendall
    I am having the followig problem with an OenSuSE + Heartbeat + Pacemaker + Xen HA cluster: when the node a Xen domU is running on is "dead" the Xen domU running on it is not restarted on the second node. The cluster is setup with two nodes, each running OpenSuSE-11.3, Heartbeat 3.0, and Pacemaker 1.0 in CRM mode. For storage I am using a LUN on an iSCSI SAN device; the LUN is formatted with OCFS2 and managed with LVM. The Xen domU has two logical volumes; one for root and the other for swap. I am using IPMI cards for STONITH devices, and a dedicated ethernet link for heartbeat communications. The ha.cf file is as follows: keepalive 1 deadtime 10 warntime 5 udpport 694 ucast eth1 auto_failback off node dhcp-166 node stage use_logd yes crm yes My resources look as follows: shocrm(live)configure# show node $id="5c1aa924-bba4-4f95-a367-6c9a58ac4a38" dhcp-166 node $id="cebc92eb-af24-4833-aaf0-672adf80b58e" stage primitive Xen-Util ocf:heartbeat:Xen \ meta target-role="Started" \ operations $id="Xen-Util-operations" \ op start interval="0" timeout="60" start-delay="0" \ op stop interval="0" timeout="120" \ params xmfile="/etc/xen/vm/xen-util" primitive my-stonith stonith:external/ipmi \ params hostname="dhcp-166" ipaddr="192.168.3.106" userid="ADMIN" passwd="xxx" \ op monitor interval="2m" timeout="60s" primitive my-stonith2 stonith:external/ipmi \ params hostname="stage" ipaddr="192.168.3.105" userid="ADMIN" passwd="xxx" \ op monitor interval="2m" timeout="60s" property $id="cib-bootstrap-options" \ dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \ cluster-infrastructure="Heartbeat" The Xen domU config file is as follows: name = "xen-util" bootloader = "/usr/lib/xen/boot/domUloader.py" #bootargs = "xvda1:/vmlinuz-xen,/initrd-xen" bootargs = "--entry=xvda1:/boot/vmlinuz-xen,/boot/initrd-xen" memory = 4096 disk = [ 'phy:vg_xen/xen-util-root,xvda1,w', 'phy:vg_xen/xen-util-swap,xvda2,w', ] root = "/dev/xvda1" vif = [ 'mac=00:16:3e:42:42:06' ] #vfb = [ 'type=vnc,vncunused=0,vnclisten=192.168.3.172' ] extra = "" Say domU "Xen-Util" is running on node "stage"; if "stage" goes down, "Xen-Util" does not restart on node "dhcp-166". It seems to want to try as an "xm list" will show it for a few seconds and if you "xm console xen-util" it will give a message like "copying /boot/kernel.gz from xvda1 to /var/lib/xen/tmp/kernel.a53gs for booting". However, it never gets past that, eventually gives up, and no longer appears in "xm list". Now, when node "stage" comes back online after being power cycled, it detects that "Xen-Util" isn't running, and starts it (on stage). I've tried starting "Xen-Util" on node "dhcp-166" without the cluster running, and it works fine. No problems. So, I know it works in that respect. Any ideas? Thanks!

    Read the article

  • How to keep group-writeable shares on Samba with OSX clients?

    - by Oliver Salzburg
    I have a FreeNAS server on a network with OSX and Windows clients. When the OSX clients interact with SMB/CIFS shares on the server, they are causing permission problems for all other clients. Update: I can no longer verify any answers because we abandoned the project, but feel free to post any help for future visitors. The details of this behavior seem to also be dependent on the version of OSX the client is running. For this question, let's assume a client running 10.8.2. When I mount the CIFS share on an OSX client and create a new directory on it, the directory will be created with drwxr-x-rx permissions. This is undesirable because it will not allow anyone but me to write to the directory. There are other users in my group which should have write permissions as well. This behavior happens even though the following settings are present in smb.conf on the server: [global] create mask= 0666 directory mask= 0777 [share] force directory mode= 0775 force create mode= 0660 I was under the impression that these settings should make sure that directories are at least created with rwxrwxr-x permissions. But, I guess, that doesn't stop the client from changing the permissions after creating the directory. When I create a folder on the same share from a Windows client, the new folder will have the desired access permissions (rwxrwxrwx), so I'm currently assuming that the problem lies with the OSX client. I guess this wouldn't be such an issue if you could easily change the permissions of the directories you've created, but you can't. When opening the directory info in Finder, I get the old "You have custom access" notice with no ability to make any changes. I'm assuming that this is caused because we're using Windows ACLs on the share, but that's just a wild guess. Changing the write permissions for the group through the terminal works fine, but this is unpractical for the deployment and unreasonable to expect from anyone to do. This is the complete smb.conf: [global] encrypt passwords = yes dns proxy = no strict locking = no read raw = yes write raw = yes oplocks = yes max xmit = 65535 deadtime = 15 display charset = LOCALE max log size = 10 syslog only = yes syslog = 1 load printers = no printing = bsd printcap name = /dev/null disable spoolss = yes smb passwd file = /var/etc/private/smbpasswd private dir = /var/etc/private getwd cache = yes guest account = nobody map to guest = Bad Password obey pam restrictions = Yes # NOTE: read smb.conf. directory name cache size = 0 max protocol = SMB2 netbios name = freenas workgroup = COMPANY server string = FreeNAS Server store dos attributes = yes hostname lookups = yes security = user passdb backend = ldapsam:ldap://ldap.company.local ldap admin dn = cn=admin,dc=company,dc=local ldap suffix = dc=company,dc=local ldap user suffix = ou=Users ldap group suffix = ou=Groups ldap machine suffix = ou=Computers ldap ssl = off ldap replication sleep = 1000 ldap passwd sync = yes #ldap debug level = 1 #ldap debug threshold = 1 ldapsam:trusted = yes idmap uid = 10000-39999 idmap gid = 10000-39999 create mask = 0666 directory mask = 0777 client ntlmv2 auth = yes dos charset = CP437 unix charset = UTF-8 log level = 1 [share] path = /mnt/zfs0 printable = no veto files = /.snap/.windows/.zfs/ writeable = yes browseable = yes inherit owner = no inherit permissions = no vfs objects = zfsacl guest ok = no inherit acls = Yes map archive = No map readonly = no nfs4:mode = special nfs4:acedup = merge nfs4:chown = yes hide dot files force directory mode = 0775 force create mode = 0660

    Read the article

  • Unexpected start of already-primary server processes when heartbeat on secondary is stopped.

    - by vorik
    Hi, I've got an active-passive Heartbeat cluster with Apache, MySQL, ActiveMQ and DRBD. Today, I wanted to perform hardware-maintenance on the secondary node (node04), so I stopped the heartbeat service before shutting it down. Then, the primary node (node03) received a shutdown notice from the secondary node (node04). This logging comes from the primary node: node03 heartbeat[4458]: 2010/03/08_08:52:56 info: Received shutdown notice from 'node04.companydomain.nl'. heartbeat[4458]: 2010/03/08_08:52:56 info: Resources being acquired from node04.companydomain.nl. harc[27522]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/status status heartbeat[27523]: 2010/03/08_08:52:56 info: Local Resource acquisition completed. mach_down[27567]: 2010/03/08_08:52:56 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[27567]: 2010/03/08_08:52:56 info: mach_down takeover complete for node node04.companydomain.nl. heartbeat[4458]: 2010/03/08_08:52:56 info: mach_down takeover complete. harc[27620]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[27620]: 2010/03/08_08:52:56 received ip-request-resp drbddisk OK yes ResourceManager[27645]: 2010/03/08_08:52:56 info: Acquiring resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:56 info: Running /etc/ha.d/resource.d/drbddisk start Filesystem[27700]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/mysql start mysql[27783]: 2010/03/08_08:52:57 Starting MySQL[ OK ] apache[27853]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/monitor start monitor[28160]: 2010/03/08_08:52:58 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq start activemq[28210]: 2010/03/08_08:52:58 Starting ActiveMQ Broker... ActiveMQ Broker is already running. ResourceManager[27645]: 2010/03/08_08:52:58 ERROR: Return code 1 from /etc/ha.d/resource.d/activemq ResourceManager[27645]: 2010/03/08_08:52:58 CRIT: Giving up resources due to failure of activemq ResourceManager[27645]: 2010/03/08_08:52:58 info: Releasing resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/IPaddr 1.2.3.212 stop IPaddr[28329]: 2010/03/08_08:52:58 INFO: ifconfig eth0:0 down IPaddr[28312]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28378]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28433]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/tivoli-cluster stop ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq stop activemq[28503]: 2010/03/08_08:53:01 Stopping ActiveMQ Broker... Stopped ActiveMQ Broker. ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/monitor stop monitor[28681]: 2010/03/08_08:53:01 ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/LVSSyncDaemonSwap master stop LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster down LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncbackup up LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster released ResourceManager[27645]: 2010/03/08_08:53:02 info: Running /etc/ha.d/resource.d/apache /etc/httpd/conf/httpd.conf stop apache[28782]: 2010/03/08_08:53:03 INFO: Killing apache PID 18390 apache[28782]: 2010/03/08_08:53:03 INFO: apache stopped. apache[28771]: 2010/03/08_08:53:03 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:03 info: Running /etc/ha.d/resource.d/mysql stop mysql[28851]: 2010/03/08_08:53:24 Shutting down MySQL.....................[ OK ] ResourceManager[27645]: 2010/03/08_08:53:24 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop Filesystem[29010]: 2010/03/08_08:53:25 INFO: Running stop for /dev/drbd0 on /data Filesystem[29010]: 2010/03/08_08:53:25 INFO: Trying to unmount /data Filesystem[29010]: 2010/03/08_08:53:25 ERROR: Couldn't unmount /data; trying cleanup with SIGTERM Filesystem[29010]: 2010/03/08_08:53:25 INFO: Some processes on /data were signalled Filesystem[29010]: 2010/03/08_08:53:27 INFO: unmounted /data successfully Filesystem[28999]: 2010/03/08_08:53:27 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:27 info: Running /etc/ha.d/resource.d/drbddisk stop heartbeat[4458]: 2010/03/08_08:53:29 WARN: node node04.companydomain.nl: is dead heartbeat[4458]: 2010/03/08_08:53:29 info: Dead node node04.companydomain.nl gave up resources. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth0 dead. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth1 dead. hb_standby[29193]: 2010/03/08_08:53:57 Going standby [foreign]. heartbeat[4458]: 2010/03/08_08:53:57 info: node03.companydomain.nl wants to go standby [foreign] Soo... What just happened here??? Heartbeat on node04 stopped and told node03, which was the active node at the time. Somehow, node03 decided to start the cluster processes that were already running. (For the processes that are not critical, I always return a 0 from the startupscript so it does not stops the entire cluster when a non-essential part fails.) When starting ActiveMQ, it returns status 1 because it is already running. This fails the node and shuts everything down. As heartbeat is not running on the secondary node, it cannot failover to there. When I tried to run ha_takeover to restart the resources, absolutely nothing happened. Only after I restarted heartbeat on the primary node the resources could be started (after a delay of 2 minutes). These are my questions: Why does heartbeat on the primary node try to start the cluster processes again? Why did ha_takeover not work? What can I do to prevent this from happening? Server configuration: DRBD: version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by [email protected], 2010-01-20 09:14:48 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate B r---- ns:0 nr:6459432 dw:6459432 dr:0 al:0 bm:301 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 uname -a Linux node04 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux haresources node03.companydomain.nl \ drbddisk \ Filesystem::/dev/drbd0::/data::ext3 \ mysql \ apache::/etc/httpd/conf/httpd.conf \ LVSSyncDaemonSwap::master \ monitor \ activemq \ tivoli-cluster \ MailTo::[email protected]::DRBDFailureDrisAcc \ MailTo::[email protected]::DRBDFailureDrisAcc \ 1.2.3.212 ha.cf debugfile /var/log/ha-debug logfile /var/log/ha-log keepalive 500ms deadtime 30 warntime 10 initdead 120 udpport 694 mcast eth0 225.0.0.3 694 1 0 mcast eth1 225.0.0.4 694 1 0 auto_failback off node node03.companydomain.nl node node04.companydomain.nl respawn hacluster /usr/lib64/heartbeat/dopd apiauth dopd gid=haclient uid=hacluster Thank you very much in advance, Ger Apeldoorn

    Read the article

  • Heartbeat won't successfully start up resources from a cold boot when a failed node is present

    - by Matthew
    I currently have two ubuntu servers running Heartbeat and DRBD. The servers are directory connected with a 1000Mbps crossover cable on eth1 and have access to an IP camera LAN on eth0. Now, let's say that one node is down and the remaining functional node is booting after having been shut down. The node that is still functioning won't start up heartbeat and provide access to the drbd resource from a cold boot. I have to manually restart heartbeat by sudo service heartbeat restart to get everything up and running. How can I get it to start fine from a cold start, when only one server is present? Here is the ha.cf: debug /var/log/ha-debug logfile /var/log/ha-log logfacility none keepalive 2 deadtime 10 warntime 7 initdead 60 ucast eth1 192.168.2.2 ucast eth0 10.1.10.201 node EMserver1 node EMserver2 respawn hacluster /usr/lib/heartbeat/ipfail ping 10.1.10.22 10.1.10.21 10.1.10.11 auto_failback off Some material from the syslog: harc[4604]: 2012/11/27_13:54:49 info: Running /etc/ha.d//rc.d/status status mach_down[4632]: 2012/11/27_13:54:49 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[4632]: 2012/11/27_13:54:49 info: mach_down takeover complete for node emserver2. Nov 27 13:54:49 EMserver1 heartbeat: [4586]: info: Initial resource acquisition complete (T_RESOURCES(us)) Nov 27 13:54:49 EMserver1 heartbeat: [4586]: info: mach_down takeover complete. IPaddr[4679]: 2012/11/27_13:54:49 INFO: Resource is stopped Nov 27 13:54:49 EMserver1 heartbeat: [4605]: info: Local Resource acquisition completed. harc[4713]: 2012/11/27_13:54:49 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp ip-request-resp[4713]: 2012/11/27_13:54:49 received ip-request-resp IPaddr::10.1.10.254 OK yes ResourceManager[4732]: 2012/11/27_13:54:50 info: Acquiring resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server IPaddr[4759]: 2012/11/27_13:54:50 INFO: Resource is stopped ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 start IPaddr[4816]: 2012/11/27_13:54:50 INFO: Using calculated nic for 10.1.10.254: eth0 IPaddr[4816]: 2012/11/27_13:54:50 INFO: Using calculated netmask for 10.1.10.254: 255.255.255.0 IPaddr[4816]: 2012/11/27_13:54:50 INFO: eval ifconfig eth0:0 10.1.10.254 netmask 255.255.255.0 broadcast 10.1.10.255 IPaddr[4804]: 2012/11/27_13:54:50 INFO: Success ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/drbddisk r0 start Filesystem[4965]: 2012/11/27_13:54:50 INFO: Resource is stopped ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 start Filesystem[5039]: 2012/11/27_13:54:50 INFO: Running start for /dev/drbd1 on /shr Filesystem[5033]: 2012/11/27_13:54:51 INFO: Success ResourceManager[4732]: 2012/11/27_13:54:51 info: Running /etc/init.d/nfs-kernel-server start Nov 27 13:55:00 EMserver1 heartbeat: [4586]: info: Local Resource acquisition completed. (none) Nov 27 13:55:00 EMserver1 heartbeat: [4586]: info: local resource transition completed. Nov 27 13:57:46 EMserver1 heartbeat: [4586]: info: Heartbeat shutdown in progress. (4586) Nov 27 13:57:46 EMserver1 heartbeat: [5286]: info: Giving up all HA resources. ResourceManager[5301]: 2012/11/27_13:57:46 info: Releasing resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server ResourceManager[5301]: 2012/11/27_13:57:46 info: Running /etc/init.d/nfs-kernel-server stop ResourceManager[5301]: 2012/11/27_13:57:46 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 stop Filesystem[5372]: 2012/11/27_13:57:46 INFO: Running stop for /dev/drbd1 on /shr Filesystem[5372]: 2012/11/27_13:57:47 INFO: Trying to unmount /shr Filesystem[5372]: 2012/11/27_13:57:47 INFO: unmounted /shr successfully Filesystem[5366]: 2012/11/27_13:57:47 INFO: Success ResourceManager[5301]: 2012/11/27_13:57:47 info: Running /etc/ha.d/resource.d/drbddisk r0 stop ResourceManager[5301]: 2012/11/27_13:57:47 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 stop IPaddr[5509]: 2012/11/27_13:57:47 INFO: ifconfig eth0:0 down IPaddr[5497]: 2012/11/27_13:57:47 INFO: Success Nov 27 13:57:47 EMserver1 heartbeat: [5286]: info: All HA resources relinquished. Nov 27 13:57:48 EMserver1 heartbeat: [4586]: info: killing /usr/lib/heartbeat/ipfail process group 4603 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBFIFO process 4589 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4590 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4591 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4592 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4593 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4594 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4595 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4596 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4597 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4598 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4599 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4589 exited. 11 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4596 exited. 10 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4598 exited. 9 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4590 exited. 8 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4595 exited. 7 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4591 exited. 6 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4592 exited. 5 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4593 exited. 4 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4597 exited. 3 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4594 exited. 2 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4599 exited. 1 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: emserver1 Heartbeat shutdown complete. Here is some more from the log ResourceManager[2576]: 2012/11/28_16:32:42 info: Acquiring resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server IPaddr[2602]: 2012/11/28_16:32:42 INFO: Running OK Filesystem[2653]: 2012/11/28_16:32:43 INFO: Running OK Nov 28 16:32:52 EMserver1 heartbeat: [1695]: WARN: node emserver2: is dead Nov 28 16:32:52 EMserver1 heartbeat: [1695]: info: Dead node emserver2 gave up resources. Nov 28 16:32:52 EMserver1 ipfail: [1807]: info: Status update: Node emserver2 now has status dead Nov 28 16:32:52 EMserver1 heartbeat: [1695]: info: Link emserver2:eth1 dead. Nov 28 16:32:53 EMserver1 ipfail: [1807]: info: NS: We are still alive! Nov 28 16:32:53 EMserver1 ipfail: [1807]: info: Link Status update: Link emserver2/eth1 now has status dead Nov 28 16:32:55 EMserver1 ipfail: [1807]: info: Asking other side for ping node count. Nov 28 16:32:55 EMserver1 ipfail: [1807]: info: Checking remote count of ping nodes. Nov 28 16:32:57 EMserver1 heartbeat: [1695]: info: Heartbeat shutdown in progress. (1695) Nov 28 16:32:57 EMserver1 heartbeat: [2734]: info: Giving up all HA resources. ResourceManager[2751]: 2012/11/28_16:32:57 info: Releasing resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server ResourceManager[2751]: 2012/11/28_16:32:57 info: Running /etc/init.d/nfs-kernel-server stop ResourceManager[2751]: 2012/11/28_16:32:57 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 stop Filesystem[2829]: 2012/11/28_16:32:57 INFO: Running stop for /dev/drbd1 on /shr Filesystem[2829]: 2012/11/28_16:32:57 INFO: Trying to unmount /shr Filesystem[2829]: 2012/11/28_16:32:58 INFO: unmounted /shr successfully Filesystem[2823]: 2012/11/28_16:32:58 INFO: Success ResourceManager[2751]: 2012/11/28_16:32:58 info: Running /etc/ha.d/resource.d/drbddisk r0 stop ResourceManager[2751]: 2012/11/28_16:32:58 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 stop IPaddr[2971]: 2012/11/28_16:32:58 INFO: ifconfig eth0:0 down IPaddr[2958]: 2012/11/28_16:32:58 INFO: Success Nov 28 16:32:58 EMserver1 heartbeat: [2734]: info: All HA resources relinquished. Nov 28 16:32:59 EMserver1 heartbeat: [1695]: info: killing /usr/lib/heartbeat/ipfail process group 1807 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBFIFO process 1777 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1778 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1779 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1780 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1781 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1782 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1783 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1784 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1785 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1786 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1787 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1778 exited. 11 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1779 exited. 10 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1780 exited. 9 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1781 exited. 8 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1782 exited. 7 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1783 exited. 6 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1784 exited. 5 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1785 exited. 4 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1786 exited. 3 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1787 exited. 2 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1777 exited. 1 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: emserver1 Heartbeat shutdown complete. If I restarted heartbeat at this point... the resources heartbeat controls would start up fine.... please help!

    Read the article

1