Ethernet links went down: Sat Oct 30 12:33:22 CDT 2010 Ethernet links came up: Sat Oct 30 12:33:58 CDT 2010 This caused the active load balancer to drop and the standby unit to take over, then 36 seconds later, it cut back. The normally standby servers' ethernet links were unaffected during this though both load balancers are plugged into the same physical switches on adjacent ports. This is pointing to a problem on the normally active load balancer and I'll need to do more log searching to check on potential permanent failure. Summary: All services handled on our cluster are back online Techno-babble starts here... The cutover from active to standby to active caused the servers behind the load balancers to cache the ARP (address resolution protocol) address for the gateway for the servers for the now-standby unit. I know that sounds confusing, sorry! To help explain: gateway is on 192.168.x.254 xx:xx:xx:xx:xx:17 (normal active unit) failure gateway is on 192.168.x.254 xy:xy:xy:xy:xy:4a (normal standby unit) cutback gateway is on 192.168.x.254 xx:xx:xx:xx:xx:17 (normal active unit) but the systems cached the ..:4a address as it had changed and would not recheck until after expiration. This is a completely normal and desired behaviour. Oct 30 12:33:31 web-10 kernel: arp: 192.168.x.254 moved from xx:xx:xx:xx:xx:17 to xy:xy:xy:xy:xy:4a on em0 I had to log into each of the servers and clear the ARP cache for the gateway address and everything came back online. Web services and mail services handled on our cluster were affected during this event. All services are back online, the outage lasted approximately 15 minutes while I logged into each and every server. During this time some of the servers had already updated their ARP cache, so the different services were back online faster than the time it took me to log into all of the servers. I find no errors on services at this time and everything looks good. Now starts the investigation into why the normally active load balancer thought that *all* of its ethernet ports went down at precisely the same moment. Support can be reached Monday thru Friday from 8:00am until 8:00pm via phone at 612-337-6340, or via email at [log in to unmask] -- Mike Horwath ipHouse - Welcome home! [log in to unmask] The universe is an island, surrounded by whatever it is that surrounds universes. - Berkeley Fortune