On 4/12/2016 alerts starting coming in around 6:55PM CDT that something is causing a large load on one of our Zebi storage SAN units that some of our VM customers (SVC/VDC) customers are stored on. This outage only affects VMware hosting customers on our VMForge platform. A lot of VM's were waiting on storage, and stacking up on latency on zebi4, but yet the load on zebi4 was showing lower than average utilization, and no alerts, problems or outages shown. All stats looked normal, just a little bit lower than expected load for this time of night and system load. On a hunch, we failed over the controller for Zebi4 to the standby around 8:10PM CDT, and the system load on the storage node went up to a more normal pattern. This cleared up the storage latency stacking up issues across the board for those VMs on Zebi4. Our engineers are now reviewing all VMs stored there to make sure they have recovered, or to help them along if needed. Unfortunately, there are no errors in the logs, nothing out of the ordinary on the (now standby) controller, everything there is operating at 100% by appearances, but not in practice. Since the only way for the vendor to debug this state would be to put everybody back in the error condition, we're not going to persue that at this time. We are up-to-date on software revs, although there is one newer available that the release notes says fixes something else, but we'll most likely move to that one after reviewing it more. If you have any further problems or questions please let us know at [log in to unmask], or call us up at 612-337-6340. Thank you. -- Doug McIntyre <[log in to unmask]> -- ipHouse/Goldengate/Bitstream/ProNS -- Network Engineer/Provisioning/Jack of all Trades