Update 2013-10-17 00:30 UTC
Packet loss stopped, back to 100% connectivity. Awaiting RFO from network ops.
Update 2013-10-16 18:45 UTC
Service restored but 25-30% packet loss ongoing. Pending RFO (Reason for Outage) report from network ops.
Update 2013-10-16 16:45 UTC
Network ops are performing a full reload on the affected routers.
Update 2013-10-16 16:15 UTC
Network ops have identified the problem and are working to resolve it as soon as possible. Still no firm ETA available.
Update 2013-10-16 15:45 UTC
Our hardware is working and links are up, but no traffic is reaching our switches. Network ops are investigating.
2013-10-16 15:15 UTC
We are currently experiencing an outage at our Amsterdam 1 datacenter. Engineers are working on a solution but we do not yet have a firm ETA. Please check back shortly for an update.
Customers affected include shared Web hosting, reseller Web hosting, ns1.anu.net and ns2.anu.net DNS resolvers (ns3 is hosted in Chicago and is still up), and customers with virtual servers hosted on ams1-cloudmin.anu.net.
The Cloudmin VM management system for our Amsterdam 2 datacenter crashed last night (11th October). The crash was due to a bug in the VM status collection system which caused it to use excessive resources and eventually run out of memory altogether.
We have resolved this issue by restarting the Cloudmin server. There is also an unofficial patch for this bug which we have now applied, pending the next maintenance release of Cloudmin which will fix this known issue.
Services affected: DNS resolution for ams2-cloudmin.anu.net zone, Web GUI VM management for VMs in Amsterdam 2 datacenter, API/customer portal management of VMs in Amsterdam 2 datacenter.
A power distribution unit failure at our Amsterdam 1 datacenter has this morning taken out about 1/4 of our Xen hosts located in Amsterdam 1. Engineers are en route to switch it out.
In the mean time we have rebooted all affected virtual servers using spare capacity on the remaining Xen hosts. No loss of data has occurred as our redundant centralised storage servers have not been affected.
Our customer portal, shared.anu.net Lasso/PHP hosting server and a handful of customer VMs were briefly affected by the outage but have all now been restored.