Tuesday, December 14, 2010

Part Of The Reason For The Delay in Deploying Incursion Part 2

CCP had to delay the release of the second incursion patch from today to tomorrow.  CCP Red Button posted one of the reasons yesterday on the forums.

Thought I'd post some pictures of Stuff™ that happened today for those interrested.

Burned bladecenter
Burned blade 1
burned blade 2

As you can see we had a decent bonfire burning in our datacenter today. While not the primary reason for the delayed patch tomorrow this was definitely a pretty big contributing factor. This morning we had an alarm triggered on a powersupply in one of the bladecenters containing the Singularity testcluster. When reseating the powersupply which is part of a routine troubleshooting procedure before replacing with a spare part it somehow mysteriously, even magically (cause yet undetermined, and this is the first time we have this happen) managed to cause an internal short circuit in the bladecenter, completely fry one of the blades and seriously singe a few of the others. As you can imagine this caused us to immediately pull the bladecenter out of service and thus shutting down our primary public test server while we were juggling hamsters between cages to make Stuff™ work again.

Having Singularity, which is our primary staging server, shut down for several hours on the day before a big deployment is far from ideal, in fact it is critical as we use every hour right up until the last minute to test, retest and test again that everything is working. So add this to other issues that we were dealing with today and the prudent thing to do was to delay the patch and have a second dress rehearsal tomorrow, this time hopefully without incident.

Oh and by the way, the hamsters are still alive if somewhat visibly shaken.

No comments:

Post a Comment