External Event that Affected Data Centers and Lessons Learnt

Tianjin-explosion-photo

Background

Exactly 1 year ago (the day this article is posted), i.e. 12th of August 2015, at about 22:51 hours, a fire broke out at a dangerous goods warehouse in Tianjin BinHai District. After the fire services are on site, an explosion rocked the site at 23:30 hours and another explosion happened again at 23:34 hours.

There are at least two data center facilities within a 2km radius of the site and another 3 facilities within a 3.8km radius. The data centers were already built and running before the dangerous goods warehouse (constructed in 2013) came in existence.

In one of my earlier post on data center site selection, it is one thing to conduct the environmental factors and distance from dangerous goods storage facility before the data center site selection, but such exercise should be conducted on an annual basis to evaluate if there is any change of the environmental risk.

tianjin-incident-20150814024919762

Data Centers in the Blast Radius

Those data centers within this 2km circle includes Tencent, Standard Chartered bank Tianjin backend processing center that contains a data center, National Supercomputer centre, China Hewlett Packard cloud solution centre, and LiePin (a job search company) data center. Data centers that are further away but still within Tianjin are Sohu, 58TongCheng, 21Vianet, China Telecom and China Unicom.

Data Centers that are known to have stopped operations or IT services are impaired

  1. National Supercomputer Center
  2. Tencent Tianjin Data Center

The Tencent Tianjin Data Center had to stopped operations due to evacuation order by the authorities as the chemical fumes from the incident site is harmful to human. Damage to existing data center equipment and facilities were not too serious as the wall of the Tencent Tianjin data center facility closest to the explosion site is undergoing fitting-out work as part of its phase 2 project work. However, the force of the explosions were so great that some air-handling units that were awaiting installation as part of the phase 2 project were moved a meter or so by the explosive force.

Tencent transfer their Tianjin Data Center workload over to their main data center at Shenzhen and had their people evacuated. Their vendors were informed and ready to repair damaged equipment after evacuation order were lifted.

From the Tencent response and preparedness, we can deduce that Tencent have a few key factors that made it ready for a major power or site accessibility incident:

 

  1. The People Factor – the people at both their Tianjin and Shenzhen facilities are trained and ready for such a major event.
  2. The System Factor –Tencent have several public facing systems such as QQ, Wechat, Wechat Pay, Tencent eMall, gaming, video streaming and so forth, and all these backend systems are up to date and their frontend app are engineered in such a way that they can be served from any of their data center. Tencent probably had to do something on their backend to ensure data integrity of their transactional system (e.g. for online gaming and Tencent eMall) works without problem despite the unavailability of their Tianjin Data Center facility.

 

From the social media that some of the Tencent staff had shared, however, they were unprepared in terms of:

  1. Food and Drinking Water Supply
  2. Accommodation
  3. Up to date news, however this is probably due to the lockdown or disparate information from the authorities during the first 24-48 hours of the incident.

Tianjin-news_7__796311779

Tianjin-office-photo2C2B476040.jpegInside an office

Tianjin-Inside-a-building-2 Inside an office

Other similar Incident

On 21st August 2015, there was an explosion that is attributed to a generator in a building basement, at downtown Los Angeles (http://www.datacenterknowledge.com/archives/2015/08/21/explosion-downtown-los-angeles-disrupts-data-center-operations/)

Quote “The blast at 811 West Wilshire Blvd. took out an on-site power station, leaving 12 buildings in the area without electricity, according to the local utility.

The explosion interrupted connectivity on network infrastructure operated by Level 3 Communications, which serves a lot of data center users in the area, Craig VerColen, a spokesman for LogMeIn, a company whose data center went dark as a result of the incident, said via email. Level 3 issued a statement saying its technicians were working to restore services.”

 

List of Data Center service providers DOs

  • Regular review of site profile and surroundings, assess risk and devise mitigating measures
  • Business Continuity and Disaster Recovery plan and exercise
  • Evacuation Plan and exercise
  • Disaster Response Checklist (update those that are not up to date, check your fuel supply vendor contacts and their ability to supply fuel)
  • Updated Vendor Contact List
  • Include a damage assessment checklist that list all the major equipment and systems, including inside the building, the data halls, plant rooms, fuel tanks, and the surroundings.
  • Stay in touch with every staff, on and off shift
  • Update staff on safe transport route
  • Procure food supplies and place in office
  • Inform your legal and finance people (document your loss, and ask your legal and finance teams to prepare for insurance claim)
  • Get ready additional resource including staff and transport
  • TIA942-2012A states that a data center site should be more than 0.4km away from chemical plant. It is better to safer, for example be more than 3.2km away.

 

Reference:

  1. https://zh.wikipedia.org/wiki/2015%E5%B9%B4%E5%A4%A9%E6%B4%A5%E6%B8%AF%E5%8D%B1%E5%8C%96%E5%93%81%E5%80%89%E5%BA%AB%E7%88%86%E7%82%B8%E4%BA%8B%E6%95%85
  2. http://www.missionmode.com/blog/business-continuity-lessons-tianjin-port-explosion/
  3. http://www.datacenterknowledge.com/archives/2015/08/21/explosion-downtown-los-angeles-disrupts-data-center-operations/
Advertisements
External Event that Affected Data Centers and Lessons Learnt

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s