Data Center Downtime at the Core and the Edge: A Survey of Frequency, Duration and Attitudes

Edge computing is expanding rapidly and re-shaping the data center ecosystem as organizations across industries move computing and storage closer to users to improve response times and reduce bandwidth requirements.

While forms of distributed computing have been common in some sectors for years, this current evolution is distinct in that it is enabling a broad range of new and emerging applications and has higher criticality requirements than traditional distributed computing sites.

At the same time, core data center managers are dealing with increased complexity and balancing multiple and sometimes conflicting priorities that can compromise availability.

As a result, today’s data center networks are more vulnerable to downtime than ever before. In an effort to quantify that vulnerability, the Ponemon Institute conducted a study of downtime frequency, duration and attitudes at the core and the edge, sponsored by Vertiv.

The study is based on responses from 425 participants representing 132 data centers and 1,667 edge locations. All core and edge data centers included in the study are located in the United States/Canada and Latin America (LATAM).

The study found data center networks vulnerable to downtime events across the network. Core data centers experienced an average of 2.4 total facility shutdowns per year with an average duration of more than two hours (138 minutes). This is in addition to almost 10 downtime events annually isolated to select racks or servers. At the edge, the frequency of total facility shutdowns was even higher, although the duration of those outages was less than half that of those in core data centers.

The study also looks at the attitudes that shape decisions regarding core and edge data centers to help identify factors that could be contributing to downtime events. More than half (54%) of all core data centers are not using best practices in system design and redundancy, and 69% say their risk of an unplanned outage is increased as a result of cost constraints.

Leading causes of unplanned downtime events at the core and the edge included cyberattacks, IT equipment failures, human error, UPS battery failure, and UPS equipment failure.

Finally, the study asked participants to identify the actions their organizations could take to prevent future downtime events. They identified activities ranging from investment in new equipment to infrastructure redundancy to improved training and documentation.

Key Findings

Facility Size
Edge data centers aren’t necessarily defined by size but by function. For the purpose of this research, edge data centers are defined as facilities that bring computation and data storage closer to the location where it is needed to improve response times and save bandwidth. Nevertheless edge data centers were on average about one-third the size of the core data centers.

The extrapolated size for core data centers that participated in this study is 15,153 square feet/1,408 square meters. For edge computing facilities, the average size is 5,010 square feet/465 square meters.

Frequency of Core and Edge Downtime

 Figure 3 shows the shutdown experience of participating data centers over the past 24 months. As can be seen, total data center shutdown has the lowest frequency (4.81). However, these events are also the most disruptive, and the 4.81 unplanned total facility shutdowns over a 24-month period would be considered unacceptable for many organizations.

Partial outages of certain racks in the data center have the highest frequency at 9.93, followed by individual server outages at 9.43.

It can be difficult to directly compare the total number of downtime events in edge and core facilities due to the higher complexity generally found in core data centers and the increased presence of personnel in these facilities. However, it is possible to compare total facility shutdowns for core and edge data centers. Edge data centers experienced a slightly higher frequency of total facility shutdowns at an average of 5.39 over 24 months. As edge sites continue to proliferate, reducing the frequency of outages at the edge will become a high priority for many organizations.


Leave a Reply

Your email address will not be published. Required fields are marked *