Professional Services

Management and Operations Guideline

This online tool provides an M&O guideline that identifies the behaviors independent of Tiers and the installed infrastructure that impact the ability of a data center to meet its business and performance objectives. This guideline was specifically developed to highlight the M&O behaviors that apply, independent of the data center design, encompassing both the day-to-day activities on the ground and upstream planning and decision making.

The behaviors were drawn directly from Uptime Institute's Tier Standard: Operational Sustainability. They were reviewed and verified by a Coalition composed of key stakeholders in the enterprise owner, outsourced operations, and third-party/multi-tenant industry segments, to ensure M&O's compatibility with a variety of management solutions and across multiple computing environments. This Guideline documents the results of that effort. M&O criteria in this guideline were developed by and for the data center industry, rather than a re-interpretation of standards originating in other industries.

The behavior criteria in each of the management and operations components of staffing, maintenance, training, planning, and operating conditions provides data center owners, operators, and managers with the minimum behaviors that are required to ensure solid 24x7 data center operations of a non-Tier facility. Focus on the recommended behaviors will assist in attaining the full performance and site uptime potential with the installed infrastructure, improve the efficiency of operations, and realize opportunities for energy efficiency.

Management and Operations Behaviors

The M&O behaviors addressed in the categories below are essential for a site to achieve its full Uptime potential, obtain maximum leverage of the installed infrastructure/design, achieve efficient operations of the data center, and increase the opportunity for gaining energy efficiencies. You can click on each category to learn more.

The M&O behaviors fall into 5 categories:   

The set of behaviors in each category is independent of the infrastructure: design, site location, and building characteristic elements found in Operational Sustainability (which are not directly applicable in the measurement of the day to day operations and management activities of the site).

Staffing and Organization:
The right number of qualified individuals organized correctly is critical to a data center meeting long-term performance objectives. Enough qualified in-house staff and/or vendor support must be available to perform all the maintenance activities and operate the data center to provide the greatest opportunity to meet the uptime objective. All personnel working in a data center must have the experience and technical qualifications necessary to perform their assigned activities without impacting the data center. The roles and responsibilities of each position should be defined and their criticality acknowledged by management. The data center organization needs to focus on achieving the desired uptime objective.
Click Here to View Staffing and Organization Behaviors

An effective maintenance program consisting of preventive and predictive maintenance programs, vendor support, adequate resources, and a tracking capability are necessary to keep equipment in a like-new condition and to minimize equipment failures. A preventive maintenance (PM) program that keeps equipment in top performance condition is the most effective way to minimize equipment failures. Fully scripted processes and procedures for accomplishing all necessary maintenance activities needs to exist.

A maintenance management system (MMS) that tracks status of equipment and trends maintenance activities is required for an effective maintenance program. An effective predictive maintenance program identifies potential issues before they can cause a problem and provides information management needs to better allocate maintenance resources. The MMS is critical to ensure maintenance scheduling and completion, staffing loads and justification, and to develop life-cycle plans and budgets. Tracking status of maintenance activities is vital to minimize deferred maintenance (any maintenance activity that is deferred becomes a risk to the data center). Tracking outages and determining root cause is important to ensure that corrective actions can be taken to prevent reoccurrence.

Any level of vendor support to maintain infrastructure should have a corresponding list of qualified vendors with formal contracts specifying the scope of work, call-in process, qualifications, and response times to ensure the level of service required meets the uptime objectives. Housekeeping is also an equally important aspect of maintenance to keep combustibles and contaminants out of the computer room and the critical environment.
Click here to view Maintenance Behaviors

A training program ensures that all personnel understand policies, procedures, and unique requirements for working in the data center. This is essential to avoiding unplanned outages and to ensuring the proper response to both anticipated and unplanned events. As the uptime objective, or site complexity, increases, so does the requirement for a more comprehensive and rigorous training program to prevent human error. Training programs need to be well documented to ensure consistent training of all individuals.

The amount of training required for vendors depends on whether or not they are escorted at all times. Vendor training should go beyond verification of qualifications/certifications for the specific activities and equipment they maintain. They need to be trained on, and required to follow, site-specific policies and procedures.
Click here to view Training Behaviors

Planning, Coordination, and Management:
Components of an effective planning, coordination, and management program include site policies; financial management policies; site infrastructure library; and space, power, and cooling capacity management tools.

All data center policies and procedures need to be documented to ensure that they are understood and followed. Inconsistencies in the performance of data center management and operations can lead to outages. Availability of a complete on-site reference library of all information on the data center infrastructure is critical for those individuals that might be required to work an abnormality. Additionally, accurate as-built drawings for the entire data center should be readily available.

Monitoring and analysis of airflow and electrical power can indicate potential problems before they occur, improve resource utilization and data center availability, and provide an environment to take advantage of energy efficiencies.

A financial process that ensures a data center has the budget to support the business objective is essential. The data from an MMS (covered in the maintenance section) is extremely valuable in creating, reviewing, and justifying staffing levels and overall data center operation budgets.
Click here to View Planning, Coordination, and Management Behaviors
Operating Conditions:
Consistent and documented management of capacity components and set points are needed to ensure power and cooling are always available for IT equipment. Operating conditions should be based on both risk and cost. Load management decisions need to be established, documented, and practiced based on electrical capacity components to ensure maximum loads are not exceeded and capacity is reserved for switching between components.
Click here to View Operating Conditions Behaviors