Key Data Center Cooling Metrics
Power Utilization Effectiveness (PUE)
The most widely adopted metric for data center efficiency, PUE, was defined by The Green Grid as the ratio of the total energy used in a data center to the energy used by IT equipment (servers, network switches, UPSs, etc.).
\[ \begin{equation} \text{PUE}=\frac{\text{total facility energy}}{\text{IT equipment energy}} \end{equation} \]A PUE of 1.0 indicates that all energy entering the data center is being used for revenue-generating IT workloads. Meanwhile, a PUE of 2.0 suggests that only half the total data center energy is used for IT equipment. This implies that this metric characterizes the overhead needed to run a facility.
According to the Uptime Institute’s 2024 Global Data Center Survey, average data center PUE has settled at about 1.5.
PUE Value | Data Center Efficiency |
---|---|
< 1.2 | Excellent |
1.3-1.5 | Good |
1.6-1.8 | Acceptable |
> 1.8 | Poor |
Considerations for PUE
While PUE provides a straightforward way to gauge overall data center efficiency by comparing total facility energy to IT-specific energy consumption, it also has certain limitations that have become more apparent as data center designs and operations have evolved.
Energy Classification
PUE groups all energy use into “facility” versus “IT” categories, which can obscure where exactly energy is consumed. For example, it is not always clear whether server fans should be counted as part of IT equipment or as facility cooling loads—different interpretations can lead to notably different PUE values.
IT Workloads
Because PUE focuses solely on energy ratios, it does not distinguish between energy spent on active computing tasks versus energy drawn by underutilized or idle servers. A data hall running predominantly idle machines can achieve a low PUE, but this does not necessarily reflect effective use of power toward productive work.
Regional and Climate Variability
Comparing PUE across geographically diverse sites can be misleading due to differences in local climate. For instance, a data center in a cooler, drier region may rely more on free‐cooling methods and exhibit a lower PUE, whereas a facility in a warmer, more humid location may require more mechanical cooling, resulting in a higher PUE—even if both centers employ similarly efficient infrastructure.
Rack Cooling Index™ (RCI)
The Rack Cooling Index, a trademark of ANCIS Inc., captures how effectively server racks are being cooled by considering the ASHRAE TC 9.9 thermal guidelines for recommended and allowable inlet temperatures. The guidelines offer a range of 64.4-80.6°F, resulting in RCIHi and RCILo to cover each end of the range. These are defined below as:
\[ \text{RCI}_{Hi}=\left[1-\left(\frac{\sum_{i=1}^n(T_i-T_{R,Hi})}{N\times(T_{A,Hi}-T_{R,Hi})}\right)\right]\times 100 \]where:
- \(T_i\) is the maximum inlet temperature for the \(i^{th}\) rack
- \(T_{R,Hi}\) is the ASHRAE max recommended temperature (80.4°F)
- \(T_{A,Hi}\) is the ASHRAE max allowable temperature (89.6°F)
- \(N\) is the number of racks exceeding the max recommended temperature
and:
\[ \text{RCI}_{Lo}=\left[1-\left(\frac{\sum_{i=1}^n(T_{R,Lo}-T_i)}{N\times(T_{R,Lo}-T_{A,Lo})}\right)\right]\times 100 \]where:
- \(T_i\) is the minimum inlet temperature for the \(i^{th}\) rack
- \(T_{R,Lo}\) is the ASHRAE minimum recommended temperature (64.4°F)
- \(T_{A,Lo}\) is the ASHRAE minimum allowable temperature (59°F)
- \(N\) is the number of racks below the minimum recommended temperature
Put into words for RCIHi:
- RCIHi = 100%: all rack inlet temperatures are below \(T_{R,Hi}\)
- RCIHi < 100%: at least one rack inlet temperature is above \(T_{R,Hi}\)
- RCIHi = 0%: all rack inlet temperatures are above \(T_{R,Hi}\)
Similarly for RCILo:
- RCILo = 100%: all rack inlet temperatures are above \(T_{R,Lo}\)
- RCILo < 100%: at least one rack inlet temperature is beLow \(T_{R,Lo}\)
- RCILo = 0%: all rack inlet temperatures are below \(T_{R,Lo}\)
Further intuition for RCI can be gained through the table below:
RCI Value | Cooling Performance |
---|---|
100% | Ideal |
> 96% | Good |
91-95% | Acceptable |
< 90% | Poor |
Example RCIHI Calculation
To work through an RCIHI calculation, consider the table below:
Rack | Inlet Temp (°F) |
---|---|
1 | 68.0 |
2 | 71.6 |
3 | 77.0 |
4 | 82.4 |
5 | 86.0 |
6 | 78.8 |
7 | 66.2 |
8 | 91.4 |
9 | 75.2 |
10 | 73.5 |
Reminding ourselves that the ASHRAE recommended temperature range is 64.4-80.6°F, we consider the racks outside of this range and subtract the max recommended temperature from the inlet temperature value:
Rack | Inlet temp (°F) | High side excess (°F) |
---|---|---|
4 | 82.4 | 1.8 |
5 | 86.0 | 5.4 |
8 | 91.4 | 10.8 |
Three racks are out of compliance, therefore \(N=3\). We then sum the high side excess for these racks:
\[ \sum_{i=1}^n(T_i-T_{R,Hi})=1.8+5.4+10.8=18°\text{F} \]Now, we have all the terms needed for the full RCIHi calculation:
\[ \begin{align} \text{RCI}_{Hi}&=\left[1-\left(\frac{\sum_{i=1}^n(T_i-T_{R,Hi})}{N\times(T_{A,Hi}-T_{R,Hi})}\right)\right]\times 100 \\ &=\left[1-\left(\frac{18°\text{F}}{3\times(89.6°\text{F}-80.6°\text{F})}\right)\right]\times 100 \\ &=\left[1-\left(\frac{18}{27}\right)\right]\times 100 \\ &=\left[1-\left(\frac{2}{3}\right)\right]\times 100 = \boxed{33.33\%} \end{align} \]This value suggests that there are serious cooling issues with our sample data center.
The calculation can be similarly performed for RCILo.
Return Temperature Index™ (RTI)
RTI, another trademark of ANCIS Inc., measures the effectiveness of the air management system from an energy standpoint and is defined by:
\[ RTI=\frac{\text{Rack }\Delta T}{\text{Air handler }\Delta T}\times 100 \]To understand the principle, a quick refresher in basic heat transfer fundamentals may be helpful. Recall the specific heat formula in rate form:
\[ \dot{Q}=\dot{m}c_p\Delta T \]where:
- \(\dot{Q}\) is the rate of heat added or removed (W)
- \(\dot{m}\) is the mass flow rate of the working fluid (kg/s)
- \(c_p\) is the specific heat capacity of the working fluid (J/kg-K)
- \(\Delta T\) is the temperature difference, \(|T_{out}-T_{in}|\)
In a data center, heat transfer occurs mainly in two places:
- inside the server rack, where cool supply air is heated by electronic components
- inside the air handler, where warm air exhausted by the racks is cooled by a heat exchanger
In both places, it is ideal for heat transfer to be maximized. This can be done by increasing any of \(\dot{m}\), \(c_p\), or \(\Delta T\):
- Specific heat is a constant defined by the cooling system’s working fluid and cannot be increased
- Increasing the mass flow rate will require the rack and air handler fans to run at higher power, and is not practical or scalable
This leaves manipulating \(\Delta T\), illustrating why rack and air handler \(\Delta T\) are used in the RTI metric. CRAC unit efficiency increases as return temperature increases.
Since \(\Delta T\) is typically prescribed by operators, a more intutiive representation is to subsitute \(\Delta T\) with flow rate:
\[ RTI=\frac{\text{Rack flow rate}}{\text{Air handler flow rate}}\times 100 \]If the rack flow rate is greater than the air handler flow rate, RTI > 100%, the racks can be starved of cool air, and recirculation from rack exhausts to rack inlets can occur, raising rack inlet temperatures and lowering cooling effectiveness.
If the air handler flow rate is greater than the rack flow rate, RTI < 100%, cool supply air can bypass the rack inlets and short-circuit to the CRAC return, lowering return temperature and CRAC effectivness. This can occur if the air handler flow rate is increased to combat rack hotspots.
RTI Value | Airflow status |
---|---|
100% | Balanced |
> 100% | Net recirculation |
< 100% | Net bypass |
While RTI = 100% is ideal in a vacuum, slight deviations from this mark may be necessary in real-world contexts. If mixing is required to ensure an even distribution of rack inlet temperatures, a RTI > 100% would be desirable to drive recirculation and mixing.