Liquid Cooling vs Air Cooling for AI Workloads
The era of air-cooled data centers handling every workload is ending. NVIDIA's H100 GPUs generate 700W each, the H200 pushes even higher, and the upcoming B200 and GB200 systems will demand cooling solutions that air simply cannot deliver efficiently. If you're deploying AI infrastructure in 2026, understanding your cooling options isn't just about efficiency — it's about whether your deployment is even physically possible. This guide provides a comprehensive comparison of liquid and air cooling technologies for AI workloads.
The Cooling Crisis in AI Data Centers
Traditional data center cooling was designed for servers drawing 5-15 kW per rack. A typical CRAC (Computer Room Air Conditioning) unit can handle about 100 kW of heat dissipation, serving 10-20 standard racks. But AI has changed the math completely:
- A single NVIDIA DGX H100 system generates approximately 10 kW of heat
- A rack of 4 DGX systems generates 40+ kW — 4-8x what traditional cooling was designed for
- NVIDIA's GB200 NVL72 rack is expected to generate 120+ kW of heat
- At these densities, you'd need 1 CRAC unit for every 1-2 racks — physically impossible in most facilities
This thermal challenge is driving the most significant shift in data center engineering in decades: the transition from air cooling to liquid cooling.
Air Cooling Technologies
Traditional CRAC/CRAH Systems
Computer Room Air Conditioning (CRAC) and Computer Room Air Handler (CRAH) units are the workhorses of traditional data center cooling. They push cold air under a raised floor or through overhead ducts, cooling equipment through convection. For standard enterprise workloads at 5-10 kW per rack, they work well. For AI, they're increasingly inadequate.
- Capacity: Effective up to 15-20 kW per rack with hot/cold aisle containment
- PUE impact: Typically adds 0.3-0.5 to PUE (facility PUE of 1.3-1.5)
- Cost: Lowest capital cost, highest operating cost at high densities
- Retrofittability: Standard in all existing facilities
In-Row Cooling
In-row cooling units sit between server racks, providing targeted cooling exactly where heat is generated. This eliminates the inefficiency of cooling an entire room when only specific areas have high-density equipment. In-row units can extend air cooling to 25-30 kW per rack when combined with hot aisle containment, making them a bridge technology for moderate AI deployments.
Rear-Door Heat Exchangers (RDHx)
Rear-door heat exchangers replace the standard rear door of a server cabinet with a door containing a liquid-cooled heat exchanger. Hot exhaust air passes through the heat exchanger, transferring heat to a chilled water loop. This is technically a hybrid approach — air cools the components, but liquid removes the heat from the rack. RDHx can support 30-40 kW per rack and is one of the easiest liquid cooling technologies to deploy in existing facilities.
Liquid Cooling Technologies
Direct-to-Chip (Cold Plate) Cooling
Direct-to-chip cooling places cold plates directly on the hottest components — GPUs and CPUs — with liquid flowing through them to carry heat away. This is the approach NVIDIA has standardized for their data center GPU platforms and is the most widely deployed liquid cooling technology for AI workloads in 2026.
- Capacity: Can handle 80-150+ kW per rack, suitable for current and next-gen GPU systems
- PUE impact: Excellent efficiency, facility PUE of 1.1-1.2 achievable
- Compatibility: NVIDIA's reference designs for H100, H200, and B200 all support direct-to-chip cooling. Most major server OEMs offer cold plate options.
- Coolant: Typically uses treated water or water-glycol mix, operating at 35-45°C — warm enough for free cooling in many climates
- Considerations: Requires plumbing infrastructure in the data hall, leak detection systems, and trained personnel. Air cooling is still needed for memory, storage, and networking components.
Single-Phase Immersion Cooling
In single-phase immersion cooling, servers are submerged in a tank of dielectric (non-conductive) fluid. The fluid absorbs heat directly from all components, and warm fluid is pumped to a heat exchanger where it's cooled and recirculated. The fluid remains liquid throughout the process (hence "single-phase").
- Capacity: Can handle 100+ kW per tank, with excellent heat removal from all components
- PUE impact: Best-in-class efficiency, PUE of 1.02-1.05 achievable since no fans are needed
- Noise: Near-silent operation since fans are eliminated
- Hardware compatibility: Requires removing fans and potentially modifying server designs. Not all server platforms are certified for immersion.
- Considerations: Higher capital cost, specialized maintenance procedures, and warranty implications from some OEMs. Hardware swaps require draining or fishing servers out of fluid.
Two-Phase Immersion Cooling
Two-phase immersion uses a fluid with a low boiling point. As components heat up, the fluid boils on contact, absorbing heat through phase change (much more efficient than conduction alone). The vapor rises, condenses on a cool surface, and drips back into the tank. This process is extremely efficient because phase change absorbs far more energy per unit of fluid than simply raising the fluid's temperature.
- Capacity: Highest cooling capacity per unit volume, 150+ kW per tank possible
- PUE impact: Theoretical PUE of 1.01-1.03, the lowest achievable
- Challenges: Two-phase fluids (typically engineered fluorocarbons) are expensive, and some have environmental concerns (high global warming potential). Fluid containment is critical.
- Maturity: Still relatively niche compared to direct-to-chip and single-phase immersion. Fewer vendors and less operational experience in production environments.
Head-to-Head Comparison
Which Cooling Technology Should You Choose?
Choose Air Cooling If:
- Your rack density is under 20 kW
- You're running inference (not training) with moderate GPU counts
- You're deploying in an existing facility without liquid cooling infrastructure
- Your budget prioritizes lower capital cost over operating efficiency
Choose Direct-to-Chip If:
- You're deploying H100, H200, or B200 GPU clusters (NVIDIA's recommended approach)
- Your rack density is 40-150 kW
- You want the best balance of performance, cost, and operational simplicity
- You're building new or can retrofit piping into your data hall
- This is the default choice for most AI deployments in 2026
Choose Immersion If:
- You're building a purpose-built AI facility from the ground up
- Maximum efficiency and density are top priorities
- You have the budget and operational expertise to manage immersion systems
- You're deploying at massive scale where the efficiency gains justify the complexity
Finding Liquid-Cooled Facilities
Not all colocation providers offer liquid cooling. When evaluating facilities for AI workloads, ask specifically about:
- What liquid cooling technologies are available (direct-to-chip, immersion, RDHx)?
- What is the maximum supported power density per rack?
- Is the liquid cooling infrastructure in the data hall today, or is it a future plan?
- What is the coolant water temperature, and does the facility use free cooling?
- What leak detection and containment systems are in place?
Browse AI-ready facilities in our directory, or learn about power capacity requirements for AI workloads.