Doing the best

11.12.09 -

Since the National Science Foundation gave the nod to build Blue Waters—a sustained petascale computer for open scientific research—NCSA, the University of Illinois, IBM, and partners around the country have been collaborating on the machine and a facility to house it. Building the two simultaneously has afforded unprecedented opportunities for synergy between machine and facility. Access' Barbara Jewett chatted with IBM Fellow Ed Seminaro, chief architect for POWER HPC servers at IBM, about this synergy as well as some of the unique aspects of the Blue Waters project.

IBM is collaborating with NCSA and the University of Illinois on the Illinois Petascale Computing Facility (PCF) as well as the Blue Waters hardware that's going into it. Is collaborating on the building something you normally do?

We do, but not necessarily to the degree that we've done it here. It's not often that you get in on the ground floor and that you start before the building has even been drawn up on blueprints, so we were able, in fact, to design some of the server racks to accommodate this specific facility. It's likely a lot of things we're doing here will become more standard practices, but we were able to do some things that are unique in this situation.

As one example, we actually have the system laid out on a raised floor to take up all the available area from side to side; there's almost no wasted space whatsoever. The aisles on both sides of our cabinets are about the minimum you'd want them to be. If a room is constructed ahead of time or plans are drawn up ahead of time you're usually not able to achieve that.

The Blue Waters system is completely water-cooled. You really have to plan up front for that. What we've been able to do with the help of EYP, the building architect, is position the machines directly over the cooling main lines so we have no extra plumbing whatsoever for feeding water to the parts of the machine. We were able to do very similar things with how we distribute power to the machine.

Talk about the cooling strategy for Blue Waters.

By putting the water almost in contact with the chips that are dissipating the power, it literally conducts the heat directly out without having to move any air around. That allows the electronics to run cooler, which causes them to dissipate less power. Less power dissipated, means less power or energy is used.

During a large part of the year we'll run with what's called free cooling. With the outdoor water-cooling tower, the water won't be any cooler than the outside air temperature but during a large portion of the year the weather is cool. So we'll literally circulate the water through the electronics, outside through a big radiator with a fan blowing through it, and right back into the building. The only energy we have to provide is the energy for the fans to help cool that radiator outside and to pump the fluid around.

Does the water have to be a particular temperature to effectively cool?

For maximum efficiency, you want to be able run the water at the warmest possible temperature because that will capture the most free-cooling days. The coldest the water can be is to just above freezing, but the water is moving at a good enough pace that it never gets quite that cold. The warmest the water can be, in our case, is a temperature of about 68 degrees Fahrenheit or so. We're actually investigating trying to run the water even warmer to use free cooling more months.

We have a metric we use in the industry, power usage effectiveness or PUE. What that refers to is you have power coming into the building at some voltage and you have energy used by the machine, and you take the ratio of the energy used by the IT equipment versus the energy consumed by the entire building. If that ratio is 1.00 that means there wasn't any energy wasted in the building infrastructure prior to getting to the machine. What would that wasted energy be used for? It would be used to reduce the voltage down to a voltage the machine could use, to transmit power, and there are some losses, the biggest being the energy it takes to move the heat from the IT equipment out to the outside environment. And when the outside temperature is higher than you would like the room to be, you would also have to cool the heat transfer mechanism, which is typically water.

With PCF and Blue Waters, we will achieve a PUE in the neighborhood of about 1.18. To give you an idea of how good that is, a typical well designed data center will be 1.4. The lower the PUE, the more efficient the building. If we can gain additional free cooling days that number will be even lower because we'll save more energy.

Do you ever worry about water leaks in the machines?

Bringing the water right to the electronics means there is more potential for leaks, but we really haven't seen any in our POWER 575 systems that are presently cooled this way. In order to guard against it, we use very high-quality components within the cabinet, and also in each cabinet we have a completely self-contained water-cooling system. In our present product, it's about seven gallons. But the machine would sense a leak and shut off long before we lose seven gallons of water.

What other technologies are employed to increase energy efficiency?

After you get past cooling, the next thing is power delivery to the product itself. In the world of power delivery, the higher the voltage the more efficient the power transmission. So the best thing is to always use the highest voltage you can.

In PCF, we'll run directly off line from 480 volts, which is the standard AC voltage in North America. The ability to do that really depends on how good the IT equipments' immunity is to power line disturbances, and Blue Waters' cabinets have exceptional immunity to power line voltage glitches. In the case of PCF, it was a very high-availability environment before we even started, so we took the infrastructure that was there and routed that right into the building. The machines will literally run directly off the voltage that comes out of the substation transformers in the building. By doing that, we only lose about 2 percent in power conversion efficiency, which is an extremely low number.

I've heard you refer to new metrics in computing. What are they?

In the world of what most people refer to as enterprise computing—which is really any type of computing done in very large scale—you generally have a lot of equipment and your goal is to get a lot of throughput [a lot of work through the machine]. The higher throughput you need, the more compute power you need. Throughput per watt or throughput per amount of power used is one of the prime metrics in designing the server. Because, of course, not only is it environmentally friendly to have the most throughput per watt, it also saves money, especially with the rising cost of energy.

Also, throughput per the amount of space used is a very important metric, especially when a new building is being constructed. The smaller the space we can put the computer in, the lower the cost of the building will be. Last, another important metric is how much building infrastructure we need in a facility to accommodate a server. How do we transfer the power coming into the building, which is typically high voltage, down to the voltage the machine can use; how do we transfer the heat from the machine out to the outside ambient environment while at the same time keeping the inside ambient environment at some temperature limit we establish? The more lax we're able to make those requirements of the data center, again, the lower the cost of the facility.

What are the common mistakes people make when building a data center?

One of the most common mistakes I see is designing the data center to be a little too flexible. It is easy to convince yourself that, when you build a building, you really want to build it to accommodate any type of equipment, but this is at the cost of power efficiency.

Today the best efficiency is to run high-voltage power directly to an IT cabinet, directly water cool like we're doing with Blue Waters. Of course, not every piece of IT equipment is equipped to be directly connected in that fashion. So if you want to be more flexible, you might want a center that is air cooled with some water-cooling capability, and the capability to bring low-voltage or high-voltage power to the cabinets.

Another mistake I see is shorting space. It starts out that a customer has a lot of space, usually because they're replacing some very old equipment with more modern equipment that is inherently smaller for the amount of work that's done. So they convert the extra to meet other company needs and then have no room to add more hardware later. If you need more power and more cooling there often are solutions, but when you're out of space, you're out of space.

What do you see as data center trends?

Centralization, consolidation. I see more and more of that. I see companies that had many smaller data centers considering consolidating to two very, very large centers. Most of the time people will want more than one so they have some disaster recovery capability, and they'll be separated by a fairly large distance. But the idea of some very, very large companies wanting to go down to just two data centers, that's definitely a breakthrough.

Another is cost of building construction. Some people spend enormous sums, but really, it gets back to can you design the IT equipment so that it doesn't require too much special capability. And what that really means is that you don't have to build a very special facility, you just have to be able to build the general power and cooling capabilities you need and a good sturdy raised floor. This can save a phenomenal amount of money.

There's a thought that one way to really achieve the same thing is with an approach called container computing, where you actually take a tractor trailer, retrofit it with computing hardware, and then roll that entire trailer on site. And in a similar fashion to what I mentioned they put electricity and water to it. The space is basically the space in the parking lot where you park the trailer. Or the container can be within a facility for those who like the data IT equipment under their roof.

A single Blue Waters rack in itself will be a breakthrough in the industry of medium- to large-scale computing. A single rack containing server, networking and storage capability can replace a small data center and can be placed in a small 11' x 16' (175 square foot) room. All that is needed to power and cool this system is two small water lines and a standard 480 power panel. A second rack will double the compute power, and fits in a 15' x 16' (240 square foot) room.

Imagine fitting an entire data center of today in a room the size of a bedroom. The need to pack a tremendous amount of Blue Waters compute power in an area of just 10,000 square feet in PCF has inherently yielded an architecture that makes housing data center equipment far more economical.

What's been the biggest challenge for you with the Blue Waters/PCF project?

The biggest challenge when you are doing something like this is to do it cost effectively. With government contracts the money is lean in the first place because you are trying to do an awful lot with the money you have. This is something all of us who are in the throughput computing arena deal with on a daily basis. A lot of times the way we get by is by using commodity hardware because at least you do have a good understanding of what the costs likely are.

Here, we're doing this with leading-edge, very custom hardware to try to achieve the best energy efficiency and the highest compute density. Having to invent on a schedule, and having to do it extremely cost effectively, is very difficult. That has positively been the biggest challenge. And it's nothing completely unique to the National Science Foundation and to NCSA. Generally, in all these type of large-scale projects this is what you see. Possibly not quite to this degree, I'll say.

But that's been a challenge—trying to do everything the best and to do it cost effectively.

National Science Foundation

Blue Waters is supported by the National Science Foundation through awards ACI-0725070 and ACI-1238993.