NCSA Director Bill Gropp on ‘Different Approaches to AI’ | NCSA | National Center for Supercomputing Applications | Office of the Vice Chancellor of Research and Innovation

Editor’s note: This is part of a series of virtual visits with NCSA thought leaders on current topics impacting the field of high-performance computing.

Different Approaches to AI in Academic, Commercial Communities
By Bill Gropp, NCSA Director

Around this same time last year, I expounded on what the “Future of AI” may entail. A lot has happened in the 12 months since then, including new approaches, new trends and, yes, new complications.

A lot of the news covering artificial intelligence stems from the efforts being made in the commercial sector. Whether it’s well-known chatbots and large language models like ChatGPT or legacy tech companies like Google and Microsoft investing heavily in AI development, the public’s awareness of this field of industry continues to grow.

But what about academic research into AI?

Research is absolutely a major part of AI in the commercial sector. But in the end, commercial companies are looking to make money and they’re looking to make money before somebody else does. If you think of AI as revolutionary, you want to be the one that succeeds in delivering the revolution because that’s where you’ll get the most return. That’s why I think we see this huge commercial investment in AI systems that apply the existing algorithms or modest improvements of them combined with more and more GPU and computing resources to deliver AI capabilities you can put into products.

On the academic research side, there’s a lot of interest in understanding the strengths and weaknesses of AI systems – “explainable AI.” There’s great interest in figuring out how we can build AI systems that don’t produce false outputs. NCSA’s partnership with the National Deep Inference Fabric provides an environment in which researchers can better understand how inference happens and how a trained model comes up with a certain kind of response.

There’s also substantial interest in adding local updates to a trained model. How do you increase the speed at which you can do that while reducing the amount of computing and energy it takes? There’s a huge amount of research that needs to be done that doesn’t require the enormous scale that you find in the commercial sectors. This is one of the goals of the National Artificial Intelligence Research Resource (NAIRR), which aims to democratize access to computing resources for those who do research on AI in the many areas that don’t require tens or hundreds of thousands of GPUs.

Different goals, similar outcomes
Of course, there are different sets of guidelines, goals and rewards within the academic and commercial communities. Both are effective in their own way and can be beneficial to each other. The academic system encourages people to explore new ideas and increase the body of knowledge. Research is trying to answer questions, discover new knowledge, gain insight and understanding, and make it available to everyone by publishing or releasing artifacts others can use. The commercial sector – capitalism – encourages people to think about how to deliver value measured as profit but one that can benefit society as well.

Academic researchers came up with the predecessor to the internet which would not have happened without them. But the recognition of the commercial opportunities in having ubiquitous networking provided a lot of the resources that have driven the availability of that networking across the planet. Neither group would have accomplished the whole on their own and we see the same situation with AI. The initial work and algorithms that AI depends upon came out of the academic research community. A lot of the amazing tools that people are interested in exploring and understanding better came out of the resources that the commercial world was able to bring to bear.

Building efficient AI
One challenge in conducting AI research is the need for large numbers of GPUs. While power-efficient for the work they do, they still consume large amounts of energy. NCSA has an advantage here, particularly among academic computing centers, because we operate a data center with significant amounts of available power – 24 megawatts available for computing – and infrastructure for liquid cooling, the most efficient way to cool a system and allow the processors to operate at their full capability.

I think there are some very interesting long-term research questions since the methods we’re currently using are somewhat brute force. Because of the commercial implications, there’s a lot of focus on “What can we do really, really quickly?” But there are opportunities to discover better ways to train and implement AI. One of the things that’s really exciting about being in a research environment is that some people are also looking at “How do we do this more efficiently so that we don’t need so much energy?”
Bill Gropp, NCSA Director

In the short run, we’re still looking for those methods, and so it can really be a constraint. People who want to deploy these systems have to have a place that has the power and the cooling. I keep mentioning the cooling because, in talking with some of my colleagues, I know that a challenge that they’re running into is that researchers are getting GPU clusters and then showing up and saying “Here, plug this in in your space,” and there isn’t enough power, and even if there is enough power, there isn’t enough cooling, particularly if the researcher acquired a system that doesn’t match the cooling infrastructure. Air cooling is not as efficient as liquid cooling and it just makes it much more complicated. Researchers need to be thinking about these things and how they may impact their work.

NCSA’s AI strategy
At NCSA, we’ve approached AI on a couple of different levels. The easiest one is building on our long-term leadership in providing computing resources for the nation. It started with Delta where we recognized there was an unmet need for GPU resources quite broadly – not just in AI, but also in high-performance computing. We did see the growing interest in AI, but the design of Delta predated the emergence of these large language models. So at that point, the computing demands of AI were growing, but they had not exploded the way they have since we proposed Delta.

As the demand grew, we realized that there was going to be an increasing need for GPUs at all scales – in scales that are beyond what anything but the very largest systems can provide, but also at more modest levels where a lot of research is done. That led us to propose DeltaAI and to configure that machine more for AI work than for HPC work. We also realized there was an opportunity to look at the application of AI, so we established the Center for AI Innovation, which really looks at the translational use of AI and research in how you use AI in various tasks.

Strategically, NCSA has focused on building up expertise in the application of AI. Another piece of this is the recent formation of the Office of Data Science Research, which will also help us better connect AI and data science efforts across campus. AI is not a subset of data science and data science is not a subset of AI, but they do have a lot of overlap.

AI fatigue
While consumers of the news may feel that every other story is discussing AI, there’s a similar concern in the research community that AI is taking too much of everybody’s attention and pushing aside other things that are important. AI may not be a relevant tool for them. We’re seeing increased recognition that there are research spaces where AI is going to transform what we can do, but there are other places where it just doesn’t work.

And there we need to keep working. It’s one of the reasons why a focus for DeltaAI is, of course, AI – it’s right there in the name – but it’s also a good HPC platform. We expect there to be significant HPC use and by HPC use, I’m not talking about solving high-performance computing problems with AI. I’m talking about using the techniques that have been established and continuing to apply them and improve them in areas where AI doesn’t actually help us.

Some research just isn’t suited for AI. There may be some points of contact, but it’s not clear that AI is going to be a solution in their work. There’s a lot of concern that there is too much focus on AI or quantum – that’s another one – and that some of the fundamentals will get lost. We can’t allow that to happen, no matter how exciting the possibilities are and intriguing the outcomes of AI will be.