Delta Powers Student Research

September 21, 2023

An abstract image of many boxes linked. This is meant to convey the concept of data sharing between computers. The image is blue tinted.

Researchers at the University of Illinois at Urbana Champaign (UIUC) have long taken advantage of the opportunities afforded to them by having a supercomputing center embedded on campus. NCSA’s proximity to the labs, classes and buildings where academic researchers teach and work makes incorporating cyberinfrastructure into their project plans easy. However, student researchers may not be aware they can also take advantage of these resources through ACCESS.

When ACCESS launched last fall, one of the program’s new features was an expansion of the types of allocations awarded, allowing for an easing of requirements in some of the application processes. For example, ACCESS made it very simple for smaller projects to get an allocation with the addition of the Explore ACCESS allocation option. Another group that benefits significantly from the less restrictive allocation requirements is student researchers. Graduate students are eligible to be principal investigators (PIs) on Explore ACCESS allocations.

Allowing students to be more involved in the allocation process is a move that’s highly supportive of research in a number of impactful ways. These new allocations have a knock-on effect similar to student internship programs like SPIN and REU. Students often don’t have the ability to obtain sufficient computational resources on their own. The expense and support needed to get time on a supercomputer can overwhelm a small project, despite how much the resource would aid in the project. By allowing student researchers to use these resources, we better prepare the next generation of researchers, helping them gain familiarity with the process of obtaining time on a supercomputer and working with a support team to use it. Much like NCSA’s research interns, this experience complements their coursework, giving them practical hands-on time during this crucial learning phase of their careers. When they graduate, these students will not only have a leg up on new researchers who didn’t take advantage of these opportunities, but they’ll be able to integrate into working research teams that much faster, allowing the momentum of the research to continue apace without having to stop to explain some basics to the new team members. Additionally, by giving graduate students the ability to be PIs, you give them the opportunity to learn how to draft a project proposal, something they’ll find immensely beneficial in their future careers.

An image of Delta, NCSA's GPU based supercomputer. Delta is spelled out on the supercomputer in colorful geometric shapes reminiscent of a sunset over the water. — Delta, NCSA’s GPU-based supercomputer, is one of the resources in the ACCESS portfolio. Researchers can request time on Delta through ACCESS, an NSF-funded program.

One such student research team from UIUC’s AI@UIUC student-led campus organization shows these benefits at work. AI@UIUC works directly with faculty advisors to develop long-term research projects with the goal of being published and presented. They also collaborate with interested academic and industry partners to create tools or products that utilize artificial intelligence. AI@UIUC perfectly embodies the type of organization that can benefit from more widely available NSF-funded HPC resources.

The team was given an allocation on Delta, NCSA’s GPU-based supercomputer. Their project was focused on Federated Learning (FL). Federated Learning is a way to train machine-learning models by having many different computers work together while preserving the privacy of data local to each machine. These computers are called clients, and they’re teamed up to train a model with the aid of a central server. This differs from the more traditional methods of training a model where all the data is sent directly to the server. In FL, updates to the model are sent between the client and the server. This is beneficial in certain situations where one wants to keep data private, like with medical information, because the client computers don’t need to share the data with the server. In short, FL allows an AI model to be trained without having to share sensitive or protected information.

Getting an ACCESS allocation through the application was very straightforward. We gave details about our project and got approval quickly.

– Rishub Tamirisa

Rishub Tamirisa is one of the student researchers who worked on this project. He said they came to ACCESS because they knew they needed more power for their work. “We chose ACCESS because the GPUs are better than most GPUs in typical university lab clusters (NVIDIA A100s vs. V100s). For federated learning or any large model training, having the best GPUs at one’s disposal is critical for doing efficient research.”

The team was pleasantly surprised at how easy it was to get started. “Getting an ACCESS allocation through the application was very straightforward. We gave details about our project and got approval quickly,” Tamirisa said.

His group’s project aimed to refine the existing method of FL. “The goal of our research was to introduce a new method for federated learning,” Tamirisa explained. “ Federated learning aims to solve the problem of having multiple models trained on different datasets learn from each other during training, without sharing training data (thus preserving privacy). Our paper introduced a new algorithm for doing this that achieved higher accuracy on existing benchmarks than prior methods.”

They call their method FedSelect. The idea is to customize both the architecture and parameters for each client based on its local data during training: Each client model should only select a necessary subset of shared parameters that encode global information relevant to their local task, as using all global information from any full layer(s) may not be optimal. As it turns out, this method shows improved performance compared to other pruning-based FL methods and other personalized FL approaches. Additionally, it reduces communication costs compared to partial model personalization methods.

“The impact of our work is both theoretical and practical,” said Tamirisa. “Our algorithm, in particular, is unique because we develop a new hypothesis for how knowledge is stored in deep-learning models that wasn’t previously discussed in the field. Practically, a new algorithm that achieves higher accuracy is useful for any individual or company wanting to implement federated learning algorithms for their services.

While there are many research domains that could benefit from time on a supercomputer, when it comes to machine learning, supercomputers are quickly becoming essential. “Federated learning, in particular, is expensive,” said Tamrisa, “since it requires training multiple models in parallel and aggregating their results. ACCESS helped us get much faster results because of the high-end GPU access.”

The multiple-GPU setup via Delta made training significantly more efficient and enabled faster research iteration on our ideas. In just three months, we went from initial research formulation to implementation, resulting in work accepted at ICML, one of the top ML conferences worldwide.
Rishub Tamirisa, student researcher, UIUC

But NCSA is more than its hardware. The allocation Tamirisa’s team was awarded came with mentorship and support. “We were thoroughly supported by NCSA throughout our entire project,” he said. “It was very easy to ask questions or find out details on using their systems.” Tamirisa’s team was quickly able to pick up how to use Delta with some help from their support. “Working on Delta was easy to use, especially with prior experience working with a distributed computing environment.” He similarly praised the Open OnDemand portal, an open-source application designed to make interfacing with cyberinfrastructure simple. In fact, the Delta team has worked diligently to make using Delta easy for all, including those who’ve never used such resources before.

Volodymyr Kindratenko standing next to an HPC system. — Kindratenko is the director of CAII. CAII’s mission is to advance AI research, provide students with opportunities for career development in AI, and address industrial grand challenges through innovative use of AI by engaging the research community, students, and industry collaborators.

Volodymyr Kindratenko, the director of the Center for AI Innovation (CAII) and faculty mentor for the AI@UIUC student organization, explained that the Open OnDemand web-based interface, which is currently available on Delta, was initially deployed and tested on NCSA’s HAL system. This interface was designed to provide an easy-to-use framework for users who may not be familiar with the traditional command-line interface to HPC systems. Kindratenko emphasized that this significantly simplifies access to the compute resources, enabling users to start training models immediately without having to learn the complexities of the batch job-submission system, command-line interface and other specific processes of working with cyberinfrastructure like Delta.

The entire experience and the results of the research were made possible by the availability of such a powerful resource right on campus. “GPUs are expensive,” Tamirisa explained, “and modern ML research costs can be prohibitive for most project ideas. We were very excited about getting our allocation.” He also posits that the demand for these types of resources is only going to increase as time goes on. “Distributed training environments are increasingly common for training state-of-the-art models across deep learning. With generative modeling being particularly popular, an environment like Delta would be beneficial for anyone wishing to produce meaningful work in the area.”

Tamirisa’s team made the most of their opportunity, and it paid off in big ways for them. Not only did the allocation allow them to complete their work, but it also gave them the resources to produce a paper that was accepted for a poster presentation at the 2023 ICML Federated Learning workshop. While his team still has a ways to go before they finish their degrees, they are now well on their way to working professionally within the research community.

NCSA’s Volodymyr Kindratenko, director of the Center for AI Innovation (CAII), contributed to this story.

UNIVERSITY OF ILLINOIS URBANA-CHAMPAIGN

Delta Powers Student Research

Receive NCSA news & events updates in your email inbox