Illinois-NCSA team receives $1.8M to create data platform for Big Data in plants

07.13.15 -

By Claudia Lutz, Institute for Genomic Biology

Historically, successful trait selection in plant breeding has involved manual measurement of individual plants. This requirement limits the number of plants that can be evaluated, and the scope of properties that can be measured. A new grant from the Department of Energy to researchers at the Donald Danforth Plant Science Center and multiple partner institutions, including the University of Illinois and the National Center for Supercomputing Applications (NCSA), will fund the development of a system to automate the measurement of plants using cameras and other sensors mounted on drones, tractors, and robots, and analysis of the resulting large data sets to facilitate the development of high-yielding strains of sorghum, a key bioenergy crop.

The $8 million grant was one of several awarded by the DOE Advanced Research Projects Agency-Energy (ARPA-E) Transportation Energy Resources from Renewable Agriculture (TERRA) program. Todd Mockler, the Geraldine and Robert Virgil Distinguished Investigator at the Danforth Center, is the principal investigator. Of the total grant, $1.8 million will go to NCSA to establish a supercomputing pipeline for a reference sensing platform. Plant biologist David LeBauer will act as principal investigator for this component of the project.

LeBauer, an NCSA Fellow and Carl R. Woese Institute for Genomic Biology affiliate, will work with groups at NCSA to establish a reference data set and computing environment that will support all of the researchers funded by the TERRA program.

Researchers will develop a cutting-edge, automated system to collect, analyze, and share data on multiple characteristics of plants growing in the field via sensors in the air, on the ground, and mounted on tractors, and link these observations to genomic data collected from individual plants. Data collected will be used as a reference upon which new sensors, sensor platforms, and data analysis pipelines can be developed. By the project’s end, groups expect to develop a system that can survey every plant in a 50 hectare area (almost 100 football fields) each day.

“[Our] goal is to reduce the overall program cost by providing a single large dataset and computing platform for all of the projects funded by the TERRA program,” said LeBauer. “This reference data set will allow researchers to . . . develop smaller, less expensive platforms that more efficiently target the most useful observations.” Ultimately, he said, these tools will be used to develop plants strains that are able to tolerate stresses such as drought, temperature, and disease.

Researchers at NCSA will also make the computing solutions developed for the project publicly available, allowing their future use in a variety of Big Data research applications.

“In addition to using technology to make breeding more efficient, [our platform] will have more general applications in agriculture, robotics, sensing, and data analysis,” LeBauer said. “Our goal is to make data, software, and computing resources available to the larger scientific and engineering communities interested in these and other areas, by adopting the tools and methods of open science.”

Other partner institutions for the grant include Clemson University, the HudsonAlpha Institute for Biotechnology, Kansas State University, Texas A&M University, the University of Arizona, and Washington University in St. Louis, with key collaborators at the U.S. Arid Land Agricultural Research Center of the USDA-Agricultural Research Service.

Written with materials provided by the Donald Danforth Plant Science Center and David LeBauer.