Looking to the Past to Save the Oceans Now March 21, 2022 Science Features Arts and HumanitiesData AnalyticsEarth and EnvironmentSoftware and Applications Share this page: Twitter Facebook LinkedIn Email By NCSA News Staff A floating laboratory embarked on a four-year mission to chart the ocean floor, measure ocean temperatures and chemistry, and collect marine specimens from the unexplored deep sea. By mission’s end, it had wildly exceeded even its ambitious expectations. Sounds like pretty typical oceanographic research – except the year was 1872, and the ship was a small British warship that had been converted into the world’s first floating laboratory. During the course of its mission, the HMS Challenger sailed nearly 70,000 miles and recorded data at over 360 individual stations, identifying the world’s major ocean basins and currents, as well as 4,700 new species of marine creatures and plants. The thousands of bottled specimens, original illustrations by expedition artists, and 50 volumes of Challenger data inaugurated the modern fields of oceanography and marine biology. The Oceans 1876 project NCSA’s Software directorate is collaborating with Gillen D’Arcy Wood on the Oceans 1876 project. Wood is a professor of environmental humanities and English at the University of Illinois Urbana-Champaign, where he also serves as associate director of the Institute for Sustainability, Energy and Environment. He received a 2021 Andrew Carnegie Fellowship for this project. The Oceans 1876 project revisits the famous voyage of the HMS Challenger, marking its 150th anniversary. The project also examines untapped Challenger data to explore what the data might tell us about the current state of the oceans – and how to protect them. The data can serve as a pre-industrial baseline for our deteriorating oceans. The project is unusual in that it is blending humanities research methods with the physical sciences, while also using technology and Big Data. Kaveh Karimi Asli says after hearing Wood speak on his research two years ago he was intrigued by the data possibilities and did some work with Wood on his own time to explore possibilities. Wood’s Carnegie fellowship provided funding for the two to officially collaborate. Karimi Asli was an NCSA research software engineer at the time he and Wood initiated their collaboration; he’s now a senior research software engineer at the University of Oslo and continues to provide technical guidance for the project through NCSA part-time. Wood says he was intrigued by the idea of turning “the vast trove of Challenger data languishing in archival form, or as informational fragments spread across various marine science databases and print sources, into an accessible resource. The development of an open-source platform of this nature will help enable developers and researchers in the digital humanities and other fields to efficiently transform historical analog data sources into structured, searchable resources.” The team at NCSA, along with Wood, are collaborating with curators and archivists at the British Natural History Museum to create a fully modernized Challenger database and website to improve both public access and the research potential of the collection. NCSA Overseeing the Data Challenge Christopher Navarro and Karimi Asli are leading NCSA’s efforts. The technical goal of the project, they say, is to extract, validate, and modernize the oceans and species data from the Challenger’s 50-volume reports. That’s more than 30,000 pages. As a key first step in a multi-phase process, the NCSA team is focusing on the published Victorian-era marine data, extracting and processing them into structured data and making them available via a public application programming interface, or API. The team has started the enormous task of extracting and standardizing the published HMS data, which contains species information and environmental attributes of visited stations, such as air temperature and water density at different depths, and hopes to finish soon. Simultaneously, the NCSA team needs to calibrate and normalize the extracted data to make them comparable to modern marine species and ocean data. The main normalization priority is the mapping of species to their modern names, as many of them have gone through multiple changes and re-categorization during the past 150 years. In a parallel task, the NCSA team is creating a web application that lets users interact with the data, investigate the historical records and compare them with the latest data around the state of the oceans. Grad students are performing much of the data extraction work. Karimi Asli says the work is mainly creating pipelines. It starts with a collection of scripts that leverage optical character recognition, or OCR, libraries to extract entities from the text. These entities are categorized into attributes of interest, such as environmental variables or species. A separate pipeline annotates each category with relevant data fetched from third-party resources such as World Register of Marine Species and Catalogue of Life. For example, Karimi Asli explains, they are using Global Names Index, a catalog of biological scientific names. Their pipeline scripts leverage this dataset by extracting species names from scanned HMS Challenger texts and then verifies them against the modern species name in this library. The Results For researchers, the digitization, calibration, and organization of Challenger data into this accessible format will enable a wide range of comparative and historical marine research that is not currently possible. Marine biologists, says Karimi Asli, are excited. “We hope to be able to provide data in a way that you can browse the journey and interact with it in a user-friendly way. So instead of just getting raw data out of an API, you can see the timeline of the journey and then see details of all the visits, the stations. We also hope to connect our database to digital assets that exist in various collections around the world, specifically the one at London’s Natural History Museum and the Royal Albert Memorial Museum in the United Kingdom,” he says. The digitized data will also help provide answers to vital eco-historical research questions. For example: Where has ocean temperature increased and by what degree? What marine species encountered by Challenger scientists have migrated, expanded, or contracted in range or gone extinct? How has ocean acidification affected marine ecosystems? And how has warming altered the direction and strength of major ocean currents? Scientists look forward to discovering some answers.