Developing a High-Throughput Unsupervised Learning System for Pollen Identification
Surangi W. Punyasena
College: Liberal Arts and Sciences
Award year: 2012-2013
Contained within the fossil pollen and spore record is one of the most comprehensive histories of terrestrial vegetation and its response to long-term environmental change. However, the inefficiencies inherent in the collection and analysis of palynological (pollen and spore) data have meant that this record remains one of the few areas of scientific inquiry where automated data acquisition and image analysis have made little headway. In order to address the problem of data collection and data availability for the tropics, we propose a radical alternative to traditional palynological counts – a high-throughput unsupervised automated counting system. Based on advanced machine learning algorithms, this machine classification system will be developed and tested using high-resolution images of modern Neotropical pollen rain samples. The system will be designed to handle the large, uncompressed image data that we will generate (~5 TB/day).