Past Awardee

Promise of the Honey Bees: Comparative and Functional Genomics of Insect Genomes to Understand Cisregulatory Basis of Social Behavior

Saurabh Sinha

College: Engineering
Award year: 2007-2008

Honey bees live in societies that exhibit incredible sophistication, in the form of highly regulated division of labor. While this behavioral aspect has been researched for hundreds of years, the molecular basis of socially regulated behavior is largely unknown. Finding the genetic sources of specialized animal behavior is an important problem facing us today, and understanding the underlying themes of behavior in the honeybee is sure to help build the foundations for studying human behavior in the future. The first leap in this direction has come from the recent sequencing of the honeybee genome, opening up the floodgates to genomics studies that will churn out biological knowledge about social behavior from the gigabytes of available genomic data.

This project will identify the key players in the transcriptional networks that implement the "logic" of socially regulated behavior. Our strategy will hinge on (i) comparing the honeybee genome to that of other insects (wasp, beetle, and fruitfly) that do not exhibit complex social behavior, and on (ii) exploiting the wealth of genetic information on Drosophila, especially in developmental contexts. We have recently published preliminary work in this direction (Sinha et al., PNAS 2006) and are perfectly positioned to perform a much more complete and improved study on the same lines.

We have built the basic tools needed for our research, but face the wall of computational complexity. Mutual comparison of four or more genomes totaling about 1Gbp of sequence and hundreds of cis-regulatory elements, using sophisticated machine learning tools such as Hidden Markov models, will dwarf our current resources. The computing power and grid computing expertise of the NCSA are exactly what can make our project a reality. Apart from the obvious need of parallelizing our comparative genomics tools, we will bank on the NCSA's expertise for efficient data retrieval models facilitate search/query, integration, portability, extensibility, and maintenance of the data and code pipeline.

Our project, while having its own clearly defined biological objective, will also serve as the prototype for future functional studies on other families of highly diverged organisms.