ADASS XXII University of Illinois
November 4-8, 2012
ADASS2012 program header image

ADASS XXII Conference

Oral Presentations

See the schedule for presentation times.

Badenhorst, Scott J. O01: Acceleration of automated HI source extraction.
Baxter, Richard J. O02: GPU-based Acceleration of Radio Interferometry Point Source Visibility Calculations in the MEQtrees Framework.
Berriman, G. Bruce O03: A Tale of 160 Scientists, Three Applications, A Workshop, and A Cloud.
Brunner, Robert J. O04: Practical Informatics: Training the Next Generation
Bulgarelli, Andrea O05: AGILE/GRID science alert monitoring system: the workflow and the Crab flare case
Carrasco Kind, Matias O06: Implementing Probabilistic Photometric Redshifts
Chang, Seo-Won O07: Improvement of Time-Series Photometry based on Multi-Aperture Indexing and Spatiotemporal de-trending
Costa, Alessandro O08: VisIVO: a web-based, workflow-enabled Gateway for Astrophysical Visualization
Csépány, Gergely O09: Design concepts of the "Fly's Eye" all-sky camera system
Eguchi, Satoshi O10: Prototype Implementation of Web and Desktop Applications for ALMA Science Verification Data and the Lessons Learned
Fabbro, Sebastien O11: Delivering astronomy software with minimal user maintenance
Farivar, Reza O12: Cloud based processing of large photometric surveys
Fernique, Pierre O13: HEALPix based cross-correlation in Astronomy
Foucaud, Sebastien O14: The Taiwan Extragalactic Astronomical Data Center
Gopu, Arvind O15: Framework for a Web-browser and AMQP messaging based Interactive Astronomy Data Analysis
Hack, Warren O35: DrizzlePac: Managing Multi-component WCS solutions for HST Data
Haase, Jonas O16: How to transplant a large archive
Jeschke, Eric O17: Introducing the Ginga FITS Viewer and Toolkit
Kamennoff, Nicolas O18: Development of a astrophysical specific language for big data computation
Knapic, Cristina O19: Full tollerance archiving system
Laidler, Chris O20: GPU accelerated sinusoidal Hough Transformations, to detect binary pulsars.
Lubow, Steve O21: Hubble Source Catalog
Mann, Robert G. O22: Astronomy and Computing: a new journal for the astronomical computing community
Martinez-Rubi, O. O23: LEDDB: LOFAR Epoch of Reionization Diagnostic Database
Massimino, Pietro O24: SpaceMission - Learning Astrophysics through Mobile Gaming
Ott, Stephan O25: Herschel Data Processing Development - 10 years after
Pozanenko, Alexei O26: Searching for secondary photometric standards for wide FOV and fast transients observations
Roby, William O27: Using Firefly Tools to enhance archive web pages
Schaaf, Andre O28: Feedback about astronomical application developements for mobile devices
Teplitz, Harry O29: Enhancing Science During the Rapid Expansion of IRSA
Tollerud, Erik O30: The Astropy Project: A Community Python Library for Astrophysics
Warner, Craig O31: Redefining the Data Pipeline Using GPUs
Williams, Stewart O32: Spectral Line Selection in the ALMA Observing Tool
Winkelman, Sherry O33: An Observation-Centric View of the Chandra Data Archive
Wise, Michael W. O34: The LOFAR Data System: An Integrated Observing, Processing, and Archiving Facility

O01: Acceleration of automated HI source extraction.

Badenhorst, Scott J. University of Cape Town
Kuttel, Michelle University of Cape Town
Blyth, Sarah University of Cape Town

Future deep and wide-field neutral hydrogen(HI) galaxy surveys on both the proposed Square Kilometer Array and precursor instruments (e.g. MeerKAT) are expected to produce extremely large data sets, which are likely to render conventional manual extraction of HI sources infeasible. Software such as SExtractor and the popular DUCHAMP source extraction and analysis packages automate the source extraction process. The A'Trous wavelet reconstruction algorithm, used for noise removal within the DUCHAMP source extraction package, has been shown to greatly improve the reliability of automated source extraction.[1][2] However, as existing automated methods for source extraction are very computationally expensive and currently do not scale to extremely large datasets, there is considerable potential to accelerate source finding algorithms with currently commodity parallel technologies: multi-core and inexpensive Graphics Processing Units (GPU's).

The aim of this work is to allow for fast automated extraction of HI sources from noisy large survey data. This has two components: the handling of extremely large files (in excess of 5 TB) that will be produced by next-generation interferometers, as well as accelerating the source extraction algorithm pipeline. We compare three popular memory management libraries (Mmap, Boost and Stxxl) that allow for the processing of data files that are too large to fit into main memory to establish which schemes provides the best loading times from external memory. We also investigate possible computation increases in automated source extraction through an efficient parallel implementation of the A'Trous wavelet reconstruction algorithm on both multicore and graphics hardware, which is evaluated against the original serial code implemented in the DUCHAMP package.


  1. Popping A., Jurek R., Westmeier T., Serra P., Flöer L., Meyer M., Koribalski B. (2012) Comparison of Potential ASKAP Hi Survey Source Finders. Publications of the Astronomical Society of Australia.
  2. Matthew T. Whiting (2012) Duchamp: a 3D source finder for spectral-line data. Monthly Notices of the Royal Astronomical Society. Volume 421, Issue 4, April 2012, Pages: 3242 - 3256

O02: GPU-based Acceleration of Radio Interferometry Point Source Visibility Calculations in the MEQtrees Framework

Baxter, Richard J. University of Cape Town
Marais, Patrick University of Cape Town
Kuttel, Michelle M. University of Cape Town

Modern radio interferometer arrays are powerful tools for obtaining high resolution, low frequency images of objects in deep space. While single dish telescopes convert the electromagnetic radiation directly into an image of the sky (or sky-intensity map), interference patterns between dishes in the array are converted into samples of the Fourier plane (UV-data or visibilities). A subsequent Fourier transform of the visibilities reveals the image of the sky. Conversely, a sky-intensity map comprising of a collection of point sources can undergo an inverse Fourier transform to simulate the corresponding Point Source Visibilities. Such simulated visibilities are important for testing models of external factors that affect the accuracy of observed data, such as Radio Frequency interference and interaction with the ionosphere. MeqTrees is a widely used radio interferometry calibration and simulation software package containing a Point Source Visibility module. However, calculation of visibilities is extremely computationally intensive, as it essentially involves applying the same Fourier equation to many point sources across multiple frequency bands and time slots. There is great potential for this module to be accelerated by the highly parallel Single-Instruction-Multiple-Data architectures in modern commodity Graphics Processing Units. Here we report a GPU/CUDA implementation of the Point Source Visibility calculation within the existing MeqTrees framework. For a large numbers of sources, this implementation achieves 18× speedup over the existing CPU module. With modifications to the MeqTrees memory management system to reduce overheads by incorporating GPU memory operations, real speedups of 25× should be achievable.

O03: A Tale of 160 Scientists, Three Applications, A Workshop, and A Cloud.

Berriman, G. Bruce IPAC. Caltech
Brinkworth, C. IPAC. Caltech
Gelino, D. M. IPAC. Caltech
Wittman, D. K. IPAC. Caltech
Deelman, E. ISI, USC
Juve, G. ISI, USC
Rynge, M. ISI, USC
Kinney, J.,. Inc

The NASA Exoplanet Science Institute (NExScI) hosts the annual Sagan Workshops, thematic meetings aimed at introducing researchers to the latest tools and methodologies in exoplanet research. The theme of the Summer 2012 workshop, held from July 23 to July 27 at Caltech, was to explore the use of exoplanet light curves to study planetary system architectures and atmospheres. A major part of the workshop was to use hands-on sessions to instruct attendees in the use of three open source tools for the analysis of light curves, especially from the Kepler mission. Each hands-on session involved the 160 attendees using their laptops to follow step-by-step tutorials given by experts.

One of the applications, PyKE, is a suite of Python tools designed to reduce and analyze Kepler light curves; these tools can be invoked from the command line or a GUI in PyRAF, or from the Unix command line. The Transit Analysis Package (TAP) uses Markov Chain Monte Carlo (MCMC) techniques to fit light curves under the Interactive Data Language (IDL) environment, and Transit Timing Variations (TTV) uses IDL tools and Java-based GUI’s to confirm and detect exoplanets from timing variations in light curve fitting.

Rather than attempt to run these diverse applications on the inevitable wide range of environments on attendees laptops, they were ran instead on the Amazon Elastic Cloud 2 (EC2). The cloud offers features ideal for this type of short term need: computing and storage services are made available on demand for as long as needed, and a processing environment can be customized and replicated as needed. The cloud environment included an NFS file server virtual machine (VM), 15 client VM’s for use by attendees, and a VM to enable ftp downloads of attendees results. The file server was configured with a 1 TB Elastic Block Storage (EBS) volume (network attached storage mounted as a device) containing the application software and attendees’ home directories. The clients were configured to mount the applications and home directories from the server via NFS. All VM’s were built with CentOS version 5.8. Attendees connected their laptops to one of the client VMs using the Virtual Network Computing (VNC) protocol, which enabled them to interact with a remote desktop GUI during the hands-on sessions.

We will describe the mechanisms for handling security, failovers, and licensing of commercial software. In particular, IDL licenses were managed through a server at Caltech, connected to the IDL instances running on Amazon EC2 via a Secure Shell (ssh) tunnel. The system operated flawlessly during the workshop, and we will list a set of best practices for building and running applications on the cloud.

O04: Practical Informatics: Training the Next Generation

Brunner, Robert J. University of Illinois

A commonly discussed yet infrequently addressed problem in the scientific community is the inadequate training our students receive in dealing with large data, a subject more popularly known as informatics. Yet as presented by the late Jim Gray, we now have a fourth paradigm for scientific research, namely data intensive science. Over the last few years, I have tried to address this educational deficiency at the University of Illinois at Urbana-Champaign. Initially, I added relevant informatics content into the standard Astronomy curricula in order to increase the student's exposure to this new paradigm. Realizing that this was merely a band-aid solution, I next created and offered a new course, entitled Practical Informatics for the Physical Sciences that was warmly received by undergraduate and graduate students in several science and engineering disciplines. More recently, I have been tasked by the University with expanding this course into a two-part online course to introduce informatics concepts and techniques to a wider audience.

In this paper, I present my initial motivation for adopting informatics material into the Astronomy curricula, my thoughts and experiences in developing the Practical Informatics course, lessons learned from the entire process, and my progress in developing the new online course. I hope that others can make use of these lessons to more broadly improve the training of the next generation of scientists.

O05: AGILE/GRID science alert monitoring system: the workflow and the Crab flare case

Bulgarelli, Andrea INAF/IASF Bologna, Italy
Trifoglio, Massimo INAF/IASF Bologna, Italy
Gianotti, Fulvio INAF/IASF Bologna, Italy
Tavani, Marco INAF/IAPS Roma, Italy
Parmiggiani, Nicolò Università di Modena e Reggio Emilia
Conforti, Vito INAF/IASF Bologna, Italy

During the first 5 years of the AGILE mission we have observed many gamma-ray transients of Galactic and extra-Galactic origin. A fast reaction to unexpected transient events is a crucial part of the AGILE monitoring program, because the follow-up of an astrophysical transients is the key point for the scientific return of the mission; it is crucial to follow these events when they occur. Each 90 minutes (the duration of the AGILE orbit) the data is acquired from satellite by the Ground Segment and sent to AGILE team for an automatic elaboration and an alert generation by the AGILE/GRID science alert monitoring system presented in this paper.

The monitoring system of the AGILE mission checks in real-time (i.e. during the data acquisition) the current observation to detect astrophysical unexpected events; to reach this objective we have developed two independent pipelines with two different blind search methods, one in ASI/ASDC and one at INAF/IASF Bologna. The last one is presented in this paper.

In September 2010 the science alert monitoring system has recorded a transient phenomena from the Crab Nebula, generating an automated alert sent via email and SMS two hours after the end of an AGILE satellite orbit, i.e. two hours after the Crab flare itself: for this discovery AGILE won the 2012 Bruno Rossi prize. In this as in many other cases the reaction speed of the monitoring system was crucial. There is a lot of other interesting transient events detected by the AGILE/GRID science monitoring system (e.g. Cygnus X-3, PSR B1509-53, some transient from Cygnus region).

We present the workflow and the software designed and developed by the Agile Team to perform the automatic analysis on the AGILE data for the detection of gamma-ray transients. In addition, an App for iPhone will be released this autumn that will enable the AGILE team to access the monitoring system through mobile phone.

AGILE is a gamma-ray observatory launched in 2007 that operates in the energy range 30 MeV-50 GeV; its scientific mission is the exploration of the high-energy gamma-ray sky.

O06: Implementing Probabilistic Photometric Redshifts

Carrasco Kind, Matias University of Illinois at Urbana Champaign
Brunner, Robert University of Illinois at Urbana Champaign

Photometric redshifts have become more important with the growth of large imaging surveys. But their basic implementation has not changed significantly from their original development, as most techniques provide a single estimate and a computed error for the source redshift. In this paper, we present a new approach that provides accurate probability density functions (PDF) of redshifts for galaxies by efficiently combining standard template fitting techniques with powerful machine learning methods in a new, fully probabilistic manner. In addition, our approach also provides extra information about the internal structure of the data, including the relative importance of variables, identification of areas in the training sample that provide poor predictions and outlier rejection. Our final implementation will handle massive data and will be developed to capitalize on modern computational systems.

The main structure of our method is currently completed, and we have carried out several performance tests by using data from the SDSS, DES, COSMOS, DEEP2 with promising results. First, we use a template fitting method to obtain likelihoods in the photo-z/color space by using either observed or synthetic spectra. Within a Bayesian framework that involves a training sample, we compute a novel and accurate priors, by using a Random Naive Bayes Classifier, in addition to a principal component analysis, which provide improved PDFs. An important advantage of the Bayesian approach is that the accuracy of computed photo-z can be characterized in a way that has no equivalents in other statistical approaches, enabling the selection of galaxy samples with very reliable photo-zs.

The second step is computing PDFs using prediction trees and a random forest algorithm we developed called TPZ. This requires the construction of a large number of trees that are grown dynamically, which differs from current techniques that employ a static processes, producing a better predictor. By using a random forest analysis, we are able to either incorporate existing training data in an efficient manner or to recommend specific new observations of additional data to improve the overall efficacy of our redshift PDF estimation. We are also able to identify the most important variables in our calculations to improve out photo-zs.

The final step combines both PDFs. These are independent methods and have different origins and little or no covariance. We use the training data to identify those regimes where each method performs best and weight their PDFs accordingly in order to obtain a final photo-z PDF. This powerful combination enables us to obtain a robust PDF and also to better identify outliers. As en example, we applied our initial code to the DES simulated data and find that our approach meets almost all of the survey science requirements. Our algorithm runs in parallel using MPI that can be easily implemented on big clusters for massive data from large photometric surveys.

O07: Improvement of Time-Series Photometry based on Multi-Aperture Indexing and Spatiotemporal de-trending

Chang, Seo-Won Department of Astronomy, Yonsei University, Korea
Byun, Yong-Ik Department of Astronomy, Yonsei University, Korea

High precision time series photometry is often very difficult with ground-based surveys of large field of view. Systematic noises from cloud passages and subtle PSF variations are the major causes. We developed a new photometry algorithm based on multi-aperture measurement and indexed aperture correction followed by spatio-temporal de-trending process. This turns out to be a powerful method in improving overall photometric accuracy and also in minimizing the damages caused by cosmic rays, moving objects and other cosmetic problems of CCD pixels. The performance of our new method is demonstrated with MMT survey of M37 and HAT-South survey data. The de-trending alone is also a very useful tool in removing systematics from light curves, as we demonstrate with a subset of light curves from LINEAR database. Our method removes serious systematic variations that are shared by light curves of nearby stars, while true variabilities are preserved. This greatly improves the usefulness of archived light curve data.

O08: VisIVO: a web-based, workflow-enabled Gateway for Astrophysical Visualization

Costa, Alessandro INAF Catania
Bandieramonte, Marilena University of Catania
Becciani, Ugo INAF Catania
Krokos, Mel University of Portsmouth
Massimino, Piero INAF Catania
Petta, Catia University of Catania
Pistagna, Costantino INAF Catania
Riggi, Simone INAF Catania
Sciacca, Eva INAF Catania
Vitello, Fabio INAF Catania

We present VisIVO Science Gateway, a web-based, workflow-enabled framework, integrating large-scale multidimensional datasets and applications for visualization and data exploration on Distributed Computing Infrastructures (DCIs). Our framework is implemented through a workflow-enabled portal wrapped around WS-PGRADE which is a portal of the grid User Support Environment (gUSE). We provide customized interfaces for creating, invoking, monitoring and modifying scientific workflows. All technical aspects related to underlying visualization and DCI configurations are conveniently hidden from view. A number of workflows are enabled by default, e.g. implementing local or remote uploading and creation of scientific movies. Scientific movies are useful not only to scientists for presenting their research results, but also to museums and science centers for engaging the general public with complex scientific concepts. Our gateway can be accessed via standard web interfaces but also through a newly developed iOS mobile application offering scientific communities novel ways to share their results and experiences for analysis and exploration of large-scale astrophysical datasets within collaborative visualization environments.

O09: Design concepts of the "Fly's Eye" all-sky camera system

Csépány, Gergely MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary
Pál, András MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary
Vida, Krisztián MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary
Regály, Zsolt MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary
Mészáros, László MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary
Olá, Katalin MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary
Kiss, Csaba MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary
Döbrentei, László MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary
Mezö, György MTA Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út 15-17, Budapest, H-1121, Hungary

In this presentation we briefly summarize the design concepts of the ``Fly's Eye'' Camera System, a proposed high resolution all-sky monitoring device intended to perform time domain astronomy in multiple optical passbands while still accomplish a high etendue. Fundings have already been accepted in order to design and build a ``Fly's Eye'' device. In principle, the device contains 19 wide-field, fast focal ratio lens and cameras equipped with 4k x 4k detectors, where all of these are mounted on a hexapod platform. This platform provides sidereal tracking during the image acquisition and resets the attitude between the subsequent exposures. The optical setup covers the sky above the horizontal altitude of 30 degrees while the cumulative light-collecting power of such a device is similar to the Pan-STARRS or Kepler telescopes. Hence, the expected data flow rate is also rather large, it is in the range of 120TB/year assuming continuous operations and exposure times of a 3 minutes. Beyond the technical details and the actual scientific goals, here we also demonstrate the possibilities and yields of a possible network operation involving approximately a dozen sites distributed nearly homogeneously geographically. Such a network would yield an integrated etendue similar to LSST. In this project, we intend to follow an ``open design, open source and open data'' model. In addition, the design would be more feasible for operation in harsh environment, due to the involvement of an enclosure with optical windows and regulated temperature and humidity inside the enclosure. The robust mechanical design that exploits a hexapod for local sidereal tracking will lack unique moving parts and be fault tolerant due to its redundancy. Moreover, exactly the same instrument can be built independently from the actual geographical location and the installation procedure is simple since there is no need for polar alignment. This presentation will also focus on the data management issues raised by the proposed image acquisition scheme and the implied flow rate.

O10: Prototype Implementation of Web and Desktop Applications for ALMA Science Verification Data and the Lessons Learned

Eguchi, Satoshi National Astronomical Observatory of Japan
Kawasaki, Wataru National Astronomical Observatory of Japan
Shirasaki, Yuji National Astronomical Observatory of Japan
Komiya, Yutaka National Astronomical Observatory of Japan
Kosugi, George National Astronomical Observatory of Japan
Ohishi, Masatoshi National Astronomical Observatory of Japan
Mizumoto, Yoshihiko National Astronomical Observatory of Japan

Atacama Large Millimeter/submillimeter Array (ALMA) is the largest radio telescope built on the Chajnantor plateau in northern Chile, and it is expected to provide us much useful information for deep understanding of the universe due to its unprecedented high resolutional data in space and spectrum. ALMA is estimated to generate TB scale data during only one observation; astronomers manage to identify which part of the data they are really interested in. Now we have been developing new GUI software for this purpose utilizing the VO interface: ALMA Web Quick Look System (Web QL) and ALMA Desktop Application (Desktop App.; Kawasaki et al., in this conference). The former is written in JavaScript and HTML5 generated from Java codes by Google Web Toolkit, and the latter is in pure Java. These two applications are designed to communicate with each other through the VO interface, which is developed to handle multidimensional huge astronomical data effectively. Users can access Web QL via Desktop App. as well as our portal site depending on their research purposes. Here is a use case example:

  1. a user access our portal site and search objects that they are interested in.
  2. Then Web QL is automatically launched in the web browser.
  3. All user operations are translated to ADQLs and sent to the VO server as queries behind the screen; the results are sent back to the QL, which updates the screen.
  4. If the user is satisfied with the data on the screen, they can download the result as a FITS file; the QL sends the latest parameters to a cut-out service in VO at the moment.
  5. The user loads the FITS file into Desktop App., and performs more detailed analysis.
  6. If the user find that the data is insufficient for the analysis, they can directly jump into the Web QL and obtain more adequate file.
An essential point of our approach is how to reduce network traffic: we prepare, in advance, "compressed" FITS files of 2x2x1 (horizontal, vertical, and spectral directions, respectively) binning, 2x2x2 binning, 4x4x2 binning data, and so on. These files are hidden from users, and Web QL automatically choose proper one by each user operation. Through this work, we find that class-based programming languages (e.g., Java) is preferred to prototype-based ones (e.g., JavaScript) for the development of Web QL, since we have to convert the FITS format into multiple JavaScript Object Notations in proper order. We also find that the FITS file IOs are very time consuming because the FITS format is not designed to hold TB scale data; pixels lie sequencially in only one dimension. A partial cutout of data from a multidimensional FITS file corresponds to multiple sequential accesses throughout the file. Hence we have to develop alternative data containers for much faster data processing. In this paper, I introduce our data analysis systems, and describe what we learned through the development.

O11: Delivering astronomy software with minimal user maintenance

Fabbro, Sebastien University of Victoria
Goliath, Sharon Canadian Astronomy Data Centre

We present an approach to deliver on-demand astronomy processing using virtualization and a network file system. User-requested astronomy software applications are built and tested on a dedicated server, and distributed on-demand to cloud-based worker clients using a fast HTTP read-only cache file system. The worker clients are light virtual appliances which keep overheads to processing resources very small, while still ensuring the portability of all software applications. The goal is to limit the need for users of processing resources to carry out software installation, portability and maintenance task. We describe the design and infrastructure of the system, the software building process on the server, and show two typical applications: performing standard data analysis on a desktop, and using the system with a virtual machine on a cloud processing infrastructure in order to detect transients on a wide field survey.

O12: Cloud based processing of large photometric surveys

Farivar, Reza Department of Electrical and Computer Engineering, University of Illinois
Brunner, Robert Department of Astronomy, University of Illinois
Santucci, Robert Department of Physics, University of Illinois
Campbell, Roy Department of Computer Science, University of Illinois

Astronomy, as is the case with many scientific domains, has entered the realm of being a data rich science. Nowhere is this reflected more clearly than in the growth of large area surveys, such as the recently completed Sloan Digital Sky Survey (SDSS). The photometric data generated by these surveys are often processed in complementary manners. First, the data is rapidly analyzed to identify transient phenomena. Second, the nightly data is processed for release within the scientific collaboration. Finally, on longer timescales, the entire data set is reprocessed to produce a homogenous public data set.

These different data processing strategies utilize vastly different computational approaches. Transient searches often involve fast, minimalistic data reductions in order to be completed as close to the data gathering as possible. Nightly processing is more thorough and follows the data gathering, yet is not a supercomputing challenge. On the other hand, full data releases require massive data reprocessing using the latest calibration parameters, but this is typically only needed once or twice a year. As a result, these projects face a dilemma, do they acquire a dedicated computational system to handle all processing tasks but which might often be idled, or do they focus on the day-to-day processing and try to leverage other systems when the need arises? This question has become even more relevant with the growth of large photometric surveys, like the Dark Energy Survey, which will soon obtain PB of imaging data, and the next generation surveys like the Large Synoptic Survey Telescope, which will acquire PBs of raw imaging data per year.

In this paper, we demonstrate a new approach to this common problem. By leveraging the cloud-computing model, we demonstrate how computing on demand and understanding data locality and access patterns can produce a competitive computing model for scientific big data challenges. As a specific example, we have tackled reprocessing the SDSS imaging data, by using the SExtractor analysis program. Overall, the work is executed in two parts. First, we fetch in parallel the remotely hosted SDSS calibrated imaging data from Fermi National Laboratory, and process the images by using different invocations of SExtractor running within a 512-core Hadoop cluster. This ensures the input file locality provided by the HDFS/Hadoop framework remains effective. Second, we match the processed output images to generate a single catalog. Using the intermediate key/value pair design of Hadoop, our framework matches objects across different SExtractor invocations to create a unified catalog from all SDSS processed data. We conclude by presenting our experimental results and a discussion of the lessons we have learned in completing this challenge.

O13: HEALPix based cross-correlation in Astronomy

Fernique, Pierre CDS, Observatoire de Strasbourg
Durand, Daniel National Research Council Canada
Boch, Thomas CDS, Observatoire de Strasbourg
Oberto, Anaïs CDS, Observatoire de Strasbourg
Pineau, François-Xavier CDS, Observatoire de Strasbourg

We are presenting some work on a cross-correlation system based on HEALPix cells indexing. The system will allow users to answer scientific questions like "please find all HST images on which there is an observation of a radio quiet quasar" in a single query since its index is using the footprint of any given observations. The baseline of this system is the creation of the HEALPix indexes grouped hierarchically and organized in a special format file called MOC (see developed by the CDS. By this way, the cross correlation between images all around the sky and a full survey is reduced to searches only in meaning areas (HEALPix cells defined in both sides). At the condition that the survey data base also uses internally an HEALPix positional index, the search result comes back almost immediately (typically a few seconds for the example above). We have started to build the index for some surveys, catalogs (VizieR catalogs, Simbad, ...) and some pointed mode archives (like HST at CADC) and we are developing an elementary tool which will compute the intersection of any input MOC files. Once the few identified matching cells are found, the access to the original contributing observations is straightforward since we kept the list of matching observations per cell. The usage of the MOC files is starting to be used though the VO community as a general indexing method and other tools such as Aladin and TOPCAT are starting to make use of them. Since the MOC files are simply encoded in FITS or JSON, it is simple to make use of them in any context. Finally, the index generated keeps the knowledge the original progenitors. This allows direct access of the original data when using the MOC file. As time permit, we will present a live demo.

O14: The Taiwan Extragalactic Astronomical Data Center

Foucaud, Sebastien National Taiwan Normal University, Taiwan
Hashimoto, Yasuhiro National Taiwan Normal University, Taiwan
Tsai, Meng-Feng National Central University, Taiwan

Founded in 2010, the Taiwan Extragalactic Astronomical Data Center (TWEA-DC) has for goal to propose access to large amount of data for the Taiwanese and International community, focusing its efforts on Extragalactic science. In continuation with individual efforts in Taiwan over the past few years, this is the first stepping-stone towards the building of a National Virtual Observatory.

Making advantage of our own fast indexing algorithm (BLINK), based on a octahedral meshing of the sky coupled with a very fast kd-tree and a clever parallelization amongst available resources, TWEA-DC will propose from spring 2013 a service of "on-the-fly" matching facility, between on-site catalogs and user-based catalogs. We are also planning to offer access to raw and reducible data available from archives worldwide, allowing a friendly access to this goldmine of under-exploited information. Finally, we are developing our own specific on-line analysis tools, such as an automated photometric redshifts and SED fitting code, an automated groups and clusters finder, and a multiple-objects and arc finder.

I am going to introduce our data center focusing on the services planned such as the matching tool, our automated photometric redshift algorithm (APz) and our automated groups and cluster finder (APFoF).

O15: Framework for a Web-browser and AMQP messaging based Interactive Astronomy Data Analysis

Gopu, Arvind Indiana University
Hayashi, Soichi Indiana University
Young, Michael Indiana University

The WIYN 3.5m telescope's One Degree Imager (ODI), with a fully populated focal plane, will produce raw observational data on the order of 2-4 GB per exposure, and approximately 0.5 TB for a typical 3 night observational run. Pipeline processed Calibrated data has been calculated to be of the same order of magnitude in size. Traditional desktop-based astronomical processing techniques will no longer be sufficient to reduce data of this magnitude. The ODI Pipeline, Portal, and Archive (ODI-PPA) is a web-browser based solution being developed for ODI’s proprietary and archive data search/access, and pipeline processing needs.

In particular, we are developing a sub-component of the ODI-PPA portal: an interactive web-based data analysis framework. This framework will allow the ODI-PPA user to request typical data processing algorithms to be applied on their data from within their PPA account on their web browser. It uses PHP+jQuery within the Zend framework for the web browser front-end functionality, known astronomy data processing modules including IRAF's imexamine and imstat on the backend, and AMQP based messaging to accomplish the necessary Remote Procedure Calls (RPC) to those modules. The modules themselves will be executed on powerful and dedicated compute nodes on an existing compute cluster at Indiana University. When fully implemented, this framework will allow the users to execute typical interactive analysis steps such as contour plots, point source detection and photometry, surface photometry, and catalog source matching.

An ODI instrument with a partially populated focal plane is being commissioning this fall as a precursor to the full ODI instrument. We will use data generated from this instrument to further optimize the design and configuration of our framework.

O16: How to transplant a large archive

Haase, Jonas ESO/ESA
Arviset, Christophe ESA/ESAC
Osuna, Pedro ESA/ESAC
Rosa, Michael ESA

During 2011 and 2012 the European Hubble Space Telescope Archive was moved with all its data and services from its old home at ESO near Munich to to its new one at ESA/ESAC near Madrid. The successful move of the active HST and Hubble Legacy Archives only took slightly more than a year despite a limited amount of preparation beforehand and a minimum of manpower available for the task.

This talk describes the logistics of the move, lessons learned and the strategies which have been employed to make the Hubble archives easy to maintain, self-contained and easily portable between environments with sufficient storage capacity and ubiquitous grid processing power.

O17: Introducing the Ginga FITS Viewer and Toolkit

Jeschke, Eric Subaru Telescope, National Astronomical Observatory of Japan
Inagaki, Takeshi Subaru Telescope, National Astronomical Observatory of Japan
Kackley, Russell Subaru Telescope, National Astronomical Observatory of Japan

The paper and presentation introduces Ginga, a new open-source FITS viewer and toolkit based on Python astronomical packages such as pyfits, numpy, scipy, matplotlib and pywcs. For developers, we present a set of python classes for viewing FITS files under the modern Gtk and Qt widget sets and a more full-featured viewer that has a plugin architecture. We further describe how plugins can be written to extend the viewer with many different capabilities.

The software may may be of interest to software developers who are looking for a solution for integrating FITS visualization into their python programs and end users interested in a new and different FITS viewer that is not based on Tcl/Tk widget technology. The software has been released under a BSD license.

O18: Development of a astrophysical specific language for big data computation

Kamennoff, Nicolas ACSEL / Epitech
Foucaud, Sébastien NTNU-ES
Reybier, Sébastien SoaMI
Auroux, Lionel LSE / Epita

The link between astrophysics and computer science tighten as need for storage and computation from the former represents a challenge for the later. Along the past years many software have been developed, but only few using High Performance Computing (HPC). Indeed, access to both hardware and development resources comes with a cost and thus is not easy for lots of scientists. Whereas parallel and distributed programming has now become a burning issue for software developers as single core computers vanished. In collaboration with the Taiwan Extragalactic Data Center (TWEA-DC), we aim at developping a open-source software framework of a new kind. This paper describes the BLINK project (Billion Line INdexing in a clicK) whose goal is to develop a Domain Specific Language (DSL) which will allow people to describe their requests for data and computation in an astrophysicists-friendly way and which will distribute this task on HPC centers that holds this service. As part of the computer science challenge, we are focusing on strong software and hardware optimizations, trying to get advantage of all resources available (multi-core CPU, GPGPU, APU and so on...) while offering an easy to use service. Indeed we won't be able to release a generic-for-everyone tool, so it will be mandatory for this system to reach a high standard of modularity and control. By this way we hope to deliver a powerful framework which will allow community members to add their own language keywords and modules to the solution. As we are on the prelude of our journey, some milestones have been predicted. Our first step is to release a fast cross matching service based on this system.

O19: Full tollerance archiving system

Knapic, Cristina INAF - OATs
Molinaro, Marco INAF - OATs
Smareglia, Riccardo INAF - OATs

Archiving system at Centro Italiano Archivi Astronomici (IA2) manages data from external sources like telescopes, observatories or surveys and handles them in order to guarantee preservation, dissemination and reliability, possibly in a Virtual Observatory (VO) compliant manner. A metadata model dynamic constructor and a data archive program are new ideas needed to automatize management of different astronomical data sources in a fault tolerant environment. The goal is a full tollerance Arhiving system, but it is complicated by various and changing during time data model, file formats (FITS, HDF5, ROOT, PDS, etc..) and metadata content, also inside the same project. To avoid this catastrophic scenario some considerations are emerged in order to guarantee ingestion in archive, back compatibility and preservation of information. A solution could be found on an Archive Supervisor developed in a distributed Corba-based framework like ACS (Alma Common Software). It could operate using the component container paradigm, providing common services such as logging, error and alarm management, configuration database, life-cycle management and managing subsystems to let them cooperate through any of the supported programming languages. Cooperation means to create and use a semi-structured data model and processes that handle incoming files of different format, store them in different type of databases and distribute private data in different ways. Those processes have to map incoming file metadata in a dynamic way (i.e. using an XML schema) and conveniently store data in storage and metadata in database. Database configuration should be independent on real data structure and model. Depending on data type (raw or calibrated) developers could adopt different database management systems. In such scenario SQL and NO-SQL data bases are under evaluation.

O20: GPU accelerated sinusoidal Hough Transformations, to detect binary pulsars.

Laidler, Chris University of Cape Town

Analysis of relativistic binary pulsars is currently the best means by which to test theories of gravity in strong gravitational fields. However, automated blind searching for pulsars is time consuming and resource intensive. This is particularly the case when searching for pulsars in binary orbits, as the signal from these pulsars is Doppler-smeared by their orbital motion, which reduces the sensitivity of standard searching procedures. Two methods are commonly used to detect binary pulsars: acceleration and side-band search. An acceleration search is effective when the orbital period of the pulsar is significantly longer than the observation time, while a side-band search is effective when the orbital period is shorter than the observation period. Thus, there is a sensitivity gap when the orbital period is approximately equivalent to the duration of observation (a few hours). To search for such pulsars, a method applying a Hough Transformation to a Dynamic Power Spectrum (DPS) has been developed. This method is computationally intensive, as many orbital parameters have to be considered during the transformation. There is potential for commodity graphics processors (GPUs) to accelerate this algorithm at minimal expense as, because each transformation of a DPS point is essentially an independent process, the Hough Transformation is well suited to parallelization.

In this work, we present a GPU implementation of a custom four-dimensional Hough Transformation to detect sinusoids in noisy images. We apply this transformation to synthesized DPS to detect the sinusoidal sift in observed spin frequency from binary pulsars, in approximately circular orbits, and report on its effectiveness in detecting binary pulsars.

O21: Hubble Source Catalog

Lubow, Steve STScI
Budavari, Tamas JHU

We have created an initial catalog of objects observed by the WFPC2 and ACS instruments on the Hubble Space Telescope (HST). The catalog is based on observations taken on more than 6000 visits (telescope pointings) of ACS/WFC and more than 25000 visits of WFPC2. The catalog is obtained by cross matching by position in the sky all Hubble Legacy Archive (HLA) Source Extractor source lists for these instruments. The source lists describe properties of source detections within a visit. The calculations are performed on a SQL Server database system. First we collect overlapping images into groups, e.g., Eta Car, and determine nearby (approximately matching) pairs of sources from different images within each group. We then apply a novel algorithm for improving the cross matching of pairs of sources by adjusting the astrometry of the images. Next, we combine pairwise matches into maximal sets of possible multi-source matches. We apply a greedy Bayesian method to split the maximal matches into more reliable matches. We test the accuracy of the matches by comparing the fluxes of the matched sources. The result is a set of information that ties together multiple observations of the same object. A biproduct of the catalog is greatly improved relative astrometry for many of the HST images. We also provide information on nondetections that can be used to determine dropouts. With the catalog, for the first time, one can carry out time domain, multi-wavelength studies across a large set of HST data. The catalog is publicly available. Much more can be done to expand the catalog capabilities.

O22: Astronomy and Computing: a new journal for the astronomical computing community

Mann, Robert G Unversity of Edinburgh
Accomazzi, A. Harvard-Smithsonian Center for Astrophysics
Budavari, T. Johns Hopkins University
Fluke, C. Swinburne University of Technology
Gray, N. University of Glasgow
O'Mullane, W. European Space Astronomy Centre
Wicenec, A. University of Western Australia
Wise, M. ASTRON: Netherlands Institute for Radio Astronomy

We announce the launch of "Astronomy and Computing", a new peer-reviewed journal to serve the astronomical computing community. The need for such a journal was identified in a BoF discussion at the Boston ADASS, which also started the debate as to what aims and scope the journal should have to best meet the requirements of our community. In this talk we will outline how that debate has progressed to the point of the launch of A&C and discuss the role that the community has in shaping its development.

A&C is currently accepting papers for its first issue, which is due for publication this winter. Following the discussion at the Boston ADASS, the journal will be accepting a range of different types of paper: from research papers and "reports on practice", presenting technical lessons learned, through to invited reviews and "white papers". Each of these will be refereed according to appropriate criteria to ensure that A&C develops a corpus of long-lasting value to the community, as well as providing a means for sharing information within it. Authors will be strongly encouraged to make their manuscripts available to the community in a timely fashion via the ArXiv, and to provide sustainable links to data and source code described in their papers, while the journal intends to develop innovative ways of preparing and publishing papers to best present their content.

Members of the Editorial and Scientific Advisory Boards of A&C will be present to take further input from the ADASS community as to how the journal can best serve its needs: "Astronomy and Computing" is your journal, please engage with us to make it a success.

Further information about A&C can be found at

O23: LEDDB: LOFAR Epoch of Reionization Diagnostic Database

Martinez-Rubi, O. University of Groningen, Kapteyn Astronomical Institute, Groningen, The Netherlands.
Veligatla, V. K. University of Groningen, Kapteyn Astronomical Institute, Groningen, The Netherlands.
de Bruyn, A. G. ASTRON, Dwingeloo, The Netherlands.
Lampropoulos, P. ASTRON, Dwingeloo, The Netherlands.
Offringa, A. R. University of Groningen, Kapteyn Astronomical Institute, Groningen, The Netherlands.
Yatawatta, S. ASTRON, Dwingeloo, The Netherlands.

We present the details of the LEDDB (LOFAR EoR Diagnostic Database) that will be used in the storage, management, processing and analysis of the LOFAR EoR project observations.

LOFAR (Low-Frequency Array) is an antenna array that observes at low radio frequencies. It consists of about 70 stations spread around Europe that combine their signals to form an interferometric aperture synthesis array.

The LOFAR EoR (Epoch of Reionization) experiment is one of the key science projects of LOFAR. It aims to study the redshifted 21-cm line of neutral hydrogen from the Epoch of Reionization. There are many challenges to meet this goal including strong astrophysical foreground contamination, ionospheric distortions, complex instrumental response and different types of noise. The very faint signals require hundreds of hours of observation thereby accumulating petabytes of data. To diagnose and monitor the various instrumental and ionospheric parameters, as well as manage the data, we have developed the LEDDB. Its main tasks and uses are:

  • To store referencing information of the observations, mainly the locations of the data but also other indexing information.
  • To store diagnostic parameters of the observations extracted through calibration.
  • To facilitate efficient data management and pipeline processing.
  • To monitor the performance of the telescope as a function of date.
  • To visualize the diagnostic parameters. This includes tools for the generation of plots and animations to analyze the diagnostic data through all its multiple dimensions. For example we can observe the complex gain of all the stations as a function of time and frequency to visualize ionospheric distortion affecting large part of the array.
From the petabytes of data generated from the hundred of observations we estimate 10 terabytes of diagnostic data will be stored in the LEDDB. In addition to the size challenge, the most important issue to be taking into account for the design of the database and its query engine is the number of rows of some of the tables, which, in fact, become the main bottleneck in the queries.

The LEDDB is implemented with PostgreSQL and accessed through a python interface provided by the psycopg2 module. The query engine is a python API which provides fast and flexible access to the database. We use a python based web server (cherrypy) to interface with the query engine. The client-side user interface in the web page is implemented with JQueryUI framework. Minimum access times are possible thanks to efficient table indexing, the minimization of the amount of join operations, the use of persistent connections eased by the session handling provided by the cherrypy framework and an extensive set of options in the query engine for the selection of the data.

O24: SpaceMission - Learning Astrophysics through Mobile Gaming

Massimino, Pietro INAF - Astrophysical Observatory of Catania - Italy
Bandieramonte, Marilena INAF - Astrophysical Observatory of Catania - Italy
Becciani, Ugo INAF - Astrophysical Observatory of Catania - Italy
Costa, Alessandro INAF - Astrophysical Observatory of Catania - Italy
Krokos, Mel University of Portsmouth - U.K.
Petta, Catia University of Catania - Italy
Pistagna, Costantino INAF - Astrophysical Observatory of Catania - Italy
Riggi, Simone INAF - Astrophysical Observatory of Catania - Italy
Sciacca, Eva INAF - Astrophysical Observatory of Catania - Italy
Vitello, Fabio INAF - Astrophysical Observatory of Catania - Italy

SpaceMission is a mobile application (iOS) offering hands-on experience of astrophysical concepts using scientific simulations. The application is based on VisIVO which is a suite of software tools for visual discovery through 3D views generated from astrophysical datasets.

The game employs a standard dark matter N-body cosmological simulation carried out in a computational box of 80 Mpc containing 1 billion particles. The application contains data in the form of 3D renderings corresponding to a number of pre-determined view points inside the simulation and focusing either at the center of the computational box or at the center of interesting objects that are to be discovered by the players.

The goal is to find a number of regions of interest inside the cosmological simulation, for example a spiral galaxy or a collision between galaxies. During investigation players are provided with tools to create movies of their explorations. Such movies are rendered by exploiting (in a seamless way) high-performance computational infrastructures, e.g. desktop grids or remotely connected web grids. Once a player discovers an object, scientific information is displayed on the screen e.g. about its structure and evolution. Moreover a dynamic evolution can be shown, starting from the beginning of the simulation (redshift = 50), i.e. from immediately after the big-bang to the present day.

The typical operational scenario is to first select frames to be part of the player's virtual journey in the cosmos. Secondly, a selection of images is required, taken from particular viewpoints and associated with user-defined zoom levels. Finally, a HTTP request is submitted to a VisIVO server to produce a cinematic film.

The required computational resources depend on the underlying hardware specification of the server. As an example, the rendering time to generate a film based on 6 player-selected frames may last up to 45 minutes. The reason is that all in-between frames are generated directly from the original cosmological simulation to obtain very high quality movies.

The current version of the app. utilizes 192MB for displaying 1 million bodies. Once a movie is completed the player receives by email all necessary instructions for watching it, e.g. either by using a common browser or the app. itself. Movies can also be uploaded to YouTube.

The SpaceMission app is the first in what the developers hope will be a range of new-generation mobile tools exploiting specialized scientific data with access to DCIs to help encourage young people to become more interested in sciences.

To this extend an interactive exhibit (supported by the Science and Technology Facilities Council, UK) based on SpaceMission is installed at INTECH, a major science centre and planetarium in Winchester, UK. Our initial feedback is encouraging, but we are planning to formally validate SpaceMission through a competition among high school students in the UK and Italy.

O25: Herschel Data Processing Development - 10 years after

Ott, Stephan Herschel Science Centre

The Herschel Space Observatory, the fourth cornerstone mission in the ESA science program, was launched on the 14th of May 2009. With a 3.5 m telescope, it is the largest space telescope ever launched. Herschel’s three instruments (HIFI, PACS, and SPIRE) perform photometry and spectroscopy in the 55 - 672 micron range and deliver exciting science for the astronomical community during at least three years of routine observations. Since the 2nd of December 2009 Herschel has been performing and processing observations in routine science mode. As a cryogenic mission Herschel’s operational lifetime is consumable-limited by its supply of liquid helium; it is currently estimated that we will run out of helium in March 2013.

Originally it was considered sufficient to provide astronomers with raw data and the software tools to carry out a basic data reduction, and no ‘data products’ were to be generated and delivered. Later in the development process it was realised that the expectations of the astronomical community on what to expect from an observatory and its data processing system, data products, and archive have evolved, and that data analysis tools and scientifically useful products must be provided to increase the scientific return of a mission.

Therefore additional resources to implement a freely distributable data processing system development were made available. The goal was to provide a single "cradle to grave" data analysis system, supporting the needs of both the project team and the general astronomical community starting from early instrument level tests, covering pre-launch system operational verification tests, check-out and performance verification phase, operations, post operations and finishing with the population of the Herschel legacy archive.

We will summarise the lessons learned during those ten years of Herschel data processing development, address the main challenges of this major software development project, reflect on what was planned and went right, and where the plans needed to be adapted to better address the needs of the widely distributed Herschel Science Ground Segment Consortium.

O26: Searching for secondary photometric standards for wide FOV and fast transients observations

Pozanenko, Alexei Space Research Institute
Vovchenko, A. Institute of Informatics Problems
Volnova, A. Sternberg Astronomical Institute, Moscow State University
Denisenko, D. Sternberg Astronomical Institute, Moscow State University
Kalinichenko, L. Institute of Informatics Problems
Skvortsov, N. Institute of Informatics Problems
Stupnikov, S. Institute of Informatics Problems
Koupriyanov, V. Central Astronomical Observatory (Pulkovo)

We discuss an approach to the automatic selection of secondary photometric standards. In particular the problem of photometric standards arises in a large FOV and real-time photometric observations such as Gamma-ray burst (GRB). Quality of the selected standards influences the photometric light curve of the values obtained by different observatories. It is particularly important, since for the most of the afterglows the following cross-calibration is not performed, and the photometric light curve data are collected from the GCN original publications network. Some of parameters of cross identified sources of various catalogs which to be identified as the photometric standards are point-like objects, non-variable stars, non-extreme color terms of the stars and negligible proper motions of the stars.

For the selection of appropriate secondary photometric standards a conceptual approach to solving the problem has been applied. The main distinguishing feature of the approach consists in formulation of the problem in terms of the application domain, independently of particular resources (catalogs, services, photometric systems). Classes of application objects and functions expressing data transformations required for solving the problem are specified declaratively at the mediator layer that is provided for virtual integration of heterogeneous information resources hiding them from the application. Specifically the conceptual problem solving approach is implemented in the hybrid architecture combining mediation support facilities with the AstroGrid. Mapping of the resource schemas into the mediator metadata is performed in accordance with the global/local-as-view data integration approach.

For the secondary standards search problem two subject mediators have been specified. One of them actually performs the solution of the standards selection, and another, auxiliary one provides for elimination of the variable stars during the selection process. While solving the problem, a set of real catalogs to be used can be changed without changing the specifications of the mediators and supporting tools. Currently, a set of catalogs used includes: SDSS, 2MASS, USNO-A2.0, USNO-B1.0, ASAS, GSC, UCAC, GCVS, VSX, NVSS, NED. The secondary standards search is organized in real time as a follow-up to a GRB event notification and using in GRB follow-up networks.

027: Using Firefly Tools to enhance archive web pages

Roby, William Caltech/IPAC
Ly, Loi Caltech/IPAC
Wu, Xiuqin Caltech/IPAC
Goldina, Tatiana Caltech/IPAC

Astronomy web developers are looking for fast and powerful HTML 5/AJAX tools to enhance their web archives. We are exploring ways to make this easier for the developer. How could you have a full FITS visualizer or a Web 2.0 table that supports paging, sorting, and filtering in your web page in 10 minutes? Can it be done without even installing any software or maintaining a server?

Firefly is a powerful, configurable system for building web-based user interfaces to access astronomy science archives. It has been in production for the past 3 years. Recently we have made some of the advance components available through very simple JavaScript calls. This allows a web developer, without any significant knowledge of Firefly, to have FITS visualizers, tables, and spectrum plots on their web pages with minimal learning curve. Because we use cross-site JsonP, installing a server is not necessary. Web sites that use these tools can be created in minutes.

We will give a brief talk overviewing Firefly Tools and then demo how to use theses components.

We are using Firefly to serve data for several projects, both ground and space based. These projects include Spitzer, Planck, WISE, PTF, LSST and others. The similarities between the different archive user interfaces greatly reduced the learning curve and enhanced the user experiences of the archive systems. Firefly was created in IRSA, the NASA/IPAC Infrared Science Archive (

O28: Feedback about astronomical application developements for mobile devices

Schaaff, Andre CDS, CNRS, Observatoire astronomique de Strasbourg
Boch, Thomas CDS, UDS, Observatoire astronomique de Strasbourg
Fernique, Pierre CDS, UDS, Observatoire astronomique de Strasbourg
Houpin, Romain Université de Lorraine
Kaestlé, Vincent Université de Strasbourg
Royer, Maxime Université de Lorraine
Scheffmann, Julien Université de Lorraine
Weiler, Alexandre Université de Lorraine

Within a few years, Smartphones have become the standard for mobile telephony.and we are now witnessing a rapid development of Internet tablets. These mobile devices have enough powerful hardware features to run more and more complex applications. In the field of astronomy it is possible to use these tools to access data via a simple browser, but also to develop native applications reusing libraries (Java for Android, Objective-C for iOS) developed for desktops. We are working since two years on mobile application development and we have now skills in native iOS and android developments, Web development (especially HTML5, JavaScript, CSS3) and conversion tools (PhoneGap) from Web development to native applications. The biggest change comes from human/computer interaction that is radically changed by the use of multitouch. This interaction requires a redesign of interfaces to take advantage of new features (simultaneous selections in different parts of the screen, etc.). In the case of native applications, the distribution is usually done through online stores (App Store, Google Play, etc..) which gives a visibility to a wider audience. Our approach is to perform testing of materials, development of prototypes but also operational applications. The native application development is costly in development time, but the possibilities are broader because it is possible to use for example the gyroscope and the accelerometer, to point an object in the sky. Developments depend on the Web browser and the rendering and performance are often very different. It is also possible to convert Web developments to native applications, but currently it is better to restrict this possibility to light applications in terms of functionality. Developments in HTML5 are promising but are far behind those available on desktops. HTML5 has the advantage of allowing developements independent from the evolution of the mobile platforms ("write once, run anywhere"). The upcoming of Windows 8 supported on desktops, Internet tablets as well as a mobile version for smartphones will further expand the native systems family. This will enhance the interest of Web development. In 2010, we started to develop for android with a prototype based on VizieR Mine. In 2011 we developed SkySurveys, which reuses HEALPix Java libraries from Aladin. It lets you navigate through surveys. It also uses the OpenGL library, which provides a 30 times faster display. In 2012 we developed SkyObjects natively for iOS and Android. This application provides information about astronomical objects, stores them locally with the user own information and points the objects in the sky, etc. We also tested the same type of application with HTML5 and we are working to improve its display performance. SkyObjects is in the deposit process on Google Play and on the App Store. SkySurveys is available directly from the CDS Web pages. We will present these developments with a critical eye and we will do a demonstration.

O29: Enhancing Science During the Rapid Expansion of IRSA

Teplitz, Harry IRSA/CalTech
Team, The IRSA IRSA/CalTech

The NASA/IPAC Infrared Science Archive is undergoing a rapid expansion in the portfolio of missions for which we serve data, including the recent additions of the Spitzer Heritage Archive, the WISE archive, and NASA's Planck archive. This expansion has led to more than an order of magnitude increase in data holdings within a few years, and the inclusion of catalogs up to 10 billion rows. As a result, IRSA has a special opportunity to grow its services to support not only individual IR missions, but to optimize the synergy between them. We discuss the ways in which we enhance the science return from multiple IR data sets, including data discovery, moving object "precovery", and image cross-comparison.

O30: The Astropy Project: A Community Python Library for Astrophysics

Tollerud, Erik Yale University
Greenfield, Perry STScI
Robitaille, Thomas MPIA

I will introduce and describe progress on Astropy, a large, community effort to provide common astronomy/astrophysics utilities and promote reuse of software. It is based on a model of a collaborative open-source core package (currently under heavy development) and independent but affiliated packages contributed by individuals or organizations. I will describe some of the features in the current core package, the organizational structure of the community, and the direction the project is headed in the near future.

O31: Redefining the Data Pipeline Using GPUs

Warner, Craig University of Florida
Gonzalez, Anthony University of Florida
Eikenberry, Stephen University of Florida
Packham, Christopher University of Texas at San Antonio

There are two major challenges facing the next generation of data processing pipelines: 1) handling an ever increasing volume of data as array sizes continue to increase and 2) the desire to process data in near real-time to maximize observing efficiency by providing rapid feedback on data quality. Combining the power of modern graphics processing units (GPUs), relational database management systems (RDBMSs), and extensible markup language (XML) to re-imagine traditional data pipelines will allow us to meet these challenges.

Modern GPUs contain hundreds of processing cores, each of which can process hundreds of threads concurrently. Technologies such as Nvidia's Compute Unified Device Architecture (CUDA) platform and the PyCUDA ( module for python allow us to write parallel algorithms and easily link GPU-optimized code into existing data pipeline frameworks. This approach has produced speed gains of over a factor of 100 compared to CPU implementations for individual algorithms and overall pipeline speed gains of a factor of 10-25 compared to traditionally built data pipelines for both imaging and spectroscopy (Warner et al., 2011).

However, there are still many bottlenecks inherent in the design of traditional data pipelines. For instance, file input/output of intermediate steps is now a significant portion of the overall processing time. In addition, most traditional pipelines are not designed to be able to process real-time data on-the-fly.

We present a model for a next-generation data pipeline that has the flexibility to process data in near real-time at the observatory as well as to automatically process huge archives of past data by using a simple XML configuration file. XML is ideal for describing both the dataset and the processes that will be applied to the data. Meta-data for the datasets would be stored using an RDBMS (such as mysql or PostgreSQL) which could be easily and rapidly queried and file I/O would be kept at a minimum. We believe this redefined data pipeline will be able to process data in near-real time at the telescope, concurrent with continuing observations, thus maximizing precious observing time and optimizing the observational process in general. We also believe that using this design, it is possible to obtain a speed gain of a factor of 30-40 over traditional data pipelines when processing large archives of data.

O32: Spectral Line Selection in the ALMA Observing Tool

Williams, Stewart UK Astronomy Technology Center
Bridger, Alan UK Astronomy Technology Center

ALMA is the world's most powerful ground-based telescope offering views of the submm/mm universe with unparalleled resolution and flexibility in terms of receiver, front-end and back-end configuration. Proposals making full use of ALMA capabilities are complex, yet despite their complexity they should be easily constructed by scientists inexperienced in submm/mm radio astronomy.

The high level of flexibility offered by the ALMA front-end creates numerous pitfalls for the inexperienced radio astronomer, with many ways to specify a proposal with an invalid instrument configuration. In general, previous proposal preparation tools have either validated the proposal just prior to submission, or let the user submit potentially infeasible spectral setups. The ALMA Observing Tool provides a new perspective, assisting the user by being aware of the hardware limitations and steering the proposal towards valid configurations and away from invalid spectral setups.

In this presentation I will discuss the spectral line selection implementation used by the ALMA Observing Tool and the real-time filtering techniques that let the user identify their primary target lines, locate 'lines of opportunity', all the while preventing the user from constructing invalid proposals. I will also discuss the challenges in the implementation and connecting to virtual observatory spectral line services.

O33: An Observation-Centric View of the Chandra Data Archive

Winkelman, Sherry SAO
Rots, Arnold SAO

The Chandra Data Archive (CDA) plays a crucial role in the Chandra X-ray Center (CXC) that manages the operations of the observatory. The archive contains more than Chandra data. It contains proposals submitted to the observatory; scheduling information; observation parameters; a complete processing history of all data; and a bibliography. The archive operations group maintains a search and retrieve database which records who searches for data; downloads data; and where the data go. It's the data which tie these disparate snapshots of the archive together. So what significant events occur to the data during it's lifetime? Can they tell us anything about the science impact of the archive? Can they provide a new way of revealing the data to the astronomical community? In the end, a data-centric view of the CDA can help us visualize how our data fits into the astronomical community and can provide insight into how the archive can best serve the the science needs of the community.

The CDA has identified and recorded a number of key dates which occur for every dataset: acceptance of a proposal; completion of an observation; data delivery; and processing history. Other logged events occur for some or most of the data: grouping data into aggregates; monitoring the distribution of data; and monitoring the publication history in press releases, journals, and proceedings. What information can be gleaned by looking at this metadata in an observation-centric way? We can build time lines which can show us how public release of data, press releases, reprocessing of data, and publications affect the download of data. We can build word clouds using proposal and publication abstracts linked to the data to get a visual feel for what the dataset is and perhaps provide new parameters for searching for data. We can build distribution maps of where the data go which might help scientists identify regional areas of expertise on a subject. Examining which key events haven't yet occurred to data could lead to developing tools and interfaces which encourage the use of that data in research. This presentation will present an observation-centric view of Chandra data and will seek to address some of the questions raised in this abstract. This work is supported by NASA contract NAS8-03060.

O34: The LOFAR Data System: An Integrated Observing, Processing, and Archiving Facility

Wise, Michael W. ASTRON (Netherlands Institute for Radio Astronomy)
Nijboer, Ronald J. ASTRON (Netherlands Institute for Radio Astronomy)
Holties, Hanno A. ASTRON (Netherlands Institute for Radio Astronomy)

LOFAR, the Low Frequency Array, is a next-generation radio interferometer constructed in the north of the Netherlands and across Europe. Utilizing a novel phased-array design, LOFAR covers the largely unexplored low frequency range from 10 %G−%@ 240 MHz and has been designed to support a broad range of science capabilities. These include deep, all-sky radio surveys, searches for highly redshifted 21cm line emission from the Epoch of Reionization, surveys of pulsars and cosmic radio transients, and the detection of ultra-high energy cosmic rays. With its dense core array and long interferometric baselines, LOFAR achieves unparalleled sensitivity and spatial resolution in the low-frequency radio regime. Digital beam-forming techniques make the LOFAR system agile and allow for rapid repointing of the telescope as well as the potential for multiple simultaneous observations. Supporting such a wide range of observing capabilities, however, represents a data management challenge due to both the size and complexity of the scientific data produced by the system. For LOFAR, the raw system telemetry can exceed 15 Tbits/s corresponding to potentially ~1 exabyte (EB) of accumulated science data per week.

The LOFAR data system has been designed to reduce this large, raw data stream to a set of well-defined scientific data products and make them accessible to the user. It tightly couples the ability to configure and schedule a wide range of observing modes with a set of scientific processing pipelines and an active long-term archive (LTA) system. LOFAR is one of the first radio observatories to feature automated processing pipelines to deliver fully calibrated science products to its user community. The LOFAR LTA system has been designed to provide not only the traditional search-and-retrieve functionality that astronomers have come to expect, but also to serve as an additional processing resource either during standard processing or via user-initiated post-processing. In this talk, we present an overview of the full LOFAR data system. We describe its current capabilities and limitations as well as plans for additional functionality to be added through the ongoing development. Finally, we will review some of the issues of scientific data management at these scales and discuss how LOFAR can provide an unimportant pathfinder for the data challenges facing the SKA (Square Kilometre Array) and data intensive astronomy in the coming decade.

O35: DrizzlePac: Managing Multi-component WCS solutions for HST Data

Hack, Warren Space Telescope Science Institute
Dencheva, Nadezhda Space Telescope Science Institute
Fruchter, Andrew Space Telescope Science Institute

Calibration of the geometric distortion of HST instruments includes up to 3 separate distortion components to be used in conjunction with the WCS information. Managing and applying these separate components in an efficient manner required merging the use of multiple FITS conventions into a single WCS representation, which includes the full distortion model, that gets stored in the FITS header itself. The capabilities of this multi-component WCS already simplify how HST images are aligned and combined by users based on calibrations which have improved accuracy, while headerlets have the potential to allow alignment solutions to be more easily shared within the astronomical community. The logic implemented to combine these FITS conventions are described in this presentation. The DrizzlePac Python package now serves as a practical demonstration how this new logic works with real HST data and shows how this set of tools provides all the pieces necessary for managing and applying these highly accurate, complex WCS representations with minimal effort.