Subscribe | Unsubscribe

LSST E-News

LSST E-News

April 2009  •  Volume 2 Number 1  •  Archive

Data Management Boundary Meeting Defines Science Needs

Tim Axelrod, Jeff Kantor and Anna Spitz

Data Management Boundary Meeting Attendees

Data Management Boundary Meeting Attendees

How do you manage 30 terabytes of data each night and enable cutting-edge science, including some that hasn’t even been defined yet? The LSST data management and science teams convened a meeting February 9-10, 2009 at the University of California, Davis (UC-Davis) to further refine just how to accomplish this enormous, paradigm-shifting task. LSST science requirements are the drivers for defining how data management is structured and carried out. “One of the challenges is that a lot of the science is unpredictable. There is a broad distribution of access and computation needs”, says LSST Director, Tony Tyson. Adding to this underlying challenge is the communication challenge: assembling a collaboration strategy to get scientists more involved in how data will be captured, analyzed, moved through, and archived in the LSST system. The UC-Davis meeting met these challenges by refining boundary definitions for products to create a software environment, which enables access to LSST data for diverse users, and by instituting steps to improve communication between the architects of the data management and the data users.

The LSST Science Requirement Document (SRD) specifies a set of science goals for LSST observing program to achieve. To accomplish this, “the data management system (DMS) must generate, or enable the generation of, a set of data products, and to make them available to scientists and the public.” According to Tim Axelrod, Data Management Scientist, to perform this mission, the DMS performs the following major functions:

  • The DMS processes the incoming stream of images the camera system generates to produce transient alerts and to archive the raw images.
  • Approximately once per year the DMS creates and archives a Data Release, a static self-consistent collection of data products generated from all survey data taken from the date of survey initiation to the cutoff date for Data Release.
  • The DMS periodically creates new calibration data products that other processing functions will use.
  • The DMS makes all LSST data available through an interface that uses community-based standards and facilitates user data analysis and production of user-defined data products at Data Access Centers (DACs) and external sites.

Attendees tackled the basic needs and expectations of the DM system. Day one focused thinking with the question, how do we go from raw pixels to calibrated databases and day two focused discussions with the charge, how do we make this useful to the world. With these goals in mind, the group organized discussions around the needs of science collaborations, the data management deliverables, challenges of data collection, and development possibilities beyond the core data management products.

Attendees:
Lee Armus, Tim Axelrod, Jacek Becla, Kirk Borne, M.E. Brown, David Burke, Chuck Claver, Andy Connolly, Roc Cutri, Gregory Dubois-Felsman, Harry Ferguson, Rob Gibson, Oskar Holm, Hu Zhan, Zeljko Ivezic, Suzanne Jacoby, James Jee, Lynne Jones, Steve Kahn, Jeff Kantor, Robert Lupton ,Phil Marshall, Ray Plante, Abi Saha, Sam Schmidt, Ryan Scranton, Michael Strauss, Don Sweeney, Tony Tyson, David Wittman, Sidney Wolff

The meeting posed an array of questions concerning details of processing for the science collaborations. Questions ranged from what does “facilitates user data analysis and production of user-defined data products” mean to how will the system optimize detection of the objects that each science collaboration is interested in.

LSST collaborators have envisioned data products driven by four core science cases in the SRD: constraining dark energy and dark matter, taking an inventory of the solar system, exploring the transient optical sky, and mapping the Milky Way. These data products are organized into three groups based primarily on where and when they are produced.

Pipeline processing produces Level 1 data products continuously every observing night through a highly automated process. Level 1 data products include exposures, alerts, detected source catalogs, astronomical object catalogs, and nightly summary statistics.

Level 2 data products are generated as part of a Data Release and include products that require significant computation and human interaction because they combine data from many sources. Level 2 data products consist of co-added exposures, calibration products, and more extended catalogs.

Researchers derive Level 3 data products from Level 1 and 2 products to support specific science goals. The Data Management System provides computing (hardware and software) capabilities to enable researchers to create the Level 3 data products, and in certain cases, propose them for elevation into Level 1 or 2.

Level 1 and Level 2 data products that have passed quality control tests are required to be accessible to the public without restriction. Access policies for Level 3 products will be product- and source-specific and in some cases proprietary.

Science collaborators left the meeting with the charge to take the list of the ten science projects for each collaboration, and consider how they would perform the analyses, what LSST-provided data products are needed and what processing and support are needed. Science collaborations are to report back in one month. Combining this information with Axelrod’s summary document about the system, the DM group will be able to do a computing and storage estimate and create a table of total resources needed to achieve all the science collaborators want to do with the LSST’s massive amounts of data.

 

LSST is a public-private partnership. Funding for design and development activity comes from the National Science Foundation, private donations, grants to universities, and in-kind support at Department of Energy laboratories and other LSSTC Institutional Members:

Brookhaven National Laboratory; California Institute of Technology; Carnegie Mellon University; Columbia University; Google, Inc.; Harvard-Smithsonian Center for Astrophysics; Johns Hopkins University; Kavli Institute for Particle Astrophysics and Cosmology - Stanford University; Las Cumbres Observatory Global Telescope Network, Inc.; Lawrence Livermore National Laboratory; National Optical Astronomy Observatory; Princeton University; Purdue University; Research Corporation for Science Advancement; Rutgers University; SLAC National Accelerator Laboratory; The Pennsylvania State University; The University of Arizona; University of California at Davis; University of California at Irvine; University of Illinois at Urbana-Champaign; University of Pennsylvania; University of Pittsburgh; University of Washington; Vanderbilt University

LSST E-News is a free email publication of the Large Synoptic Survey Telescope Project. It is for informational purposes only, and the information is subject to change without notice.

Subscribe | Unsubscribe

Copyright © 2009 LSST Corp., Tucson, AZ • www.lsst.org