← Back to rubinobservatory.org

Data Products

The Rubin Data Products, Abridged

Contact author: Melissa Graham, Lead Community Scientist for Rubin Observatory

Date: August 2022

 

This webpage supplies a brief, informal summary of the planned Rubin Observatory data products and analysis tools, and outlines the boundary between what Rubin Observatory will provide and what will be left to the expertise of the science community.

 

These plans are subject to change. The Data Products Definitions Document (DPDD; ls.st/dpdd) and the publication “LSST: From Science Drivers to Reference Design and Anticipated Data Products” (Ivezić et al. 2019) remain the ultimate reference for descriptions of the planned LSST data products and pipelines.

 

The information on this webpage is also available as a slide deck in Zenodo (DOI:10.5281/zenodo.7011229), and as a recorded presentation. As of August 19 2022, the Zenodo slide deck had been updated to the 2022 version (with the new 80 hours timescale for promptly processed images), and a new recording made and posted below.

 

 

 

1. Introduction

The Rubin Observatory’s LSST Science Pipelines will create the general-use data products and analysis tools which will enable scientists to produce the expected science deliverables in the four science pillars: probing dark energy and dark matter, taking an inventory of the solar system, exploring the transient optical sky, and mapping the Milky Way. The science deliverables are described in the Rubin Science Requirements Document (SRD; ls.st/srd). The general-use data products and analysis tools developed by Rubin staff incorporate algorithms and software that have been designed, built, and validated by the global astronomical community, and represent a cumulation of shared knowledge and expertise.

Producing the expected LSST science deliverables - and pushing into new scientific frontiers - will also require the development of specialized algorithms, data products, analysis tools, and cyberinfrastructure that go beyond what will be provided by Rubin Observatory. This considerable amount of work is best left to the specific expertise of the science community, and the independent LSST Science Collaborations are driving this development.

 

2. Transients, Variables, and Moving Objects

Processing Pipelines: The data products for transients, variables, and moving objects will be primarily produced by the Prompt Processing pipelines, which will perform reduction, calibration, difference image analysis (DIA), source detection and measurement, and alert distribution within 60 seconds of image readout. Solar System Processing for moving objects will take place during the day. Images and catalogs that result from Prompt Processing will be available after 80 hours, and are fully described in Section 3 of the DPDD. All DIA data products will be re-generated during the annual Data Release Processing. Source detection and measurement on direct images (i.e., non-difference images) will only be done during the annual Data Release Processing.

Alert Packets are ascii files containing data for a single detected source in a difference image (DIASource); they will include catalog data and small cutouts of the difference and template images.

Images

  • processed visit images (PVIs or “direct images”; 80 hours and annually)

  • difference images (template-subtracted; 80 hours and annually)

  • template images (transient-free annual stacks; annually)

 

Catalogs

  • sources detected with SNR>5 via difference image analysis (DIA), and associated forced photometry: the DIASource, DIAObject, and DIAForcedSource tables (24 hours and annually)

  • DIASources linked as moving-objects in the solar system (SS) and their orbital parameters: the SSSource, SSObject, and MPCORB tables (24 hours and annually)

  • sources detected with SNR>5 in PVIs, and associated forced photometry: the Source, Object, and ForcedSource tables (annually)

 

Catalog contents will include:

  • unique identifiers (IDs)

  • measurements (e.g., coords, flux, mag, date/time, shape, size, PSF fit, proper motion, parallax)

  • IDs of nearby LSST static-sky catalog objects (i.e., host association)

  • orbital parameters derived by the Minor Planet Center (MPC)

  • time variability parameters (limited; to be determined with community input; ls.st/dmtn-118)

  • pre-discovery ("precovery") PSF photometry in difference images

 

Examples of additional specialized algorithms, data products, and analysis tools that will be left to the expertise of the science community include, but are not limited to: photometric and spectroscopic follow-up observations; object classifications (e.g., light-curve types, astronomical categorization); cyberinfrastructure for the large-scale acquisition, processing, and analysis of follow-up; cross-matching to non-LSST catalogs; host-galaxy confirmation (e.g., distinguishing faint or blended hosts); orbital and/or time-variability parameters beyond what is in the LSST tables; light-curve parameters (e.g., rise/fall times, peak brightness, asteroid rotation rates); shifted-and-stacked images (e.g., to detect faint moving objects); multi-night stacks or difference images (e.g., to detect fainter objects); physical parameters (e.g., redshift, distance, host extinction, composition, intrinsic magnitude); and event occurrence rates (e.g., volumetric rates).

 

3. Static-Sky Objects (Stars and Galaxies)

Processing Pipelines: The data products for static-sky objects (stars and galaxies) will be primarily produced by the Data Release Processing pipelines, which will reduce, calibrate, and combine (i.e., stack, coadd) all LSST images, and detect, measure, and characterize sources in both direct and “deep coadded” images. Images and catalogs that result from Data Release Processing will be available annually, and are fully described in Section 4 of the DPDD.

Images

  • processed visit images (PVIs or “direct images”; 80 hours and annually)

  • deep CoAdds (stack of all LSST images; one per filter; annually)

 

Catalogs

  • sources detected with SNR>5 in PVIs, and associated forced photometry: the Source, Object, and ForcedSource tables (annually)

  • forced photometry in PVIs at the location of all Objects: the ForcedSource tables (annually)

 

Catalog contents will include:

  • unique identifiers (IDs)

  • measurements (e.g., flux, mag, color, date/time, shape, size, PSF fit, proper motion, parallax)

  • centroids and adaptive moments

  • Petrosian and Kron fluxes

  • deblending parameters (e.g., parent/child associations; priors for crowded fields)

  • model fits (e.g., point-source, bulge-disk)

  • aperture surface brightness measurements

  • photometric redshift (PZ) estimates (community-vetted algorithm TBD; see ls.st/dmtn-049

  • local shear estimation measures

 

Examples of additional specialized algorithms, data products, and analysis tools that will be left to the expertise of the science community include, but are not limited to: alternative types of deeply stacked coadded images (e.g., intermediate timescales, multi-band, best-seeing); specialized deblending algorithms (e.g., for crowded fields); probabilistic photometry catalogs (e.g., for crowded fields); stellar types or physical parameters (e.g., metallicity); Milky Way component associations (e.g., disk/bulge/halo stars); specialized low-surface brightness measurements; galaxy PZ or physical parameters (e.g., star formation rates) beyond those from the adopted PZ algorithm; galaxy shear estimates beyond those provided by the adopted shear algorithm; other galaxy characterization (e.g., AGN, interacting galaxies, group or cluster membership, morphological classifications); cyberinfrastructure to support large-scale compute-intensive processing (e.g., wide-area joint pixel analyses with non-LSST data sets or image reprocessing, cosmological simulations).

 

4. Compute Resources and User-Generated Data Products

In order for scientists to access and analyze the LSST data, Rubin Observatory will provide the Rubin Science Platform (RSP). The RSP is a set of integrated web-based applications and services running at the Rubin Observatory Data Access Centers (DACs), which will include tools to query, visualize, subset, and analyze the full LSST data archives in a stable software environment located “next-to-the-data,” with storage space and compute resources for user-generated data products.

User-Generated Data Products refers collectively to the specialized data products that will be generated by the science community. These will be created and stored using suitable Application Programming Interfaces (APIs) that will be provided as part of the RSP. Users and groups will be able to maintain access control over the data products they create, enabling them to have limited distribution or to be shared with the entire Rubin Observatory community.

As defined in the Science Requirements Document (SRD; ls.st/srd), the Rubin Data Management System will provide at least 10% of its total capacity for user processing and storage. Scientists will be able to pool their compute resource quotas in order to undertake larger processing jobs. If the compute resources provided by Rubin Observatory are oversubscribed, a “Resource Allocation Committee” will be established.

Due to the unprecedentedly large nature of the LSST data set, it is anticipated that some of the additional specialized algorithms, data products, and analysis tools that will be left to the expertise of the science community will require significant external cyberinfrastructure support in addition to the RSP. A few examples include: processing and analyzing follow-up observations for LSST time-domain events; running wide-area joint pixel analyses with non-LSST data sets; building and using frameworks for probabilistic catalogs; iterative development and training for machine learning algorithms; and many, many other applications in the big-data era of the LSST.

 

5. Additional Information

Data Rights: Please refer to the Rubin Observatory Data Policy, ls.st/rdo-013

Definitions and Acronyms: Please use lsst.org/scientists/glossary-acronyms to search for definitions of any terms or acronyms used in the above descriptions.

Questions? The “Data Q&A” Category of the Rubin Community Forum is dedicated to answering questions about the planned Rubin data products. Forum membership is open to everyone. You must sign up for an account in order to post, but most content is viewable without an account.

To ask a question about Rubin data, go to Community.lsst.org, and under ‘Science’ click on ‘Data Q&A’, which will take you to the ‘Data Q&A’ category where you can see the Q&A that others have posted. At upper right, click on +New Topic, compose your question, then +Create Topic. Rubin staff will answer your question as soon as possible.

 

Financial support for Rubin Observatory comes from the National Science Foundation (NSF) through Cooperative Agreement No. 1258333, the Department of Energy (DOE) Office of Science under Contract No. DE-AC02-76SF00515, and private funding raised by the LSST Corporation. The NSF-funded Rubin Observatory Project Office for construction was established as an operating center under management of the Association of Universities for Research in Astronomy (AURA).  The DOE-funded effort to build the Rubin Observatory LSST Camera (LSSTCam) is managed by the SLAC National Accelerator Laboratory (SLAC).
The National Science Foundation (NSF) is an independent federal agency created by Congress in 1950 to promote the progress of science. NSF supports basic research and people to create knowledge that transforms the future.
NSF and DOE will continue to support Rubin Observatory in its Operations phase. They will also provide support for scientific research with LSST data.   




Contact   |   We are Hiring

Admin Login

Back to Top