Subscribe | Unsubscribe

LSST E-News

LSST E-News

October 2012  •  Volume 5 Number 2

Searching for Answers in all the Right Places

The overlapping partitioning concept developed by the LSST database team enables efficient searching of enormous databases by allowing neighboring objects to be found without the time-consuming process of searching multiple partitions. (Graphic: Emily Acosta, LSST)

The Large Synoptic Survey Telescope (LSST) database team has developed an innovative “overlapping partitioning” method for storing enormous amounts of information for rapid access. By overlapping equally sized packets of information in the partitioned sphere, searching for nearest neighbor sources becomes quick and efficient. Further, the technique has been shown to work just as efficiently with increasingly complex systems. The improved algorithms resulting from this innovative architecture will be available as open source software that can be used by a broad spectrum of fields to transform access to large databases.

The graphic to the left illustrates that the LSST database will be partitioned into equal-area chunks, each storing roughly equal quantities of astronomical sources. This concept of “spherical partitioning” is crucial for making the massive LSST data search practical. However, the typical method of partitioning data into chunks requires neighbors to be in the same partition if they are to be located quickly. This creates the problem that finding objects near another object will sometimes require the time-consuming process of searching other partitions for objects near the edge.

The new approach developed by LSST researchers partitions the data with overlaps. The expanded graphic’s inset shows the overlapping characteristic, where extracted partition 509 contains all sources within 509 (including the red sphere) plus those that also appear on the edges of neighboring chunks (yellow sphere). This overlapping characteristic of the LSST database structure means we can still use partitioning while searching for nearest neighbors because it allows objects to be found without contacting other partitions, up to a certain distance. Implementing a partitioning scheme along with overlapping edges of partitions means the enormous LSST database can be searched efficiently.

LSST’s innovative datbase innovation was featured as an NSF highlight on Research.gov

Additionally, the researchers have demonstrated that the architecture proposed for the LSST database is linearly scalable in that more nodes can be added without system performance degradation. The test database had a 55-billion-row data set on a 150-node parallel database cluster; the experiment was similar in scale to searching hypothetical satellite-imagery databases for red, convertible sports cars traveling near white, full-size pick-up trucks on any road on Earth!

LSST is also creating a general-purpose data and algorithm-parallel framework that, like these database innovations, will be available as open source software. The project’s open source example will be reusable on any high-performance, parallel scientific application, and as a result, can be leveraged by future projects in many fields of science and engineering, especially those that store spatial information, like maps, and information that changes with time. Domains that will benefit most from LSST’s improved spatial search and storage algorithms include the geosciences (e.g. climate, oceanography, and seismology), medical imaging, and oil and gas exploration. And since applications using data that changes with time, or temporal data, are found in virtually every domain, LSST’s innovations possess the potential to transform large database access in many fields, including the financial sector, the internet (modeling user behavior, fraud detection), climate modeling, drug discovery, healthcare, and many retail applications.

Article written by Suzanne Jacoby and Robert McKercher

 

LSST is a public-private partnership. Funding for design and development activity comes from the National Science Foundation, private donations, grants to universities, and in-kind support at Department of Energy laboratories and other LSSTC Institutional Members:

Adler Planetarium; Argonne National Laboratory; Brookhaven National Laboratory (BNL); California Institute of Technology; Carnegie Mellon University; Chile; Cornell University; Drexel University; Fermi National Accelerator Laboratory; George Mason University; Google, Inc.; Harvard-Smithsonian Center for Astrophysics; Institut de Physique Nucléaire et de Physique des Particules (IN2P3); Johns Hopkins University; Kavli Institute for Particle Astrophysics and Cosmology (KIPAC) – Stanford University; Las Cumbres Observatory Global Telescope Network, Inc.; Lawrence Livermore National Laboratory (LLNL); Los Alamos National Laboratory (LANL); National Optical Astronomy Observatory; National Radio Astronomy Observatory; Princeton University; Purdue University; Research Corporation for Science Advancement; Rutgers University; SLAC National Accelerator Laboratory; Space Telescope Science Institute; Texas A & M University; The Pennsylvania State University; The University of Arizona; University of California at Davis; University of California at Irvine; University of Illinois at Urbana-Champaign; University of Michigan; University of Pennsylvania; University of Pittsburgh; University of Washington; Vanderbilt University

LSST E-News Team:

  • Suzanne Jacoby (Editor-in-Chief)
  • Anna Spitz (Writer at Large)
  • Robert McKercher (Staff Writer)
  • Mark Newhouse (Design & Production: Web)
  • Emily Acosta (Design & Production: PDF/Print)
  • Sidney Wolff (Editorial Consultant)
  • Additional contributors as noted

LSST E-News is a free email publication of the Large Synoptic Survey Telescope Project. It is for informational purposes only, and the information is subject to change without notice.

Subscribe | Unsubscribe

Copyright © 2012 LSST Corp., Tucson, AZ • www.lsst.org

trackPageview(); } catch(err) {}