Subscribe | Unsubscribe
The Borne Identity: There’s No Running from Big Data
Kirk Borne, right, discusses Big Data with LSST Informatics and Statistics Science Collaboration colleague G. Jogesh Babu (Pennsylvania State University) during a break at the 2012 LSST All Hands Meeting in August 2012. (Image Credit: Emily Acosta)
Kirk Borne is used to operating on large scales. Once an avid 40-miles-per week runner, the LSST Informatics and Statistics Science Collaboration chair has finished several marathons, run in exotic locales around the world, and captained the Space Telescope Science Institute (STScI) running team to victory in several corporate challenge running events. Injuries forced Kirk to give up running in 1996, which is appropriate since there’s no running from the rising tide of Big Data in astrophysics.
Eight years of managing the staff in NASA’s Astronomical Data Center and Astrophysics Data Facility convinced Kirk of the value of Big Data in astronomy and of the enormous scientific discovery potential from datasets. Since 2003, Kirk has been on the faculty of George Mason University’s School of Physics, Astronomy, and Computational Sciences, where he researches data-oriented astronomy and teaches courses on Scientific Databases, Scientific Data, Scientific Data Mining, Statistics, Astroinformatics, and Computational Data Sciences. As chair of the LSST Informatics and Statistics Science Collaboration, Kirk is helping to address the data-to-knowledge challenge of LSST’s future dataset, which will reach multi-petabyte scale during the first month of the 10-year survey’s operations. The Informatics and Statistics collaboration is working on the development and application of sophisticated algorithms for data mining and statistics that will efficiently extract knowledge from the potentially overwhelming mass of LSST data. (See “The LSST Data Avalanche: Astroinformatics Rises to the Challenge” LSST E-News volume 5 number 2).
As an example of informatics in action, Kirk recalled a data mining project he worked on with the National Weather Service, in which they developed a neural network for detecting wildfires in remote sensing satellite data.
“We applied a neural network data mining algorithm on a huge number of large satellite images, in multiple wavebands (optical and infrared) to detect wildfires across the United States. This required a lot of training data (i.e., locations that were confirmed by humans to be locations where fires were truly taking place). The goals were: (a) to emulate with an algorithm what humans were doing slowly and painfully; (b) to automate the process; (c) to remove the human errors and subjectivity of the fire detection assessments (i.e., reduce the classification error on these types of transient events) by training an algorithm to do the discovery and classification; and (d) to extend the discovery and classification of these events to worldwide coverage, not only for the USA anymore. Regarding the last item, we would now call this ‘scaling to Big Data’."
With LSST, Kirk said he is especially interested in the application of data mining and knowledge discovery algorithms for finding new properties of galaxies as a function of cosmic time and cosmic environment, particularly pertaining to the mass assembly history of the Universe: the birth rate, current state, and ultimate fate of colliding and merging galaxies.
Although Kirk officially started working on LSST in 2005, he first learned of the project during “delightful discussions” with LSST Director Tony Tyson at an Aspen Workshop in 2001.
“I was attracted to the education opportunities and the scientific discovery potential from the Big Data that LSST will generate, and LSST has now given me an outlet to exercise all of my interests and skills in this area,” he said. “The sheer scale of the data management system and scientific data products is far beyond anything ever seen in astronomy. I am confident that LSST will reveal numerous unknown unknowns about the Universe. This is exciting from both the scientific perspective and the outreach perspective.”
Kirk also serves on the Outreach Advisory Board for LSST Education and Public Outreach (EPO), including design and development activities at the EPO to Data Management (DM) interface. Kirk wants to help LSST’s EPO program to inspire future generations of students and scientists in the same way that a childhood gift from an uncle inspired him.
“When I was 9 years old, I received, as a gift from an uncle, a colorful big coffee table book on astronomy; I was enthralled by everything in it, and I just had to study this stuff for a career,” Kirk said. “I was hooked! There was no other career path that could ever satisfy my curiosity except astronomy, physics, and math – i.e., astrophysics!”
He describes the LSST EPO team as extraordinarily creative and forward-thinking, with many plans for both formal and informal education with the LSST data products.
“I believe that the work we are doing in Big Data science, in transforming scientific research (developing the new data-oriented 4th paradigm of science), and in STEM education/public outreach is truly transformative and at the leading-edge of astronomy research and education.”
Just as running took Kirk to exotic places around the world for “literally breath-taking experiences,” he advises students that STEM education and a multi-disciplinary skill set will take them places they never could have imagined.
“After I learned about data mining and how it is an amazingly useful skill set,” Kirk explained, “I found myself invited to speak before audiences in several federal agencies and in several different scientific disciplines, covering a wide range of domains, while consulting on numerous projects from science to healthcare to national security. Astronomy and physics training gives you the most fundamentally important skills that employers seek: critical thinking and problem solving. Never forget how valuable you are for having this training.”
Article written by Robert McKercher and Kirk Borne
LSST is a public-private partnership. Funding for design and development activity comes from the National Science Foundation, private donations, grants to universities, and in-kind support at Department of Energy laboratories and other LSSTC Institutional Members:
Adler Planetarium; Argonne National Laboratory; Brookhaven National Laboratory (BNL); California Institute of Technology; Carnegie Mellon University; Chile; Cornell University; Drexel University; Fermi National Accelerator Laboratory; George Mason University; Google, Inc.; Harvard-Smithsonian Center for Astrophysics; Institut de Physique Nucléaire et de Physique des Particules (IN2P3); Johns Hopkins University; Kavli Institute for Particle Astrophysics and Cosmology (KIPAC) – Stanford University; Las Cumbres Observatory Global Telescope Network, Inc.; Lawrence Livermore National Laboratory (LLNL); Los Alamos National Laboratory (LANL); National Optical Astronomy Observatory; National Radio Astronomy Observatory; Princeton University; Purdue University; Research Corporation for Science Advancement; Rutgers University; SLAC National Accelerator Laboratory; Space Telescope Science Institute; Texas A & M University; The Pennsylvania State University; The University of Arizona; University of California at Davis; University of California at Irvine; University of Illinois at Urbana-Champaign; University of Michigan; University of Pennsylvania; University of Pittsburgh; University of Washington; Vanderbilt University
LSST E-News Team:
LSST E-News is a free email publication of the Large Synoptic Survey Telescope Project. It is for informational purposes only, and the information is subject to change without notice.