This week, the Vera C. Rubin Observatory is launching the first preview of its new Rubin Science Platform (RSP) for an initial cohort of astronomers. The observatory, which is located in Chile but managed by the U.S. National Science Foundation’s NOIRLab in Tucson, AZ and SLAC in California, is jointly funded by the NSF and the U.S. Department of Energy. The platform provides an easy-to-use interface to store and analyze the massive datasets of the Legacy Survey of Space and Time (LSST), which will survey a third of the sky each night for ten years, detecting billions of stars and galaxies, and millions of supernovae, variable stars, and small bodies in our Solar System.
The LSST datasets are unprecedented in size and complexity, and will be far too large for scientists to download to their personal computers for analysis. Instead, scientists will use the RSP to process, query, visualize, and analyze the LSST data archives through a mixture of web portal, notebook, and other virtual data analysis services. An initial launch with simulated data, called Data Preview 0, builds on the Rubin Observatory’s three-year partnership with Google to develop an Interim Data Facility (IDF) on Google Cloud to prototype hosting of the massive LSST dataset. This agreement marks the first time a cloud-based data facility has been used for an astronomy application of this magnitude.
Bringing the stars to the cloud
For Data Preview 0, the IDF leverages Cloud Storage, Google Kubernetes Engine (GKE), and Compute Engine to provide the Rubin Observatory user community access to simulated LSST data in an early version of the RSP. The simulated data were developed over several years by the LSST Dark Energy Science Collaboration to imitate five years of an LSST-like survey over 300 square degrees of the sky (about 1,500 times the area of the moon). The resulting images are very realistic: they have the same instrumental characteristics, such as pixel size and sensitivity to photons, that are expected from the Rubin Observatory’s LSST Camera, and they were processed with an early version of the LSST Science Pipelines that will eventually be used to process LSST data. “This will be the first time that these workloads have ever been hosted in a cloud environment. Researchers will have an opportunity to explore an early version of this platform,” says Ranpal Gill, senior manager and head of communications at the Rubin Observatory.
Broadening access for more researchers
Over 200 scientists and students with Rubin Observatory data rights were selected to participate in Data Preview 0 from a pool of applicants that represents a wide range of demographic criteria, regions, and experience level. Participants will be supported with resources such as tutorials, seminars, communication channels, and networking opportunities—and they will be free to pursue their own science at their own pace using the data in the RSP.
“The revolutionary nature of the future LSST dataset requires a commensurately innovative system for data access and analysis paired with robust support for scientists,” says Melissa Graham, lead community scientist for the Rubin Observatory and research scientist in the astronomy department at the University of Washington. “I’m personally excited to enhance my own skills by using the RSP’s tools for big data analysis, while also helping others to learn and to pursue their LSST-related science goals during Data Preview 0.”
At the same time, the fact that the RSP is hosted in the cloud provides researchers at smaller institutions access to state-of-the-art astronomy infrastructure that is comparable to that of the largest national research centers.
The launch benefits the observatory too: the development team can learn what researchers are interested in while also testing and debugging the platform. Graham says that “the platform is still in active development so researchers using it will be able to follow along in the progress, and provide feedback on ways that we can optimize the development of the tools.”
The LSST aims to begin the ten-year survey in 2023-24 and expects it to include 500 petabytes of data. Through the cloud, Google aims to help make this extraordinary project scalable and accessible to researchers everywhere. To learn more about Data Preview 0, watch this video.
Want to ramp up your own research in the cloud? We offer research credits to academics using Google Cloud for qualifying projects in eligible countries. You can find our application form on Google Cloud’s website or contact our sales team.