IBM and NASA’s Marshall Space Flight Centre have announced a collaboration to use IBM’s artificial intelligence (AI) technology to discover new insights in NASA’s massive trove of Earth and geospatial science data. Their combined work will apply AI ‘foundation model’ technology to NASA’s Earth-observing satellite data for the first time, an IBM spokesperson informed businessline.

Foundation model at work

AI ‘foundation model technology’ is quickly gaining traction through models like ChatGPT and it has been applied to NASA’s Earth-observing satellite data for the first time, the spokesperson explained. Its goal is to advance the scientific understanding of and response to Earth and climate-related issues like natural disasters and warming temperatures.

‘Foundation models’ are types of AI models that are trained on a broad set of unlabelled data, can be used for different tasks, and can apply information about one situation to another. These models have rapidly advanced the field of natural language processing (NLP) technology over the last five years, and IBM is pioneering applications of foundation models beyond language.

Large volumes of data

Earth observations that allow scientists to study and monitor our planet are being gathered at unprecedented rates and volume. New and innovative approaches are required to extract knowledge from these vast data resources. The goal of this work is to provide an easier way for researchers to analyse and draw insights from these large datasets, the spokesperson said. IBM’s foundation model technology has the potential to speed up the discovery and analysis of these data in order to quickly advance the scientific understanding of Earth and response to climate-related issues. 

SWIR false colour composite of the snow-capped Himalayas on November 28, 2022.

SWIR false colour composite of the snow-capped Himalayas on November 28, 2022. | Photo Credit: NASA IMPACT

IBM and NASA are planning to develop several new technologies to extract insights from Earth observations. One project will train an IBM geospatial intelligence foundation model on NASA’s Harmonised Landsat Sentinel-2 (HLS) dataset, a record of land cover and land use changes captured by Earth-orbiting satellites. By analysing petabytes of satellite data to identify changes in the geographic footprint of phenomena such as natural disasters, cyclical crop yields, and wildlife habitats, this foundation model technology will help researchers provide critical analysis of our planet’s environmental systems.

Develops new NLP model

Another output from this collaboration is expected to be an easily searchable corpus of Earth science literature. IBM has developed an NLP model trained on nearly three lakh Earth science journal articles to organise the literature and make it easier to discover new knowledge. Containing one of the largest AI workloads trained on Red Hat’s OpenShift software to date, the fully trained model uses PrimeQA, IBM’s open-source multilingual question-answering system. Beyond providing a resource to researchers, the new language model for Earth science could be infused into NASA’s scientific data management and stewardship processes.

Rahul Ramachandran, senior research scientist at NASA’s Marshall Space Flight Center in Huntsville, Alabama, said the beauty of foundation models is that the models can potentially be used for many downstream applications. “Building these foundation models cannot be tackled by small teams. You need teams across different organisations to bring their different perspectives, resources, and skill sets.” 

Valuable insights

Raghu Ganti, principal researcher at IBM, said foundation models have proven successful in NLP, and it’s time to expand that to new domains and modalities important for business and society.

“Applying foundation models to geospatial, event-sequence, time-series, and other non-language factors within Earth science data could make enormously valuable insights and information suddenly available to a much wider group of researchers, businesses, and citizens. Ultimately, it could facilitate a larger number of people working on some of our most pressing climate issues.”

Other potential IBM-NASA joint projects in this agreement include constructing a foundation model for weather and climate prediction using MERRA-2, a dataset of atmospheric observations, the spokesperson for IBM said. This collaboration is part of NASA’s Open-Source Science Initiative, a commitment to building an inclusive, transparent, and collaborative open science community over the next decade.