Standards

The Path to STAC 1.0.0

It’s almost time! The SpatioTemporal Asset Catalog (STAC) specification has been maturing for over 3 years, and already has a rich ecosystem of tools with hundreds of millions of assets cataloged. The core community has agreed that it’s time to put a pin on it and lock in a super stable specification that can be a core building block for years to come. We believe STAC will be the foundation of something truly special: A transformation to a ‘Cloud Native Geospatial’ world that will open up unimaginable innovation.

Our goal has been to build simple, flexible building blocks to expose geospatial data in order to enable others to build incredible value on top of that. This can only happen if there’s a stable base to rely upon. In the next couple of months, we hope to release 1.0.0. Our goal goes beyond just releasing the specification: We aim to make sure there is a ‘complete’ core ecosystem of tools to make it easy for people to create, use, and get value from STAC. See below for our plan, details on our sprint, and also ways that organizations can directly support STAC.

Machine Learning

Archived Training Dataset Downloads now Available on Radiant MLHub

A little over a year ago, we launched the first iteration of Radiant MLHub in the form of a STAC-compliant API, which allows you to browse our training data collections and list and download individual assets from the items within those collections. Today, we’re announcing the ability to download an archived version of training datasets with just a single-click download. In this post, we’ll describe the process for downloading datasets, the structure of the archived datasets, and provide some tips for effectively traversing the downloaded datasets.

We are now offering three different methods of downloading our datasets. The easiest method, downloading on our registry, can be accessed by navigating to a dataset page and clicking on the “Download” link for each collection you would like to download. Clicking this link will direct you to our dashboard, which will ask you to login if you are not already authenticated and then begin the download process for that collection.

Machine Learning, News

Radiant MLHub in 2021: Realizing a Data Ecosystem

In December 2019, we publicly launched Radiant MLHub, the first open-access cloud-based repository for geospatial training datasets. Since then, we have continuously published new datasets and expanded the ecosystem around Radiant MLHub.

The idea of Radiant MLHub was born in Spring 2018 after several discussions and feedback from members of the community and funders. We had started a new project to develop a global and geographically diverse land cover training dataset using human verification called LandCoverNet. Soon after the launch of LandCoverNet in 2018, we identified a gap in the ecosystem to facilitate publication and uptake of training datasets in our community. That gap in the data value chain led us to the design and implementation of Radiant MLHub.

Machine Learning

Announcing the Updated Machine Learning for Earth Observation Market Map

Meet the 150+ organizations that focus on machine learning applications with satellite data. 

The updated Machine Learning for Earth Observation Market Map is finally here!

The ML4EO market map is a curated list of organizations focused on different machine learning aspects with a satellite data pipeline. This release includes an additional list of 50 organizations, which we missed in the first version published in September 2020.

Standards

SpatioTemporal Asset Catalogs and the Open Geospatial Consortium

Community Voices

Data Labeling Contest: Crowdsourcing a scalable solution to generate labels for satellite imagery

A conversation with the First Place winners of the Data Labeling Contest – In September 2020, we announced the Data Labeling Contest winners. The contest was part of the Cloud Native Geospatial Outreach Day sponsored by Planet, Microsoft, Azavea, and Radiant Earth Foundation. Participants were invited to contribute to open-access training data catalogs by identifying cloudy pixels in Sentinel-2 scenes. Two hundred thirty-one labelers joined the contest, representing a wide range of educational backgrounds, institutions, and geographies. While several awards were given to the top 83 contributions in six categories, in this Q&A, we sat down with Solomon Kica from Uganda and Jhomira Vanessa Loja Zumaeta from Peru, who won the Top Labeler first prize awards. Both winners were selected for the top prize because their scores were incredibly close, a 3.6% difference, and both scores stood out from the rest of the participants.

Machine Learning

Can you guess if this place is real?

Generating synthetic training data that can improve the accuracy of machine learning models – You have probably read about fake images and videos being generated by machine learning (ML) models. While this application might sound more like a fun exercise or, in some cases, malicious activity, synthetic (aka fake) data can help improve the accuracy of ML models. For example, research has shown that Generative Adversarial Networks (GANs) can generate synthetic data to augment real medical image training data that improves liver lesion classification and medical diagnosis.

As part of a project to tackle the scarcity of training data for agricultural monitoring applications, we are using GANs to generate synthetic Sentinel-2 satellite imagery. The results reveal that our GAN model can generate realistic imagery that can be used in classification models. Check out isthisplacereal.com and see how many of the images you can correctly identify as real or synthetic.

Standards

The first STAC API 1.0 release: 1.0.0-beta.1

I’m pleased to share that we’ve just released STAC API 1.0.0-beta.1. This is our first release of the API since we split the specification, with STAC now living in its own repository. You can see the latest specification in the stac-api-spec repository, and we link to browsable API representations of the major portions below.

What started out as a pretty modest release ended up snowballing into a major amount of work, but I think we’re all pretty proud of the end state. Our main goal was to have a version of the API that was released standalone, independent of the STAC Core releases. In previous versions, it was just assumed that a version number applied to both the STAC content and the API. So we wanted to enable services could be more explicit about which version of each they supported — one could upgrade the service to 1.0.0-beta.1 API, but have it still serve STAC 0.9.0 Items.

Machine Learning

Advancing AI for Earth Science: A Data Systems Perspective

Tackling data challenges and incorporating physics into machine learning models will help unlock the potential of artificial intelligence to answer Earth science questions.

The Earth sciences present uniquely challenging problems, from detecting and predicting changes in Earth’s ecosystems in response to climate change to understanding interactions among the ocean, atmosphere, and land in the climate system. Helping address these problems, however, is a wealth of data sets—containing atmospheric, environmental, oceanographic, and other information—that are mostly open and publicly available. This fortuitous combination of pressing challenges and plentiful data is leading to the increased use of data-driven approaches, including machine learning (ML) models, to solve Earth science problems.

Community Voices

Zhuangfang NaNa Yi: Building Machine Learning Applications that Empower Policymakers with Insights to Support Vulnerable Communities

A conversation about the nuances of applying machine learning algorithms to Earth observation for global development organizations.

It is our pleasure to Dr. Zhuangfang NaNa Yi, a machine learning engineer at Development Seed, supporting international development organizations like UNICEF, the World Bank, and USAID in making data-driven policy decisions. She has extensive experience in applying machine learning algorithms to geospatial and satellite data, from building applications that farmers can use to track crop types and changes to water bodies, mapping forest and measuring food security, and more.