Machine Learning

Archived Training Dataset Downloads now Available on Radiant MLHub

A little over a year ago, we launched the first iteration of Radiant MLHub in the form of a STAC-compliant API, which allows you to browse our training data collections and list and download individual assets from the items within those collections. Today, we’re announcing the ability to download an archived version of training datasets with just a single-click download. In this post, we’ll describe the process for downloading datasets, the structure of the archived datasets, and provide some tips for effectively traversing the downloaded datasets.

We are now offering three different methods of downloading our datasets. The easiest method, downloading on our registry, can be accessed by navigating to a dataset page and clicking on the “Download” link for each collection you would like to download. Clicking this link will direct you to our dashboard, which will ask you to login if you are not already authenticated and then begin the download process for that collection.

Machine Learning, News

Radiant MLHub in 2021: Realizing a Data Ecosystem

In December 2019, we publicly launched Radiant MLHub, the first open-access cloud-based repository for geospatial training datasets. Since then, we have continuously published new datasets and expanded the ecosystem around Radiant MLHub.

The idea of Radiant MLHub was born in Spring 2018 after several discussions and feedback from members of the community and funders. We had started a new project to develop a global and geographically diverse land cover training dataset using human verification called LandCoverNet. Soon after the launch of LandCoverNet in 2018, we identified a gap in the ecosystem to facilitate publication and uptake of training datasets in our community. That gap in the data value chain led us to the design and implementation of Radiant MLHub.

Machine Learning

Announcing the Updated Machine Learning for Earth Observation Market Map

Meet the 150+ organizations that focus on machine learning applications with satellite data. 

The updated Machine Learning for Earth Observation Market Map is finally here!

The ML4EO market map is a curated list of organizations focused on different machine learning aspects with a satellite data pipeline. This release includes an additional list of 50 organizations, which we missed in the first version published in September 2020.

Machine Learning

Can you guess if this place is real?

Generating synthetic training data that can improve the accuracy of machine learning models – You have probably read about fake images and videos being generated by machine learning (ML) models. While this application might sound more like a fun exercise or, in some cases, malicious activity, synthetic (aka fake) data can help improve the accuracy of ML models. For example, research has shown that Generative Adversarial Networks (GANs) can generate synthetic data to augment real medical image training data that improves liver lesion classification and medical diagnosis.

As part of a project to tackle the scarcity of training data for agricultural monitoring applications, we are using GANs to generate synthetic Sentinel-2 satellite imagery. The results reveal that our GAN model can generate realistic imagery that can be used in classification models. Check out and see how many of the images you can correctly identify as real or synthetic.

Machine Learning

Advancing AI for Earth Science: A Data Systems Perspective

Tackling data challenges and incorporating physics into machine learning models will help unlock the potential of artificial intelligence to answer Earth science questions.

The Earth sciences present uniquely challenging problems, from detecting and predicting changes in Earth’s ecosystems in response to climate change to understanding interactions among the ocean, atmosphere, and land in the climate system. Helping address these problems, however, is a wealth of data sets—containing atmospheric, environmental, oceanographic, and other information—that are mostly open and publicly available. This fortuitous combination of pressing challenges and plentiful data is leading to the increased use of data-driven approaches, including machine learning (ML) models, to solve Earth science problems.

Machine Learning, News

Machine Learning for Earth Observation Market Map

Meet the 100+ organizations that focus on machine learning applications with satellite data

Building geospatial machine learning applications involve many dependable moving parts, from accessing Earth observation (EO) data, labeling imagery, and generating training data to creating and developing models and running analytics. A growing list of organizations from various sectors are providing solutions and services to advance these applications. Who can help you build machine learning applications, identify patterns from your data, or run your crowdsourcing campaign? What organizations are providing software or a platform that you can utilize to develop your machine learning model?

Machine Learning

Using Generative Adversarial Networks to Address Scarcity of Geospatial Training Data

Results show models based on Generative Adversarial Networks perform better than Convolutional Neural Networks in classifying land cover classes outside of the training dataset.

In many supervised machine learning (ML) applications that use Earth observations (EO), we rely on ground reference data to generate training and validation data. These reference data are the building block of those applications and require geographical diversity if one aims to deploy the models across various geographies. Ground reference data collection, however, is an extensive process and extremely scarce in remote areas that would most benefit from the use of EO.

Machine Learning, Standards

Cloud Native Geospatial Outreach Day Recap

Chris Holmes, Technology Fellow at Radiant Earth gives a recap of the Cloud Native Geospatial Outreach Day and shares some of his favorite parts.

It’s been just over three weeks since the Cloud Native Geospatial Outreach Day. Everyone I’ve talked to felt it was an incredible event, and I definitely concur. Thankfully we managed to record almost all of it, so if you missed it you can still catch the content on youtube!

We opened with a welcome from Bruno Sánchez-Andrade Nuño and me, representing the Microsoft and Planet, the convening sponsors. Then Hamed, the new Executive Director of Radiant Earth, introduced the Data Labeling Contest (which was a great success).

Machine Learning, News

Cloud-Spotting at a Million Pixels an Hour

Jon Engelsman won the Best Quality Labeler award for our recent Data Labeling Contest. We asked him to detail his approach and workflow.

I recently attended the Cloud Native Geospatial Outreach Day, a virtual event designed to “introduce STACCOG, and other emerging cloud-native geospatial formats and tools to new audiences.” As part of the outreach day, co-sponsors PlanetMicrosoftAzavea, and Radiant Earth teamed up to host a week-long data labeling contest. This friendly competition had contestants race to manually label the shapes of clouds across a large selection of satellite images from around the world. The contest’s ultimate goal was to generate a crowd-sourced collection of high-quality labeled images, data that can be used to train accurate cloud detection models.

Machine Learning, News

Announcing the Winners of the Data Labeling Contest

Earlier this month, we organized a data labeling contest as part of the Cloud Native Geospatial Outreach Day sponsored by Planet, Microsoft, and Azavea. The contest was designed as a crowdsourcing campaign to encourage the global community to contribute to open-access training data catalogs. Participants were asked to identify cloudy pixels in Sentinel-2 scenes.

The labeling contest was conducted on GroundWork, Azavea’s annotation tool designed for geospatial data. We were amazed by the high participation worldwide and the community’s excitement to help develop a large-scale accurate cloud detection training dataset. In the end, 231 users around the world signed up and labeled 75,645 tasks, which equates to about 2 million km2 of classified Sentinel-2 imagery.