Machine Learning

Publish your training data on Radiant MLHub for NeurIPS 2021

Submissions to the new Datasets and Benchmarks track require data documentation and availability on an open repository.

Organizers of the NeurIPS 2021 conference recently announced a new track for Datasets and Benchmarks. This is a significant development for a major machine learning (ML) conference to highlight the importance of data in developing algorithms for real-world problems. We at Radiant Earth Foundation welcome this initiative and applaud the organizers for establishing this new track.

In recent years, there have been many discussions and arguments to incentive ML researchers to work on real-world problems. One of those incentive mechanisms is the opportunity to publish a paper in a peer-reviewed conference, and getting recognition for working on these problems. The new track at NeurIPS is a necessary step to realize these incentives.

 

Machine Learning

Socially Responsible Data Labeling

Generating a global training dataset while supporting social initiatives and sustainable practices.

Labeling satellite imagery is the process of applying tags to scenes to provide context or confirm information. These labeled training datasets form the basis for machine learning (ML) algorithms. The labeling undertaking (in many cases) requires humans to meticulously and manually assign captions to the data, allowing the model to learn patterns and estimate them for other observations.

For a wide range of Earth observation applications, training data labels can be generated by annotating satellite imagery. Images can be classified to identify the entire image as a class (e.g., water body) or for specific objects within the satellite image. However, annotation tasks can only identify features observable in the imagery. For example, with Sentinel-2 imagery at the 10-meter spatial resolution, one cannot detect the more detailed features of interest, such as crop types but would be able to distinguish large croplands from other land cover classes.

Community Voices, Machine Learning

Igor Ivanov: Harnessing Machine Learning Skills to Reduce Damages from Tropical Storms

A conversation with the First Place winner of the Radiant Earth Tropical Cyclone Wind Estimation Data Competition

We recently announced the Radiant Earth Tropical Cyclone Wind Estimation Data Competition winners, a contest designed to build a machine learning (ML) model to improve NASA IMPACT’s Deep Learning-based Hurricane Intensity Estimator. Seven hundred thirty-three participants leveraged NOAA’s Geostationary Operational Environmental Satellites (GOES) imagery to estimate the wind speeds of storms at different points in time using satellite images captured throughout a storm’s life cycle. In this Q&A, we sat down with Igor Ivanov from Ukraine, winner of the first place Development Seed Award, to talk about his journey to become a data scientist and winning the contest.

Machine Learning

Radiant MLHub Python Client — Beta Release

Using the Python client to discover and download training datasets without managing API requests.

Community Voices, Machine Learning

Celebrating Women Leading the ML4EO Community

Meet the rising stars of women around the world at the forefront of machine learning for Earth observation.

Happy International Women’s Day!

Today, we celebrate the women who break barriers and expand the frontiers of machine learning for Earth observation. This essential field can help us understand the planet’s ecosystem, its different elements, interactions, and changes.

These 15 leading women were selected from 56 outstanding nominations from the ML4EO community. The Radiant Earth Foundation selection committee created a set of criteria to rank the nominees.

Machine Learning

Archived Training Dataset Downloads now Available on Radiant MLHub

A little over a year ago, we launched the first iteration of Radiant MLHub in the form of a STAC-compliant API, which allows you to browse our training data collections and list and download individual assets from the items within those collections. Today, we’re announcing the ability to download an archived version of training datasets with just a single-click download. In this post, we’ll describe the process for downloading datasets, the structure of the archived datasets, and provide some tips for effectively traversing the downloaded datasets.

We are now offering three different methods of downloading our datasets. The easiest method, downloading on our registry, can be accessed by navigating to a dataset page and clicking on the “Download” link for each collection you would like to download. Clicking this link will direct you to our dashboard, which will ask you to login if you are not already authenticated and then begin the download process for that collection.

Machine Learning, News

Radiant MLHub in 2021: Realizing a Data Ecosystem

In December 2019, we publicly launched Radiant MLHub, the first open-access cloud-based repository for geospatial training datasets. Since then, we have continuously published new datasets and expanded the ecosystem around Radiant MLHub.

The idea of Radiant MLHub was born in Spring 2018 after several discussions and feedback from members of the community and funders. We had started a new project to develop a global and geographically diverse land cover training dataset using human verification called LandCoverNet. Soon after the launch of LandCoverNet in 2018, we identified a gap in the ecosystem to facilitate publication and uptake of training datasets in our community. That gap in the data value chain led us to the design and implementation of Radiant MLHub.

Machine Learning

Announcing the Updated Machine Learning for Earth Observation Market Map

Meet the 150+ organizations that focus on machine learning applications with satellite data. 

The updated Machine Learning for Earth Observation Market Map is finally here!

The ML4EO market map is a curated list of organizations focused on different machine learning aspects with a satellite data pipeline. This release includes an additional list of 50 organizations, which we missed in the first version published in September 2020.

Machine Learning

Can you guess if this place is real?

Generating synthetic training data that can improve the accuracy of machine learning models – You have probably read about fake images and videos being generated by machine learning (ML) models. While this application might sound more like a fun exercise or, in some cases, malicious activity, synthetic (aka fake) data can help improve the accuracy of ML models. For example, research has shown that Generative Adversarial Networks (GANs) can generate synthetic data to augment real medical image training data that improves liver lesion classification and medical diagnosis.

As part of a project to tackle the scarcity of training data for agricultural monitoring applications, we are using GANs to generate synthetic Sentinel-2 satellite imagery. The results reveal that our GAN model can generate realistic imagery that can be used in classification models. Check out isthisplacereal.com and see how many of the images you can correctly identify as real or synthetic.

Machine Learning

Advancing AI for Earth Science: A Data Systems Perspective

Tackling data challenges and incorporating physics into machine learning models will help unlock the potential of artificial intelligence to answer Earth science questions.

The Earth sciences present uniquely challenging problems, from detecting and predicting changes in Earth’s ecosystems in response to climate change to understanding interactions among the ocean, atmosphere, and land in the climate system. Helping address these problems, however, is a wealth of data sets—containing atmospheric, environmental, oceanographic, and other information—that are mostly open and publicly available. This fortuitous combination of pressing challenges and plentiful data is leading to the increased use of data-driven approaches, including machine learning (ML) models, to solve Earth science problems.