The models include metadata based on the STAC ML Model Extension to enable easy sharing and retrieval. Radiant MLHub has been the source for high-quality open geospatial training data for use with machine learning (ML) algorithms since 2019. Today, we’re excited to announce the addition of a model repository allowing Radiant MLHub users access to both geospatial training data and ML models. The geospatial models catalog includes metadata that describes training data associated with a model and its architecture for training a model to generate predictions.
We have the pleasure of introducing Radiant Earth Foundation’s first online course, Machine Learning for Earth Observations (ML4EO) Bootcamp. Available on Atingi, an open digital learning platform designed to improve training and employment opportunities, this self-paced course contains a mixture of lectures and hands-on exercises for novice data science or remote sensing practitioners. Atingi is implemented by the Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) on behalf of the German Federal Ministry for Economic Cooperation and Development (BMZ).
Describing ML Models with the Geospatial Machine Learning Model Catalog (GMLMC).
During the height of the COVID-19 pandemic, the government of Togo launched a program to “boost national food production in response to the COVID-19 crisis by distributing aid to farmers”1. To accomplish this, the government needed accurate information about the distribution of smallholder farmers throughout the country. This kind of cropland map did not exist for the country, so they worked with NASA Harvest to rapidly develop a cropland map using AI. Finding enough high-resolution labeled training data to train the machine learning model was also a significant challenge, so the team combined global and local crowdsourced labels collected using the Geo-Wiki platform2 with hand-labeled imagery in targeted areas to train a new model for predicting crop areas.
Now featuring 250+ organizations that focus on machine learning applications with satellite data
The latest interactive Machine Learning for Earth Observation Market Map, a curated list of organizations focused on different machine learning aspects with a satellite data pipeline, is available for download. This release includes an additional list of 100+ organizations, thanks to a crowdsourcing effort on social media. Earlier in September, we asked our followers on Twitter and LinkedIn to identify organizations that we missed in the earlier version of the market map or were established since then. The large number of contributions from people in such a short period speaks of the niche area of machine learning (ML) for Earth observation (EO). The entries hint toward the incredible aptitude of organizations to optimize these innovative technologies and expand them in the service of humanity.
Using STAC to catalog machine learning training data.
Researchers and data scientists are increasingly combining Earth observation (EO) with ground truth data from a variety of sources to build faster, more accurate machine learning (ML) models to gain valuable insights in domains ranging from agriculture to autonomous navigation to ecosystem health monitoring. These models are integrated into analytic pipelines that generate on-the-fly predictions at scale. The accuracy of these inferences are then evaluated using well-defined validation metrics and the results used to improve the performance of the original model in a continuous feedback loop.
If this sounds like a complex process, that’s because it is! Ad-hoc techniques for handling these workflows may work well within a single organization, but can lead to a bewildering array of algorithms and data for end-users.
Submissions to the new Datasets and Benchmarks track require data documentation and availability on an open repository.
Organizers of the NeurIPS 2021 conference recently announced a new track for Datasets and Benchmarks. This is a significant development for a major machine learning (ML) conference to highlight the importance of data in developing algorithms for real-world problems. We at Radiant Earth Foundation welcome this initiative and applaud the organizers for establishing this new track.
In recent years, there have been many discussions and arguments to incentive ML researchers to work on real-world problems. One of those incentive mechanisms is the opportunity to publish a paper in a peer-reviewed conference, and getting recognition for working on these problems. The new track at NeurIPS is a necessary step to realize these incentives.
Generating a global training dataset while supporting social initiatives and sustainable practices.
Labeling satellite imagery is the process of applying tags to scenes to provide context or confirm information. These labeled training datasets form the basis for machine learning (ML) algorithms. The labeling undertaking (in many cases) requires humans to meticulously and manually assign captions to the data, allowing the model to learn patterns and estimate them for other observations.
For a wide range of Earth observation applications, training data labels can be generated by annotating satellite imagery. Images can be classified to identify the entire image as a class (e.g., water body) or for specific objects within the satellite image. However, annotation tasks can only identify features observable in the imagery. For example, with Sentinel-2 imagery at the 10-meter spatial resolution, one cannot detect the more detailed features of interest, such as crop types but would be able to distinguish large croplands from other land cover classes.
A conversation with the First Place winner of the Radiant Earth Tropical Cyclone Wind Estimation Data Competition
We recently announced the Radiant Earth Tropical Cyclone Wind Estimation Data Competition winners, a contest designed to build a machine learning (ML) model to improve NASA IMPACT’s Deep Learning-based Hurricane Intensity Estimator. Seven hundred thirty-three participants leveraged NOAA’s Geostationary Operational Environmental Satellites (GOES) imagery to estimate the wind speeds of storms at different points in time using satellite images captured throughout a storm’s life cycle. In this Q&A, we sat down with Igor Ivanov from Ukraine, winner of the first place Development Seed Award, to talk about his journey to become a data scientist and winning the contest.
Using the Python client to discover and download training datasets without managing API requests.
We are excited to announce the first beta release of the
radiant_mlhub library, a Python client for working with the Radiant MLHub API! With this release, users can work with Radiant MLHub datasets through an intuitive Python interface without having to worry about constructing API requests and managing authentication.
The library is still in the early stages of development, but we encourage you to try it out and give us feedback on how well it addresses your use-cases. This article will walk you through the process of installing and configuring the library, navigating datasets and their collections, and downloading training datasets. For more detailed documentation of the Python library, please see the official documentation here. A basic knowledge of Python programming is recommended.