This website uses cookies, pixels, and similar technologies (“cookies”), some of which are provided by third parties, to enable website features and functionality; measure, analyze, and improve site performance; enhance user experience; record user interactions; and support our advertising and marketing. We and our third-party vendors may monitor, record, and access information and data, including device data, IP address and online identifiers, referring URLs and other browsing information, for these and similar purposes. By clicking “Accept all cookies,” you agree to such purposes. If you continue to browse our site without clicking “Accept all cookies,” or if you click “Reject all cookies,” only cookies necessary to operate and enable default website features and functionalities will be deployed. If you are visiting our Site in the U.S., by using this site or clicking “Accept all cookies,” “Reject all cookies,” or “Preferences,” you acknowledge and agree to our Privacy Policy, Cookie Policy, and Terms of Use.

library

Research & Analysis
/

The RarePlanes Dataset

Read the Paper
RarePlanes is a unique open-source machine learning dataset from CosmiQ Works and AI.Reverie that incorporates both real and synthetically generated satellite imagery.

RarePlanes is a unique open-source machine learning dataset from CosmiQ Works and AI.Reverie that incorporates both real and synthetically generated satellite imagery. The RarePlanes dataset specifically focuses on the value of AI.Reverie synthetic data to aid computer vision algorithms in their ability to automatically detect aircraft and their attributes in satellite imagery. Although other synthetic/real combination datasets exist, RarePlanes is the largest openly-available very-high resolution dataset built to test the value of synthetic data from an overhead perspective. Previous research has shown that synthetic data can reduce the amount of real training data needed and potentially improve performance for many tasks in the computer vision domain. The real portion of the dataset consists of 253 Maxar WorldView-3 satellite scenes spanning 112 locations and 2,142 km^2 with 14,700 hand-annotated aircraft. The accompanying synthetic dataset is generated via AI.Reverie’s novel simulation platform and features 50,000 synthetic satellite images with ~630,000 aircraft annotations. Both the real and synthetically generated aircraft feature 10 fine grain attributes including: aircraft length, wingspan, wing-shape, wing-position, wingspan class, propulsion, number of engines, number of vertical-stabilizers, presence of canards, and aircraft role. Finally, we conduct extensive experiments to evaluate the real and synthetic datasets and compare performances. By doing so, we show the value of synthetic data for the task of detecting and classifying aircraft from an overhead perspective.

Read the paper here.

Dataset Download

The RarePlanes dataset is available through the AWS Open-Data Program (https://registry.opendata.aws/rareplanes/) for free download.

To download the real training and testing tiles (~3 GB) execute these commands:

mkdir train test
cd train
aws s3 cp s3://rareplanes-public/real/tarballs/train/RarePlanes_train_geojson_aircraft_tiled.tar.gz . 
aws s3 cp s3://rareplanes-public/real/tarballs/train/RarePlanes_train_PS-RGB_tiled.tar.gz . 
cd ../test
aws s3 cp s3://rareplanes-public/real/tarballs/test/RarePlanes_test_geojson_aircraft_tiled.tar.gz . 
aws s3 cp s3://rareplanes-public/real/tarballs/test/RarePlanes_test_PS-RGB_tiled.tar.gz .

All real data (~107 GB) including both tiled and untiled formats can be downloaded using this command:

aws s3 cp --recursive s3://rareplanes-public/real/tarballs/  .

All synthetic data (~211 GB) can be downloaded using this command:

aws s3 cp --recursive s3://rareplanes-public/synthetic/ .

All model weights (~3.8 GB) used in the experiments can be downloaded here:

aws s3 cp --recursive s3://rareplanes-public/weights/ .

Dataset User Guide

www.iqt.org/library/rareplanes-public-user-guide

Codebase

https://github.com/aireveries/RarePlanes

Attribution

Please cite our work if you use the dataset:

These data are licensed under the CC-4.0-BY-SA license.

@misc{RarePlanes_Dataset,
    title={RarePlanes Dataset},
    author={Shermeyer, Jacob and Hossler, Thomas and Van Etten, Adam and Hogan, Daniel and Lewis, Ryan and Kim, Daeil},
    organization = {In-Q-Tel - CosmiQ Works and AI.Reverie},
    month = {June},
    year = {2020}
}

@article{RarePlanes_Paper,
    title={RarePlanes: Synthetic Data Takes Flight},
    author={Shermeyer, Jacob and Hossler, Thomas and Van Etten, Adam and Hogan, Daniel and Lewis, Ryan and Kim, Daeil},
    organization = {In-Q-Tel - CosmiQ Works and AI.Reverie},
    month = {June},
    year = {2020}
}