library
RarePlanes is a unique open-source machine learning dataset from CosmiQ Works and AI.Reverie that incorporates both real and synthetically generated satellite imagery. The RarePlanes dataset specifically focuses on the value of AI.Reverie synthetic data to aid computer vision algorithms in their ability to automatically detect aircraft and their attributes in satellite imagery. Although other synthetic/real combination datasets exist, RarePlanes is the largest openly-available very-high resolution dataset built to test the value of synthetic data from an overhead perspective. Previous research has shown that synthetic data can reduce the amount of real training data needed and potentially improve performance for many tasks in the computer vision domain. The real portion of the dataset consists of 253 Maxar WorldView-3 satellite scenes spanning 112 locations and 2,142 km^2 with 14,700 hand-annotated aircraft. The accompanying synthetic dataset is generated via AI.Reverie’s novel simulation platform and features 50,000 synthetic satellite images with ~630,000 aircraft annotations. Both the real and synthetically generated aircraft feature 10 fine grain attributes including: aircraft length, wingspan, wing-shape, wing-position, wingspan class, propulsion, number of engines, number of vertical-stabilizers, presence of canards, and aircraft role. Finally, we conduct extensive experiments to evaluate the real and synthetic datasets and compare performances. By doing so, we show the value of synthetic data for the task of detecting and classifying aircraft from an overhead perspective.
Read the paper here.
Dataset Download
The RarePlanes dataset is available through the AWS Open-Data Program (https://registry.opendata.aws/rareplanes/) for free download.
To download the real training and testing tiles (~3 GB) execute these commands:
mkdir train test
cd train
aws s3 cp s3://rareplanes-public/real/tarballs/train/RarePlanes_train_geojson_aircraft_tiled.tar.gz .
aws s3 cp s3://rareplanes-public/real/tarballs/train/RarePlanes_train_PS-RGB_tiled.tar.gz .
cd ../test
aws s3 cp s3://rareplanes-public/real/tarballs/test/RarePlanes_test_geojson_aircraft_tiled.tar.gz .
aws s3 cp s3://rareplanes-public/real/tarballs/test/RarePlanes_test_PS-RGB_tiled.tar.gz .
All real data (~107 GB) including both tiled and untiled formats can be downloaded using this command:
aws s3 cp --recursive s3://rareplanes-public/real/tarballs/ .
All synthetic data (~211 GB) can be downloaded using this command:
aws s3 cp --recursive s3://rareplanes-public/synthetic/ .
All model weights (~3.8 GB) used in the experiments can be downloaded here:
aws s3 cp --recursive s3://rareplanes-public/weights/ .
Dataset User Guide
www.iqt.org/library/rareplanes-public-user-guide
Codebase
https://github.com/aireveries/RarePlanes
Attribution
Please cite our work if you use the dataset:
These data are licensed under the CC-4.0-BY-SA license.
@misc{RarePlanes_Dataset,
title={RarePlanes Dataset},
author={Shermeyer, Jacob and Hossler, Thomas and Van Etten, Adam and Hogan, Daniel and Lewis, Ryan and Kim, Daeil},
organization = {In-Q-Tel - CosmiQ Works and AI.Reverie},
month = {June},
year = {2020}
}
@article{RarePlanes_Paper,
title={RarePlanes: Synthetic Data Takes Flight},
author={Shermeyer, Jacob and Hossler, Thomas and Van Etten, Adam and Hogan, Daniel and Lewis, Ryan and Kim, Daeil},
organization = {In-Q-Tel - CosmiQ Works and AI.Reverie},
month = {June},
year = {2020}
}