A database of 28 by 28 grayscale images of UFO-like shapes, sighting patterns, and common lookalikes, in the style of the MNIST handwritten digit database.
This page provides the UFO-MNIST V1 files, a small benchmark for machine learning examples, baseline classifiers, and computer-vision experiments. The database has a training set of 8,000 examples and a test set of 2,000 examples. Each image is centered when possible and stored as a single-channel 28 by 28 unsigned-byte image.
UFO-MNIST was assembled from public UFO/UAP sighting references, official release material, and generated augmentations that make the classes balanced and easy to use in MNIST-style experiments.
The following samples show examples from each of the ten classes derived from common spotting categories. They are enlarged here by the browser; the stored images are 28 by 28 pixels.
The main file is a compressed NumPy archive. It contains
train_images, train_labels,
test_images, test_labels,
class_names, and seed.
| File | Description |
|---|---|
| ufo_mnist_28x28.npz | MNIST-style image arrays and labels. |
| labels.json | Integer label to class-name mapping. |
| manifest.csv | Source manifest with URLs, license notes, and redistribution flags. |
| samples.csv | Per-sample split, label, source family, and deterministic seed. |
| dataset_card.md | Dataset card with intended use, composition, and construction notes. |
| checksums.json | SHA-256 checksums for generated release files. |
| baseline_metrics.json | Metrics from the simple linear baseline. |
| cnn_metrics.json | Metrics from the current best CNN benchmark. |
The images are stored in arrays shaped like C arrays, with the last index changing fastest. The arrays have the following shapes:
| Array | Shape | Type |
|---|---|---|
train_images |
(8000, 28, 28) |
uint8 |
train_labels |
(8000,) |
uint8 |
test_images |
(2000, 28, 28) |
uint8 |
test_labels |
(2000,) |
uint8 |
Pixel values are unsigned bytes in the range 0 to 255. Larger values are brighter. The construction script adds controlled blur, noise, scan-line artifacts, hot pixels, and small offsets so that the samples resemble low-resolution spotting imagery rather than perfectly clean symbols.
| Label | Name | Examples per split |
|---|---|---|
| 0 | disk | 800 train, 200 test |
| 1 | orb | 800 train, 200 test |
| 2 | triangle | 800 train, 200 test |
| 3 | cigar_rod | 800 train, 200 test |
| 4 | light_formation | 800 train, 200 test |
| 5 | irregular_glow | 800 train, 200 test |
| 6 | aircraft | 800 train, 200 test |
| 7 | balloon | 800 train, 200 test |
| 8 | bird | 800 train, 200 test |
| 9 | celestial_or_artifact | 800 train, 200 test |
import numpy as np
data = np.load("data/ufo_mnist_v1/ufo_mnist_28x28.npz")
train_images = data["train_images"]
train_labels = data["train_labels"]
test_images = data["test_images"]
test_labels = data["test_labels"]
UFO-MNIST V1 was generated with seed 1337. Official records
from the public WAR.GOV/UFO release and AARO imagery pages are cited in the
source manifest as reference material. The released 28 by 28 arrays are
balanced with generated and augmented examples so every class has the same
number of train and test samples.
The current nearest-centroid baseline on the test split is 42.0 percent. This gives a simple reference point for quick classifiers.
The current best local benchmark is a compact convolutional neural network: three convolutional blocks with batch normalization, dropout, adaptive pooling, and AdamW. It used the 8,000-image training split and was evaluated once on the 2,000-image test split.
| Metric | Value |
|---|---|
| CNN accuracy | 99.6 percent |
| CNN macro F1 | 99.6 percent |
| CNN weighted F1 | 99.6 percent |
| Linear baseline accuracy | 45.9 percent |
| Nearest-centroid accuracy | 42.0 percent |
The CNN reached perfect F1 on disk, orb,
triangle, cigar_rod, and bird.
The hardest remaining class was celestial_or_artifact, with
F1 of 98.8 percent. Full precision, recall, F1, epoch history, and
confusion-matrix values are in
cnn_metrics.json.