-
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.
Imagenet (http://image-net.org)
-
ImageNet is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+).
-
Summary and Statistics (updated on April 30, 2010)
- Total number of non-empty synsets: 21841
- Total number of images: 14,197,122
- Number of images with bounding box annotations: 1,034,908
- Number of synsets with SIFT features: 1000
- Number of images with SIFT features: 1.2 million
-
Challenages
COCO (http://cocodataset.org)
-
COCO is a large-scale object detection, segmentation, and captioning dataset.
- Object segmentation
- Recognition in context
- Superpixel stuff segmentation
- 330K images (>200K labeled)
- 1.5 million object instances
- 80 object categories
- 91 stuff categories
- 5 captions per image
- 250,000 people with keypoints
-
Challenages
- COCO2018 Detection task with object segmentation
Ongoing
-
Use the 2017 datasets for detection and keypoint
- Download with
gsutil -m rsync gs://images.cocodataset.org/val2017 val2017
, and replace val2017 with train2017, val2017, test2017, unlabeled2017 - Train Images Train/Val Annotations 118K/18GB
- Validate images Images Staff Train/Val Annotations 5K/1GB
- Test Images Info 41K/6GB
- Unlabeled Images Info 123K/19GB
- Download with
-
Cocoapi for Lua, MATLAB, Python
-
- COCO2018 Detection task with object segmentation
-
Designed to facilitate the research on learning visual representation from noisy web data.
-
WebVision Dataset 2.0 (Over 5,000 synsets and 16 million images images crawled from the Flickr website and Google Images search)
-
Challenages