COCO

Overview

COCO is a large-scale object detection, segmentation, and captioning dataset. It contains over 200 000 annotated images, 1.5 million object instances, 80 object categories, 91 stuff categories and 250 000 people with keypoints. COCO has several features: object segmentation, recognition in context and superpixel stuff segmentation.

Associated Paper or Article

For more information, please read Microsoft COCO: Common Objects in Context.

Annotations

The annotations come as JSON files containing JSON arrays. There is an annotation file for each type of task (object detection, keypoint detection, captioning, stuff segmentation, panoptic segmentation). The annotation's structure is as follows:

The COCO dataset uses the same annotation format as the LVIS dataset. You can find a comprehensive annotation guide for the latter here.

Download

You can download the dataset here. You can choose between a 2014 and a 2017 version of the dataset.

Model

No official model has been provided for this dataset.

Benchmarks

No offical benchmarks have been provided for this dataset. However, you can consult the object detection, keypoints detection, stuff segmentation, panoptic segmentation or captioning leaderboards.

Associated Challenges

The dataset has 5 associated annual challenges, from 2015 to 2020. Those can be consulted in the Tasks tab on the official website. Challenges contain from two up to four tasks, such as object detection, keypoint detection, stuff segmentation and panoptic segmentation.

License

Dataset licenced under the CC BY 4.0 licence.