Figure 1. Example of annotated image with one bumblebee

Insect benchmark datasets with time-lapse images as described in paper:

Bjerge K, Alison J, Dyrmann M, Frigaard C.E., Mann H. M. R., Høye T.T., Accurate detection and identification of insects from camera trap images with deep learning, bioRxiv, 2022

MD5 hash: 6831b05cab0988743a113819eb23be75
Zipped size: 8.8 GB
Unzipped size: 41.5 GB
Files: 43.683

MD5 hash: 88317db11fd10fab4976edb4d8d4a71f
Zipped size: 1.1 GB
Unzipped size: 5.24 GB
Files: 5.708

MD5 hash: d940cac65cf067a3baf356ecaa9944e3
Zipped size: 2.83 GB
Unzipped size: 12.9 GB
Files: 14.590

Labels in YOLO format: ultralytics/yolov5: label format

The annotated training and validation datasets contains insects of nine different species as listed below:

0 Coccinellidae septempunctata
1 Apis mellifera
2 Bombus lapidarius
3 Bombus terrestris
4 Eupeodes corolla
5 Episyrphus balteatus
6 Aglais urticae
7 Vespula vulgaris
8 Eristalis tenax

The test dataset contains additional classes of insects.

9 Non-Bombus Anthophila
10 Bombus spp.
11 Syrphidae
12 Fly spp.
13 Unclear insect
14 Mixed animals:
Non-Anthophila Hymenoptera
Non-Syrphidae Diptera
Non-Conccinalidae Coleoptera
Other animals

There are two naming conventions for image (.jpg) and label (.txt) files.

Background images without insects are named:
Background image: 12_13-20190704172200-snapshot.jpg
Empty label file: 12_13-20190704172200-snapshot.txt

Images annotated with insects are named:
Image file: S1_146-Aug23_1_156-20190822133230.jpg
Label file: S1_146-Aug23_1_156-20190822133230.txt


YYYYMMDDHHMMSS – Capture timestamp with year, month, date, hour, minutes, and second
Seq – Sequence number created by the motion program to separate images
C – Identification of two cameras with Id=0 or Id=1 in system identified by SZ_IP
MonthDate – Folder name for where the original image were stored in the system
SZ_IP – Identification of five camera systems: S1_123, S2_146, S3_194, S4_199, S5_187 (Two cameras in each system)
X – An index number related to a specific camera and folder ensuring unique file names of background images from different camera systems.

The important information in a filename is system (SZ_IP), camera Id (C) and timestamp (YYYYMMDDHHMMSS).

The three best YOLOv5 models from the paper are available in pytorch format.

All models are tested with YOLOv5 release v7.0 (22-11-2022): ultralytics/yolov5: YOLOv5 🚀 in PyTorch

Download the containing the files listed below.
MD5 hash: bc2194e94bfbe0ba93e4a66df6eb6f1b
Zipped size: 489 MByte
Unzipped size: 528 MByte Model no. 6 in Table 2 (F1=0.912) Model no. 8 in Table 2 (F1=0.925) Model no. 10 in Table 2 (F1=0.932)

insects-1201val.yaml: YAML file with label names to train YOLOv5 Linux bash shell script with parameters to train YOLOv5m6 Linux bash shell script with parameters to validated models

Copyright © 2018 AU Signal Processing Group