Lecture

Object Detection Exercises

13 Oct 2022 • Richard Kuo

The exercises includes Image Annotation Tools, examples of YOLOv4, v5, v6, v7, YOLOR, YOLOX, CSL-YOLO, and YOLOv5 applications, Mask RCNN, SSD MobileNet, YOLOv5+DeepSort, Objectron, Steel Defect Detection, PCB Defect Detection, Identify Military Vehicles in Satellite Imagery, Pothole Detection, Car Breaking Detection.

Image Annotation

FiftyOne

Annotating Datasets with LabelBox
To get started, you need to install FiftyOne and the Labelbox Python client:
!pip install fiftyone labelbox

labelme

$pip install labelme

LabelImg

$pip install labelImg

$labelImg
$labelImg [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
—

VOC .xml convert to YOLO .txt

$cd ~/tf/raccoon/annotations $python ~/tf/xml2yolo.py

Annotation formats

YOLO format in .txt class_num x, y, w, h

0 0.5222826086956521 0.5518115942028986 0.025 0.010869565217391304
0 0.5271739130434783 0.5057971014492754 0.013043478260869565 0.004347826086956522

COCO format in .xml

<annotation>
	<folder>JPEGImages</folder>
	<filename>BloodImage_00000.jpg</filename>
	<path>/home/pi/detection_dataset/JPEGImages/BloodImage_00000.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>640</width>
		<height>480</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>WBC</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>260</xmin>
			<ymin>177</ymin>
			<xmax>491</xmax>
			<ymax>376</ymax>
		</bndbox>
	</object>

YOLOs

YOLOv5 applications

YOLOv5 Detect

detect image / video

YOLOv5 Elephant

train YOLOv5 for detecting elephant (dataset from OpenImage V6)

YOLOv5 BCCD

BCCD Dataset is a small-scale dataset for blood cells detection.
3 classes: RBC (Red Blood Cell), WBC (White Blood Cell), Platelets (血小板)
Github: https://github.com/Shenggan/BCCD_Dataset
Kaggle: https://www.kaggle.com/datasets/surajiiitm/bccd-dataset

Directory Structure:

├── BCCD
│   ├── Annotations
│   │       └── BloodImage_00000.xml ~ 00410.xml (364 files)
│   ├── ImageSets/Main/train.txt, val.txt, test.txt, trainval.txt (filename list)
│   └── JPEGImages
│       └── BloodImage_00000.jpg ~ 00410.xml (364 files)

Convert Annotations (from COCO .xml to YOLO format .txt)

def cord_converter(size, box):
#   convert xml annotation to darknet format coordinates
#   :param size： [w,h]
#   :param box: anchor box coordinates [upper-left x,uppler-left y,lower-right x, lower-right y]
#   :return: converted [x,y,w,h]
    
    x1 = int(box[0])
    y1 = int(box[1])
    x2 = int(box[2])
    y2 = int(box[3])

    dw = np.float32(1. / int(size[0]))
    dh = np.float32(1. / int(size[1]))

    w = x2 - x1
    h = y2 - y1
    x = x1 + (w / 2)
    y = y1 + (h / 2)

    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return [x, y, w, h]

def save_file(img_jpg_file_name, size, img_box):
    save_file_name = LABELS_ROOT + '/' + img_jpg_file_name + '.txt'
    print(save_file_name)
    file_path = open(save_file_name, "a+")
    for box in img_box:

        cls_num = classes.index(box[0]) # find class_id

        new_box = cord_converter(size, box[1:]) # convert box coord into YOLO x,y,w,h

        file_path.write(f"{cls_num} {new_box[0]} {new_box[1]} {new_box[2]} {new_box[3]}\n")

    file_path.flush()
    file_path.close()
    
def get_xml_data(file_path, img_xml_file):
    img_path = file_path + '/' + img_xml_file + '.xml'
    print(img_path)

    dom = parse(img_path)
    root = dom.documentElement
    img_name = root.getElementsByTagName("filename")[0].childNodes[0].data
    img_size = root.getElementsByTagName("size")[0]
    objects = root.getElementsByTagName("object")
    img_w = img_size.getElementsByTagName("width")[0].childNodes[0].data
    img_h = img_size.getElementsByTagName("height")[0].childNodes[0].data
    img_c = img_size.getElementsByTagName("depth")[0].childNodes[0].data
    # print("img_name:", img_name)
    # print("image_info:(w,h,c)", img_w, img_h, img_c)
    img_box = []
    for box in objects:
        cls_name = box.getElementsByTagName("name")[0].childNodes[0].data
        x1 = int(box.getElementsByTagName("xmin")[0].childNodes[0].data)
        y1 = int(box.getElementsByTagName("ymin")[0].childNodes[0].data)
        x2 = int(box.getElementsByTagName("xmax")[0].childNodes[0].data)
        y2 = int(box.getElementsByTagName("ymax")[0].childNodes[0].data)
        # print("box:(c,xmin,ymin,xmax,ymax)", cls_name, x1, y1, x2, y2)
        img_jpg_file_name = img_xml_file + '.jpg'
        img_box.append([cls_name, x1, y1, x2, y2])
    # print(img_box)

    # test_dataset_box_feature(img_jpg_file_name, img_box)
    save_file(img_xml_file, [img_w, img_h], img_box)   

files = os.listdir(ANNOTATIONS_PATH)
for file in files:
    print("file name: ", file)
    file_xml = file.split(".")
    get_xml_data(ANNOTATIONS_PATH, file_xml[0])

Create yaml for YOLO (train, val path & labels)

!echo "train: Dataset/images/train\n" > data/bccd.yaml
!echo "val:   Dataset/images/val\n" >> data/bccd.yaml
!echo "nc : 3\n" >> data/bccd.yaml
!echo "names: ['Platelets', 'RBC', 'WBC']\n" >> data/bccd.yaml

!cat data/bccd.yaml

YOLOv5 Helmet

SafetyHelmetWearing-Dataset

|--VOC2028    
    |---Annotations    
    |---ImageSets    
    |---JPEGImages   

dataset conversion from COCO to YOLO format

YOLOv5 Facemask
train YOLOv5 for facemask detection

YOLOv5 Traffic Analysis
use YOLOv5 to detect car/truck per frame, then analyze vehicle counts per lane and the estimated speed

YOLOv5 Global Wheat Detection
train YOLOv5 for wheat detection

EfficientDet Global Wheat Detection

Mask R-CNN

Kaggle: rkuo2000/mask-rcnn

Mask R-CNN transfer learning

Kaggle: Mask RCNN transfer learning

YOLOv5 + DeepSort

Kaggle: YOLOv5 DeepSort

Objectron

Kaggle: rkuo2000/mediapipe-objectron

OpenCV-Python play GTA5

Ref. Reading game frames in Python with OpenCV - Python Plays GTA V
Code: Sentdex/pygta5

Steel Defect Detection

Dataset: Severstal: Steel Defect Detection
Kaggle: https://www.kaggle.com/code/jaysmit/u-net (Keras UNet)

PCB Defect Detection

Dataset: HRIPCB dataset (dropbox)

Identify Military Vehicles in Satellite Imagery

Blog: Identify Military Vehicles in Satellite Imagery with TensorFlow
Dataset: Moving and Stationary Target Acquisition and Recognition (MSTAR) Dataset

Pothole Detection

Blog: Pothole Detection using YOLOv4
Code: yolov4_pothole_detection.ipynb
Kaggle: YOLOv7 Pothole Detection

create .yaml for YOLO

%%writefile data/pothole.yaml
train: ../pothole_dataset/images/train 
val: ../pothole_dataset/images/valid
test: ../pothole_dataset/images/test

# Classes
nc: 1  # number of classes
names: ['pothole']  # class names

Car Breaking Detection

Code: YOLOv7 Braking Detection

This site was last updated December 22, 2022.