Object Detection Exercises

The exercises includes Image Annotation Tools, examples of YOLOv4, v5, v6, v7, YOLOR, YOLOX, CSL-YOLO, and YOLOv5 applications, Mask RCNN, SSD MobileNet, YOLOv5+DeepSort, Objectron, Steel Defect Detection, PCB Defect Detection, Identify Military Vehicles in Satellite Imagery, Pothole Detection, Car Breaking Detection.


Image Annotation

FiftyOne

Annotating Datasets with LabelBox
To get started, you need to install FiftyOne and the Labelbox Python client:
!pip install fiftyone labelbox


labelme

$pip install labelme


LabelImg

$pip install labelImg

$labelImg
$labelImg [IMAGE_PATH] [PRE-DEFINED CLASS FILE]

VOC .xml convert to YOLO .txt

$cd ~/tf/raccoon/annotations $python ~/tf/xml2yolo.py


Annotation formats

  • YOLO format in .txt class_num x, y, w, h
    0 0.5222826086956521 0.5518115942028986 0.025 0.010869565217391304
    0 0.5271739130434783 0.5057971014492754 0.013043478260869565 0.004347826086956522
    
  • COCO format in .xml
<annotation>
	<folder>JPEGImages</folder>
	<filename>BloodImage_00000.jpg</filename>
	<path>/home/pi/detection_dataset/JPEGImages/BloodImage_00000.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>640</width>
		<height>480</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>WBC</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>260</xmin>
			<ymin>177</ymin>
			<xmax>491</xmax>
			<ymax>376</ymax>
		</bndbox>
	</object>

YOLOs

YOLOv4

Kaggle: rkuo2000/yolov4

YOLOv5

Kaggle: rkuo2000/yolov5

Scaled YOLOv4

Kaggle: rkuo2000/scaled-yolov4

YOLOR

Kaggle: rkuo2000/yolor

YOLOX

Kaggle: rkuo2000/yolox

CSL-YOLO

Kaggle: rkuo2000/csl-yolo

PP-YOLOE

Kaggle: rkuo2000/pp-yoloe

YOLOv6

Kaggle: rkuo2000/yolov6

YOLOv7

Kaggle: rkuo2000/yolov7


YOLOv5 applications

YOLOv5 Detect

detect image / video


YOLOv5 Elephant

train YOLOv5 for detecting elephant (dataset from OpenImage V6)


YOLOv5 BCCD

BCCD Dataset is a small-scale dataset for blood cells detection.
3 classes: RBC (Red Blood Cell), WBC (White Blood Cell), Platelets (血小板)
Github: https://github.com/Shenggan/BCCD_Dataset
Kaggle: https://www.kaggle.com/datasets/surajiiitm/bccd-dataset

  • Directory Structure:
├── BCCD
│   ├── Annotations
│   │       └── BloodImage_00000.xml ~ 00410.xml (364 files)
│   ├── ImageSets/Main/train.txt, val.txt, test.txt, trainval.txt (filename list)
│   └── JPEGImages
│       └── BloodImage_00000.jpg ~ 00410.xml (364 files)
  • Convert Annotations (from COCO .xml to YOLO format .txt)
def cord_converter(size, box):
#   convert xml annotation to darknet format coordinates
#   :param size: [w,h]
#   :param box: anchor box coordinates [upper-left x,uppler-left y,lower-right x, lower-right y]
#   :return: converted [x,y,w,h]
    
    x1 = int(box[0])
    y1 = int(box[1])
    x2 = int(box[2])
    y2 = int(box[3])

    dw = np.float32(1. / int(size[0]))
    dh = np.float32(1. / int(size[1]))

    w = x2 - x1
    h = y2 - y1
    x = x1 + (w / 2)
    y = y1 + (h / 2)

    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return [x, y, w, h]

def save_file(img_jpg_file_name, size, img_box):
    save_file_name = LABELS_ROOT + '/' + img_jpg_file_name + '.txt'
    print(save_file_name)
    file_path = open(save_file_name, "a+")
    for box in img_box:

        cls_num = classes.index(box[0]) # find class_id

        new_box = cord_converter(size, box[1:]) # convert box coord into YOLO x,y,w,h

        file_path.write(f"{cls_num} {new_box[0]} {new_box[1]} {new_box[2]} {new_box[3]}\n")

    file_path.flush()
    file_path.close()
    
def get_xml_data(file_path, img_xml_file):
    img_path = file_path + '/' + img_xml_file + '.xml'
    print(img_path)

    dom = parse(img_path)
    root = dom.documentElement
    img_name = root.getElementsByTagName("filename")[0].childNodes[0].data
    img_size = root.getElementsByTagName("size")[0]
    objects = root.getElementsByTagName("object")
    img_w = img_size.getElementsByTagName("width")[0].childNodes[0].data
    img_h = img_size.getElementsByTagName("height")[0].childNodes[0].data
    img_c = img_size.getElementsByTagName("depth")[0].childNodes[0].data
    # print("img_name:", img_name)
    # print("image_info:(w,h,c)", img_w, img_h, img_c)
    img_box = []
    for box in objects:
        cls_name = box.getElementsByTagName("name")[0].childNodes[0].data
        x1 = int(box.getElementsByTagName("xmin")[0].childNodes[0].data)
        y1 = int(box.getElementsByTagName("ymin")[0].childNodes[0].data)
        x2 = int(box.getElementsByTagName("xmax")[0].childNodes[0].data)
        y2 = int(box.getElementsByTagName("ymax")[0].childNodes[0].data)
        # print("box:(c,xmin,ymin,xmax,ymax)", cls_name, x1, y1, x2, y2)
        img_jpg_file_name = img_xml_file + '.jpg'
        img_box.append([cls_name, x1, y1, x2, y2])
    # print(img_box)

    # test_dataset_box_feature(img_jpg_file_name, img_box)
    save_file(img_xml_file, [img_w, img_h], img_box)   
files = os.listdir(ANNOTATIONS_PATH)
for file in files:
    print("file name: ", file)
    file_xml = file.split(".")
    get_xml_data(ANNOTATIONS_PATH, file_xml[0])
  • Create yaml for YOLO (train, val path & labels)
!echo "train: Dataset/images/train\n" > data/bccd.yaml
!echo "val:   Dataset/images/val\n" >> data/bccd.yaml
!echo "nc : 3\n" >> data/bccd.yaml
!echo "names: ['Platelets', 'RBC', 'WBC']\n" >> data/bccd.yaml

!cat data/bccd.yaml

YOLOv5 Helmet

SafetyHelmetWearing-Dataset

|--VOC2028    
    |---Annotations    
    |---ImageSets    
    |---JPEGImages   

dataset conversion from COCO to YOLO format


YOLOv5 Facemask
train YOLOv5 for facemask detection


YOLOv5 Traffic Analysis
use YOLOv5 to detect car/truck per frame, then analyze vehicle counts per lane and the estimated speed


YOLOv5 Global Wheat Detection
train YOLOv5 for wheat detection


EfficientDet Global Wheat Detection


Mask R-CNN

Kaggle: rkuo2000/mask-rcnn


Mask R-CNN transfer learning

Kaggle: Mask RCNN transfer learning


YOLOv5 + DeepSort

Kaggle: YOLOv5 DeepSort


Objectron

Kaggle: rkuo2000/mediapipe-objectron


OpenCV-Python play GTA5

Ref. Reading game frames in Python with OpenCV - Python Plays GTA V
Code: Sentdex/pygta5


Steel Defect Detection

Dataset: Severstal: Steel Defect Detection
Kaggle: https://www.kaggle.com/code/jaysmit/u-net (Keras UNet)


PCB Defect Detection

Dataset: HRIPCB dataset (dropbox)


Identify Military Vehicles in Satellite Imagery

Blog: Identify Military Vehicles in Satellite Imagery with TensorFlow
Dataset: Moving and Stationary Target Acquisition and Recognition (MSTAR) Dataset


Pothole Detection

Blog: Pothole Detection using YOLOv4
Code: yolov4_pothole_detection.ipynb
Kaggle: YOLOv7 Pothole Detection

  • create .yaml for YOLO
%%writefile data/pothole.yaml
train: ../pothole_dataset/images/train 
val: ../pothole_dataset/images/valid
test: ../pothole_dataset/images/test

# Classes
nc: 1  # number of classes
names: ['pothole']  # class names

Car Breaking Detection

Code: YOLOv7 Braking Detection



This site was last updated December 22, 2022.