Image Classification

Introduction to Image Classification, including datasets, models, applications, and transfer learning


Datasets

PASCAL VOC (Visual Ojbect Classes)

VOC2007 train/val/test 9,963張標註圖片,有24,640個標註物件
VOC2012 train/val/test11,530張標註圖片,有27,450個ROI 標註物件
20 classes:

  • Person: person
  • Animal: bird, cat, cow, dog, horse, sheep
  • Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
  • Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

COCO Dataset

  • Object segmentation
  • Recognition in context
  • Superpixel stuff segmentation
  • 330K images (>200K labeled)
  • 1.5 million object instances
  • 80 object categories
  • 91 stuff categories
  • 5 captions per image
  • 250,000 people with keypoints

ImageNet

This dataset spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. This subset is available on Kaggle.


Open Images V6+

  • Blog: Open Images V6 — Now Featuring Localized Narratives
  • 15,851,536 boxes on 600 categories
  • 2,785,498 instance segmentations on 350 categories
  • 3,284,280 relationship annotations on 1,466 relationships
  • 675,155 localized narratives
  • 59,919,574 image-level labels on 19,957 categories
  • Extension - 478,000 crowdsourced images with 6,000+ categories

Download
Download and Visualize using FiftyOne

As with any other dataset in the FiftyOne Dataset Zoo, downloading it is as easy as calling:
dataset = fiftyone.zoo.load_zoo_dataset("open-images-v6", split="validation")

import fiftyone as fo
import fiftyone.zoo as foz

# List available zoo datasets
print(foz.list_zoo_datasets())

#
# Load the COCO-2017 validation split into a FiftyOne dataset
#
# This will download the dataset from the web, if necessary
#
dataset = foz.load_zoo_dataset("coco-2017", split="validation")

# Give the dataset a new name, and make it persistent so that you can
# work with it in future sessions
dataset.name = "coco-2017-validation-example"
dataset.persistent = True

# Visualize the in the App
session = fo.launch_app(dataset)

Applications

CIFAR-10

Dataset: CIFAR-10

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
Kaggle: https://www.kaggle.com/rkuo2000/cifar10-cnn


Traffic Sign Classifier (交通號誌辨識)

Dataset: German Traffic Sign Recognition Benchmark (GTSRB)

34 traffic signs, 39209 training images, 12630 test images
Kaggle: https://www.kaggle.com/rkuo2000/gtsrb-cnn


Emotion Detection (情緒偵測)

Dataset: FER-2013 (Facial Expression Recognition)

7 facial expression, 28709 training images, 7178 test images
labels = [“angry”, “disgusted”, “fearful”, “happy”, “neutral”, “sad”, “surprised”]
Kaggle: https://www.kaggle.com/rkuo2000/fer2013-cnn


Pneumonia Detection (肺炎偵測)

Dataset: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

Kaggle: https://www.kaggle.com/rkuo2000/pneumonia-cnn


COVID19 Detection (新冠肺炎偵測)

Dataset: https://www.kaggle.com/bachrr/covid-chest-xray

Kaggle:


FaceMask Classification (人臉口罩辨識)

Dataset: Face Mask ~12K Images dataset

Kaggle: https://www.kaggle.com/rkuo2000/facemask-cnn


Garbage Classification (垃圾分類)

Dataset: https://www.kaggle.com/asdasdasasdas/garbage-classification (42MB)

6 categories : cardboard(403), glass(501), metal(410), paper (594), plastic(482), trash(137)

Kaggle: https://www.kaggle.com/rkuo2000/garbage-cnn


Food Classification (食物分類)

Dataset: Food-11
The dataset consists of 16,643 images belonging to 11 major food categories:

  • Bread (1724 images)
  • Dairy product (721 images)
  • Dessert (2,500 images)
  • Egg (1,648 images)
  • Fried food (1,461images)
  • Meat (2,206 images)
  • Noodles/pasta (734 images)
  • Rice (472 images)
  • Seafood (1,505 images)
  • Soup (2,500 images)
  • Vegetable/fruit (1,172 images)

Kaggle: https://www.kaggle.com/rkuo2000/food11-classification


Mango Classification (芒果分類)

Dataset: 台灣高經濟作物 - 愛文芒果影像辨識正式賽
Kaggle:


Transer Learning

Birds Classification (鳥類分類)

Dataset: https://www.kaggle.com/rkuo2000/birds2

用Google搜尋照片, 下載各20/30張照片,放入資料夾birds後,壓縮成birds.zip, 再上傳Kaggle.com/datasets

Kaggle: https://www.kaggle.com/rkuo2000/birds-classification


Animes Classification (卡通人物分類)

Dataset: https://www.kaggle.com/datasets/rkuo2000/animes

用Google搜尋照片, 下載卡通人物各約20/30張照片,放入資料夾animes後,壓縮成animes.zip, 再上傳Kaggle.com/datasets

Kaggle: https://www.kaggle.com/rkuo2000/anime-classification


Worms Classification(害蟲分類)

Dataset: worms4

用Google搜尋照片, 下載各20/30張照片,放入資料夾worms後,壓縮成worms.zip, 再上傳Kaggle.com/datasets

Kaggle: https://www.kaggle.com/rkuo2000/worms-classification


Railway Track Fault Detection (鐵軌故障偵測)

Dataset: Railway Track Fault Detection
Kaggle: https://www.kaggle.com/code/rkuo2000/railtrack-resnet50v2

from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras import models, layers

base_model=ResNet50V2(input_shape=input_shape,weights='imagenet',include_top=False) 
base_model.trainable = False # freeze the base model (for transfer learning)

# add Fully-Connected Layers to Model
x=base_model.output
x=layers.GlobalAveragePooling2D()(x)
x=layers.Dense(128,activation='relu')(x)  # FC layer 
preds=layers.Dense(num_classes,activation='softmax')(x) #final layer with softmax activation

model=models.Model(inputs=base_model.input,outputs=preds)
model.summary()

Kaggle: https://www.kaggle.com/code/rkuo2000/railtrack-efficientnet

import efficientnet.tfkeras as efn
from tensorflow.keras import models, layers, optimizers, regularizers, callbacks

base_model = efn.EfficientNetB7(input_shape=input_shape, weights='imagenet', include_top=False)
base_model.trainable = False # freeze the base model (for transfer learning)

x = base_model.output
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(128)(x)
out = layers.Dense(num_classes, activation="softmax")(x)

model = models.Model(inputs=base_model.input, outputs=out)

model.summary()

Skin Lesion Classification (皮膚病變分類)

*Dataset:8 Skin Cancer MNIST: HAM10000

7 types of lesions : (picture = 600x450)

  • Actinic Keratoses (光化角化病)
  • Basal Cell Carcinoma (基底細胞癌)
  • Benign Keratosis (良性角化病)
  • Dermatofibroma (皮膚纖維瘤)
  • Malignant Melanoma (惡性黑色素瘤)
  • Melanocytic Nevi (黑素細胞痣)
  • Vascular Lesions (血管病變)
    Kaggle: https://www.kaggle.com/code/rkuo2000/skin-lesion-classification

  • assign base_model
    #base_model=applications.MobileNetV2(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.InceptionV3(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.ResNet50V2(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.ResNet101V2(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.ResNet152V2(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.DenseNet121(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.DenseNet169(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.DenseNet201(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.NASNetMobile(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.NASNetLarge(input_shape=(331,331,3), weights='imagenet',include_top=False)
    
  • import EfficentNet model
    import efficientnet.tfkeras as efn
    base_model = efn.EfficientNetB7(input_shape=(224,224,3), weights='imagenet', include_top=False)
    



This site was last updated December 22, 2022.