Image Classification
Introduction to Image Classification, including datasets, models, applications, and transfer learning
Datasets
PASCAL VOC (Visual Ojbect Classes)
VOC2007 train/val/test 9,963張標註圖片,有24,640個標註物件
VOC2012 train/val/test11,530張標註圖片,有27,450個ROI 標註物件
20 classes:
- Person: person
- Animal: bird, cat, cow, dog, horse, sheep
- Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
- Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
COCO Dataset
- Object segmentation
- Recognition in context
- Superpixel stuff segmentation
- 330K images (>200K labeled)
- 1.5 million object instances
- 80 object categories
- 91 stuff categories
- 5 captions per image
- 250,000 people with keypoints
ImageNet
This dataset spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. This subset is available on Kaggle.
Open Images V6+
- Blog: Open Images V6 — Now Featuring Localized Narratives
- 15,851,536 boxes on 600 categories
- 2,785,498 instance segmentations on 350 categories
- 3,284,280 relationship annotations on 1,466 relationships
- 675,155 localized narratives
- 59,919,574 image-level labels on 19,957 categories
- Extension - 478,000 crowdsourced images with 6,000+ categories
Download
Download and Visualize using FiftyOne
As with any other dataset in the FiftyOne Dataset Zoo, downloading it is as easy as calling:
dataset = fiftyone.zoo.load_zoo_dataset("open-images-v6", split="validation")
import fiftyone as fo
import fiftyone.zoo as foz
# List available zoo datasets
print(foz.list_zoo_datasets())
#
# Load the COCO-2017 validation split into a FiftyOne dataset
#
# This will download the dataset from the web, if necessary
#
dataset = foz.load_zoo_dataset("coco-2017", split="validation")
# Give the dataset a new name, and make it persistent so that you can
# work with it in future sessions
dataset.name = "coco-2017-validation-example"
dataset.persistent = True
# Visualize the in the App
session = fo.launch_app(dataset)
Applications
CIFAR-10
Dataset: CIFAR-10
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
Kaggle: https://www.kaggle.com/rkuo2000/cifar10-cnn
Traffic Sign Classifier (交通號誌辨識)
Dataset: German Traffic Sign Recognition Benchmark (GTSRB)
34 traffic signs, 39209 training images, 12630 test images
Kaggle: https://www.kaggle.com/rkuo2000/gtsrb-cnn
Emotion Detection (情緒偵測)
Dataset: FER-2013 (Facial Expression Recognition)
7 facial expression, 28709 training images, 7178 test images
labels = [“angry”, “disgusted”, “fearful”, “happy”, “neutral”, “sad”, “surprised”]
Kaggle: https://www.kaggle.com/rkuo2000/fer2013-cnn
Pneumonia Detection (肺炎偵測)
Dataset: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia
Kaggle: https://www.kaggle.com/rkuo2000/pneumonia-cnn
COVID19 Detection (新冠肺炎偵測)
Dataset: https://www.kaggle.com/bachrr/covid-chest-xray
Kaggle:
FaceMask Classification (人臉口罩辨識)
Dataset: Face Mask ~12K Images dataset
Kaggle: https://www.kaggle.com/rkuo2000/facemask-cnn
Garbage Classification (垃圾分類)
Dataset: https://www.kaggle.com/asdasdasasdas/garbage-classification (42MB)
6 categories : cardboard(403), glass(501), metal(410), paper (594), plastic(482), trash(137)
Kaggle: https://www.kaggle.com/rkuo2000/garbage-cnn
Food Classification (食物分類)
Dataset: Food-11
The dataset consists of 16,643 images belonging to 11 major food categories:
- Bread (1724 images)
- Dairy product (721 images)
- Dessert (2,500 images)
- Egg (1,648 images)
- Fried food (1,461images)
- Meat (2,206 images)
- Noodles/pasta (734 images)
- Rice (472 images)
- Seafood (1,505 images)
- Soup (2,500 images)
- Vegetable/fruit (1,172 images)
Kaggle: https://www.kaggle.com/rkuo2000/food11-classification
Mango Classification (芒果分類)
Dataset: 台灣高經濟作物 - 愛文芒果影像辨識正式賽
Kaggle:
- https://www.kaggle.com/rkuo2000/mango-classification
- https://www.kaggle.com/rkuo2000/mango-efficientnet
Transer Learning
Birds Classification (鳥類分類)
Dataset: https://www.kaggle.com/rkuo2000/birds2
用Google搜尋照片, 下載各20/30張照片,放入資料夾birds後,壓縮成birds.zip, 再上傳Kaggle.com/datasets
Kaggle: https://www.kaggle.com/rkuo2000/birds-classification
Animes Classification (卡通人物分類)
Dataset: https://www.kaggle.com/datasets/rkuo2000/animes
用Google搜尋照片, 下載卡通人物各約20/30張照片,放入資料夾animes後,壓縮成animes.zip, 再上傳Kaggle.com/datasets
Kaggle: https://www.kaggle.com/rkuo2000/anime-classification
Worms Classification(害蟲分類)
Dataset: worms4
用Google搜尋照片, 下載各20/30張照片,放入資料夾worms後,壓縮成worms.zip, 再上傳Kaggle.com/datasets
Kaggle: https://www.kaggle.com/rkuo2000/worms-classification
Railway Track Fault Detection (鐵軌故障偵測)
Dataset: Railway Track Fault Detection
Kaggle: https://www.kaggle.com/code/rkuo2000/railtrack-resnet50v2
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras import models, layers
base_model=ResNet50V2(input_shape=input_shape,weights='imagenet',include_top=False)
base_model.trainable = False # freeze the base model (for transfer learning)
# add Fully-Connected Layers to Model
x=base_model.output
x=layers.GlobalAveragePooling2D()(x)
x=layers.Dense(128,activation='relu')(x) # FC layer
preds=layers.Dense(num_classes,activation='softmax')(x) #final layer with softmax activation
model=models.Model(inputs=base_model.input,outputs=preds)
model.summary()
Kaggle: https://www.kaggle.com/code/rkuo2000/railtrack-efficientnet
import efficientnet.tfkeras as efn
from tensorflow.keras import models, layers, optimizers, regularizers, callbacks
base_model = efn.EfficientNetB7(input_shape=input_shape, weights='imagenet', include_top=False)
base_model.trainable = False # freeze the base model (for transfer learning)
x = base_model.output
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(128)(x)
out = layers.Dense(num_classes, activation="softmax")(x)
model = models.Model(inputs=base_model.input, outputs=out)
model.summary()
Skin Lesion Classification (皮膚病變分類)
*Dataset:8 Skin Cancer MNIST: HAM10000
7 types of lesions : (picture = 600x450)
- Actinic Keratoses (光化角化病)
- Basal Cell Carcinoma (基底細胞癌)
- Benign Keratosis (良性角化病)
- Dermatofibroma (皮膚纖維瘤)
- Malignant Melanoma (惡性黑色素瘤)
- Melanocytic Nevi (黑素細胞痣)
-
Vascular Lesions (血管病變)
Kaggle: https://www.kaggle.com/code/rkuo2000/skin-lesion-classification - assign base_model
#base_model=applications.MobileNetV2(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.InceptionV3(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.ResNet50V2(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.ResNet101V2(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.ResNet152V2(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.DenseNet121(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.DenseNet169(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.DenseNet201(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.NASNetMobile(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.NASNetLarge(input_shape=(331,331,3), weights='imagenet',include_top=False)
- import EfficentNet model
import efficientnet.tfkeras as efn base_model = efn.EfficientNetB7(input_shape=(224,224,3), weights='imagenet', include_top=False)
This site was last updated December 22, 2022.