Pose Estimation

Pose Estimation includes Applications, Body Pose, Head Pose, Hand Pose , Object Pose.


Pose Estimation Applications

  • 產線SOP
    以雅文塑膠來說,產線作業員的動作僅集中於上半身,以頭部、頸部、肩膀、手臂、手掌的動作為主。Beseye_alpha 針對需求,複製日本大型製造工廠 AI 模型開發的成功案例、及與客戶多次討論需求、實地作業工作站規劃、實際場域測試資料訓練,開發出一個「肢體律動分析」模型,有效達到降低運算量的目標。

  • 其他應用


Body Pose

Ref. A 2019 Guide to Huamn Pose Estimatioin

BodyPix - Person Segmentation in the Browser

Code: tfjs-models/body-pix
pip install tf_bodypix Live Demo


OpenPose

Paper: arxiv.org/abs/1812.08008
Code: CMU-Perceptual-Computing-Lab/openpose
Ref. A Guide to OpenPose in 2021


PoseNet

PoseNet is built to run on lightweight devices such as the browser or mobile device where as
OpenPose is much more accurate and meant to be ran on GPU powered systems. You can see the performance benchmarks below.

Paper: arxiv.org/abs/1505.07427
Code: rwightman/posenet-pytorch


Pose Recognition

using Pose keypoints as dataset to train a DNN
Code: burningion/dab-and-tpose-controlled-lights
IPYNB: pose-control-lights


MMPose

Code: open-mmlab
Model Zoo


DensePose RCNN

Paper: arxiv.org/abs/1802.00434
Code: facebookresearch/DensePose

Region-based DensePose architecture

Multi-task cascaded architectures


Multi-Person Part Segmentation

Paper: arxiv.org/abs/1907.05193
Code: kevinlin311tw/CDCL-human-part-segmentation


Head Pose

Head Pose Estimation

Code:yinguobing/head-pose-estimation

Hopenet

Paper: Fine-Grained Head Pose Estimation Without Keypoints

Code: deep-head-pose

Code: hopenet
Blog: HOPE-Net : A Machine Learning Model for Estimating Face Orientation


VTuber

Vtuber總數突破16000人,增速不緩一年增加3000人 依據日本數據調查分析公司 User Local 的報告,在該社最新的 User Local VTuber 排行榜上,有紀錄的 Vtuber 正式突破了 16,000 人。

1位 Gawr Gura(がうるぐら サメちゃん) Gawr Gura Ch. hololive-EN

2位 キズナアイ A.I.Channel

3位 Mori Calliope(森カリオペ) Mori Calliope Ch.

VTuber-Unity = Head-Pose-Estimation + Face-Alignment + GazeTracking**

VRoid Studio

VTuber_Unity

OpenVtuber


Hand Pose

Hand Pose Estimation papers

Hand3D

Paper: arxiv.org/abs/1705.01389
Code: lmb-freiburg/hand3d

DeeHPS

Paper: arxiv.org/abs/1808.09208

GraphPoseGAN

Paper: arxiv.org/abs/1912.01875

3D Hand Shape

Paper: arxiv.org/abs/1903.00812
Code: https://github.com/3d-hand-shape/hand-graph-cnn


FaceBook InterHand2.6M

Paper: InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image
Code: facebookresearch/InterHand2.6M


FrankMocap: Fast Monocular 3D Hand and Body Motion Capture by Regression and Integration

Paper: arxiv.org/abs/2008.08324
Code: facebookresearch/frankmocap


A Skeleton-Driven Neural Occupancy Representation for Articulated Hands

Paper: arxiv.org/abs/2109.11399

Towards unconstrained joint hand-object reconstruction from RGB videos

Paper: arxiv.org/abs/2108.07044
Code: hassony2/homan

Fast Monocular Hand Pose Estimation on Embedded Systems

Paper: arxiv.org/abs/2102.07067

Recent Advances in 3D Object and Hand Pose Estimation

Paper: arxiv.org/abs/2006.05927


Object Pose

Benchmark for 6D Object Pose

Core datasets:

LM-O T-LESS TUD-L IC-BIN ITODD HB YCB-V

Other datasets: LM, RU-APC, IC-MI, TYO-L.


Real-Time Seamless Single Shot 6D Object Pose Prediction (YOLO-6D)

Paper: arxiv.org/abs/1711.08848
Code: microsoft/singleshotpose


PoseCNN

Paper: arxiv.org/abs/1711.00199
Code: yuxng/PoseCNN

PoseCNN


DeepIM

Paper: arxiv.org/abs/1804.00175
Code: liyi14/mx-DeepIM


Segmentation-driven Pose

Paper: arxiv.org/abs/1812.02541
Code: cvlab-epfl/segmentation-driven-pose


DPOD

Paper: arxiv.org/abs/1902.11020
Code: yashs97/DPOD


HO-3D_v3 Dataset

Paper: arxiv.org/abs/2107.00887
Github: shreyashampali/ho3d
HO-3D is a dataset with 3D pose annotations for hand and object under severe occlusions from each other.


Exercises of Pose Estimation

BodyPix

Kaggle: rkuo2000/BodyPix


PoseNet

Kaggle: rkuo2000/posenet-pytorch
Kaggle: rkuo2000/posenet-human-pose


MMPose

Kaggle:rkuo2000/MMPose

2D Human Pose

2D Human Whole-Body

2D Hand Pose

2D Face Keypoints

3D Human Pose

2D Pose Tracking

2D Animal Pose

3D Hand Pose

WebCam Effect


OpenPose

Kaggle: rkuo2000/openpose-pytorch


Pose Recognition (姿態辨識)

專題實作步驟:

  1. 建立身體動作之姿態照片資料集 (例如:5 poses , take 20 pictures of each pose)
  2. 始用MMPose 辨識出照片中的各姿勢之身體關鍵點 (use MMPose convert 16 keypoints (x,y) of each pose)
  3. 產生姿態關鍵點資料集 x_train.append(pose_keypoints) ( x_train.shape = (20x5, 16, 2), y_train.shape= (20x5, 1) )
  4. 建立DNN模型並訓練模型, 然後下載模型檔pose_dnn.h5至PC
  5. 於PC建立帶camera輸入之服務器程式, 載入模型pose_dnn.h5進行姿態動作辨識

模型建構與訓練之程式樣本 (PC or Kaggle)

input_shape=(16,2)
num_classes=5

inputs = layers.Input(shape=input_shape)
x = layers.Dense(128)(inputs)
outputs = layers.Dense(num_classes, activation="softmax")(x)
model = models.Model(inputs=inputs, outputs=outputs)

models.compile(loss = 'categorical_crossentropy', optimizer = 'adam' , metrics = ['accuracy'])

history = model.fit(x_train, y_train, batch_size=1, epochs=20, validation_data=(x_test, y_test))
models.save_model(model, 'pose_dnn.h5')

姿態辨識服務器之程式樣本 (PC with Camera)

model = models.load_model('models/pose_dnn.h5')
labels = ['stand', 'raise-right-arm', 'raise-left-arm', 'cross arms','both-arms-left']

cap = cv2.VideoCapture(0)

while(cap.isOpened()):
    ret, frame = cap.read()
    image = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)
    
    mmdet_results = inference_detector(det_model, image) # 人物偵測產生BBox
    person_results = process_mmdet_results(mmdet_results, args.det_cat_id) # 記住人物之BBox  
    pose_results, returned_outputs = inference_top_down_pose_model(...) # 感測姿態產生pose keypoints
    
    x_test = np.array(preson_results).reshape(1,16,2) # 將Keypoints List 轉成 numpy Array
    preds = model.fit(x_test) # 辨識姿態動作
    maxindex = int(np.argmax(preds))
    txt = labels[maxindex]
    print(txt)

Head Pose Estimation

Kaggle: rkuo2000/head-pose-estimation


VTuber-Unity

Head-Pose-Estimation + Face-Alignment + GazeTracking

Build-up Steps:

  1. Create a character: VRoid Studio
  2. Synchronize the face: VTuber_Unity
  3. Take video: OBS Studio
  4. Post-processing:
  5. Upload: YouTube
  6. [Optional] Install CUDA & CuDNN to enable GPU acceleration
  7. To Run
    $git clone https://github.com/kwea123/VTuber_Unity
    $python demo.py --debug --cpu


OpenVtuber

Build-up Steps:

  • Repro Github
    $git clone https://github.com/1996scarlet/OpenVtuber
    $cd OpenVtuber
    $pip3 install –r requirements.txt
  • Install node.js for Windows
  • run Socket-IO Server
    $cd NodeServer
    $npm install express socket.io
    $node. index.js
  • Open a browser at http://127.0.0.1:6789/kizuna
  • PythonClient with Webcam
    $cd ../PythonClient
    $python3 vtuber_link_start.py


Hand Pose

Dataset: InterHand2.6M

  1. Download pre-trained InterNet from here
  2. Put the model at demo folder
  3. Go to demo folder and edit bbox in here
  4. run python demo.py --gpu 0 --test_epoch 20
  5. You can see result_2D.jpg and 3D viewer.

Camera positios visualization demo

  1. cd tool/camera_visualize
  2. Run python camera_visualize.py



This site was last updated December 22, 2022.