adirik / dwpose

Whole-body pose estimation

  • Public
  • 186 runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 GPU hardware. Predictions typically complete within 71 seconds. The predict time for this model varies significantly based on the inputs.

Readme

DWPose

DWPose is a whole body pose estimation model that detects 2D body, hands and face keypoints of multiple people in images. Refer to the paper and original repo for details.

Using the API

To use the DWPose, simply upload an image and set the threshold to filter out low probability detections. The API outputs an .npz file with keypoint detections for each person in the image, and a plot of detected keypoints overlaid on the image. The output file is organized as follows:

{
    "person_0": {
        "body": np.array of shape (18, 2),
        "face": np.array of shape (68, 2),
        "hands": np.array of shape (2, 21, 2),
    },
    "person_1": {
        "body": np.array of shape (18, 2),
        "face": np.array of shape (68, 2),
        "hands": np.array of shape (2, 21, 2),
    },
   ..
}

Keypoints are given as relative (x, y) coordinates and independent of image size. DWPose returns 18 body keypoints, 68 face keypoints and 21 hand keypoints per hand.

import numpy as np

# load keypoints
data = np.load("result.npz", allow_pickle=True)

# number of detected people
num_people = len(data.files)

# body, face, hands keypoints of person_0
person_0 = data["person_0"].item()
body_kpts = person_0["body"] 
face_kpts = person_0["face"] 
hands_kpts = person_0["hands"] 

References

@inproceedings{yang2023effective,
  title={Effective whole-body pose estimation with two-stages distillation},
  author={Yang, Zhendong and Zeng, Ailing and Yuan, Chun and Li, Yu},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={4210--4220},
  year={2023}
}