Vision-Based Hand Detection and Tracking Using Fusion of Kernelized Correlation Filter and Single-Shot Detection
: Hand detection and tracking are key components in many computer vision applications, including hand pose estimation and gesture recognition for human–computer interaction systems, virtual reality, and augmented reality. Despite their importance, reliable hand detection in cluttered scenes remains...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English English English |
Published: |
MDPI
2023
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/10103/1/J16219_8930626b82c06d4375e69da013ec81a8.pdf http://eprints.uthm.edu.my/10103/2/J16219_8930626b82c06d4375e69da013ec81a8.pdf http://eprints.uthm.edu.my/10103/3/J16219_8930626b82c06d4375e69da013ec81a8.pdf http://eprints.uthm.edu.my/10103/ https://doi.org/10.3390/app13137433 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | : Hand detection and tracking are key components in many computer vision applications, including hand pose estimation and gesture recognition for human–computer interaction systems, virtual reality, and augmented reality. Despite their importance, reliable hand detection in cluttered
scenes remains a challenge. This study explores the use of deep learning techniques for fast and robust hand detection and tracking. A novel algorithm is proposed by combining the Kernelized Correlation Filter (KCF) tracker with the Single-Shot Detection (SSD) method. This integration enables
the detection and tracking of hands in challenging environments, such as cluttered backgrounds and
occlusions. The SSD algorithm helps reinitialize the KCF tracker when it fails or encounters drift issues due to sudden changes in hand gestures or fast movements. Testing in challenging scenes showed that the proposed tracker achieved a tracking rate of over 90% and a speed of 17 frames
per second (FPS). Comparison with the KCF tracker on 17 video sequences revealed an average improvement of 13.31% in tracking detection rate (TRDR) and 27.04% in object detection error (OTE). Additional comparison with MediaPipe hand tracker on 10 hand gesture videos taken from the
Intelligent Biometric Group Hand Tracking (IBGHT) dataset showed that the proposed method outperformed the MediaPipe hand tracker in terms of overall TRDR and tracking speed. The results demonstrate the promising potential of the proposed method for long-sequence tracking stability,
reducing drift issues, and improving tracking performance during occlusions. |
---|