Particle swarm optimization with deep learning for human action recognition

Motion target detection and tracking are among the most important computer vision areas that use advanced image processing techniques. To accomplish the purpose of tracking a target, traditional approaches first perform the detection process and then track the target. However, such methods necessita...

Full description

Saved in:
Bibliographic Details
Main Authors: Usmani, U.A., Watada, J., Jaafar, J., Aziz, I.A., Roy, A.
Format: Article
Published: ICIC International 2021
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85121686467&doi=10.24507%2fijicic.17.06.1843&partnerID=40&md5=306e24c2811bb0855b1edc69452ddc0f
http://eprints.utp.edu.my/29597/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Motion target detection and tracking are among the most important computer vision areas that use advanced image processing techniques. To accomplish the purpose of tracking a target, traditional approaches first perform the detection process and then track the target. However, such methods necessitate a constant false-alarm rate system that detects the whole frame obtained at the moment. This decreases the detection efficiency and degrades the target tracking output. Also, the current motion target detection algorithms extract features from the relevant object only if the moving object has complex texture features. The regions extracted by these algorithms are larger than the region of interest and stretches towards the direction of movement. These algorithms are sensitive to noise, and thus, it is difficult to accurately predict the location of the objects. This paper proposes a deep learning framework for human action recognition to overcome the drawbacks of the current state-of-the-art methods. To extract the appearance based and structural information, each frame of the action sequences is evaluated for spatial features. The temporal properties of the video sequences undergo computation across full corresponding blocks frames to give motion based information. The features are reduced using the particle swarm optimization detection technique in video image sequences to reduce the computational complexity. If the scene is stationary, the identification of the moving people is addressed based on the correlation tracking technique. Finally, a deep learning neural network is used to evaluate the method�s effectiveness. Since two autoen-coders have been trained separately, the information in the autoencoders is forwarded to the deep learning neural network to recognize human actions. Our deep learning method also performs the atomic morphological operations and shadow removal based on the hue saturation value color space. Our method can also track targets that do not have high contrast and prominent features with the background. We infer that our approach helps improve the tracking stability and increases the robustness of the tracking process by an accuracy of 97.09 on the Stony Brook University interaction dataset, 98.02 on the High-level Human Interaction Recognition Challenge interaction dataset, 96.17 on the Weizmann dataset. The statistical measures such as precision (96.76), sensitivity (95.39), Matthews correlation coefficient (93.83), and Jaccard Index (92.63) are also high, thus demonstrating the better performance than all the current state-of-the-art methods. © ICIC International 2021.