Vision-based human action recognition using time delay input radial basis function networks
Understanding human actions from video sequences is one of the most active and challenging research topics in computer vision. In spite of several promising works,particularly in recent years, to achieve high accuracy, there is still a lack of efficient systems for real-time applications, thereby in...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2011
|
Online Access: | http://psasir.upm.edu.my/id/eprint/41836/1/FK%202011%20155%20ir.pdf http://psasir.upm.edu.my/id/eprint/41836/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.upm.eprints.41836 |
---|---|
record_format |
eprints |
spelling |
my.upm.eprints.418362016-03-03T03:26:58Z http://psasir.upm.edu.my/id/eprint/41836/ Vision-based human action recognition using time delay input radial basis function networks Kalhor, Davood Understanding human actions from video sequences is one of the most active and challenging research topics in computer vision. In spite of several promising works,particularly in recent years, to achieve high accuracy, there is still a lack of efficient systems for real-time applications, thereby increasing demand for faster systems. In other words, when addressing high performance systems for real-time applications both accuracy and speed should be considered. In practice, however, concurrently achieving high accuracy and high speed is very challenging. This thesis is motivated to deal with this problem and proposes a method, which is sufficiently fast for realtime human action recognition at 10 frames per second (fps). The proposed method consists of two main parts. In the first part, a feature vector is extracted for each frame, and then an action descriptor is constructed from a concatenation of these vectors. The choice of appropriate features is of vital importance to successful design of a high-performance system. This thesis, unlike most of the previous works in which very complex and high dimensional feature vectors have been used to describe actions, proposes a new descriptor with low dimensionality and complexity while preserving required power of discrimination. The feature vector is built by merging three information channels from grid-based shape features, bounding box, and the mass center of silhouettes. In the second part,these feature vectors are classified utilizing a Time Delay Input Radial Basis Function Network (TDIRBFN). This network has no integration layer and therefore a smaller number of model parameters and less computation during model selection. A growing-cell approach is suggested to train this network. This work is evaluated using leave-one-actor-out protocol and a human action dataset (provided by University of Illinois at Urbana-Champaign) containing 14 actions. Based on experimental results, implemented in MATLAB environment, the average execution time for constructing feature vectors is almost 20 ms (50 fps), significantly smaller than the literature. The proposed method can be trained to meet two different objectives, high speed (the main requirement of real-time systems) and high accuracy (the main requirement of non-real-time systems). The achieved results are 15.5 fps (classifier speed) and 90.66% (accuracy), for the first objective, and 94.52% (accuracy) and 2.37 fps (classifier speed), for the second objective. A comparative analysis demonstrates that the proposed system, in addition to comparable accuracy with the literature, outperforms state-of-the-art methods in terms of both speed and overall performance. The findings of this work are significant in that they offer simpler descriptors as well as the TDIRBFN as an alternative method for classification of human actions, particularly for real-time applications. 2011-12 Thesis NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/41836/1/FK%202011%20155%20ir.pdf Kalhor, Davood (2011) Vision-based human action recognition using time delay input radial basis function networks. Masters thesis, Universiti Putra Malaysia. |
institution |
Universiti Putra Malaysia |
building |
UPM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Putra Malaysia |
content_source |
UPM Institutional Repository |
url_provider |
http://psasir.upm.edu.my/ |
language |
English |
description |
Understanding human actions from video sequences is one of the most active and challenging research topics in computer vision. In spite of several promising works,particularly in recent years, to achieve high accuracy, there is still a lack of efficient systems for real-time applications, thereby increasing demand for faster systems. In
other words, when addressing high performance systems for real-time applications both accuracy and speed should be considered. In practice, however, concurrently achieving high accuracy and high speed is very challenging. This thesis is motivated to deal with this problem and proposes a method, which is sufficiently fast for realtime
human action recognition at 10 frames per second (fps).
The proposed method consists of two main parts. In the first part, a feature vector is extracted for each frame, and then an action descriptor is constructed from a
concatenation of these vectors. The choice of appropriate features is of vital importance to successful design of a high-performance system. This thesis, unlike most of the previous works in which very complex and high dimensional feature vectors have been used to describe actions, proposes a new descriptor with low dimensionality and complexity while preserving required power of discrimination. The feature vector is built by merging three information channels from grid-based shape features, bounding box, and the mass center of silhouettes. In the second part,these feature vectors are classified utilizing a Time Delay Input Radial Basis Function Network (TDIRBFN). This network has no integration layer and therefore a
smaller number of model parameters and less computation during model selection. A growing-cell approach is suggested to train this network.
This work is evaluated using leave-one-actor-out protocol and a human action dataset (provided by University of Illinois at Urbana-Champaign) containing 14 actions.
Based on experimental results, implemented in MATLAB environment, the average execution time for constructing feature vectors is almost 20 ms (50 fps), significantly
smaller than the literature. The proposed method can be trained to meet two different objectives, high speed (the main requirement of real-time systems) and high accuracy
(the main requirement of non-real-time systems). The achieved results are 15.5 fps (classifier speed) and 90.66% (accuracy), for the first objective, and 94.52%
(accuracy) and 2.37 fps (classifier speed), for the second objective. A comparative analysis demonstrates that the proposed system, in addition to comparable accuracy
with the literature, outperforms state-of-the-art methods in terms of both speed and overall performance. The findings of this work are significant in that they offer
simpler descriptors as well as the TDIRBFN as an alternative method for classification of human actions, particularly for real-time applications. |
format |
Thesis |
author |
Kalhor, Davood |
spellingShingle |
Kalhor, Davood Vision-based human action recognition using time delay input radial basis function networks |
author_facet |
Kalhor, Davood |
author_sort |
Kalhor, Davood |
title |
Vision-based human action recognition using time delay input radial basis function networks |
title_short |
Vision-based human action recognition using time delay input radial basis function networks |
title_full |
Vision-based human action recognition using time delay input radial basis function networks |
title_fullStr |
Vision-based human action recognition using time delay input radial basis function networks |
title_full_unstemmed |
Vision-based human action recognition using time delay input radial basis function networks |
title_sort |
vision-based human action recognition using time delay input radial basis function networks |
publishDate |
2011 |
url |
http://psasir.upm.edu.my/id/eprint/41836/1/FK%202011%20155%20ir.pdf http://psasir.upm.edu.my/id/eprint/41836/ |
_version_ |
1643833113791430656 |
score |
13.211869 |