An improved object detection model based on optimised CNN.
Object detection is a computer vision technique that gives the ability to individually locate, recognise, and interpret multiple objects in an image with a better understanding. Modern image understanding tasks like image classification have been improved by state-of-the-art deep learning methods, p...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Penerbit UTM Press
2022
|
Subjects: | |
Online Access: | http://eprints.utm.my/104593/1/SenthilKumarJayapalanSyahidAnuar2022_AnImprovedObjectDetectionModel.pdf http://eprints.utm.my/104593/ https://oiji.utm.my/index.php/oiji/article/view/230 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.104593 |
---|---|
record_format |
eprints |
spelling |
my.utm.1045932024-02-21T08:21:59Z http://eprints.utm.my/104593/ An improved object detection model based on optimised CNN. Mohd. Anuar, Mohd. Syahid Jayapalan, Senthil Kumar TJ Mechanical engineering and machinery Object detection is a computer vision technique that gives the ability to individually locate, recognise, and interpret multiple objects in an image with a better understanding. Modern image understanding tasks like image classification have been improved by state-of-the-art deep learning methods, particularly by convolutional neural networks (CNN). Region-based object detection algorithms such as Fast-RCNN achieve classification by CNN but over a longer period of time. You only look once (YOLO) prompts the object location and classification, treating object detection as a regression problem in an end-to-end network in a single step, whereas its accuracy decreases when the image has similar objects in a confined area, particularly when independent of the surrounding context. The aim of the current study is to improve YOLOv3 by optimising Darknet-53 to address the memory issue, using switchable normalisation techniques. We investigated the performance of five pre-trained networks, SqueezeNet, GoogleNet, ShuffleNet, Darknet-53, and Inception-V3, using a confusion matrix employing various epochs, learning rates, and mini-batches based on transfer learning. Darknet-53 took five times longer to complete the training and also ran into errors, most likely due to GPU memory shortages, whereas GoogleNet virtually obtained the same results in a fraction of the time. Using switchable normalisation techniques with the 10 class CIFAR-10 dataset, and utilising deep network designer (DND) of MATLAB R2021a, optimised versions of Darknet-53 increased the validation accuracy, considerably reducing the training time, and rectified the memory issue, which were then used as a backbone for YOLOv3 for effective object detection. The enhanced YOLOv3 was then assessed using a vehicle dataset and a sample Kuala Lumpur traffic scene using average precision. YOLOv3 with optimised CNN dNet-CIN as the backbone produced the best experimental results, with an FPS of 3.21 and a mAP-50 of 97%. Penerbit UTM Press 2022-12-15 Article PeerReviewed application/pdf en http://eprints.utm.my/104593/1/SenthilKumarJayapalanSyahidAnuar2022_AnImprovedObjectDetectionModel.pdf Mohd. Anuar, Mohd. Syahid and Jayapalan, Senthil Kumar (2022) An improved object detection model based on optimised CNN. Open International Journal Of Informatics, 10 (1). pp. 78-96. ISSN 2289-2370 https://oiji.utm.my/index.php/oiji/article/view/230 NA |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
language |
English |
topic |
TJ Mechanical engineering and machinery |
spellingShingle |
TJ Mechanical engineering and machinery Mohd. Anuar, Mohd. Syahid Jayapalan, Senthil Kumar An improved object detection model based on optimised CNN. |
description |
Object detection is a computer vision technique that gives the ability to individually locate, recognise, and interpret multiple objects in an image with a better understanding. Modern image understanding tasks like image classification have been improved by state-of-the-art deep learning methods, particularly by convolutional neural networks (CNN). Region-based object detection algorithms such as Fast-RCNN achieve classification by CNN but over a longer period of time. You only look once (YOLO) prompts the object location and classification, treating object detection as a regression problem in an end-to-end network in a single step, whereas its accuracy decreases when the image has similar objects in a confined area, particularly when independent of the surrounding context. The aim of the current study is to improve YOLOv3 by optimising Darknet-53 to address the memory issue, using switchable normalisation techniques. We investigated the performance of five pre-trained networks, SqueezeNet, GoogleNet, ShuffleNet, Darknet-53, and Inception-V3, using a confusion matrix employing various epochs, learning rates, and mini-batches based on transfer learning. Darknet-53 took five times longer to complete the training and also ran into errors, most likely due to GPU memory shortages, whereas GoogleNet virtually obtained the same results in a fraction of the time. Using switchable normalisation techniques with the 10 class CIFAR-10 dataset, and utilising deep network designer (DND) of MATLAB R2021a, optimised versions of Darknet-53 increased the validation accuracy, considerably reducing the training time, and rectified the memory issue, which were then used as a backbone for YOLOv3 for effective object detection. The enhanced YOLOv3 was then assessed using a vehicle dataset and a sample Kuala Lumpur traffic scene using average precision. YOLOv3 with optimised CNN dNet-CIN as the backbone produced the best experimental results, with an FPS of 3.21 and a mAP-50 of 97%. |
format |
Article |
author |
Mohd. Anuar, Mohd. Syahid Jayapalan, Senthil Kumar |
author_facet |
Mohd. Anuar, Mohd. Syahid Jayapalan, Senthil Kumar |
author_sort |
Mohd. Anuar, Mohd. Syahid |
title |
An improved object detection model based on optimised CNN. |
title_short |
An improved object detection model based on optimised CNN. |
title_full |
An improved object detection model based on optimised CNN. |
title_fullStr |
An improved object detection model based on optimised CNN. |
title_full_unstemmed |
An improved object detection model based on optimised CNN. |
title_sort |
improved object detection model based on optimised cnn. |
publisher |
Penerbit UTM Press |
publishDate |
2022 |
url |
http://eprints.utm.my/104593/1/SenthilKumarJayapalanSyahidAnuar2022_AnImprovedObjectDetectionModel.pdf http://eprints.utm.my/104593/ https://oiji.utm.my/index.php/oiji/article/view/230 |
_version_ |
1792147850075832320 |
score |
13.211869 |