A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia

Thunderstorms are one of the most destructive natural phenomena on the planet, as they are predominantly associated with lightning and heavy rainfall that result in human deaths, urban flooding, and agricultural damage. Thus, accurate thunderstorm prediction is essential for planning and managing ag...

Full description

Saved in:
Bibliographic Details
Main Authors: Rufus, Shirley, Ahmad, N. Azlinda, Abdullah, Noradlina, Abdul-Malek, Zulkurnain
Format: Conference or Workshop Item
Published: Springer Science and Business Media Deutschland GmbH 2023
Subjects:
Online Access:http://eprints.utm.my/107595/
http://dx.doi.org/10.1109/SIPDA59763.2023.10349193
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.107595
record_format eprints
spelling my.utm.1075952024-09-25T06:26:55Z http://eprints.utm.my/107595/ A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia Rufus, Shirley Ahmad, N. Azlinda Abdullah, Noradlina Abdul-Malek, Zulkurnain TK Electrical engineering. Electronics Nuclear engineering Thunderstorms are one of the most destructive natural phenomena on the planet, as they are predominantly associated with lightning and heavy rainfall that result in human deaths, urban flooding, and agricultural damage. Thus, accurate thunderstorm prediction is essential for planning and managing agriculture, flood control, and air traffic control. This study utilized historical lightning and meteorological data from 2011 to 2018 in the southern regions of Peninsular Malaysia to predict thunderstorm occurrences. The lightning dataset is classified into three class ranges, where the high range of lightning rarely occurs in this region compared to the low and medium ranges of lightning because of the nonlinear and complex characteristics of the thunderstorm and lightning itself, leading to an imbalanced dataset. The k-fold and stratified cross-validation (CV) methods and a resampling technique called SMOTE are introduced to overcome the imbalance in the training dataset. Then the dataset is trained and tested using five Machine Learning (ML) algorithms, including Decision Trees (DT), Adaptive Boosting (AdaBoost), Random Forest (RF), Extra Trees (ET), and Gradient Boosting (GB). The results have shown that the GB ML model using stratified k-fold CV and SMOTE is the best algorithm for thunderstorm prediction for this region, with accuracy ranging from 74% to 95%, recall ranging from 72% to 93%, precision ranging from 76% to 97%, and F1-Score ranging from 74% to 95%. Future thunderstorm predictions based on lightning patterns and meteorological datasets are expected to establish an early strategy to address the presence of thunderstorms by notifying the relevant authorities, to prevent any damage that may be caused by the thunderstorms. Springer Science and Business Media Deutschland GmbH 2023 Conference or Workshop Item PeerReviewed Rufus, Shirley and Ahmad, N. Azlinda and Abdullah, Noradlina and Abdul-Malek, Zulkurnain (2023) A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia. In: 17th International Symposium on Lightning Protection, SIPDA 2023, Suzhou, China, 9 October 2023 - 13 October 2023. http://dx.doi.org/10.1109/SIPDA59763.2023.10349193 DOI : 10.1007/s13205-023-03607-x
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Rufus, Shirley
Ahmad, N. Azlinda
Abdullah, Noradlina
Abdul-Malek, Zulkurnain
A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia
description Thunderstorms are one of the most destructive natural phenomena on the planet, as they are predominantly associated with lightning and heavy rainfall that result in human deaths, urban flooding, and agricultural damage. Thus, accurate thunderstorm prediction is essential for planning and managing agriculture, flood control, and air traffic control. This study utilized historical lightning and meteorological data from 2011 to 2018 in the southern regions of Peninsular Malaysia to predict thunderstorm occurrences. The lightning dataset is classified into three class ranges, where the high range of lightning rarely occurs in this region compared to the low and medium ranges of lightning because of the nonlinear and complex characteristics of the thunderstorm and lightning itself, leading to an imbalanced dataset. The k-fold and stratified cross-validation (CV) methods and a resampling technique called SMOTE are introduced to overcome the imbalance in the training dataset. Then the dataset is trained and tested using five Machine Learning (ML) algorithms, including Decision Trees (DT), Adaptive Boosting (AdaBoost), Random Forest (RF), Extra Trees (ET), and Gradient Boosting (GB). The results have shown that the GB ML model using stratified k-fold CV and SMOTE is the best algorithm for thunderstorm prediction for this region, with accuracy ranging from 74% to 95%, recall ranging from 72% to 93%, precision ranging from 76% to 97%, and F1-Score ranging from 74% to 95%. Future thunderstorm predictions based on lightning patterns and meteorological datasets are expected to establish an early strategy to address the presence of thunderstorms by notifying the relevant authorities, to prevent any damage that may be caused by the thunderstorms.
format Conference or Workshop Item
author Rufus, Shirley
Ahmad, N. Azlinda
Abdullah, Noradlina
Abdul-Malek, Zulkurnain
author_facet Rufus, Shirley
Ahmad, N. Azlinda
Abdullah, Noradlina
Abdul-Malek, Zulkurnain
author_sort Rufus, Shirley
title A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia
title_short A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia
title_full A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia
title_fullStr A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia
title_full_unstemmed A comparative analysis using machine learning approach for thunderstorm prediction in southern region of Peninsular Malaysia
title_sort comparative analysis using machine learning approach for thunderstorm prediction in southern region of peninsular malaysia
publisher Springer Science and Business Media Deutschland GmbH
publishDate 2023
url http://eprints.utm.my/107595/
http://dx.doi.org/10.1109/SIPDA59763.2023.10349193
_version_ 1811681230064713728
score 13.211869