Thunderstorm Prediction Model Using Hybrid Clustering and Machine Learning Approach

This study presents a novel thunderstorm prediction model leveraging a hybrid approach that integrates the Synthetic Minority Oversampling Technique (SMOTE), �-Means clustering and Machine Learning (ML) Models. Using historical lightning and meteorological data from the southern region of Peninsular...

Full description

Saved in:
Bibliographic Details
Main Authors: Shirley, Rufus, Noor Azlinda, Ahmad, Zulkurnain, Abdul Malek, Noradlina, Abdullah, Nurul ‘Izzati, Hashim, Asrani, Lit
Format: Proceeding
Language:en
Published: 2025
Subjects:
Online Access:http://ir.unimas.my/id/eprint/50328/1/Thunderstorm.pdf
http://ir.unimas.my/id/eprint/50328/
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11108819
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study presents a novel thunderstorm prediction model leveraging a hybrid approach that integrates the Synthetic Minority Oversampling Technique (SMOTE), �-Means clustering and Machine Learning (ML) Models. Using historical lightning and meteorological data from the southern region of Peninsular Malaysia, the study evaluates the performance of five ML models including Decision Tree (DT), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), and Extreme Gradient Boosting (XGBoost), based on the standard performance evaluation metrics such as accuracy, precision, recall, and F1-score. The results demonstrate that ensemble methods, particularly RF, consistently outperform other models across all three clusters, achieving prediction accuracy exceeding 95%. These findings underscore the effectiveness of RF in capturing data complexities and making accurate thunderstorm predictions. The study further emphasizes the role of balanced datasets through SMOTE and robust clustering techniques in enhancing model reliability. Future work will focus on integrating real-time data and incorporating additional meteorological variables to further improve predictive performance.