An accident prediction model based on ARIMA in Kuala Lumpur, Malaysia using time series of actual accidents and related data

Recently, there has been an emerging trend to analyse time series data and utilise sophisticated tools for optimally fitting time series models. To date, Malaysian industrial accident data was underutilised and lack of informative record. Thus, this paper aims to investigate the Malaysian accident d...

Full description

Saved in:
Bibliographic Details
Main Authors: Choo, Boon Chong, Abdul Razak, Musab, Mohd Tohir, Mohd Zahirasri, Awang Biak, Dayang Radiah, Syam, Syafiie
Format: Article
Language:English
Published: Universiti Putra Malaysia Press 2023
Online Access:http://psasir.upm.edu.my/id/eprint/106503/1/07%20JST-4635-2023.pdf
http://psasir.upm.edu.my/id/eprint/106503/
http://www.pertanika.upm.edu.my/pjst/browse/regular-issue?article=JST-4635-2023
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, there has been an emerging trend to analyse time series data and utilise sophisticated tools for optimally fitting time series models. To date, Malaysian industrial accident data was underutilised and lack of informative record. Thus, this paper aims to investigate the Malaysian accident database and further evaluate the optimal forecasting models in accident prediction. The model's input was based on available data from the Department of Occupational Safety and Health, Malaysia (DOSH) from 2018 until 2021, with 80 of the dataset to train the models and the remaining 20 for validation. The prediction using negative binomial and Poisson distribution showed a mean absolute percentage error (MAPE) of 33 and 51, respectively. This indicated that the negative binomial performed better than the Poisson distribution in accident frequency prediction. The available time series accident data were gathered for four years and stationarity was checked in R Studio software for the Augmented Dickey-Fuller test. The lowest Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and other error values were used to justify the best model, which was ARIMA(2,0,2)(2,0,0)(12) model. The ARIMA models were considered after the data showed autocorrelation. The MAPE for both ARIMA in R and manual time series were 40 and 49, respectively. Therefore, the accident prediction by using R Studio would outperform the manually negative binomial and Poisson distribution. Based on the findings, industrial safety practitioners should report accidents to DOSH truthfully in the era of digitalisation. This could enable future data-driven accident predictions to be carried out.