Extreme Gradient Boosting (XGBoost) and Random Forest (RF) Hybrid Ensemble with Bayesian Optimization in Landslide Susceptibility Mapping

Many models have been developed in the past for predicting landslide vulnerability. The generated models have certain shortcomings, especially when it comes to the problems of overfitting and overestimation. Therefore, the objective of this study is to assess and enhance the performance of Extreme G...

Full description

Saved in:
Bibliographic Details
Main Author: Dorothy, Anak Martin Atok
Format: Thesis
Language:en
en
en
Published: UNIMAS 2025
Subjects:
Online Access:http://ir.unimas.my/id/eprint/48367/5/DOROTHY_Student%20Declaration%20of%20Original%20Work.pdf
http://ir.unimas.my/id/eprint/48367/7/Extreme%20Gradient%20Boosting%20%28XGBoost%29%20and%20Random%20Forest%20%28RF%29%20Hybrid%20Ensemble%20with%20Bayesian%20Optimization%20in%20Landslide%20Susceptibility%20Mapping%20%2824%20pgs%29.pdf
http://ir.unimas.my/id/eprint/48367/6/Extreme%20Gradient%20Boosting%20%28XGBoost%29%20and%20Random%20Forest%20%28RF%29%20Hybrid%20Ensemble%20with%20Bayesian%20Optimization%20in%20Landslide%20Susceptibility%20Mapping.pdf
http://ir.unimas.my/id/eprint/48367/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Many models have been developed in the past for predicting landslide vulnerability. The generated models have certain shortcomings, especially when it comes to the problems of overfitting and overestimation. Therefore, the objective of this study is to assess and enhance the performance of Extreme Gradient Boosting (XGBoost) and Random Forest (RF) in predicting the susceptibility of landslides through the combination of the two models and the use of the Bayesian Optimization (BO) algorithm. The selected study area for this case study is Penang Island, which is a landslide-prone region in Malaysia. The topographical, hydrological, anthropogenic, and external factors that influence landslides were considered in this study. Geographic Information Systems (GIS) have been used to create the spatial databases of all the landslide conditioning elements. With an area under the curve (AUC) of success rate (SR) as high as 100.0% and prediction rate (PR) of 97.1%, the results indicate that optimized XGBoost performed best, followed by random forest (SR:99.3%, PR:95.9%). and stacked models (SR:96.8%, PR:95.6%) The final AUC findings showed that stacking and optimizing the hyperparameters can increase the algorithms' performance accuracy by overcoming the overfitting and overestimation issues. Although there are many applications for landslide susceptibility maps, they are crucial for site selection, engineering structure and disaster avoidance.