Enhancing house price prediction using hybrid feature selection: A combination of information gain and SVM-RFE

Accurate house price prediction is crucial for buyers, investors, and policymakers to make informed decisions. However, real estate datasets often contain high-dimensional features, including redundant and irrelevant attributes, which can negatively impact model performance. This study proposes a hy...

Full description

Saved in:
Bibliographic Details
Main Author: Low, Jun Liang
Format: Final Year Project / Dissertation / Thesis
Published: 2025
Subjects:
Online Access:http://eprints.utar.edu.my/6142/1/Final_Year_Project__Low_Jun_Liang_%2D_JUN_LIANG_LOW.pdf
http://eprints.utar.edu.my/6142/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate house price prediction is crucial for buyers, investors, and policymakers to make informed decisions. However, real estate datasets often contain high-dimensional features, including redundant and irrelevant attributes, which can negatively impact model performance. This study proposes a hybrid feature selection approach that combines Information Gain (IG) and Support Vector Machine Recursive Feature Elimination to enhance predictive accuracy. The proposed hybrid method significantly improves model performance, achieving a 22.2% reduction in Root Mean Squared Error (RMSE) (from 185,518.52 to 154,403.70) and a 22.7% increase in R-squared (from 0.6522 to 0.8008) compared to using IG alone. While IG is effective in ranking features based on their relevance to the target variable, it does not account for feature interactions and redundancy, which can lead to suboptimal feature selection. The addition of SVM-RFE addresses this limitation by iteratively refining the feature set, ensuring only the most informative attributes are retained. Furthermore, the hybrid approach demonstrated robustness even in the presence of artificially introduced noise. Hyperparameter tuning further optimized the best-performing model, yielding marginal improvements in accuracy. These findings highlight the effectiveness of combining filter and wrapper methods for real estate price prediction, demonstrating that hybrid feature selection leads to more reliable and interpretable models.