Classification of Diabetes Mellitus using Ensemble Algorithms

Diabetes Mellitus (DM) is one of the most prevalent diseases in the world today which is associated by having high glucose levels in the body either due to inadequate production of insulin or the body cell's not responding towards the produced insulin. Data mining and machine learning technique...

Full description

Saved in:
Bibliographic Details
Main Authors: Noor, N.A.B.S., Elamvazuthi, I., Yahya, N.
Format: Conference or Workshop Item
Published: Institute of Electrical and Electronics Engineers Inc. 2021
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85124145391&doi=10.1109%2fICIAS49414.2021.9642508&partnerID=40&md5=1d27bf9ffd020cabf2625a9327eb2990
http://eprints.utp.edu.my/29205/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Diabetes Mellitus (DM) is one of the most prevalent diseases in the world today which is associated by having high glucose levels in the body either due to inadequate production of insulin or the body cell's not responding towards the produced insulin. Data mining and machine learning techniques can be extremely useful in classification of DM considering the need to have a shift from current traditional methods which use sharp needles to draw blood towards a non - invasive method. The objective of this study is to perform DM classification using various machine learning algorithms. In this paper, individual classifiers such as Support Vector Machine, Naïve Bayes, Bayes Net, Decision Stump, k - Nearest Neighbors, Logistic Regression, Multilayer Perceptron and Decision Tree are experimented. Apart from that, ensemble methods such as bagging, boosting, hybrid classifier using combinations of Random Forest with other base classifiers and ensemble algorithm which is the Random Forest has also been studied. Proposed DM classification model is chosen based on an optimized model reflected by their accuracy and performance of the model. In this research, it was found that performance of ensemble method using hybrid classifier of Random Forest - Bayes Net model has proven to be the best DM classification model with an accuracy of 83.91 and AUC of 0.904 using the Pima Indian Diabetes Dataset (PIDD). © 2021 IEEE.