Bayesian logistic regression model on risk factors of type 2 diabetes mellitus
Logistic regression model has long been known and it is commonly used in analysing a binary outcome or dependent variable and connects the binary dependent variable to several independent variables. Estimates of the coefficients for the variables are obtained via the method of maximum likelihood...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://psasir.upm.edu.my/id/eprint/69118/1/FS%202016%2045%20UPM%20IR.pdf http://psasir.upm.edu.my/id/eprint/69118/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Logistic regression model has long been known and it is commonly used in analysing a
binary outcome or dependent variable and connects the binary dependent variable to several
independent variables. Estimates of the coefficients for the variables are obtained via the
method of maximum likelihood based on the frequentist point of view. However, Bayesian
analysis allows the incorporation of the prior information and the coefficients of the logistic
regression model are estimated by assuming prior distribution for each of the coefficient of
interest, which then combines with the likelihood function for the posterior distribution to be
obtained.
The Bayesian logistic regression methods made use of the metropolis hasting (Random walk
algorithm) and the Gibbs sampler with the incorporation of non-informative flat prior and
non-informative non-flat prior distributions to obtain the posterior distribution for each coefficient
of the variables. Although we incorporated the flat prior distribution, it has been
shown to be widely used in different fields of study. However, this work also incorporated
a non-flat prior, which is our main research and to the best of our knowledge has not been
incorporated on any T2DM dataset in Malaysia.
This study evaluates the risk factors such as age, ethnicity, gender, physical activity, hypertension,
body mass index, family history of diabetes and waist circumference. The coefficients
of the variables mentioned above were estimated by the method of maximum likelihood
and significant variables were further identified. The significant variables determined
by maximum likelihood method were then estimated using the BLR method. The BLR approach
via Gibbs sampler and the random walk metropolis algorithm suggests that family
history of diabetes, waist circumference and the body mass index are the significant risk
factors associated with the type 2 diabetes mellitus. The model results also show a slight
decrease in the posterior standard deviation associated with the parameters generated from
the Bayesian analysis with the non-flat prior distribution compared to the results generated
from the Bayesian analysis incorporating the non-informative prior. Having seen that the difference between the models is not much, consequently from all indications, all the models
are good and they exhibited model fit. |
---|