Enhanced mechanism to handle missing data of Hadith classifier

Tree structured modeling is a data mining technique used to recursively partition a data set into relatively homogeneous subgroups in order to make more accurate predictions on the future instances. Decision tree algorithms have the ability to deal with missing values or wrong data. While this abili...

Full description

Saved in:
Bibliographic Details
Main Authors: Aldhlan, Kawther A., Zeki, Ahmed M., Zeki, Akram M.
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:http://irep.iium.edu.my/11307/2/Enhanced_mechanism_to_handle_missing_data_of_Hadith_classifier_11307.pdf
http://irep.iium.edu.my/11307/
http://www.ontariointernational.org/ConferenceMalaysia2011.htm
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.iium.irep.11307
record_format dspace
spelling my.iium.irep.113072014-12-08T07:52:19Z http://irep.iium.edu.my/11307/ Enhanced mechanism to handle missing data of Hadith classifier Aldhlan, Kawther A. Zeki, Ahmed M. Zeki, Akram M. T Technology (General) Tree structured modeling is a data mining technique used to recursively partition a data set into relatively homogeneous subgroups in order to make more accurate predictions on the future instances. Decision tree algorithms have the ability to deal with missing values or wrong data. While this ability is considered to be advantage, the extreme effort which is required to achieve it is considered a drawback. The correct branch to take is unknown if a feature tested is missing, and the algorithm must employed enhanced mechanisms to handle missing values. Moreover, ignoring these missing data may cause critical decision to user or administrators. Specially for the cases that belong to religion. Hadith classifier is a method to classify such Hadith into four major classes Sahih, Hasan, Da'ef and Maudo' according to the status of its Isnad ( narrators chain ). This research produced a mechanism to deal with missing data in Hadith database, 999 Hadiths from Sahih Al-Bukhari, Jami'u Al-Termithi and Selseelt AlaHadith Aldae'ifah w' Almadu'h were framed the sample of this study, the attributes of the hadith database were gained according to the validate methods of Hadith science, the experiment applied C4.5 algorithm to extract the rules of classification. Moreover, the experiment has two phases training and testing , in the first phase, the machine learnt from training dataset, meanwhile, the detector detected the missing data and replace any missing data with the correct attribute according to the validity method. In the second phase the machine detect any missing data to replace it with correct attribute and dealt with passive narrator chain. The findings showed that the accurate rate of the classifier has been improved by the proposed approach with 1.65% ,on the other hand, the time complexity had effected with 0.05 seconds. Meanwhile, with naïve bayes algorithm, the accurate rate has been improved by 0.6%. In contrast to C4.5 algorithm, the time complexity to build classifier remained as it is 0.02 seconds. Furthermore, the accurate rate of the classifier positively affected with the size of training dataset in both cases. 2011-12-05 Conference or Workshop Item REM application/pdf en http://irep.iium.edu.my/11307/2/Enhanced_mechanism_to_handle_missing_data_of_Hadith_classifier_11307.pdf Aldhlan, Kawther A. and Zeki, Ahmed M. and Zeki, Akram M. (2011) Enhanced mechanism to handle missing data of Hadith classifier. In: International Conference on Sustainable Development 2011, 5-7 December 2011, Putrajaya, Malaysia. (Unpublished) http://www.ontariointernational.org/ConferenceMalaysia2011.htm
institution Universiti Islam Antarabangsa Malaysia
building IIUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider International Islamic University Malaysia
content_source IIUM Repository (IREP)
url_provider http://irep.iium.edu.my/
language English
topic T Technology (General)
spellingShingle T Technology (General)
Aldhlan, Kawther A.
Zeki, Ahmed M.
Zeki, Akram M.
Enhanced mechanism to handle missing data of Hadith classifier
description Tree structured modeling is a data mining technique used to recursively partition a data set into relatively homogeneous subgroups in order to make more accurate predictions on the future instances. Decision tree algorithms have the ability to deal with missing values or wrong data. While this ability is considered to be advantage, the extreme effort which is required to achieve it is considered a drawback. The correct branch to take is unknown if a feature tested is missing, and the algorithm must employed enhanced mechanisms to handle missing values. Moreover, ignoring these missing data may cause critical decision to user or administrators. Specially for the cases that belong to religion. Hadith classifier is a method to classify such Hadith into four major classes Sahih, Hasan, Da'ef and Maudo' according to the status of its Isnad ( narrators chain ). This research produced a mechanism to deal with missing data in Hadith database, 999 Hadiths from Sahih Al-Bukhari, Jami'u Al-Termithi and Selseelt AlaHadith Aldae'ifah w' Almadu'h were framed the sample of this study, the attributes of the hadith database were gained according to the validate methods of Hadith science, the experiment applied C4.5 algorithm to extract the rules of classification. Moreover, the experiment has two phases training and testing , in the first phase, the machine learnt from training dataset, meanwhile, the detector detected the missing data and replace any missing data with the correct attribute according to the validity method. In the second phase the machine detect any missing data to replace it with correct attribute and dealt with passive narrator chain. The findings showed that the accurate rate of the classifier has been improved by the proposed approach with 1.65% ,on the other hand, the time complexity had effected with 0.05 seconds. Meanwhile, with naïve bayes algorithm, the accurate rate has been improved by 0.6%. In contrast to C4.5 algorithm, the time complexity to build classifier remained as it is 0.02 seconds. Furthermore, the accurate rate of the classifier positively affected with the size of training dataset in both cases.
format Conference or Workshop Item
author Aldhlan, Kawther A.
Zeki, Ahmed M.
Zeki, Akram M.
author_facet Aldhlan, Kawther A.
Zeki, Ahmed M.
Zeki, Akram M.
author_sort Aldhlan, Kawther A.
title Enhanced mechanism to handle missing data of Hadith classifier
title_short Enhanced mechanism to handle missing data of Hadith classifier
title_full Enhanced mechanism to handle missing data of Hadith classifier
title_fullStr Enhanced mechanism to handle missing data of Hadith classifier
title_full_unstemmed Enhanced mechanism to handle missing data of Hadith classifier
title_sort enhanced mechanism to handle missing data of hadith classifier
publishDate 2011
url http://irep.iium.edu.my/11307/2/Enhanced_mechanism_to_handle_missing_data_of_Hadith_classifier_11307.pdf
http://irep.iium.edu.my/11307/
http://www.ontariointernational.org/ConferenceMalaysia2011.htm
_version_ 1643606481410457600
score 13.211869