Staff View: A comparative study between rough and decision tree classifiers

A comparative study between rough and decision tree classifiers

Rule-based classification system (RBC) has been widely used in many real world applications because of the easy interpretability of rules.RBC mines a collection of rule via knowledge which is hidden in dataset in order to accurately map new cases to the decision class.In the real world, the number o...

Full description

Saved in:

Bibliographic Details
Main Author:	Mohamad Mohsin, Mohamad Farhan
Format:	Monograph
Language:	English English
Published:	Universiti Utara Malaysia 2008
Subjects:	H Social Sciences (General)
Online Access:	http://repo.uum.edu.my/7807/1/fAR.pdf http://repo.uum.edu.my/7807/3/1.Mohamad%20Farhan.pdf http://repo.uum.edu.my/7807/ http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000301019
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.uum.repo.7807
record_format	eprints
spelling	my.uum.repo.78072014-07-08T01:16:21Z http://repo.uum.edu.my/7807/ A comparative study between rough and decision tree classifiers Mohamad Mohsin, Mohamad Farhan H Social Sciences (General) Rule-based classification system (RBC) has been widely used in many real world applications because of the easy interpretability of rules.RBC mines a collection of rule via knowledge which is hidden in dataset in order to accurately map new cases to the decision class.In the real world, the number of attribute of dataset could be very large due the capability of database technology to store much information.Following that, the large dataset may contain thousands of relationship and it will likely provide more knowledge since the interrelationship between data will give more description.Furthermore, it is also have the possibility to have most number of rules that contain unnecessary rule or redundancies in the model. Theoretically, a good set of knowledge should provide good accuracy when dealing with new cases.Besides accuracy, a good rule set must also has a minimum number of rules and each rule should be short as possible.It is often that a rule set contains smaller quantity of rules but they usually have more conditions.An ideal model should be able to produces fewer, shorter rule and classify new data with good accuracy.Consequently, the quality and compact knowledge will contribute manager with a good decision model.Because of that, the search for appropriate data mining approach which can provide quality knowledge is important.Rough classifier (RC) and decision tree classifier (DTC) are categorized as RBC.The purpose of this study is to investigate the capability of RC and DTC in generating quality knowledge which leads to the good accuracy.To achieve that, both classifiers are compared based on four measurements that are accuracy of the classification, the number of rule, the length of rule, and the coverage of rule.Five dataset from UCI Machine Learning namely United States Congressional Voting Records, Credit Approval, Wisconsin Diagnostic Breast Cancer, Pima Indians Diabetes Database, and Vehicle Silhouettes are chosen as data experiment.All datasets were mined using RC toolkit namely ROSETTA while C4.5 algorithm in WEKA application was chosen as DTC rule generator.The experimental results indicated that both classifiers produced good classification result and had generated quality rule in different types of model – higher accuracy, fewer rule, shorter rule, and higher coverage.In term of accuracy, RC obtained higher accuracy in average while DTC significantly generated lower number of rule than RC.In term of rule length, RC produced compact and shorter rule than DTC and the length is not significantly different.Meanwhile, RC has better coverage than DTC.Final conclusion can be decided as follows “If the user interested at a variety of rule pattern with a good accuracy and the number of rule is not important, RC is the best solution whereas if the user looks for fewer nr, DTC might be the best choice” Universiti Utara Malaysia 2008 Monograph NonPeerReviewed application/pdf en http://repo.uum.edu.my/7807/1/fAR.pdf application/pdf en http://repo.uum.edu.my/7807/3/1.Mohamad%20Farhan.pdf Mohamad Mohsin, Mohamad Farhan (2008) A comparative study between rough and decision tree classifiers. Project Report. Universiti Utara Malaysia. (Unpublished) http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000301019
institution	Universiti Utara Malaysia
building	UUM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Utara Malaysia
content_source	UUM Institutionali Repository
url_provider	http://repo.uum.edu.my/
language	English English
topic	H Social Sciences (General)
spellingShingle	H Social Sciences (General) Mohamad Mohsin, Mohamad Farhan A comparative study between rough and decision tree classifiers
description	Rule-based classification system (RBC) has been widely used in many real world applications because of the easy interpretability of rules.RBC mines a collection of rule via knowledge which is hidden in dataset in order to accurately map new cases to the decision class.In the real world, the number of attribute of dataset could be very large due the capability of database technology to store much information.Following that, the large dataset may contain thousands of relationship and it will likely provide more knowledge since the interrelationship between data will give more description.Furthermore, it is also have the possibility to have most number of rules that contain unnecessary rule or redundancies in the model. Theoretically, a good set of knowledge should provide good accuracy when dealing with new cases.Besides accuracy, a good rule set must also has a minimum number of rules and each rule should be short as possible.It is often that a rule set contains smaller quantity of rules but they usually have more conditions.An ideal model should be able to produces fewer, shorter rule and classify new data with good accuracy.Consequently, the quality and compact knowledge will contribute manager with a good decision model.Because of that, the search for appropriate data mining approach which can provide quality knowledge is important.Rough classifier (RC) and decision tree classifier (DTC) are categorized as RBC.The purpose of this study is to investigate the capability of RC and DTC in generating quality knowledge which leads to the good accuracy.To achieve that, both classifiers are compared based on four measurements that are accuracy of the classification, the number of rule, the length of rule, and the coverage of rule.Five dataset from UCI Machine Learning namely United States Congressional Voting Records, Credit Approval, Wisconsin Diagnostic Breast Cancer, Pima Indians Diabetes Database, and Vehicle Silhouettes are chosen as data experiment.All datasets were mined using RC toolkit namely ROSETTA while C4.5 algorithm in WEKA application was chosen as DTC rule generator.The experimental results indicated that both classifiers produced good classification result and had generated quality rule in different types of model – higher accuracy, fewer rule, shorter rule, and higher coverage.In term of accuracy, RC obtained higher accuracy in average while DTC significantly generated lower number of rule than RC.In term of rule length, RC produced compact and shorter rule than DTC and the length is not significantly different.Meanwhile, RC has better coverage than DTC.Final conclusion can be decided as follows “If the user interested at a variety of rule pattern with a good accuracy and the number of rule is not important, RC is the best solution whereas if the user looks for fewer nr, DTC might be the best choice”
format	Monograph
author	Mohamad Mohsin, Mohamad Farhan
author_facet	Mohamad Mohsin, Mohamad Farhan
author_sort	Mohamad Mohsin, Mohamad Farhan
title	A comparative study between rough and decision tree classifiers
title_short	A comparative study between rough and decision tree classifiers
title_full	A comparative study between rough and decision tree classifiers
title_fullStr	A comparative study between rough and decision tree classifiers
title_full_unstemmed	A comparative study between rough and decision tree classifiers
title_sort	comparative study between rough and decision tree classifiers
publisher	Universiti Utara Malaysia
publishDate	2008
url	http://repo.uum.edu.my/7807/1/fAR.pdf http://repo.uum.edu.my/7807/3/1.Mohamad%20Farhan.pdf http://repo.uum.edu.my/7807/ http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000301019
_version_	1644279647597232128
score	13.211869

A comparative study between rough and decision tree classifiers

Similar Items