A modified π rough k-means algorithm for web page recommendation system
Web page recommendation system is an application of Web Usage Mining (WUM) approach, which specializes in predicting the user next browsing activity in real-time Web for personalized recommendations. To date, many works have been addressed in investigating the use of data mining techniques (e.g.,...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2018
|
Online Access: | http://psasir.upm.edu.my/id/eprint/68816/1/FSKTM%202018%2023%20-%20IR.pdf http://psasir.upm.edu.my/id/eprint/68816/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Web page recommendation system is an application of Web Usage Mining (WUM)
approach, which specializes in predicting the user next browsing activity in real-time
Web for personalized recommendations. To date, many works have been addressed in
investigating the use of data mining techniques (e.g., Clustering) in Web page
application. Most of the research efforts are utilized partitional clustering algorithms
to discover user profiles in order to obtain a better quality of recommendations.
However, the quality of current solutions has only managed to achieve accuracy within
the range of 60-70%. This happens due to the weaknesses of partitive algorithm
criterion in Web page recommendation system to overcome the overlapping profiles,
which require more attention.
In order to tackle above problem, a modified algorithm for Web page recommendation
is proposed. The ultimate goal is to improve the recommendation quality which leads
to increase the prediction accuracy. Hence, this study carried out several objectives to
augment the support of modified clustering algorithm. Firstly, an extended K-Means
clustering algorithm (called X-Means algorithm) is proposed to filter/remove the noise
from user session data to eliminate outliers or irrelevant pages. Secondly, a modified
πRKM algorithm is proposed to partition the user session data. The modified πRKM
is able to perform better partition by identifying the overlapping objects between the
correct clusters and also capable to do a re-partition using the indiscernibility relation
function. Thirdly, the local and global similarity algorithm is proposed to classify the
current user pages request to produce recommendations.
There are different datasets used to carry out extensive experiments which are
described as follows; firstly, Iris and Vowel datasets were used to assess the
effectiveness of proposed modified πRKM, where rough classifier assessment
strategies used to measure the quality of overlapping classes. The experimental results
revealed that the modified πRKM algorithm performed better than the previous
version in terms of the correct identification of overlapping objects between positive
clusters. Secondly, the CTI dataset, which has been proven by the existing research
work as a more suitable Web server logs in the term of Web page recommendation
quality, is used for measuring the performance of the proposed modified algorithm for
Web page recommendation system. The experiment is divided into three interdependent
stages of usage mining process, namely: data preparation, pattern
discovery, and recommendation. In data preparation stage, the quality of prepared data
is measured by Local Outlier Factor (LOF) method. The experimental results revealed
that the degree of user sessions outliers reduced than the previous method while in
pattern discovery stage, the results of user sessions partition with the modified πRKM
algorithm are measured by the Davies Bouldin Index (DBI). The experimental results
revealed that the modified πRKM algorithm significantly affected the partitions
quality of the cluster obtained. In the third stage, the results of recommendation engine
are measured using three accuracy parameters, namely Precision, Coverage, and Fmeasure.
The results of the proposed modified algorithm for Web page
recommendation system achieve an accuracy of 76-82% which is significantly
outperforming than the previous work. |
---|