A clickstream-based web page significance ranking metric for web crawlers

The unpredictable fast growing dimension of the World Wide Web and its non-static nature causes considerable obstacles for Web crawlers including the presence of some incorrect and irrelevant answers among search results set and the scaling issues. Hence, solutions that are more promising are in dem...

Full description

Saved in:
Bibliographic Details
Main Authors: Selamat, Ali, Ahmadi-Abkenari, Fatemeh
Format: Conference or Workshop Item
Published: 2011
Online Access:http://eprints.utm.my/id/eprint/45469/
http://dx.doi.org/10.1109/MySEC.2011.6140674
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.45469
record_format eprints
spelling my.utm.454692017-08-29T00:59:19Z http://eprints.utm.my/id/eprint/45469/ A clickstream-based web page significance ranking metric for web crawlers Selamat, Ali Ahmadi-Abkenari, Fatemeh The unpredictable fast growing dimension of the World Wide Web and its non-static nature causes considerable obstacles for Web crawlers including the presence of some incorrect and irrelevant answers among search results set and the scaling issues. Hence, solutions that are more promising are in demand to provide more accurate search outcomes. Because implementing existed Web page importance metrics either link based or context based within a parallel crawler can not be an absolute solution for the coverage of authorized fresh Web content and the accuracy concerns, so employing these metrics is not the final approach within search engines' architecture. This paper proposes an analysis on clickstream data in order to discover the popularity of Web pages in crawl frontier through proposing the metric itself and presenting the experimental results on ranking the UTM Web pages based on the proposed discussed metric. 2011 Conference or Workshop Item PeerReviewed Selamat, Ali and Ahmadi-Abkenari, Fatemeh (2011) A clickstream-based web page significance ranking metric for web crawlers. In: The 5th Malaysian Software Engineering Conference (Mysec 2011). http://dx.doi.org/10.1109/MySEC.2011.6140674
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
description The unpredictable fast growing dimension of the World Wide Web and its non-static nature causes considerable obstacles for Web crawlers including the presence of some incorrect and irrelevant answers among search results set and the scaling issues. Hence, solutions that are more promising are in demand to provide more accurate search outcomes. Because implementing existed Web page importance metrics either link based or context based within a parallel crawler can not be an absolute solution for the coverage of authorized fresh Web content and the accuracy concerns, so employing these metrics is not the final approach within search engines' architecture. This paper proposes an analysis on clickstream data in order to discover the popularity of Web pages in crawl frontier through proposing the metric itself and presenting the experimental results on ranking the UTM Web pages based on the proposed discussed metric.
format Conference or Workshop Item
author Selamat, Ali
Ahmadi-Abkenari, Fatemeh
spellingShingle Selamat, Ali
Ahmadi-Abkenari, Fatemeh
A clickstream-based web page significance ranking metric for web crawlers
author_facet Selamat, Ali
Ahmadi-Abkenari, Fatemeh
author_sort Selamat, Ali
title A clickstream-based web page significance ranking metric for web crawlers
title_short A clickstream-based web page significance ranking metric for web crawlers
title_full A clickstream-based web page significance ranking metric for web crawlers
title_fullStr A clickstream-based web page significance ranking metric for web crawlers
title_full_unstemmed A clickstream-based web page significance ranking metric for web crawlers
title_sort clickstream-based web page significance ranking metric for web crawlers
publishDate 2011
url http://eprints.utm.my/id/eprint/45469/
http://dx.doi.org/10.1109/MySEC.2011.6140674
_version_ 1643651749078106112
score 13.211869