Logrank: a clickstream-based web page importance metric for web crawlers

Information extraction of the Web and more precise ranking methods of Web pages are among the open issues in search engines' research area due to the ever-growing and dynamic nature of the World Wide Web. Therefore, proposing novel approaches or performing any enhancement to the existed algorit...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmadi Abkenari, F., Selamat, Ali
Format: Article
Published: 2012
Subjects:
Online Access:http://eprints.utm.my/id/eprint/47162/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Information extraction of the Web and more precise ranking methods of Web pages are among the open issues in search engines' research area due to the ever-growing and dynamic nature of the World Wide Web. Therefore, proposing novel approaches or performing any enhancement to the existed algorithms is the concern of many researchers in this field. Since the performance of any Web crawler is highly dependent to the applied Web page importance metric and regarding the obstacles of existed link-dependent or context-based metrics, the innovative heuristics that guarantees the accuracy of search results and better employment of resources is highly on demand. This paper introduces a novel link independent clickstream-based Web page importance metric, illustrates the metric's effectiveness through experimentally testing it over the UTM University Web domain and evaluates the results with information retrieval evaluation measures.