Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities

The process of retrieving significant documents based on the search key from a corpus has been a vital research problem in the information retrieval field. This paper proposes an efficient way to retrieve documents related to different personalities extracted from Wikipedia. The proposed method util...

Full description

Saved in:
Bibliographic Details
Main Authors: SE. Ben Georgea, G. Jeba Rosline, N. Balasupramanian, N.R. Wilfred Blessing
Format: Article
Language:en
Published: Penerbit Universiti Kebangsaan Malaysia 2024
Online Access:http://journalarticle.ukm.my/25559/1/kejut_19.pdf
http://journalarticle.ukm.my/25559/
https://www.ukm.my/jkukm/volume-3604-2024/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1838231868555657216
author SE. Ben Georgea,
G. Jeba Rosline,
N. Balasupramanian,
N.R. Wilfred Blessing,
author_facet SE. Ben Georgea,
G. Jeba Rosline,
N. Balasupramanian,
N.R. Wilfred Blessing,
author_sort SE. Ben Georgea,
building Tun Sri Lanang Library
collection Institutional Repository
content_provider Universiti Kebangsaan Malaysia
content_source UKM Journal Article Repository
continent Asia
country Malaysia
description The process of retrieving significant documents based on the search key from a corpus has been a vital research problem in the information retrieval field. This paper proposes an efficient way to retrieve documents related to different personalities extracted from Wikipedia. The proposed method utilizes the Locality Sensitive Hashing Nearest Neighbor algorithm combined with Weighted Jaccard Distance to measure document similarity with enhanced precision. This document retrieval system demonstrates competitive performance compared to existing methods in the Personality Identification domain. The introduction of a document centroid normalization technique significantly improves the effectiveness of information retrieval by enabling better discrimination between documents. The personality document search results were compared for different distance measures using performance metrics like Normalized Discounted Cumulative Gain and Mean Average Precision. The results presented in this paper show that the TF-IDF scheme with Locality Sensitive Hashing Nearest Neighbor Algorithm using the Weighted Jaccard Distance can yield superior retrieval efficiency when contrasted with alternative approaches found in the existing literature.
format Article
id my-ukm.journal.25559
institution Universiti Kebangsaan Malaysia
language en
publishDate 2024
publisher Penerbit Universiti Kebangsaan Malaysia
record_format eprints
spelling my-ukm.journal.255592025-07-14T08:21:27Z http://journalarticle.ukm.my/25559/ Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities SE. Ben Georgea, G. Jeba Rosline, N. Balasupramanian, N.R. Wilfred Blessing, The process of retrieving significant documents based on the search key from a corpus has been a vital research problem in the information retrieval field. This paper proposes an efficient way to retrieve documents related to different personalities extracted from Wikipedia. The proposed method utilizes the Locality Sensitive Hashing Nearest Neighbor algorithm combined with Weighted Jaccard Distance to measure document similarity with enhanced precision. This document retrieval system demonstrates competitive performance compared to existing methods in the Personality Identification domain. The introduction of a document centroid normalization technique significantly improves the effectiveness of information retrieval by enabling better discrimination between documents. The personality document search results were compared for different distance measures using performance metrics like Normalized Discounted Cumulative Gain and Mean Average Precision. The results presented in this paper show that the TF-IDF scheme with Locality Sensitive Hashing Nearest Neighbor Algorithm using the Weighted Jaccard Distance can yield superior retrieval efficiency when contrasted with alternative approaches found in the existing literature. Penerbit Universiti Kebangsaan Malaysia 2024-07 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/25559/1/kejut_19.pdf SE. Ben Georgea, and G. Jeba Rosline, and N. Balasupramanian, and N.R. Wilfred Blessing, (2024) Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities. Jurnal Kejuruteraan, 36 (4). pp. 1535-1543. ISSN 0128-0198 https://www.ukm.my/jkukm/volume-3604-2024/
spellingShingle SE. Ben Georgea,
G. Jeba Rosline,
N. Balasupramanian,
N.R. Wilfred Blessing,
Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities
title Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities
title_full Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities
title_fullStr Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities
title_full_unstemmed Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities
title_short Efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities
title_sort efficient document retrieval system using locality sensitive hashing nearest neighbor algorithm and weighted jaccard distance for retrieving closest personalities
url http://journalarticle.ukm.my/25559/1/kejut_19.pdf
http://journalarticle.ukm.my/25559/
https://www.ukm.my/jkukm/volume-3604-2024/
url_provider http://journalarticle.ukm.my/