Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles

Due to the exponential growth of information’s and web sources, Automatic keyphrase extraction is still a challenging issue in the current research area. Keyphrases are very helpful for several tasks in natural language processing (NLP) and information retrieval (IR) systems. Feature extractions for...

Full description

Saved in:
Bibliographic Details
Main Authors: Miah, Mohammad Badrul Alam, Suryanti, Awang, Azad, Md. Saiful
Format: Conference or Workshop Item
Language:English
Published: IEEE 2021
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/33128/7/Region-Based%20Distance%20Analysis%20of%20Keyphrases1.pdf
http://umpir.ump.edu.my/id/eprint/33128/
https://doi.org/10.1109/ICSECS52883.2021.00030
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ump.umpir.33128
record_format eprints
spelling my.ump.umpir.331282022-09-06T07:47:52Z http://umpir.ump.edu.my/id/eprint/33128/ Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles Miah, Mohammad Badrul Alam Suryanti, Awang Azad, Md. Saiful QA76 Computer software Due to the exponential growth of information’s and web sources, Automatic keyphrase extraction is still a challenging issue in the current research area. Keyphrases are very helpful for several tasks in natural language processing (NLP) and information retrieval (IR) systems. Feature extractions for those keyphrases execute a vital role in extracting the top-quality keyphrases and summarising the documents at a superior level. This paper proposes a new region-based distance analysis of keyphrases (RDAK) unsupervised technique for feature extraction of keyphrases from articles. The proposed method comprises six phases: data acquisition and preprocessing, data processing, distance calculation, average distance, curve plotting, and curve fitting. At first, the system inputs the collected different datasets to the preprocessing step by employing some text preprocessing techniques. Afterwards, the preprocessed data is applied to the data processing phase, and then after distance calculation, it is passed to the region-based average calculation process, then curve plotting analysis, and afterwards, the curve fitting technique is utilized. Finally, the proposed system has tested and evaluated the performance through implementing them on benchmark datasets. The proposed system will significantly improve the performance of existing keyphrase extraction techniques. IEEE 2021 Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/33128/7/Region-Based%20Distance%20Analysis%20of%20Keyphrases1.pdf Miah, Mohammad Badrul Alam and Suryanti, Awang and Azad, Md. Saiful (2021) Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles. In: International Conference on Software Engineering & Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM 2021), 24-26 August 2021 , Pekan, Pahang, Malaysia. pp. 124-129.. ISBN 978-1-6654-1407-4 https://doi.org/10.1109/ICSECS52883.2021.00030
institution Universiti Malaysia Pahang
building UMP Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang
content_source UMP Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
topic QA76 Computer software
spellingShingle QA76 Computer software
Miah, Mohammad Badrul Alam
Suryanti, Awang
Azad, Md. Saiful
Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
description Due to the exponential growth of information’s and web sources, Automatic keyphrase extraction is still a challenging issue in the current research area. Keyphrases are very helpful for several tasks in natural language processing (NLP) and information retrieval (IR) systems. Feature extractions for those keyphrases execute a vital role in extracting the top-quality keyphrases and summarising the documents at a superior level. This paper proposes a new region-based distance analysis of keyphrases (RDAK) unsupervised technique for feature extraction of keyphrases from articles. The proposed method comprises six phases: data acquisition and preprocessing, data processing, distance calculation, average distance, curve plotting, and curve fitting. At first, the system inputs the collected different datasets to the preprocessing step by employing some text preprocessing techniques. Afterwards, the preprocessed data is applied to the data processing phase, and then after distance calculation, it is passed to the region-based average calculation process, then curve plotting analysis, and afterwards, the curve fitting technique is utilized. Finally, the proposed system has tested and evaluated the performance through implementing them on benchmark datasets. The proposed system will significantly improve the performance of existing keyphrase extraction techniques.
format Conference or Workshop Item
author Miah, Mohammad Badrul Alam
Suryanti, Awang
Azad, Md. Saiful
author_facet Miah, Mohammad Badrul Alam
Suryanti, Awang
Azad, Md. Saiful
author_sort Miah, Mohammad Badrul Alam
title Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_short Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_full Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_fullStr Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_full_unstemmed Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_sort region-based distance analysis of keyphrases: a new unsupervised method for extracting keyphrases feature from articles
publisher IEEE
publishDate 2021
url http://umpir.ump.edu.my/id/eprint/33128/7/Region-Based%20Distance%20Analysis%20of%20Keyphrases1.pdf
http://umpir.ump.edu.my/id/eprint/33128/
https://doi.org/10.1109/ICSECS52883.2021.00030
_version_ 1744353869878525952
score 13.211869