BERT based named entity recognition for automated hadith narrator identification
Hadith serves as a second source of Islamic law for Muslims worldwide, especially in Indonesia, which has the world's most significant Muslim population of 228.68 million people. However, not all Hadith texts have been certified and approved for use, and several falsified Hadiths make it challe...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Science and Information Organization
2022
|
Online Access: | http://eprints.utem.edu.my/id/eprint/26420/2/2022%20EMHA%20IJACSA.PDF http://eprints.utem.edu.my/id/eprint/26420/ https://thesai.org/Downloads/Volume13No1/Paper_73-BERT_based_Named_Entity_Recognition.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utem.eprints.26420 |
---|---|
record_format |
eprints |
spelling |
my.utem.eprints.264202023-03-28T13:44:43Z http://eprints.utem.edu.my/id/eprint/26420/ BERT based named entity recognition for automated hadith narrator identification Luthfi, Emha Taufiq Mohd Yusoh, Zeratul Izzah Mohd Aboobaider, Burhanuddin Hadith serves as a second source of Islamic law for Muslims worldwide, especially in Indonesia, which has the world's most significant Muslim population of 228.68 million people. However, not all Hadith texts have been certified and approved for use, and several falsified Hadiths make it challenging to distinguish between authentic and fabricated Hadiths. In terms of Hadith science, determining the authenticity of a Hadith can be accomplished by examining its Sanad and Matn. Sanad is an essential aspect of the Hadith because it indicates the chain of the Narrator who transmits the Hadith. The research reported in this paper provides an advanced Natural Language Processing (NLP) technique for identifying and authenticating the Narrator of Hadith as a part of Sanad, utilizing Named Entity Recognition (NER) to address the necessity of authenticating the Hadith. The NER technique described in the research adds an extra feed-forward classifier to the last layer of the pre-trained BERT model. In the testing process using Cahya/bert-base-indonesian-1.5G, the proposed solution received an overall F1-score of 99.63 percent. On the Hadith Narrator Identification using other Hadith passages, the final examination yielded a 98.27 percent F1-score. Science and Information Organization 2022 Article PeerReviewed text en http://eprints.utem.edu.my/id/eprint/26420/2/2022%20EMHA%20IJACSA.PDF Luthfi, Emha Taufiq and Mohd Yusoh, Zeratul Izzah and Mohd Aboobaider, Burhanuddin (2022) BERT based named entity recognition for automated hadith narrator identification. International Journal of Advanced Computer Science and Applications, 13 (1). pp. 604-611. ISSN 2158-107X https://thesai.org/Downloads/Volume13No1/Paper_73-BERT_based_Named_Entity_Recognition.pdf 10.14569/IJACSA.2022.0130173 |
institution |
Universiti Teknikal Malaysia Melaka |
building |
UTEM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknikal Malaysia Melaka |
content_source |
UTEM Institutional Repository |
url_provider |
http://eprints.utem.edu.my/ |
language |
English |
description |
Hadith serves as a second source of Islamic law for Muslims worldwide, especially in Indonesia, which has the world's most significant Muslim population of 228.68 million people. However, not all Hadith texts have been certified and approved for use, and several falsified Hadiths make it challenging to distinguish between authentic and fabricated Hadiths. In terms of Hadith science, determining the authenticity of a Hadith can be accomplished by examining its Sanad and Matn. Sanad is an essential aspect of the Hadith because it indicates the chain of the Narrator who transmits the Hadith. The research reported in this paper provides an advanced Natural Language Processing (NLP) technique for identifying and authenticating the Narrator of Hadith as a part of Sanad, utilizing Named Entity Recognition (NER) to address the necessity of authenticating the Hadith. The NER technique described in the research adds an extra feed-forward classifier to the last layer of the pre-trained BERT model. In the testing process using Cahya/bert-base-indonesian-1.5G, the proposed solution received an overall F1-score of 99.63 percent. On the Hadith Narrator Identification using other Hadith passages, the final examination yielded a 98.27 percent F1-score. |
format |
Article |
author |
Luthfi, Emha Taufiq Mohd Yusoh, Zeratul Izzah Mohd Aboobaider, Burhanuddin |
spellingShingle |
Luthfi, Emha Taufiq Mohd Yusoh, Zeratul Izzah Mohd Aboobaider, Burhanuddin BERT based named entity recognition for automated hadith narrator identification |
author_facet |
Luthfi, Emha Taufiq Mohd Yusoh, Zeratul Izzah Mohd Aboobaider, Burhanuddin |
author_sort |
Luthfi, Emha Taufiq |
title |
BERT based named entity recognition for automated hadith narrator identification |
title_short |
BERT based named entity recognition for automated hadith narrator identification |
title_full |
BERT based named entity recognition for automated hadith narrator identification |
title_fullStr |
BERT based named entity recognition for automated hadith narrator identification |
title_full_unstemmed |
BERT based named entity recognition for automated hadith narrator identification |
title_sort |
bert based named entity recognition for automated hadith narrator identification |
publisher |
Science and Information Organization |
publishDate |
2022 |
url |
http://eprints.utem.edu.my/id/eprint/26420/2/2022%20EMHA%20IJACSA.PDF http://eprints.utem.edu.my/id/eprint/26420/ https://thesai.org/Downloads/Volume13No1/Paper_73-BERT_based_Named_Entity_Recognition.pdf |
_version_ |
1761623119262384128 |
score |
13.211869 |