Staff View: Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning

Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning

Speaker diarization, the process of segmenting audio into speaker-specific regions, plays a critical role in various speech technologies by determining "who spoke when" in a conversation. This technique is particularly valuable for enhancing automatic speech recognition (ASR) and conversat...

Full description

Saved in:

Bibliographic Details
Main Authors:	Mohd Zulhafiz, Rahim, Sarah Flora, Samson Juan, Syahrul Nizam, Junaini
Format:	Article
Language:	English
Published:	ARQII Publication 2025
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://ir.unimas.my/id/eprint/47225/2/801-2473-10%20%281%29.pdf http://ir.unimas.my/id/eprint/47225/ https://arqiipubl.com/ojs/index.php/AMS_Journal/article/view/801
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.unimas.ir-47225
record_format	eprints
spelling	my.unimas.ir-472252025-01-02T06:52:38Z http://ir.unimas.my/id/eprint/47225/ Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning Mohd Zulhafiz, Rahim Sarah Flora, Samson Juan Syahrul Nizam, Junaini QA75 Electronic computers. Computer science Speaker diarization, the process of segmenting audio into speaker-specific regions, plays a critical role in various speech technologies by determining "who spoke when" in a conversation. This technique is particularly valuable for enhancing automatic speech recognition (ASR) and conversational artificial intelligent systems. However, its application to low-resourced languages remains underexplored, limiting not only the performance of speaker diarization among low-resourced languages, but also stagnating the advancements of ASR to low-resourced languages. This is due to the fact that speaker diarization enables speaker adaptation in ASR, crucial for maximizing the performance of ASR itself. This lack of digital resources of speaker diarization to low-resourced languages, as well as the scarcity of its implementation presents a gap between low-resourced languages and popular languages in terms of the advancements of speech technologies involving the particular languages. This paper focuses on Sarawak Malay, a low-resourced language, and presents conversational data collected through a crowd-sourced approach, which needs speaker turns and transcripts. These missing annotations create challenges for building accurate acoustic models. To address this, we conducted a systematic review of recent speaker diarization research and related machine learning techniques. Using the PRISMA methodology, we reviewed 42 articles published between 2018 and 2023. Our findings identify key machine learning models, such as i-vectors and x-vectors, and open-source tools like Pyannote, which offer promising advancements in diarization performance. Besides that, these tools have shown potential to be implemented in developing speaker diarization models for low-resourced language. By highlighting the gaps in current research for low-resourced languages, we provide a pathway for improving speaker diarization models in these underrepresented languages through machine learning techniques. ARQII Publication 2025-01 Article PeerReviewed text en http://ir.unimas.my/id/eprint/47225/2/801-2473-10%20%281%29.pdf Mohd Zulhafiz, Rahim and Sarah Flora, Samson Juan and Syahrul Nizam, Junaini (2025) Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning. Applications of Modelling and Simulation, 9. pp. 22-36. ISSN 2600-8084 https://arqiipubl.com/ojs/index.php/AMS_Journal/article/view/801
institution	Universiti Malaysia Sarawak
building	Centre for Academic Information Services (CAIS)
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Sarawak
content_source	UNIMAS Institutional Repository
url_provider	http://ir.unimas.my/
language	English
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Mohd Zulhafiz, Rahim Sarah Flora, Samson Juan Syahrul Nizam, Junaini Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning
description	Speaker diarization, the process of segmenting audio into speaker-specific regions, plays a critical role in various speech technologies by determining "who spoke when" in a conversation. This technique is particularly valuable for enhancing automatic speech recognition (ASR) and conversational artificial intelligent systems. However, its application to low-resourced languages remains underexplored, limiting not only the performance of speaker diarization among low-resourced languages, but also stagnating the advancements of ASR to low-resourced languages. This is due to the fact that speaker diarization enables speaker adaptation in ASR, crucial for maximizing the performance of ASR itself. This lack of digital resources of speaker diarization to low-resourced languages, as well as the scarcity of its implementation presents a gap between low-resourced languages and popular languages in terms of the advancements of speech technologies involving the particular languages. This paper focuses on Sarawak Malay, a low-resourced language, and presents conversational data collected through a crowd-sourced approach, which needs speaker turns and transcripts. These missing annotations create challenges for building accurate acoustic models. To address this, we conducted a systematic review of recent speaker diarization research and related machine learning techniques. Using the PRISMA methodology, we reviewed 42 articles published between 2018 and 2023. Our findings identify key machine learning models, such as i-vectors and x-vectors, and open-source tools like Pyannote, which offer promising advancements in diarization performance. Besides that, these tools have shown potential to be implemented in developing speaker diarization models for low-resourced language. By highlighting the gaps in current research for low-resourced languages, we provide a pathway for improving speaker diarization models in these underrepresented languages through machine learning techniques.
format	Article
author	Mohd Zulhafiz, Rahim Sarah Flora, Samson Juan Syahrul Nizam, Junaini
author_facet	Mohd Zulhafiz, Rahim Sarah Flora, Samson Juan Syahrul Nizam, Junaini
author_sort	Mohd Zulhafiz, Rahim
title	Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning
title_short	Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning
title_full	Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning
title_fullStr	Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning
title_full_unstemmed	Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning
title_sort	systematic literature review of speaker diarization techniques : toward bridging gaps in low-resourced languages using machine learning
publisher	ARQII Publication
publishDate	2025
url	http://ir.unimas.my/id/eprint/47225/2/801-2473-10%20%281%29.pdf http://ir.unimas.my/id/eprint/47225/ https://arqiipubl.com/ojs/index.php/AMS_Journal/article/view/801
_version_	1821007926338256896
score	13.235362

Systematic Literature Review of Speaker Diarization Techniques : Toward Bridging Gaps in Low-resourced Languages using Machine Learning

Similar Items