Staff View: Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis

Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis

Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yusuf, A., Sarlan, A., Danyaro, K.U., Rahman, A.S.B.A.
Format:	Conference or Workshop Item
Published:	2023
Online Access:	http://scholars.utp.edu.my/id/eprint/38065/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3
Tags:	Add Tag No Tags, Be the first to tag this record!

id	oai:scholars.utp.edu.my:38065
record_format	eprints
spelling	oai:scholars.utp.edu.my:380652023-12-11T02:54:55Z http://scholars.utp.edu.my/id/eprint/38065/ Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A. Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-tuning techniques for Hausa language sentiment classification tasks using three pre-trained multilingual language models: Roberta, XLM-R, and mBERT. A multilabel sentiment classification was conducted using Python programming language and TensorFlow library, with a GPU hardware accelerator on Google Collaboratory. The Twitter dataset used in this study contains 16849 train and 2677 unlabeled dev and 5303 test unlabeled samples of tweets/accounts, each labelled with positive, negative, and neutral respectively for train set data. The findings demonstrate that the mBERT-base-cased model gets the maximum accuracy and F1-score of 0.73 and 0.73, respectively, outperforming the other two pre-trained models. The train and validation accuracy graph of the mBERT model shows improvement over time. The study underscores the importance of tailoring the implementation code to meet specific requirements and the significance of fine-tuning pre-trained models for optimal performance. Â© 2023 IEEE. 2023 Conference or Workshop Item NonPeerReviewed Yusuf, A. and Sarlan, A. and Danyaro, K.U. and Rahman, A.S.B.A. (2023) Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis. In: UNSPECIFIED. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3 10.1109/CITA58204.2023.10262742 10.1109/CITA58204.2023.10262742 10.1109/CITA58204.2023.10262742
institution	Universiti Teknologi Petronas
building	UTP Resource Centre
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Petronas
content_source	UTP Institutional Repository
url_provider	http://eprints.utp.edu.my/
description	Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-tuning techniques for Hausa language sentiment classification tasks using three pre-trained multilingual language models: Roberta, XLM-R, and mBERT. A multilabel sentiment classification was conducted using Python programming language and TensorFlow library, with a GPU hardware accelerator on Google Collaboratory. The Twitter dataset used in this study contains 16849 train and 2677 unlabeled dev and 5303 test unlabeled samples of tweets/accounts, each labelled with positive, negative, and neutral respectively for train set data. The findings demonstrate that the mBERT-base-cased model gets the maximum accuracy and F1-score of 0.73 and 0.73, respectively, outperforming the other two pre-trained models. The train and validation accuracy graph of the mBERT model shows improvement over time. The study underscores the importance of tailoring the implementation code to meet specific requirements and the significance of fine-tuning pre-trained models for optimal performance. Â© 2023 IEEE.
format	Conference or Workshop Item
author	Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A.
spellingShingle	Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A. Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
author_facet	Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A.
author_sort	Yusuf, A.
title	Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_short	Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_full	Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_fullStr	Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_full_unstemmed	Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_sort	fine-tuning multilingual transformers for hausa-english sentiment analysis
publishDate	2023
url	http://scholars.utp.edu.my/id/eprint/38065/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3
_version_	1787138262364585984
score	13.23648

Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis

Similar Items