Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis

Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-...

Full description

Saved in:
Bibliographic Details
Main Authors: Yusuf, A., Sarlan, A., Danyaro, K.U., Rahman, A.S.B.A.
Format: Conference or Workshop Item
Published: 2023
Online Access:http://scholars.utp.edu.my/id/eprint/38065/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3
Tags: Add Tag
No Tags, Be the first to tag this record!
id oai:scholars.utp.edu.my:38065
record_format eprints
spelling oai:scholars.utp.edu.my:380652023-12-11T02:54:55Z http://scholars.utp.edu.my/id/eprint/38065/ Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A. Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-tuning techniques for Hausa language sentiment classification tasks using three pre-trained multilingual language models: Roberta, XLM-R, and mBERT. A multilabel sentiment classification was conducted using Python programming language and TensorFlow library, with a GPU hardware accelerator on Google Collaboratory. The Twitter dataset used in this study contains 16849 train and 2677 unlabeled dev and 5303 test unlabeled samples of tweets/accounts, each labelled with positive, negative, and neutral respectively for train set data. The findings demonstrate that the mBERT-base-cased model gets the maximum accuracy and F1-score of 0.73 and 0.73, respectively, outperforming the other two pre-trained models. The train and validation accuracy graph of the mBERT model shows improvement over time. The study underscores the importance of tailoring the implementation code to meet specific requirements and the significance of fine-tuning pre-trained models for optimal performance. © 2023 IEEE. 2023 Conference or Workshop Item NonPeerReviewed Yusuf, A. and Sarlan, A. and Danyaro, K.U. and Rahman, A.S.B.A. (2023) Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis. In: UNSPECIFIED. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3 10.1109/CITA58204.2023.10262742 10.1109/CITA58204.2023.10262742 10.1109/CITA58204.2023.10262742
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-tuning techniques for Hausa language sentiment classification tasks using three pre-trained multilingual language models: Roberta, XLM-R, and mBERT. A multilabel sentiment classification was conducted using Python programming language and TensorFlow library, with a GPU hardware accelerator on Google Collaboratory. The Twitter dataset used in this study contains 16849 train and 2677 unlabeled dev and 5303 test unlabeled samples of tweets/accounts, each labelled with positive, negative, and neutral respectively for train set data. The findings demonstrate that the mBERT-base-cased model gets the maximum accuracy and F1-score of 0.73 and 0.73, respectively, outperforming the other two pre-trained models. The train and validation accuracy graph of the mBERT model shows improvement over time. The study underscores the importance of tailoring the implementation code to meet specific requirements and the significance of fine-tuning pre-trained models for optimal performance. © 2023 IEEE.
format Conference or Workshop Item
author Yusuf, A.
Sarlan, A.
Danyaro, K.U.
Rahman, A.S.B.A.
spellingShingle Yusuf, A.
Sarlan, A.
Danyaro, K.U.
Rahman, A.S.B.A.
Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
author_facet Yusuf, A.
Sarlan, A.
Danyaro, K.U.
Rahman, A.S.B.A.
author_sort Yusuf, A.
title Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_short Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_full Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_fullStr Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_full_unstemmed Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
title_sort fine-tuning multilingual transformers for hausa-english sentiment analysis
publishDate 2023
url http://scholars.utp.edu.my/id/eprint/38065/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3
_version_ 1787138262364585984
score 13.222552