Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis
Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
2023
|
Online Access: | http://scholars.utp.edu.my/id/eprint/38065/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
oai:scholars.utp.edu.my:38065 |
---|---|
record_format |
eprints |
spelling |
oai:scholars.utp.edu.my:380652023-12-11T02:54:55Z http://scholars.utp.edu.my/id/eprint/38065/ Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A. Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-tuning techniques for Hausa language sentiment classification tasks using three pre-trained multilingual language models: Roberta, XLM-R, and mBERT. A multilabel sentiment classification was conducted using Python programming language and TensorFlow library, with a GPU hardware accelerator on Google Collaboratory. The Twitter dataset used in this study contains 16849 train and 2677 unlabeled dev and 5303 test unlabeled samples of tweets/accounts, each labelled with positive, negative, and neutral respectively for train set data. The findings demonstrate that the mBERT-base-cased model gets the maximum accuracy and F1-score of 0.73 and 0.73, respectively, outperforming the other two pre-trained models. The train and validation accuracy graph of the mBERT model shows improvement over time. The study underscores the importance of tailoring the implementation code to meet specific requirements and the significance of fine-tuning pre-trained models for optimal performance. © 2023 IEEE. 2023 Conference or Workshop Item NonPeerReviewed Yusuf, A. and Sarlan, A. and Danyaro, K.U. and Rahman, A.S.B.A. (2023) Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis. In: UNSPECIFIED. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3 10.1109/CITA58204.2023.10262742 10.1109/CITA58204.2023.10262742 10.1109/CITA58204.2023.10262742 |
institution |
Universiti Teknologi Petronas |
building |
UTP Resource Centre |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Petronas |
content_source |
UTP Institutional Repository |
url_provider |
http://eprints.utp.edu.my/ |
description |
Accurate sentiment analysis is greatly hindered by the code-switching phenomena, especially in the setting low resource language such as the Hausa. However, the majority of previous studies on Hausa sentiment analysis have mainly ignored this problem. This study explores the use of transformer fine-tuning techniques for Hausa language sentiment classification tasks using three pre-trained multilingual language models: Roberta, XLM-R, and mBERT. A multilabel sentiment classification was conducted using Python programming language and TensorFlow library, with a GPU hardware accelerator on Google Collaboratory. The Twitter dataset used in this study contains 16849 train and 2677 unlabeled dev and 5303 test unlabeled samples of tweets/accounts, each labelled with positive, negative, and neutral respectively for train set data. The findings demonstrate that the mBERT-base-cased model gets the maximum accuracy and F1-score of 0.73 and 0.73, respectively, outperforming the other two pre-trained models. The train and validation accuracy graph of the mBERT model shows improvement over time. The study underscores the importance of tailoring the implementation code to meet specific requirements and the significance of fine-tuning pre-trained models for optimal performance. © 2023 IEEE. |
format |
Conference or Workshop Item |
author |
Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A. |
spellingShingle |
Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A. Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis |
author_facet |
Yusuf, A. Sarlan, A. Danyaro, K.U. Rahman, A.S.B.A. |
author_sort |
Yusuf, A. |
title |
Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis |
title_short |
Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis |
title_full |
Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis |
title_fullStr |
Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis |
title_full_unstemmed |
Fine-tuning Multilingual Transformers for Hausa-English Sentiment Analysis |
title_sort |
fine-tuning multilingual transformers for hausa-english sentiment analysis |
publishDate |
2023 |
url |
http://scholars.utp.edu.my/id/eprint/38065/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174169838&doi=10.1109%2fCITA58204.2023.10262742&partnerID=40&md5=510b4f111c1bb96716742e140aabfcf3 |
_version_ |
1787138262364585984 |
score |
13.222552 |