A fine-tuned large language model for domain-specific with reinforcement learning
Large Language Models (LLMs) like GPT-3 and BERT have significantly shown advancement in natural language processing by providing robust tools for understanding and generating human languages. However, their broad but shallow knowledge across many domains often leads to less effective performance in...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Proceeding Paper |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/116732/1/A_Fine-Tuned_Large_Language_Model_for_Domain-Specific_with_Reinforcement_Learning.pdf http://irep.iium.edu.my/116732/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Large Language Models (LLMs) like GPT-3 and BERT have significantly shown advancement in natural language processing by providing robust tools for understanding and generating human languages. However, their broad but shallow knowledge across many domains often leads to less effective performance in domain-specific tasks, where detailed and special- ized knowledge is needed. To address this limitation, this paper investigates the effectiveness of fine-tuning LLMs for specific domains. The approach incorporates reinforcement learning to integrate user feedback, allowing the model to dynamically adjust and refine its responses. This will ensure the model adapts iteratively, improving communication and interaction with users. The fine-tuned model’s performance is evaluated using two domain-specific datasets—medical and dental. Evaluation metrics such as Levenshtein distance and cosine similarity are used to assess the textual accuracy and semantic relevance of the fine-tuned model. The results from the dental the medical datasets indicate a low level of textual differences and strong semantic alignment, respectively. These suggest that the fine- tuned model effectively processes and preserves the integrity of domain-specific content with the potential of fine-tuning LLMs to enhance their applicability in specific domains. |
---|