Content extraction of historical Malay manuscripts based on event ontology framework

This article aims to explore representation of the content knowledge of historical Malay manuscripts by extracting the event features using an event ontology framework. The manuscript used during the testing is Sulalatus Salatin (Sejarah Melayu (SIC)) by Abdul Ahmad Samad and it was published at Uni...

Full description

Saved in:
Bibliographic Details
Main Authors: Zahila, M. N., Noorhidawati, Abdullah, Mohd Khalid, Yanti Idaya Aspura
Format: Article
Published: IOS Press 2021
Subjects:
Online Access:http://eprints.um.edu.my/26993/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This article aims to explore representation of the content knowledge of historical Malay manuscripts by extracting the event features using an event ontology framework. The manuscript used during the testing is Sulalatus Salatin (Sejarah Melayu (SIC)) by Abdul Ahmad Samad and it was published at University of Malaya Digital Library database. In aligning to a domain-specific ontology, the Simple Event Model (SEM) model is adopted and an event-based ontology for historical Malay manuscripts is designed. Information extraction approach is done manually to extract events from the manuscript and mapped into Protege editor. Competency questions were constructed and submitted to the Protege editor using SPARQL to check the ontology capability of providing answers as well as to examine its correctness. Event-based ontology model assists in discovering and representing the content knowledge of historical Malay manuscripts and supports organisation of knowledge. All the main concepts are extracted from selected Malay manuscript and 17 concepts used to develop the event-based ontology model. The knowledge was verified by three domain experts in Malay manuscript. In the findings, the interrater reliability for Event and Actor instances is 84%, which means 16% of instances and its type are incorrect and need amendment. For Place, interrater reliability is 95% and 99% for Role. Meanwhile, the experts achieved 100% agreement for Time. In addition, the experts agreed that the concepts, properties and instances for Malay Manuscript Ontology and complied with the criteria of consistency, completeness, conciseness, expandability and ease of use. The development of the event-based model of an ontology-based system with a high level of semantic granularity reflects the various cultural riches and intellectual aspect stored in Malay manuscripts. This will enable systematic research of the knowledge embedded in the manuscripts and make it widely and easily accessible by everyone.