Saga: Encoding The Structures Of Historical Documents Using Tei-Xml Schema

Sarawak Gazette is the oldest newspaper published in Sarawak which comprised of rich historical information related to Sarawak and as essential source of historical information on Sarawak affairs, particularly from 1870 to 1941. Previously, a project to create a website for Sarawak Gazette named...

Full description

Saved in:
Bibliographic Details
Main Author: Muhammad Adib Fikri, Johari
Format: Final Year Project Report
Language:English
Published: Universiti Malaysia Sarawak, (UNIMAS) 2023
Subjects:
Online Access:http://ir.unimas.my/id/eprint/44087/1/Muhammad%20Adib%20Fikri%20Bin%20Johari%20%28fulltext%29.pdf
http://ir.unimas.my/id/eprint/44087/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sarawak Gazette is the oldest newspaper published in Sarawak which comprised of rich historical information related to Sarawak and as essential source of historical information on Sarawak affairs, particularly from 1870 to 1941. Previously, a project to create a website for Sarawak Gazette named as “e-Sarawak Gazette” was conducted as an initiative to preserve Sarawak Gazette collection digitally. However, the current version of digital form of Sarawak Gazette collection is saved as PDF as image form. This cause limitation to explore the enriching content of the historical document. Therefore, this project aims to discover and retrieve the information from the Sarawak Gazette collection and perform data annotation to discover the meaningful context contained in the historical document. Besides, this project will also mainly work on utilizing the work done from digitization and data annotation to create a TEI XML format document for Sarawak Gazette to perform structuring the XML data for the construction of indexing and implementation of search functionality to contribute to development of dynamic web portal for Sarawak Gazette. The structure of Sarawak Gazette document will be structured in compliance with the XML standard and the Text Encoding Initiative's P5 recommendations. This effort intends to achieve an explicit and semantic markup of historical information, which is intended to provide various of benefits, including the ability to validate the structure of the Sarawak Gazette document and enable advanced data processing, such as index construction and enable search features to be integrated.