Designing metadata and data structure for Sarawak gazette based on text encoding initiative guidlines

In the domain of historical study, historian and researchers face a dounting task; they must analyze, categorize and assimilate a huge amount of information. Information extraction would be a very complex and difficult task in research and historical preservation, as the materials might contain a lo...

Full description

Saved in:
Bibliographic Details
Main Author: Fong, Tze Min
Format: Final Year Project Report
Language:English
Published: Universiti Malaysia Sarawak, (UNIMAS) 2014
Subjects:
Online Access:http://ir.unimas.my/id/eprint/39039/1/Fong%20Tze%20Min%20ft.pdf
http://ir.unimas.my/id/eprint/39039/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the domain of historical study, historian and researchers face a dounting task; they must analyze, categorize and assimilate a huge amount of information. Information extraction would be a very complex and difficult task in research and historical preservation, as the materials might contain a lot of unstructured and ambiguous information. Powerful extraction tool and advanced technology have no point, if the information has not been annotated properly, especially for historical document, in this project, a matadata and date structure are designed for one of the oldest newspapers in Sarawak, Sarawak Gazette, based on the Text Encoding Initiatives (TEI) latest's guideline. It stimulates accurate-annotated metadata to make information accessible and thus facilitates information extraction of the historical newspaper. Layout of Sarawak Gazette will be analyzed using the proposed technique the bottom-up page segmentation approach, in order to design the metadata structure.