Designing metadata and data structure for Sarawak gazette based on text encoding initiative guidlines
In the domain of historical study, historian and researchers face a dounting task; they must analyze, categorize and assimilate a huge amount of information. Information extraction would be a very complex and difficult task in research and historical preservation, as the materials might contain a lo...
Saved in:
Main Author: | |
---|---|
Format: | Final Year Project Report |
Language: | English |
Published: |
Universiti Malaysia Sarawak, (UNIMAS)
2014
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/39039/1/Fong%20Tze%20Min%20ft.pdf http://ir.unimas.my/id/eprint/39039/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In the domain of historical study, historian and researchers face a dounting task; they must analyze, categorize and assimilate a huge amount of information. Information extraction would be a very complex and difficult task in research and historical preservation, as the materials might contain a lot of unstructured and ambiguous information. Powerful extraction tool and advanced technology have no point, if the information has not been annotated properly, especially for historical document, in this project, a matadata and date structure are designed for one of the oldest newspapers in Sarawak, Sarawak Gazette, based on the Text Encoding Initiatives (TEI) latest's guideline. It stimulates accurate-annotated metadata to make information accessible and thus facilitates information extraction of the historical newspaper. Layout of Sarawak Gazette will be analyzed using the proposed technique the bottom-up page segmentation approach, in order to design the metadata structure. |
---|