Abstract: This paper presents an approach for easy creation and
classification of institutional risk profiles supporting endangerment
analysis of file formats. The main contribution of this work is the
employment of data mining techniques to support set up of the most
important risk factors. Subsequently, risk profiles employ risk factors
classifier and associated configurations to support digital preservation
experts with a semi-automatic estimation of endangerment group
for file format risk profiles. Our goal is to make use of an expert
knowledge base, accuired through a digital preservation survey
in order to detect preservation risks for a particular institution.
Another contribution is support for visualisation of risk factors for
a requried dimension for analysis. Using the naive Bayes method,
the decision support system recommends to an expert the matching
risk profile group for the previously selected institutional risk profile.
The proposed methods improve the visibility of risk factor values
and the quality of a digital preservation process. The presented
approach is designed to facilitate decision making for the preservation
of digital content in libraries and archives using domain expert
knowledge and values of file format risk profiles. To facilitate
decision-making, the aggregated information about the risk factors
is presented as a multidimensional vector. The goal is to visualise
particular dimensions of this vector for analysis by an expert and
to define its profile group. The sample risk profile calculation and
the visualisation of some risk factor dimensions is presented in the
evaluation section.
Abstract: This study examined the digital preservation in
Nigeria university libraries. A comparison between the university of
Nigeria Nsukka (UNN) and Ahmadu Bello University Zaria (ABU,
Zaria). The study utilized primary source of data obtained from two
selected institution librarians. Finding revealed varying results in
terms of skills acquired by librarians before and after digitization of
the two institutions. The study reports that journals publication, text
book, CD-ROMS, conference papers and proceedings, theses,
dissertations and seminar papers are among the information resources
available for digitization. The study further documents that copyright
issue, power failure, and unavailability of needed materials are
among the challenges facing the digitization of library of the
institution. On the basis of the finding, the study concluded that
digitization of library enhances efficiency in organization and
retrieval of information services. The study therefore recommended
that software should be upgraded with backup, training of the
librarians on digital process, installation of antivirus and
enhancement of technical collaboration between the library and MIS.
Abstract: Information generated from various computerization processes is a potential rich source of knowledge for its designated community. To pass this information from generation to generation without modifying the meaning is a challenging activity. To preserve and archive the data for future generations it’s very essential to prove the authenticity of the data. It can be achieved by extracting the metadata from the data which can prove the authenticity and create trust on the archived data. Subsequent challenge is the technology obsolescence. Metadata extraction and standardization can be effectively used to resolve and tackle this problem. Metadata can be categorized at two levels i.e. Technical and Domain level broadly. Technical metadata will provide the information that can be used to understand and interpret the data record, but only this level of metadata isn’t sufficient to create trustworthiness. We have developed a tool which will extract and standardize the technical as well as domain level metadata. This paper is about the different features of the tool and how we have developed this.
Abstract: This paper presents an approach for the classification of
an unstructured format description for identification of file formats.
The main contribution of this work is the employment of data mining
techniques to support file format selection with just the unstructured
text description that comprises the most important format features for
a particular organisation. Subsequently, the file format indentification
method employs file format classifier and associated configurations to
support digital preservation experts with an estimation of required file
format. Our goal is to make use of a format specification knowledge
base aggregated from a different Web sources in order to select file
format for a particular institution. Using the naive Bayes method,
the decision support system recommends to an expert, the file format
for his institution. The proposed methods facilitate the selection of
file format and the quality of a digital preservation process. The
presented approach is meant to facilitate decision making for the
preservation of digital content in libraries and archives using domain
expert knowledge and specifications of file formats. To facilitate
decision-making, the aggregated information about the file formats is
presented as a file format vocabulary that comprises most common
terms that are characteristic for all researched formats. The goal is to
suggest a particular file format based on this vocabulary for analysis
by an expert. The sample file format calculation and the calculation
results including probabilities are presented in the evaluation section.