Abstract: In the area of Human Resource Management, the trend is towards online exchange of information about human resources. For example, online applications for employment become standard and job offerings are posted in many job portals. However, there are too many job portals to monitor all of them if someone is interested in a new job. We developed a prototype for integrating information of different job portals into one meta-search engine. First, existing job portals were investigated and XML schema documents were derived automated from these portals. Second, translation rules for transforming each schema to a central HR-XML-conform schema were determined. The HR-XML-schema is used to build a form for searching jobs. The data supplied by a user in this form is now translated into queries for the different job portals. Each result obtained by a job portal is sent to the meta-search engine that ranks the result of all received job offers according to user's preferences.
Abstract: In this paper we present a novel technique for data
hiding in binary document images. We use the concept of entropy in
order to identify document specific least distortive areas throughout
the binary document image. The document image is treated as any
other image and the proposed method utilizes the standard document
characteristics for the embedding process. Proposed method
minimizes perceptual distortion due to embedding and allows
watermark extraction without the requirement of any side information
at the decoder end.
Abstract: Automatic keyphrase extraction is useful in efficiently
locating specific documents in online databases. While several
techniques have been introduced over the years, improvement on
accuracy rate is minimal. This research examines attribute scores for
author-supplied keyphrases to better understand how the scores affect
the accuracy rate of automatic keyphrase extraction. Five attributes
are chosen for examination: Term Frequency, First Occurrence, Last
Occurrence, Phrase Position in Sentences, and Term Cohesion
Degree. The results show that First Occurrence is the most reliable
attribute. Term Frequency, Last Occurrence and Term Cohesion
Degree display a wide range of variation but are still usable with
suggested tweaks. Only Phrase Position in Sentences shows a totally
unpredictable pattern. The results imply that the commonly used
ranking approach which directly extracts top ranked potential phrases
from candidate keyphrase list as the keyphrases may not be reliable.
Abstract: Building conservation work generally involves complex and non-standard work different from new building construction processes. In preparing tenders for building conservation projects, therefore, the quantity surveyor must carefully consider the specificity of non-standard items and demarcate the scope of unique conservation work. While the quantity surveyor must appreciate the full range of works to prepare a good tender document, he typically manages many unfamiliar elements, including practical construction methods, restoration techniques and work sequences. Only by fulfilling the demanding requirements of building conservation work can the quantity surveyor enhance his professionalism an area of growing cultural value and economic importance. By discussing several issues crucial to tender preparations for building conservation projects in Malaysia, this paper seeks a deeper understanding of how quantity surveying can better standardize tender preparation work and more successfully manage building conservation processes.
Abstract: Character segmentation is an important preprocessing
step for text recognition. In degraded documents, existence of
touching characters decreases recognition rate drastically, for any
optical character recognition (OCR) system. In this paper we have
proposed a complete solution for segmenting touching characters in
all the three zones of printed Gurmukhi script. A study of touching
Gurmukhi characters is carried out and these characters have been
divided into various categories after a careful analysis. Structural
properties of the Gurmukhi characters are used for defining the
categories. New algorithms have been proposed to segment the
touching characters in middle zone, upper zone and lower zone.
These algorithms have shown a reasonable improvement in
segmenting the touching characters in degraded printed Gurmukhi
script. The algorithms proposed in this paper are applicable only to
machine printed text. We have also discussed a new and useful
technique to segment the horizontally overlapping lines.
Abstract: Text categorization is the problem of classifying text
documents into a set of predefined classes. After a preprocessing
step, the documents are typically represented as large sparse vectors.
When training classifiers on large collections of documents, both the
time and memory restrictions can be quite prohibitive. This justifies
the application of feature selection methods to reduce the
dimensionality of the document-representation vector. In this paper,
three feature selection methods are evaluated: Random Selection,
Information Gain (IG) and Support Vector Machine feature selection
(called SVM_FS). We show that the best results were obtained with
SVM_FS method for a relatively small dimension of the feature
vector. Also we present a novel method to better correlate SVM
kernel-s parameters (Polynomial or Gaussian kernel).
Abstract: Sampling and analysis of leachate from Bhalaswa
landfill and groundwater samples from nearby locations, clearly
indicated the likely contamination of groundwater due to landfill
leachate. The results of simulation studies carried out for the
migration of Chloride from landfill shows that the simulation results
are in consonance with the observed concentration of Chloride in the
vicinity of landfill facility. The solid waste disposal system presently
being practiced in Delhi consists of mere dumping of wastes
generated, at three locations Bhalaswa, Ghazipur, and Okhla without
any regard to proper care for the protection of surrounding
environment. Bhalaswa landfill site in Delhi, which is being operated
as a dump site, is expected to become cause of serious groundwater
pollution in its vicinity. The leachate from Bhalaswa landfill was
found to be having a high concentration of chlorides, as well as DOC,
COD. The present study was undertaken to determine the likely
concentrations of principle contaminants in the groundwater over a
period of time due to the discharge of such contaminants from
landfill leachates to the underlying groundwater. The observed
concentration of chlorides in the groundwater within 75m of the
radius of landfill facility was found to be in consonance with the
simulated concentration of chloride in groundwater considering one
dimensional transport model, with finite mass of contaminant source.
Governing equation of contaminant transport involving advection and
diffusion-dispersion was solved in matlab7.0 using finite difference
method.
Abstract: The dramatic effect of information technology on
society is undeniable. In education, it is evident in the use of terms
like active learning, blended learning, electronic learning and mobile
learning (ubiquitous learning). This study explores the perceptions of
54 learners in a higher education institution regarding the use of
mobile devices in a third year module. Using semi-structured
interviews, it was found that mobile devices had a positive impact on
learner motivation, engagement and enjoyment. It also improved the
consistency of learning material, and the convenience and flexibility
(anywhere, anytime) of learning. User-interfacelimitation, bandwidth
and cognitive overload, however, were of concern. The use of cloud
based resources like Youtube and Google Docs, through mobile
devices, positively influenced learner perceptions, making them
prosumers (both consumers and producers) of education content.
Abstract: Elateriospermum tapos seed (buah perah) is the one
of the rich sources of polyunsaturated fatty acids. It contains high
percentage of oleic acid which is the important component to develop
nervous system and also α-linolenic acid (ALA) which is the
precursor of omega-3 fatty acids series to synthesize
eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA).
However, there is less study about this valuable oilseed and exploit
its potential. Therefore, this paper is to assess the comparison of
physico-chemical properties and fatty composition of perah oil to
palm oil and soybean oil. From the comparison, perah oil shows low
peroxide value means it has good oxidative stability and also high
iodine values shows that it can be used in paint industry. The study
shown that perah oil is comparable to palm oil and soybean oil, so it
has high potential to be exploited in the oleochemical,
pharmaceutical, cosmetics and paint industries.
Abstract: Many new experimental films which were free from conventional movie forms have appeared since Nubellbak Movement in the late 1950s. Forty years after the movement started, on March 13th, 1995, on the 100th anniversary of the birth of film, the declaration called Dogme 95, was issued in Copenhagen, Denmark. It aimed to create a new style of avant-garde film, and showed a tendency toward being anti-Hollywood and anti-genre, which were against the highly popular Hollywood trend of movies based on large-scale investment. The main idea of Dogme 95 is the opposition to 'the writer's doctrine' that a film should be the artist's individual work and to 'the overuse of technology' in film. The key figures declared ten principles called 'Vow of Chastity', by which new movie forms were to be produced. Interview (2000), directed by Byunhyuk, was made in 2000, five years after Dogme 95 was declared. This movie was dedicated as the first Asian Dogme. This study will survey the relationship between Korean film and the Vow of Chastity through the Korean films released in theaters from a viewpoint of technology and content. It also will call attention to its effects on and significance to Korean film in modern society.
Abstract: Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.
Abstract: Thailand-s health system is challenged by the rising
number of patients and decreasing ratio of medical
practitioners/patients, especially in rural areas. This may tempt
inexperienced GPs to rush through the process of anamnesis with the
risk of incorrect diagnosis. Patients have to travel far to the hospital
and wait for a long time presenting their case. Many patients try to
cure themselves with traditional Thai medicine. Many countries are
making use of the Internet for medical information gathering,
distribution and storage. Telemedicine applications are a relatively
new field of study in Thailand; the infrastructure of ICT had
hampered widespread use of the Internet for using medical
information. With recent improvements made health and technology
professionals can work out novel applications and systems to help
advance telemedicine for the benefit of the people. Here we explore
the use of telemedicine for people with health problems in rural areas
in Thailand and present a Telemedicine Diagnosis System for Rural
Thailand (TEDIST) for diagnosing certain conditions that people
with Internet access can use to establish contact with Community
Health Centers, e.g. by mobile phone. The system uses a Web-based
input method for individual patients- symptoms, which are taken by
an expert system for the analysis of conditions and appropriate
diseases. The analysis harnesses a knowledge base and a backward
chaining component to find out, which health professionals should be
presented with the case. Doctors have the opportunity to exchange
emails or chat with the patients they are responsible for or other
specialists. Patients- data are then stored in a Personal Health Record.
Abstract: A comparison between the performance of Latin and
Arabic handwritten digits recognition problems is presented. The
performance of ten different classifiers is tested on two similar
Arabic and Latin handwritten digits databases. The analysis shows
that Arabic handwritten digits recognition problem is easier than that
of Latin digits. This is because the interclass difference in case of
Latin digits is smaller than in Arabic digits and variances in writing
Latin digits are larger. Consequently, weaker yet fast classifiers are
expected to play more prominent role in Arabic handwritten digits
recognition.
Abstract: Underpricing is one anomaly in initial public offerings
(IPO) literature that has been widely observed across different stock
markets with different trends emerging over different time periods.
This study seeks to determine how IPOs on the JSE performed on the
first day, first week and first month over the period of 1996-2011.
Underpricing trends are documented for both hot and cold market
periods in terms of four main sectors (cyclical, defensive, growth
stock and interest rate sensitive stocks). Using a sample of 360 listed
companies on the JSE, the empirical findings established that IPOs
on the JSE are significantly underpriced with an average market
adjusted first day return of 62.9%. It is also established that hot
market IPOs on the JSE are more underpriced than the cold market
IPOs. Also observed is the fact that as the offer price per share
increases above the median price for any given period, the level of
underpricing decreases substantially. While significant differences
exist in the level of underpricing of IPOs in the four different sectors
in the hot and cold market periods, interest rates sensitive stocks
showed a different trend from the other sectors and thus require
further investigation to uncover this pattern.
Abstract: While the problem based learning (PBL) approach promotes unsupervised self-directed learning (SDL), many students experience difficulty juggling the role of being an information recipient and information seeker. Logbooks have been used to assess trainee doctors but not in other areas. This study aimed to determine the effectiveness of logbook for assessing SDL during PBL sessions in first year medical students. The log book included a learning checklist and knowledge and skills components. Comparisons with the baseline assessment of student performance in PBL and that at semester end after logbook intervention showed significant improvements in student performance (31.5 ± 8 vs. 17.7 ± 4.4; p
Abstract: Information is increasing in volumes; companies are overloaded with information that they may lose track in getting the intended information. It is a time consuming task to scan through each of the lengthy document. A shorter version of the document which contains only the gist information is more favourable for most information seekers. Therefore, in this paper, we implement a text summarization system to produce a summary that contains gist information of oil and gas news articles. The summarization is intended to provide important information for oil and gas companies to monitor their competitor-s behaviour in enhancing them in formulating business strategies. The system integrated statistical approach with three underlying concepts: keyword occurrences, title of the news article and location of the sentence. The generated summaries were compared with human generated summaries from an oil and gas company. Precision and recall ratio are used to evaluate the accuracy of the generated summary. Based on the experimental results, the system is able to produce an effective summary with the average recall value of 83% at the compression rate of 25%.
Abstract: Increasing growth of information volume in the
internet causes an increasing need to develop new (semi)automatic
methods for retrieval of documents and ranking them according to
their relevance to the user query. In this paper, after a brief review
on ranking models, a new ontology based approach for ranking
HTML documents is proposed and evaluated in various
circumstances. Our approach is a combination of conceptual,
statistical and linguistic methods. This combination reserves the
precision of ranking without loosing the speed. Our approach
exploits natural language processing techniques to extract phrases
from documents and the query and doing stemming on words. Then
an ontology based conceptual method will be used to annotate
documents and expand the query. To expand a query the spread
activation algorithm is improved so that the expansion can be done
flexible and in various aspects. The annotated documents and the
expanded query will be processed to compute the relevance degree
exploiting statistical methods. The outstanding features of our
approach are (1) combining conceptual, statistical and linguistic
features of documents, (2) expanding the query with its related
concepts before comparing to documents, (3) extracting and using
both words and phrases to compute relevance degree, (4) improving
the spread activation algorithm to do the expansion based on
weighted combination of different conceptual relationships and (5)
allowing variable document vector dimensions. A ranking system
called ORank is developed to implement and test the proposed
model. The test results will be included at the end of the paper.
Abstract: The purposes of this paper are to (1) promote excellence in computer science by suggesting a cohesive innovative approach to fill well documented deficiencies in current computer science education, (2) justify (using the authors' and others anecdotal evidence from both the classroom and the real world) why this approach holds great potential to successfully eliminate the deficiencies, (3) invite other professionals to join the authors in proof of concept research. The authors' experiences, though anecdotal, strongly suggest that a new approach involving visual modeling technologies should allow computer science programs to retain a greater percentage of prospective and declared majors as students become more engaged learners, more successful problem-solvers, and better prepared as programmers. In addition, the graduates of such computer science programs will make greater contributions to the profession as skilled problem-solvers. Instead of wearily rememorizing code as they move to the next course, students will have the problem-solving skills to think and work in more sophisticated and creative ways.
Abstract: Distributed wireless sensor network consist on several
scattered nodes in a knowledge area. Those sensors have as its only
power supplies a pair of batteries that must let them live up to five
years without substitution. That-s why it is necessary to develop
some power aware algorithms that could save battery lifetime as
much as possible. In this is document, a review of power aware
design for sensor nodes is presented. As example of implementations,
some resources and task management, communication, topology
control and routing protocols are named.
Abstract: Diagnosis can be achieved by building a model of a
certain organ under surveillance and comparing it with the real time
physiological measurements taken from the patient. This paper deals
with the presentation of the benefits of using Data Mining techniques
in the computer-aided diagnosis (CAD), focusing on the cancer
detection, in order to help doctors to make optimal decisions quickly
and accurately. In the field of the noninvasive diagnosis techniques,
the endoscopic ultrasound elastography (EUSE) is a recent elasticity
imaging technique, allowing characterizing the difference between
malignant and benign tumors. Digitalizing and summarizing the main
EUSE sample movies features in a vector form concern with the use
of the exploratory data analysis (EDA). Neural networks are then
trained on the corresponding EUSE sample movies vector input in
such a way that these intelligent systems are able to offer a very
precise and objective diagnosis, discriminating between benign and
malignant tumors. A concrete application of these Data Mining
techniques illustrates the suitability and the reliability of this
methodology in CAD.