Abstract: As a structure for processing string problem, suffix
array is certainly widely-known and extensively-studied. But if the
string access pattern follows the “90/10" rule, suffix array can not take
advantage of the fact that we often find something that we have just
found. Although the splay tree is an efficient data structure for small
documents when the access pattern follows the “90/10" rule, it
requires many structures and an excessive amount of pointer
manipulations for efficiently processing and searching large
documents. In this paper, we propose a new and conceptually powerful
data structure, called splay suffix arrays (SSA), for string search. This
data structure combines the features of splay tree and suffix arrays into
a new approach which is suitable to implementation on both
conventional and clustered computers.
Abstract: e-mail has become an important means of electronic
communication but the viability of its usage is marred by Unsolicited
Bulk e-mail (UBE) messages. UBE consists of many types
like pornographic, virus infected and 'cry-for-help' messages as well
as fake and fraudulent offers for jobs, winnings and medicines. UBE
poses technical and socio-economic challenges to usage of e-mails.
To meet this challenge and combat this menace, we need to
understand UBE. Towards this end, the current paper presents a
content-based textual analysis of more than 2700 body enhancement
medicinal UBE. Technically, this is an application of Text Parsing
and Tokenization for an un-structured textual document and we
approach it using Bag Of Words (BOW) and Vector Space Document
Model techniques. We have attempted to identify the most
frequently occurring lexis in the UBE documents that advertise
various products for body enhancement. The analysis of such top
100 lexis is also presented. We exhibit the relationship between
occurrence of a word from the identified lexis-set in the given UBE
and the probability that the given UBE will be the one advertising for
fake medicinal product. To the best of our knowledge and survey of
related literature, this is the first formal attempt for identification of
most frequently occurring lexis in such UBE by its textual analysis.
Finally, this is a sincere attempt to bring about alertness against and
mitigate the threat of such luring but fake UBE.
Abstract: This paper includes two novel techniques for skew
estimation of binary document images. These algorithms are based on
connected component analysis and Hough transform. Both these
methods focus on reducing the amount of input data provided to
Hough transform. In the first method, referred as word centroid
approach, the centroids of selected words are used for skew detection.
In the second method, referred as dilate & thin approach, the selected
characters are blocked and dilated to get word blocks and later
thinning is applied. The final image fed to Hough transform has the
thinned coordinates of word blocks in the image. The methods have
been successful in reducing the computational complexity of Hough
transform based skew estimation algorithms. Promising experimental
results are also provided to prove the effectiveness of the proposed
methods.
Abstract: In the last decade digital watermarking procedures have
become increasingly applied to implement the copyright protection
of multimedia digital contents distributed on the Internet. To this
end, it is worth noting that a lot of watermarking procedures
for images and videos proposed in literature are based on spread
spectrum techniques. However, some scepticism about the robustness
and security of such watermarking procedures has arisen because
of some documented attacks which claim to render the inserted
watermarks undetectable. On the other hand, web content providers
wish to exploit watermarking procedures characterized by flexible and
efficient implementations and which can be easily integrated in their
existing web services frameworks or platforms. This paper presents
how a simple spread spectrum watermarking procedure for MPEG-2
videos can be modified to be exploited in web contexts. To this end,
the proposed procedure has been made secure and robust against some
well-known and dangerous attacks. Furthermore, its basic scheme
has been optimized by making the insertion procedure adaptive with
respect to the terminals used to open the videos and the network transactions
carried out to deliver them to buyers. Finally, two different
implementations of the procedure have been developed: the former
is a high performance parallel implementation, whereas the latter is
a portable Java and XML based implementation. Thus, the paper
demonstrates that a simple spread spectrum watermarking procedure,
with limited and appropriate modifications to the embedding scheme,
can still represent a valid alternative to many other well-known and
more recent watermarking procedures proposed in literature.
Abstract: In this paper, a model for an information retrieval
system is proposed which takes into account that knowledge about
documents and information need of users are dynamic. Two
methods are combined, one qualitative or symbolic and the other
quantitative or numeric, which are deemed suitable for many
clustering contexts, data analysis, concept exploring and
knowledge discovery. These two methods may be classified as
inductive learning techniques. In this model, they are introduced to
build “long term" knowledge about past queries and concepts in a
collection of documents. The “long term" knowledge can guide
and assist the user to formulate an initial query and can be
exploited in the process of retrieving relevant information. The
different kinds of knowledge are organized in different points of
view. This may be considered an enrichment of the exploration
level which is coherent with the concept of document/query
structure.
Abstract: In recent years Operational Transconductance Amplifier based high frequency integrated circuits, filters and systems have been widely investigated. The usefulness of OTAs over conventional OP-Amps in the design of both first order and second order active filters are well documented. This paper discusses some of the tunability issues using the Matlab/Simulink® software which are previously unreported for any commercial OTA. Using the simulation results two first order voltage controlled all pass filters with phase tuning capability are proposed.
Abstract: This study investigates the use of genetic algorithms
in information retrieval. The method is shown to be applicable to
three well-known documents collections, where more relevant
documents are presented to users in the genetic modification. In this
paper we present a new fitness function for approximate information
retrieval which is very fast and very flexible, than cosine similarity
fitness function.
Abstract: Machine-understandable data when strongly
interlinked constitutes the basis for the SemanticWeb. Annotating
web documents is one of the major techniques for creating metadata
on the Web. Annotating websites defines the containing data in a
form which is suitable for interpretation by machines. In this paper,
we present a new approach to annotate websites and documents by
promoting the abstraction level of the annotation process to a
conceptual level. By this means, we hope to solve some of the
problems of the current annotation solutions.
Abstract: Transferring patient information between medical care
sites is necessary to deliver better patient care and to reduce medical
cost. So developing of electronic medical records is an important trend
for the world.The Continuity of Care Document (CCD) is product of
collaboration between CDA and CCR standards. In this study, we will
develop a system to generate medical records with entry level based on
CCD template module.
Abstract: The importance of good requirements engineering is well documented. Agile practices, promoting collaboration and communications, facilitate the elicitation and management of volatile requirements. However, current Agile practices work in a well-defined environment. It is necessary to have a co-located customer. With distributed development it is not always possible to realize this co-location. In this environment a suitable process, possibly supported by tools, is required to support changing requirements. This paper introduces the issues of concern when managing requirements in a distributed environment and describes work done at the Software Technology Research Centre as part of the NOMAD project.
Abstract: This paper investigates the effect of International
Financial Reporting Standards (IFRS) adoption on the frequency of
earnings managements towards small positive profits. We focus on
two emerging markets IFRS adopters: South Africa and Turkey.
We tested our logistic regression using appropriate panelestimation
techniques over a sample of 330 South African and 210
Turkish firm-year observations over the period 2002-2008. Our
results document that mandatory adoption of IFRS is not associated
with a reduction in earnings management towards small positive
profits in emerging markets. These results contradict most of the
previous findings of the studies conducted in developed countries.
Based on the legal system factor, we compare the intensity of
earnings management between a code law country (Turkey) and a
common law country (South Africa) over the pre and post-adoption
periods. Our findings show that the frequency of such earnings
management practice increases significantly for the code law
country.
Abstract: The curriculum of the primary school science course was redesigned on the basis of constructivism in 2005-2006 academic years, in Turkey. In this context, the name of this course has been changed as “Science and Technology"; and both content and course books, students workbooks for this course have been redesigned in light of constructivism. The aim of this study is to determine whether the Science and Technology course books and student work books for primary school 5th grade are appropriate for the constructivism by evaluating them in terms of the fundamental principles of constructivism. In this study, out of qualitative research methods, documentation technique (i.e. document analysis) is applied; while selecting samples, criterion-sampling is used out of purposeful sampling techniques. When the Science and Technology course book and workbook for the 5th grade in primary education are examined, it is seen that both books complete each other in certain areas. Consequently, it can be claimed that in spite of some inadequate and missing points in the course book and workbook of the primary school Science and Technology course for the 5th grade students, these books are attempted to be designed in terms of the principles of constructivism. To overcome the inadequacies in the books, it can be suggested to redesign them. In addition to them, not to ignore the technology dimension of the course, the activities that encourage the students to prepare projects using technology cycle should be included.
Abstract: The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method's query answering.
Abstract: Current OCR technology does not allow to
accurately recognizing small text images, such as those found
in web images. Our goal is to investigate new approaches to
recognize very low resolution text images containing antialiased
character shapes.
This paper presents a preliminary study on the variability of
such characters and the feasibility to discriminate them by
using geometrical features. In a first stage we analyze the
distribution of these features. In a second stage we present a
study on the discriminative power for recognizing isolated
characters, using various rendering methods and font
properties. Finally we present interesting results of our
evaluation tests leading to our conclusion and future focus.
Abstract: The ability of the brain to organize information and generate the functional structures we use to act, think and communicate, is a common and easily observable natural phenomenon. In object-oriented analysis, these structures are represented by objects. Objects have been extensively studied and documented, but the process that creates them is not understood. In this work, a new class of discrete, deterministic, dissipative, host-guest dynamical systems is introduced. The new systems have extraordinary self-organizing properties. They can host information representing other physical systems and generate the same functional structures as the brain does. A simple mathematical model is proposed. The new systems are easy to simulate by computer, and measurements needed to confirm the assumptions are abundant and readily available. Experimental results presented here confirm the findings. Applications are many, but among the most immediate are object-oriented engineering, image and voice recognition, search engines, and Neuroscience.
Abstract: In this study, the Multi-Layer Perceptron (MLP)with Back-Propagation learning algorithm are used to classify to effective diagnosis Parkinsons disease(PD).It-s a challenging problem for medical community.Typically characterized by tremor, PD occurs due to the loss of dopamine in the brains thalamic region that results in involuntary or oscillatory movement in the body. A feature selection algorithm along with biomedical test values to diagnose Parkinson disease.Clinical diagnosis is done mostly by doctor-s expertise and experience.But still cases are reported of wrong diagnosis and treatment. Patients are asked to take number of tests for diagnosis.In many cases,not all the tests contribute towards effective diagnosis of a disease.Our work is to classify the presence of Parkinson disease with reduced number of attributes.Original,22 attributes are involved in classify.We use Information Gain to determine the attributes which reduced the number of attributes which is need to be taken from patients.The Artificial neural networks is used to classify the diagnosis of patients.Twenty-Two attributes are reduced to sixteen attributes.The accuracy is in training data set is 82.051% and in the validation data set is 83.333%.
Abstract: This was the first document revealing the
investigation of protein hydrolysate production optimization from J.
curcas cake. Proximate analysis of raw material showed 18.98%
protein, 5.31% ash, 8.52% moisture and 12.18% lipid. The
appropriate protein hydrolysate production process began with
grinding the J. curcas cake into small pieces. Then it was suspended
in 2.5% sodium hydroxide solution with ratio between solution/ J.
curcas cake at 80:1 (v/w). The hydrolysis reaction was controlled at
temperature 50 °C in water bath for 45 minutes. After that, the
supernatant (protein hydrolysate) was separated using centrifuge at
8000g for 30 minutes. The maximum yield of resulting protein
hydrolysate was 73.27 % with 7.34% moisture, 71.69% total protein,
7.12% lipid, 2.49% ash. The product was also capable of well
dissolving in water.
Abstract: The removal of hydrogen sulphide is required for reasons of health, odour problems, safety and corrosivity problems. The means of removing hydrogen sulphide mainly depend on its concentration and kind of medium to be purified. The paper deals with a method of hydrogen sulphide removal from the air by its catalytic oxidation to elemental sulphur with the use of Fe-EDTA complex. The possibility of obtaining fibrous filtering materials able to remove small concentrations of H2S from the air were described. The base of these materials is fibrous ion exchanger with Fe(III)- EDTA complex immobilized on their functional groups. The complex of trivalent iron converts hydrogen sulphide to elemental sulphur. Bivalent iron formed in the reaction is oxidized by the atmospheric oxygen, so complex of trivalent iron is continuously regenerated and the overall process can be accounted as pseudocatalytic. In the present paper properties of several fibrous catalysts based on ion exchangers with different chemical nature (weak acid,weak base and strong base) were described. It was shown that the main parameters affecting the process of catalytic oxidation are:concentration of hydrogen sulphide in the air, relative humidity of the purified air, the process time and the content of Fe-EDTA complex in the fibres. The data presented show that the filtering layers with anion exchange package are much more active in the catalytic processes of hydrogen sulphide removal than cation exchanger and inert materials. In the addition to the nature of the fibres relative air humidity is a critical factor determining efficiency of the material in the air purification from H2S. It was proved that the most promising carrier of the Fe-EDTA catalyst for hydrogen sulphide oxidation are Fiban A-6 and Fiban AK-22 fibres.
Abstract: Online news websites are one of the main and wide areas of Mass Media. Since the nineties several Jordanian newspapers were introduced to the World Wide Web to reach various and large numbers of audiances. Examples of these newspapers that have online version are Al-Rai, Ad-Dustor and AlGhad. Other pure online news websites include Ammon and Rum. The main aim of this study is to evaluate online newspaper websites using two assessment measures; usability and web content. This aim is achieved by using a questionnaire based evaluation which is based on the definition of usability and web content in the ISO document as the standard number 9241-part 11. The results are obtained based on 204 audiences- responses. The results of the research showed that the usability factor is relatively good for all Jordanian online newspapers whereas the web content factor is moderate.
Abstract: Iranians- imagination of heaven, which is the reward
of a person-s good deeds during their life, has shown itself in
pleasant and green gardens where earthly gardens were made as
representations of paradise. Iranians are also quite interested in
making their earthly gardens and plantations around their buildings.
With Iran-s hot and dry climate with a lack of sufficient water for
plantation coverage, it becomes noticeable how important it is to
Iranians- art in making gardens. This study, with regard to examples,
documents and library studies, investigates the characteristics of
Persian gardens. The result shows that elements such as soil, water,
plants and layout have been used in forming a unique style of Persian
gardens. Bagh-e Shah Zadeh Mahan (Mahan prince garden) is a
typical example and has been carefully studied. In this paper I try to
investigate and evaluate the characteristics of a Persian garden by
means of a descriptive approach.